Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more class-based FFTs in pycbc_live #4096

Merged
merged 18 commits into from
Aug 12, 2022

Conversation

spxiwh
Copy link
Contributor

@spxiwh spxiwh commented Jul 26, 2022

This patch (carrying on from #4016) implements class-based FFTs in some of the more time-critical areas of PyCBC Live (focusing on the early warning setup for now). This isn't so trivial because:

  • Some of this is standard module code, where most of the time you wouldn't want class-based FFTs.
  • The sizes of the inputs can vary with PyCBC Live (based on template duration), and so we need to cache a variable number of FFT memory arrays and plan.

I think this is the cleanest I can do this, without any affect on other code. There is a hack to turn this on in resample.py in the pycbc_live executable.

Comments/suggestions welcomed!

Note that I haven't touched PSD estimation and autogating, which both do repeatedly call to function-API FFTs, and may be important in the broader PyCBC Live setup.

@titodalcanton
Copy link
Contributor

Could you show a before/after profile tree?

@@ -1245,6 +1245,37 @@ def verify_segment_options_multi_ifo(cls, opt, parser, ifos):
required_opts_multi_ifo(opt, parser, ifo, cls.required_opts_list)


def create_class_fft_for_cache(npoints_time, delta_t, dtype, ifft=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function belong here? It seems like a convenience method for FFTs in general no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahnitz @spxiwh I don't think it makes sense to move this into the FFT library proper, or if you do, you need to generalize it substantially. For instance, this code assumes you are transforming between a time series and a frequency series; it doesn't consider the array to array class at all.

Also, please think about the logic here. It may be true in a particular use case that any FFT instance of a given size can be reused, but that doesn't have to be true generally. It really depends on a particular user to know when an FFT with a certain set of its arguments the same as another can actually be reused; what is the "right choice" here is not (for example) the right choice in the MatchedFilter class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahnitz I moved this into pycbc.fft.class_api. Is this a better place for it??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh-willis Would you be okay with this living in the fft module if it allowed generic array <-> array R2C or C2R transforms (with a "NotImplementedError if C2C or similar is attempted)? Normalizations are handled outside of the class based API anyway, so I think the only real change to do that is to convert back to FSeries/TSeries within the libraries being changed. If not, any suggestions on where should I put this?

Regarding logic and use cases. There are two places this is being added (for the moment):

  • The Live data handler. This is specific to PyCBC Live and doesn't get called from anywhere else.
  • The high-pass filter in strain.py, which is used by many codes. In this case the old behaviour will be the default, and only by very specifically saying "use class-based cached FFTs" will the new behaviour be activated.

.... I think that autogating may also need this if being used in the full PyCBC Live analysis, but again I'd use the same behaviour as that used in strain.py.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spxiwh I guess so? I think my point is that the function should live at no higher level of generality than its use demands. For example, if the autogating, pycbc_live, and HPF are all similar enough use cases that it should live in strain and be called from the codes that need it, then I would prefer that. Things in FFT need to be broadly supporting FFTs as we use them.

I would clarify in the documentation (docstrnig) that this will create aligned-memory, out-of-place FFTs, if you leave it in the fft/class_api.py, since those are restrictions from the most general thing that the class-based FFT supports (ideally that would even be in the function name, but that is plenty long enough already).

Is there are reason that you create and manage the cache yourself, using python dictionaries, rather than using functools.lru_cache as you did before? Though code provided by others is not foolproof, I feel better using a widely used cache that is core python, and which if it has bugs will be something that others will fix. In particular, I worry about things being thread-safe; there is an open issue to clarify that functools.lru_cache is thread-safe here; I don't know what the answer is for what you have created.

pycbc/strain/strain.py Outdated Show resolved Hide resolved
@@ -1517,7 +1553,21 @@ def overwhitened_data(self, delta_f):
buffer_length = int(1.0 / delta_f)
e = len(self.strain)
s = int(e - buffer_length * self.sample_rate - self.reduced_pad * 2)
fseries = make_frequency_series(self.strain[s:e])
npoints_time = e - s
if npoints_time not in self.make_freq_cache:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have this block of code duplicated nearly verbatim in three places. It looks like these could be condensed into a single function callable (or static method of this class?) to avoid the code duplication and keep the previous more straightforward structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken another cut at cleaning this block up. Does it look better now?

@spxiwh
Copy link
Contributor Author

spxiwh commented Aug 2, 2022

@titodalcanton Profile before the change (with only one live detector, as I'm running this on streaming data):

image

and after the change:

image

I have seen some cases where the old code was spending more time doing FFT plans than this, but I still think we want to reduce data reading time as much as possible, as this is an operation that would only get slower if we increased number of processes.

dtype=fseries.dtype), delta_f=delta_f)
pycbc.fft.fft(overwhite2, fseries_trimmed)

vec, fseries_trimmed, fft_class = self.fft_cache[npoints_time][2]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spxiwh Sorry my previous comment amount redundancy was actually about lines like 1603-1606

@spxiwh spxiwh force-pushed the pr_class_based_ffts_for_live branch from f8bb770 to 281eac8 Compare August 8, 2022 14:04
@spxiwh
Copy link
Contributor Author

spxiwh commented Aug 8, 2022

I've just made some changes to this in response to comments, hopefully it's good to go now.

  • I moved the new functionality out of the fft module.
  • I've used the functools cache instead of our own.
  • I've refactored how this works so it's less code needed to actually call into the new functions

timeseries.dtype,
ifft=True,
uid=486876761
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that the number you select for the uid here is just so that it's different from anything else? If there's a reason it has this value (also below) then please add a code comment explaining what determines the value.

If it's just to be unique from other calls of the same function, would it be clearer to just define some constants in this file, like LFILTER_UNIQUE_ID = 1, or whatever?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Josh. This is just to give different outputs if called with the same inputs from different functions. The number just has to be unique. I made this a constant, with some inline comments, as suggested.

normalize_by_rate=False)

# FFT contents of timeseries into tfreq
tfreq = execute_cached_fft(timeseries, uid=91236752,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about magic number UID.

Copy link
Contributor

@josh-willis josh-willis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if you haven't, I would recommend profiling one more time with this latest version of changes, just so we know there are no surprises. But I will go ahead and approve, LGTM.

Thanks!

@spxiwh spxiwh force-pushed the pr_class_based_ffts_for_live branch from 90ce99e to 7afad24 Compare August 11, 2022 15:29
@spxiwh
Copy link
Contributor Author

spxiwh commented Aug 11, 2022

Thanks Josh. I've been profiling this in the online analysis over the past days, and the profile above is still holding.

I did add one more feature to this: The ability to copy the output data to avoid it being unintentionally being overwritten. This is on by default, so if anyone uses this, they just get a class-based FFT, but there is some additional memory overhead. If one wants to avoid that, they can either set the appropriate options (or just call the class-based FFT API directly). Now that these is approval, I'd like to merge this tomorrow if there are no objections? (As I have at least one more patch that will use this).

@josh-willis
Copy link
Contributor

@spxiwh It's certainly fine with me if you merge this whenever you are ready; if you also want to wait on approval from @ahnitz and @titodalcanton then I leave that up to you.

@spxiwh spxiwh merged commit 40377be into gwastro:master Aug 12, 2022
@spxiwh spxiwh deleted the pr_class_based_ffts_for_live branch August 12, 2022 10:05
connor-mcisaac pushed a commit to connor-mcisaac/pycbc that referenced this pull request Oct 12, 2022
* Use class-based API in StrainBuffer - PART 1

* Fixes to previous

* FIx to previous

* Attempt to move other block to strain API

* Add cache-based class-based FFTs in resample and strain

* Pycbc live to use cached FFTs

* Some fixes to previous

* CodeClimating

* Some reordering as per discussion on PR

* Typo

* Various fixes

* Remove additional complication that isn't needed

* Significant rework of cached_FFT

* Some clean up

* More cleanup

* Code climating

* Add UIDs as constants

* Copy output by default
acorreia61201 pushed a commit to acorreia61201/pycbc that referenced this pull request Apr 4, 2024
* Use class-based API in StrainBuffer - PART 1

* Fixes to previous

* FIx to previous

* Attempt to move other block to strain API

* Add cache-based class-based FFTs in resample and strain

* Pycbc live to use cached FFTs

* Some fixes to previous

* CodeClimating

* Some reordering as per discussion on PR

* Typo

* Various fixes

* Remove additional complication that isn't needed

* Significant rework of cached_FFT

* Some clean up

* More cleanup

* Code climating

* Add UIDs as constants

* Copy output by default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants