Use more class-based FFTs in pycbc_live #4096

spxiwh · 2022-07-26T16:24:10Z

This patch (carrying on from #4016) implements class-based FFTs in some of the more time-critical areas of PyCBC Live (focusing on the early warning setup for now). This isn't so trivial because:

Some of this is standard module code, where most of the time you wouldn't want class-based FFTs.
The sizes of the inputs can vary with PyCBC Live (based on template duration), and so we need to cache a variable number of FFT memory arrays and plan.

I think this is the cleanest I can do this, without any affect on other code. There is a hack to turn this on in resample.py in the pycbc_live executable.

Comments/suggestions welcomed!

Note that I haven't touched PSD estimation and autogating, which both do repeatedly call to function-API FFTs, and may be important in the broader PyCBC Live setup.

titodalcanton · 2022-07-28T13:01:35Z

Could you show a before/after profile tree?

ahnitz · 2022-07-28T13:09:55Z

pycbc/strain/strain.py

@@ -1245,6 +1245,37 @@ def verify_segment_options_multi_ifo(cls, opt, parser, ifos):
            required_opts_multi_ifo(opt, parser, ifo, cls.required_opts_list)


+def create_class_fft_for_cache(npoints_time, delta_t, dtype, ifft=False):


Does this function belong here? It seems like a convenience method for FFTs in general no?

@ahnitz @spxiwh I don't think it makes sense to move this into the FFT library proper, or if you do, you need to generalize it substantially. For instance, this code assumes you are transforming between a time series and a frequency series; it doesn't consider the array to array class at all.

Also, please think about the logic here. It may be true in a particular use case that any FFT instance of a given size can be reused, but that doesn't have to be true generally. It really depends on a particular user to know when an FFT with a certain set of its arguments the same as another can actually be reused; what is the "right choice" here is not (for example) the right choice in the MatchedFilter class.

@ahnitz I moved this into pycbc.fft.class_api. Is this a better place for it??

@josh-willis Would you be okay with this living in the fft module if it allowed generic array <-> array R2C or C2R transforms (with a "NotImplementedError if C2C or similar is attempted)? Normalizations are handled outside of the class based API anyway, so I think the only real change to do that is to convert back to FSeries/TSeries within the libraries being changed. If not, any suggestions on where should I put this?

Regarding logic and use cases. There are two places this is being added (for the moment):

The Live data handler. This is specific to PyCBC Live and doesn't get called from anywhere else.

The high-pass filter in strain.py, which is used by many codes. In this case the old behaviour will be the default, and only by very specifically saying "use class-based cached FFTs" will the new behaviour be activated.

.... I think that autogating may also need this if being used in the full PyCBC Live analysis, but again I'd use the same behaviour as that used in strain.py.

@spxiwh I guess so? I think my point is that the function should live at no higher level of generality than its use demands. For example, if the autogating, pycbc_live, and HPF are all similar enough use cases that it should live in strain and be called from the codes that need it, then I would prefer that. Things in FFT need to be broadly supporting FFTs as we use them.

I would clarify in the documentation (docstrnig) that this will create aligned-memory, out-of-place FFTs, if you leave it in the fft/class_api.py, since those are restrictions from the most general thing that the class-based FFT supports (ideally that would even be in the function name, but that is plenty long enough already).

Is there are reason that you create and manage the cache yourself, using python dictionaries, rather than using functools.lru_cache as you did before? Though code provided by others is not foolproof, I feel better using a widely used cache that is core python, and which if it has bugs will be something that others will fix. In particular, I worry about things being thread-safe; there is an open issue to clarify that functools.lru_cache is thread-safe here; I don't know what the answer is for what you have created.

pycbc/strain/strain.py

ahnitz · 2022-07-28T13:14:02Z

pycbc/strain/strain.py

@@ -1517,7 +1553,21 @@ def overwhitened_data(self, delta_f):
            buffer_length = int(1.0 / delta_f)
            e = len(self.strain)
            s = int(e - buffer_length * self.sample_rate - self.reduced_pad * 2)
-            fseries = make_frequency_series(self.strain[s:e])
+            npoints_time = e - s
+            if npoints_time not in self.make_freq_cache:


You have this block of code duplicated nearly verbatim in three places. It looks like these could be condensed into a single function callable (or static method of this class?) to avoid the code duplication and keep the previous more straightforward structure.

I've taken another cut at cleaning this block up. Does it look better now?

spxiwh · 2022-08-02T15:53:25Z

@titodalcanton Profile before the change (with only one live detector, as I'm running this on streaming data):

and after the change:

I have seen some cases where the old code was spending more time doing FFT plans than this, but I still think we want to reduce data reading time as much as possible, as this is an operation that would only get slower if we increased number of processes.

ahnitz · 2022-08-03T14:00:29Z

pycbc/strain/strain.py

-                                                  dtype=fseries.dtype), delta_f=delta_f)
-                pycbc.fft.fft(overwhite2, fseries_trimmed)
+
+                vec, fseries_trimmed, fft_class = self.fft_cache[npoints_time][2]


@spxiwh Sorry my previous comment amount redundancy was actually about lines like 1603-1606

spxiwh · 2022-08-08T14:19:55Z

I've just made some changes to this in response to comments, hopefully it's good to go now.

I moved the new functionality out of the fft module.
I've used the functools cache instead of our own.
I've refactored how this works so it's less code needed to actually call into the new functions

josh-willis · 2022-08-09T01:54:19Z

pycbc/filter/resample.py

+                timeseries.dtype,
+                ifft=True,
+                uid=486876761
+            )


I'm assuming that the number you select for the uid here is just so that it's different from anything else? If there's a reason it has this value (also below) then please add a code comment explaining what determines the value.

If it's just to be unique from other calls of the same function, would it be clearer to just define some constants in this file, like LFILTER_UNIQUE_ID = 1, or whatever?

Thanks Josh. This is just to give different outputs if called with the same inputs from different functions. The number just has to be unique. I made this a constant, with some inline comments, as suggested.

josh-willis · 2022-08-09T01:54:44Z

pycbc/filter/resample.py

+                                       normalize_by_rate=False)
+
+            # FFT contents of timeseries into tfreq
+            tfreq = execute_cached_fft(timeseries, uid=91236752,


Same comment as above about magic number UID.

josh-willis

So, if you haven't, I would recommend profiling one more time with this latest version of changes, just so we know there are no surprises. But I will go ahead and approve, LGTM.

Thanks!

spxiwh · 2022-08-11T15:32:10Z

Thanks Josh. I've been profiling this in the online analysis over the past days, and the profile above is still holding.

I did add one more feature to this: The ability to copy the output data to avoid it being unintentionally being overwritten. This is on by default, so if anyone uses this, they just get a class-based FFT, but there is some additional memory overhead. If one wants to avoid that, they can either set the appropriate options (or just call the class-based FFT API directly). Now that these is approval, I'd like to merge this tomorrow if there are no objections? (As I have at least one more patch that will use this).

josh-willis · 2022-08-11T15:51:09Z

@spxiwh It's certainly fine with me if you merge this whenever you are ready; if you also want to wait on approval from @ahnitz and @titodalcanton then I leave that up to you.

* Use class-based API in StrainBuffer - PART 1 * Fixes to previous * FIx to previous * Attempt to move other block to strain API * Add cache-based class-based FFTs in resample and strain * Pycbc live to use cached FFTs * Some fixes to previous * CodeClimating * Some reordering as per discussion on PR * Typo * Various fixes * Remove additional complication that isn't needed * Significant rework of cached_FFT * Some clean up * More cleanup * Code climating * Add UIDs as constants * Copy output by default

spxiwh requested review from ahnitz, josh-willis and titodalcanton July 26, 2022 16:24

titodalcanton added the low latency label Jul 26, 2022

ahnitz reviewed Jul 28, 2022

View reviewed changes

pycbc/strain/strain.py Outdated Show resolved Hide resolved

ahnitz reviewed Jul 28, 2022

View reviewed changes

ahnitz reviewed Aug 3, 2022

View reviewed changes

spxiwh force-pushed the pr_class_based_ffts_for_live branch from f8bb770 to 281eac8 Compare August 8, 2022 14:04

josh-willis reviewed Aug 9, 2022

View reviewed changes

josh-willis approved these changes Aug 9, 2022

View reviewed changes

spxiwh added 15 commits August 10, 2022 06:31

Use class-based API in StrainBuffer - PART 1

9b51dc0

Fixes to previous

f20bc27

FIx to previous

710c541

Attempt to move other block to strain API

4ad5b0a

Add cache-based class-based FFTs in resample and strain

08ea923

Pycbc live to use cached FFTs

7a308b1

Some fixes to previous

328b96f

CodeClimating

dd05619

Some reordering as per discussion on PR

9442701

Typo

89942c3

Various fixes

07212b9

Remove additional complication that isn't needed

aa5dc01

Significant rework of cached_FFT

4a9a4e7

Some clean up

f51a797

More cleanup

7dec230

spxiwh added 3 commits August 10, 2022 06:31

Code climating

bcb2abb

Add UIDs as constants

5563186

Copy output by default

7afad24

spxiwh force-pushed the pr_class_based_ffts_for_live branch from 90ce99e to 7afad24 Compare August 11, 2022 15:29

spxiwh merged commit 40377be into gwastro:master Aug 12, 2022

spxiwh deleted the pr_class_based_ffts_for_live branch August 12, 2022 10:05

spxiwh mentioned this pull request Aug 30, 2022

Speed up psd computation for PyCBC Live #4106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more class-based FFTs in pycbc_live #4096

Use more class-based FFTs in pycbc_live #4096

spxiwh commented Jul 26, 2022

titodalcanton commented Jul 28, 2022

ahnitz Jul 28, 2022

josh-willis Aug 2, 2022

spxiwh Aug 2, 2022

spxiwh Aug 3, 2022

josh-willis Aug 3, 2022

ahnitz Jul 28, 2022

spxiwh Aug 2, 2022

spxiwh commented Aug 2, 2022

ahnitz Aug 3, 2022

spxiwh commented Aug 8, 2022

josh-willis Aug 9, 2022

spxiwh Aug 9, 2022

josh-willis Aug 9, 2022

josh-willis left a comment

spxiwh commented Aug 11, 2022

josh-willis commented Aug 11, 2022

		@@ -1245,6 +1245,37 @@ def verify_segment_options_multi_ifo(cls, opt, parser, ifos):
		required_opts_multi_ifo(opt, parser, ifo, cls.required_opts_list)


		def create_class_fft_for_cache(npoints_time, delta_t, dtype, ifft=False):

Use more class-based FFTs in pycbc_live #4096

Use more class-based FFTs in pycbc_live #4096

Conversation

spxiwh commented Jul 26, 2022

titodalcanton commented Jul 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spxiwh commented Aug 2, 2022

Choose a reason for hiding this comment

spxiwh commented Aug 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josh-willis left a comment

Choose a reason for hiding this comment

spxiwh commented Aug 11, 2022

josh-willis commented Aug 11, 2022