-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use more class-based FFTs in pycbc_live #4096
Conversation
Could you show a before/after profile tree? |
pycbc/strain/strain.py
Outdated
@@ -1245,6 +1245,37 @@ def verify_segment_options_multi_ifo(cls, opt, parser, ifos): | |||
required_opts_multi_ifo(opt, parser, ifo, cls.required_opts_list) | |||
|
|||
|
|||
def create_class_fft_for_cache(npoints_time, delta_t, dtype, ifft=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this function belong here? It seems like a convenience method for FFTs in general no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahnitz @spxiwh I don't think it makes sense to move this into the FFT library proper, or if you do, you need to generalize it substantially. For instance, this code assumes you are transforming between a time series and a frequency series; it doesn't consider the array to array class at all.
Also, please think about the logic here. It may be true in a particular use case that any FFT instance of a given size can be reused, but that doesn't have to be true generally. It really depends on a particular user to know when an FFT with a certain set of its arguments the same as another can actually be reused; what is the "right choice" here is not (for example) the right choice in the MatchedFilter class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahnitz I moved this into pycbc.fft.class_api. Is this a better place for it??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josh-willis Would you be okay with this living in the fft module if it allowed generic array <-> array R2C or C2R transforms (with a "NotImplementedError if C2C or similar is attempted)? Normalizations are handled outside of the class based API anyway, so I think the only real change to do that is to convert back to FSeries/TSeries within the libraries being changed. If not, any suggestions on where should I put this?
Regarding logic and use cases. There are two places this is being added (for the moment):
- The Live data handler. This is specific to PyCBC Live and doesn't get called from anywhere else.
- The high-pass filter in strain.py, which is used by many codes. In this case the old behaviour will be the default, and only by very specifically saying "use class-based cached FFTs" will the new behaviour be activated.
.... I think that autogating may also need this if being used in the full PyCBC Live analysis, but again I'd use the same behaviour as that used in strain.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spxiwh I guess so? I think my point is that the function should live at no higher level of generality than its use demands. For example, if the autogating, pycbc_live, and HPF are all similar enough use cases that it should live in strain
and be called from the codes that need it, then I would prefer that. Things in FFT need to be broadly supporting FFTs as we use them.
I would clarify in the documentation (docstrnig) that this will create aligned-memory, out-of-place FFTs, if you leave it in the fft/class_api.py
, since those are restrictions from the most general thing that the class-based FFT supports (ideally that would even be in the function name, but that is plenty long enough already).
Is there are reason that you create and manage the cache yourself, using python dictionaries, rather than using functools.lru_cache
as you did before? Though code provided by others is not foolproof, I feel better using a widely used cache that is core python, and which if it has bugs will be something that others will fix. In particular, I worry about things being thread-safe; there is an open issue to clarify that functools.lru_cache
is thread-safe here; I don't know what the answer is for what you have created.
pycbc/strain/strain.py
Outdated
@@ -1517,7 +1553,21 @@ def overwhitened_data(self, delta_f): | |||
buffer_length = int(1.0 / delta_f) | |||
e = len(self.strain) | |||
s = int(e - buffer_length * self.sample_rate - self.reduced_pad * 2) | |||
fseries = make_frequency_series(self.strain[s:e]) | |||
npoints_time = e - s | |||
if npoints_time not in self.make_freq_cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have this block of code duplicated nearly verbatim in three places. It looks like these could be condensed into a single function callable (or static method of this class?) to avoid the code duplication and keep the previous more straightforward structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken another cut at cleaning this block up. Does it look better now?
@titodalcanton Profile before the change (with only one live detector, as I'm running this on streaming data): and after the change: I have seen some cases where the old code was spending more time doing FFT plans than this, but I still think we want to reduce data reading time as much as possible, as this is an operation that would only get slower if we increased number of processes. |
pycbc/strain/strain.py
Outdated
dtype=fseries.dtype), delta_f=delta_f) | ||
pycbc.fft.fft(overwhite2, fseries_trimmed) | ||
|
||
vec, fseries_trimmed, fft_class = self.fft_cache[npoints_time][2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spxiwh Sorry my previous comment amount redundancy was actually about lines like 1603-1606
f8bb770
to
281eac8
Compare
I've just made some changes to this in response to comments, hopefully it's good to go now.
|
timeseries.dtype, | ||
ifft=True, | ||
uid=486876761 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that the number you select for the uid here is just so that it's different from anything else? If there's a reason it has this value (also below) then please add a code comment explaining what determines the value.
If it's just to be unique from other calls of the same function, would it be clearer to just define some constants in this file, like LFILTER_UNIQUE_ID = 1
, or whatever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Josh. This is just to give different outputs if called with the same inputs from different functions. The number just has to be unique. I made this a constant, with some inline comments, as suggested.
pycbc/filter/resample.py
Outdated
normalize_by_rate=False) | ||
|
||
# FFT contents of timeseries into tfreq | ||
tfreq = execute_cached_fft(timeseries, uid=91236752, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above about magic number UID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if you haven't, I would recommend profiling one more time with this latest version of changes, just so we know there are no surprises. But I will go ahead and approve, LGTM.
Thanks!
90ce99e
to
7afad24
Compare
Thanks Josh. I've been profiling this in the online analysis over the past days, and the profile above is still holding. I did add one more feature to this: The ability to copy the output data to avoid it being unintentionally being overwritten. This is on by default, so if anyone uses this, they just get a class-based FFT, but there is some additional memory overhead. If one wants to avoid that, they can either set the appropriate options (or just call the class-based FFT API directly). Now that these is approval, I'd like to merge this tomorrow if there are no objections? (As I have at least one more patch that will use this). |
@spxiwh It's certainly fine with me if you merge this whenever you are ready; if you also want to wait on approval from @ahnitz and @titodalcanton then I leave that up to you. |
* Use class-based API in StrainBuffer - PART 1 * Fixes to previous * FIx to previous * Attempt to move other block to strain API * Add cache-based class-based FFTs in resample and strain * Pycbc live to use cached FFTs * Some fixes to previous * CodeClimating * Some reordering as per discussion on PR * Typo * Various fixes * Remove additional complication that isn't needed * Significant rework of cached_FFT * Some clean up * More cleanup * Code climating * Add UIDs as constants * Copy output by default
* Use class-based API in StrainBuffer - PART 1 * Fixes to previous * FIx to previous * Attempt to move other block to strain API * Add cache-based class-based FFTs in resample and strain * Pycbc live to use cached FFTs * Some fixes to previous * CodeClimating * Some reordering as per discussion on PR * Typo * Various fixes * Remove additional complication that isn't needed * Significant rework of cached_FFT * Some clean up * More cleanup * Code climating * Add UIDs as constants * Copy output by default
This patch (carrying on from #4016) implements class-based FFTs in some of the more time-critical areas of PyCBC Live (focusing on the early warning setup for now). This isn't so trivial because:
I think this is the cleanest I can do this, without any affect on other code. There is a hack to turn this on in resample.py in the pycbc_live executable.
Comments/suggestions welcomed!
Note that I haven't touched PSD estimation and autogating, which both do repeatedly call to function-API FFTs, and may be important in the broader PyCBC Live setup.