Skip to content

NlogN algorithm for least-squares fitting of periodic templates to noisy, non-equispaced time-series data.

License

Notifications You must be signed in to change notification settings

PrincetonUniversity/FastTemplatePeriodogram

Repository files navigation

Fast Template Periodogram

http://img.shields.io/travis/PrincetonUniversity/FastTemplatePeriodogram/master.svg?style=flat https://codecov.io/gh/PrincetonUniversity/FastTemplatePeriodogram/coverage.svg?branch=master
Authors:John Hoffman (johnh2o2@gmail.com) Jake VanderPlas (jakevdp@uw.edu)
Version:1.0.0

Check out the Scipy 2017 talk

Description

plots/templates_and_periodograms.png

The Fast Template Periodogram extends the Lomb-Scargle periodogram ([Barning1963], [Vanicek1971], [Lomb1976], [Scargle1982], [ZechmeisterKurster2009]) for arbitrary (periodic) signal shapes. It naturally handles non-uniformly sampled time-series data.

Template:Periodic signal shape, expressed as a truncated Fourier series of length H.
Periodogram:Least-squares estimator of the power spectral density (Lomb-Scargle); for more general models, proportional to the "goodness of fit" statistic for the best-fit model at that frequency. See [VanderPlas2017] for more details.

Uses the the non-equispaced Fast Fourier Transform to efficiently compute frequency-dependent sums.

The ftperiodogram library is complete with API documentation and consistency checks using py.test.

Installing

Installing with pip should work:

$ pip install ftperiodogram

If this doesn't work, consult the instructions in CONDA_INSTALL.md for installing ftperiodogram and its dependencies with with conda.

Examples

See the Examples.ipynb located in the notebooks/ directory.

To run this notebook, use the jupyter notebook command from inside the notebooks/ directory:

$ cd notebooks/
$ jupyter notebook

Updates

See the issues section for known bugs. You can also submit bugs and suggest improvements through that interface.

More information

Previous implementations

The gatspy library has an implementation of both single and multiband template fitting, however this implementation uses non-linear least-squares fitting to compute the optimal parameters (amplitude, phase, constant offset) of the template fit at each frequency. That process scales as N_obs*N_f, where N is the number of observations and N_f is the number of frequencies at which to calculate the periodogram.

This is more or less the procedure used in [Sesar_etal_2017] to perform template fits to Pan-STARRS photometry, however they used a more sophisticated multiband model that locked the phases, amplitudes and offsets of all bands together. They found that template fitting was significantly more accurate for estimating periods of RR Lyrae stars, but the computational resources needed for these fits were enormous (~30 minutes per object per CPU core).

How does the fast template periodogram improve things?

By rederiving periodic template fitting (or periodic matched filter analysis) in the context of least-squares spectral analysis, we found a significantly better way to perform these fits. Details will be presented in a paper (Hoffman et al. 2017, in prep), but the important part is you can reduce the non-linearity of the problem to the following:

  • Finding the zeros of an order 6H-1 complex polynomial at each trial frequency.
    • This is done via the numpy.polynomial library, which performs singular-value decomposition on the polynomial "companion matrix", and scales as O(H^3).
  • Computing the coefficients of these polynomials for all trial frequencies simultaneously by leveraging the non-equispaced fast Fourier transform, a process that scales as O(HN_f log(HN_f)).

This provides two advantages:

Improved computational speed and scaling:plots/timing_vs_ndata_const_freq.png

Speed comparison for a test case using a constant number of trial frequencies but varying the number of observations.

Numerically stable and accurate:plots/correlation_with_nonlinopt.png

Accuracy comparison between the fast template periodogram and a gatspy-like method that uses the scipy.optimize.minimize function to find the optimal phase shift parameter. The minimization method is given 10 random starting values and the best result is kept. Though in most cases the truly optimal solution is found, in many cases a sub-optimal solution is chosen instead (i.e. only a locally optimal solution was chosen).

How is this different than the multi-harmonic periodogram?

The multi-harmonic periodogram ([Bretthorst1988], [SchwarzenbergCzerny1996]) is another extension of Lomb-Scargle that fits a truncated Fourier series to the data at each trial frequency. This algorithm can also be made to scale as HN_f logHN_f [Palmer2009].

However, the multi-harmonic periodogram is fundamentally different than template fitting. In template fitting, the relative amplitudes and phases of the Fourier series are fixed. In a multi-harmonic periodogram, the relative amplitudes and phases of the Fourier series are free parameters.

The multiharmonic periodogram is more flexible than the template periodogram, but less sensitive to a given signal. If you're hoping to find a non-sinusoidal signal with an unknown shape, it might make more sense to use a multi-harmonic periodogram.

For more discussion of the multiharmonic periodogram and related extensions, see [VanderPlas_etal_2015] and [VanderPlas2017].

TODO

  • Multi-band extensions
  • Speed improvements

References

[ZechmeisterKurster2009]The generalised Lomb-Scargle periodogram. A new formalism for the floating-mean and Keplerian periodograms
[Lomb1976]Least-squares frequency analysis of unequally spaced data
[Scargle1982]Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data
[Barning1963]The numerical analysis of the light-curve of 12 Lacertae
[Vanicek1971]Further Development and Properties of the Spectral Analysis by Least-Squares
[VanderPlas2017](1, 2) Understanding the Lomb-Scargle Periodogram
[Sesar_etal_2017]Machine-Learned Identification of RR Lyrae Stars from Sparse, Multi-band Data: the PS1 Sample
[Bretthorst1988]Bayesian Spectrum Analysis and Parameter Estimation
[SchwarzenbergCzerny1996]Fast and Statistically Optimal Period Search in Uneven Sampled Observations
[Palmer2009]A FAST CHI-SQUARED TECHNIQUE FOR PERIOD SEARCH OF IRREGULARLY SAMPLED DATA
[VanderPlas_etal_2015]Periodograms for Multiband Astronomical Time Series