Skip to content

Latest commit

 

History

History
132 lines (100 loc) · 10.9 KB

README.md

File metadata and controls

132 lines (100 loc) · 10.9 KB

List of likelihood-free examples

This is a list of example applications for use with likelihood-free (aka plug-and-play) methods of statistical inference like ABC. They are split into 3 categories:

  1. Simulator code available.

    Simulator code allows these applications to be used as part of any likelihood-free method.

  2. Simultor output available.

    Here a dataset is available of (input, output) or (parameter, data) pairs. These can be used in a subset of likelihood-free methods.

  3. Other examples.

    These are examples described in the literature for which simulators could be implemented with some effort.

This list is a work in progress. Please add or get in touch about any further examples you are familiar with. Challenging applications are particularly welcome to help motivate new methods. These include simulators with high dimensional inputs or outputs or which are very computationslly intensive to run.

Simulator code available

  • Boarding school flu models. The R package pomp contains several simulators for SIR epidemic models (see for example Britton) and data from an influenza outbreak in a British boarding school (described here.) Use simulate on euler.sim, gillespie.sir and bbs.

  • Dacca cholera model. An SDE SIRS cholera model and data from Dacca district over the years 1891 to 1940. This is analysed in King et al. A simulator is given in the R package pomp: use simulate on the dacca example.

  • g-and-k distribution. This is a flexible 4-parameter distribution with an intractable pdf. It has often been used as a test in the ABC literature, starting with Allingham et al. My unfinished R package gk simulates from this distribution. Simulation requires only 1 line of code so it is very easy to reimplement in other languages. (ABC has also been used on Tukey's g-and-h distribution, which is somewhat similar, by Peters and Sisson. A book length treatment of many other intractable distributions defined by quantiles is by Gilchrist.)

  • Gompertz population model. A stochastic Gompertz population model with log-normal measurement error. A simulator is given in the R package pomp: use simulate on the gompertz example.

  • Moving average models. Marin et al use inference for an MA(1) model as an example of ABC parameter inference where the exact posterior is avaiable. Pudlo et al extend the example to ABC model choice between MA(1) and MA(2) models, which is quite challenging. Again exact posterior values are available. The models can be simulated from using the arima.sim command in R and are simple to reimplement.

  • OU2 model. A bivariate discrete-time Ornstein-Uhlenbeck process. A simulator is given in the R package pomp: use simulate on the ou2 example.

  • Ricker and blowfly models. These are ecological models of population size data used in Wood. Simulator code is available in the R package synlik under rickerSimul and blowSimul. Alternative simulators (and a blowflies variant) are in the R package pomp: use simulate on the ricker and blowflies examples.

  • RW2 model. A 2-D normal random walk model. A simulator is given in the R package pomp: use simulate on the rw2 example.

  • Spatial extremes. Max-stable processes can be used to model spatial extreme data. These processes can be simulated using the R package SpatialExtremes. An ABC analysis was performed by Erhardt and Smith and is implemented in the R package ABCExtremes. This includes code to calculate the summary statistics they used.

  • Stable distibutions. This family of distributions has similar properties to the normal distribution but heavier tails. Most have an intractable pdf. The R stabledist package simulates from this distribution. For more background see wikipedia. It has been used in ABC analysis of financial data by Peters et al and Jasra et al amongst others. These include complex models involving using stable distributions for noise which could take some to reimplement.

  • Trait dynamics. This is a model of ecological dynamics of traits in species from Jabot. A simulator is included in the EasyABC R package.

  • Weak lensing. The cosmology model simulator in Lin and Kilbinger uses the publicly available CAMELUS algorithm.

  • Sunyaev-Zeldovic cluster formation. This cosmological process can be simulated using the numcosmo python library. For details see Ishida et al who analyse this application as an example of their cosmoabc python package. The package includes an input file for numcosmo simulations of the type used in the paper.

  • Voles prey-predator model. This a prey-predator state space model, used by Fasiolo and Wood to describe population abundance dynamics of Fennoscandian voles. Simulator code is available in the R package volesModel under volesSimulator. Also, real data is available using data(voles_data).

  • Ising/Potts model. This is a Markov random field that can be used to classify pixels in image analysis. The inverse temperature parameter $\beta$ has a sufficient statistic, but the normalising constant is intractable. ABC for the Potts model was introduced by Grelaud et al. ABC-SMC and ABC-MCMC algorithms are implemented in the R package bayesImageS. Perfect sampling for the Ising model is possible for $\beta$ below the critical temperature using coupling from the past (CFTP; Propp and Wilson). Bounding chains for Swendsen-Wang can be used to simulate from the cold states ($\beta > \beta_{crit}$), as shown by Huber. The R package PottsUtils provides other algorithms for simulating from Ising/Potts models. Real data is available using data(Menteith) in the R package bayess.

  • Exponential random graph model (ERGM). A Markov model for social networks, based on structural motifs such as triangles and k-stars. Like the Ising/Potts, this model has sufficient statistics but an intractable normalising constant. ABC-SMC for ERGM was introduced by Everitt. His MATLAB code is available in the online supplementary material. The R package ergm implements the tie-no-tie (TNT) sampler for simulating from the model. For more details, see Morris Handcock & Hunter and Hunter et al. Real data is available from the Stanford large network dataset collection (SNAP).

Simulator output available

  • coal dataset. This data is generated from a coalescent model and is available in the abctools R packages. 100,000 simulated datasets, 7 summary statistics, 2 parameters. 100 further simulated datasets are available to use as observations. A larger dataset for the same example (1,000,000 simulations) is available from Matt Nunes's webpage.

  • human dataset. This genetic data on humans under 3 demographic models is available in the abc.data R package. 150,000 simulations, 3 summary statistics, 4 parameters, 3 models, and observed data from 3 populations.

  • musigma2 dataset. This is made up of simulations from a normal model with unknown mean and variance. The summary statistics are the mean and log variance of 50 iid samples. The data is available in the abc.data R package together with observations taken from the classic iris dataset. Details from the true posterior are also provided. 150,000 simulations, 2 summary statistics, 2 parameters.

Other examples

  • M/G/1 queues. A queue with Markov arrival and non-Markov service times in which only departure times are observed has an intractable likelihood but is very simple to write a simulator for. ABC analyses have been performed by Blum and Francois and Fearnhead and Prangle amongst others. Recently an exact MCMC approach has been proposed by Shestopaloff and Neal which could be used for comparison.

  • Quantum system identification in the atom master. Catana et al discuss a quantum experimental system which can be represented by a birth-death process (their algorithm 1). If full observations are available a likelihood can be calculated, although this is expensive. Their paper concentrates on finding partial observations which can, when analysed by ABC, give comparable inferential results.