Discrete convolution statistic

This repository contains the Python 3 implementation of the discrete convolution statistic for hypothesis testing (by G. Prevedello and K.R. Duffy), and the benchmarking simulations of this statistic against Pearson's chi-squared.

Given k>1, h>0, and random variables X₁, ..., X_k, Y₁, ..., Y_h, each having discrete support {0,...,a} with possibly different integer a>0, the aim of this statistical procedure is to test the null hypothesis of goodness-of-fit

H₀: X₁ + ... + X_k ~ z,

with z probability mass vector of positive entries,

and the null hypothesis of equality in distribution

H₀: X₁ + ... + X_k ~ Y₁ + ... + Y_h.

The statistic of discrete convolution also enables to test the null hypothesis of sub-independence for X₁, ..., X_k.

Code

The function to calculate the discrete convolution statistic is found in code/discrete_convolution_statistics.py, which is then called in code/Example.ipynb and code/Simulations_run.ipynb notebooks.

The script code/Example.ipynb serves as a minimal example for the application of discrete convolution statistic function.

Considering the hypotheses presented above, the script code/Simulations_run.ipynb executes Monte Carlo simulations to estimate the proportion of rejections from different tests under several parametrisations of the random variables, with k=2, h=1, and

X₁ ~ x₁ = (1-p, p), X₂ ~ x₂ = (1-q, q),

Y ~ z(r) = (1-r)((1-p)(1-q), 1-pq(1-p)(1-q), pq) + r(1-a, 0, a),

with r in [0,1] and a = pq + (pq(1-p)(1-q))^1/2

The results from these simulations are then stored as csv files csv/Fig1.csv, csv/Fig2.csv and csv/Fig3.csv, that are then loaded in code/Simulations_plot.ipynb to create the plots stored in the folder figures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Discrete convolution statistic

Code

Files

README.md

Latest commit

History

README.md

File metadata and controls

Discrete convolution statistic

Code