GPcounts is Gaussian process regression package for counts data with negative binomial and zero-inflated negative binomial likelihoods described in the paper "Non-parametric modelling of temporal and spatial counts data from RNA-seq experiments". It is implemented in python, using the tensorflow and GPflow.
This is now published in Bioinformatics.
- Clone GPcounts repository:
git clone https://github.com/ManchesterBioinference/GPcounts.git
- Install:
- Install requirements and package
cd GPcounts
pip install -r requirements.txt
cd
git clone https://github.com/markvdw/RobustGP
cd RobustGP
python setup.py install
cd
cd GPcounts
python setup.py install
cd
Run the GPcounts/demo-notebooks
cd GPcounts/demo-notebooks
jupyter notebook
File name |
Description |
---|---|
bulk_time_series | Applying GPcounts with negative binomial likelihood on bulk RNA-Seq time course data. We compare with Gaussian likelihoood results and show how to infer trajectories and carry out one-sample and two-samples tests |
scRNA-Seq_time_series | Applying GPcounts with negative binomial likelihood on scRNA-seq gene expression data to find DE genes. We also demonstrate the use of sparse inference to improve computational efficiency. |
GPcounts_spatial | Applying GPcounts with negative binomial likelihood to identify spatially expressed genes on spatial data from Mouse Olfactory Bulb. We demonstrate how to use the 'scaled' version which is based on data normalisation via multiplication of the NB mean by a location specific scale factor. |
GPcounts_spatial_smf_scales | Applying GPcounts with negative binomial likelihood to identify spatially expressed genes on spatial data from Mouse Olfactory Bulb. We show how to calculate the scales factor using python's 'statsmodels' module instead of R code used in the above notebook. This way is easier and faster. |
Branching_GPcounts | Applying GPcounts on the single-cell data to estimate the most probable branching locations for individual genes. This notebook demonstrates how to build a GPcounts model and plot the posterior model fit and posterior branching times. The application of this approach can be extended to the bulk time series data to identify the differentiation or the perturbation points,at which the two time-courses start to diverge for the first time. |
Visium_sagittal_anterior_mouse_brain_Sparse | Applying GPcounts on visium data to identify spatially expressed genes on spatial data from Sagittal Anterior Mouse Brain. This notebook demonstrates applying a sparse GPcounts model on visium data and generate scales factor usign python's 'statsmodels' module. |
In order to reproduce the paper results we have recorded the original packages used in a different requirements file paper results .