Skip to content
/ sigfit Public

Flexible Bayesian inference of mutational signatures

License

Notifications You must be signed in to change notification settings

kgori/sigfit

Repository files navigation

sigfit

Flexible Bayesian inference of mutational signatures

Build Status

sigfit is an R package to estimate signatures of mutational processes and their activities on mutation count data. Starting from a set of single-nucleotide variants (SNVs), it allows both estimation of the exposure of samples to predefined mutational signatures (including whether the signatures are present at all), and identification of signatures de novo from the mutation counts. These two procedures are often called, respectively, signature fitting and signature extraction. In addition, sigfit implements novel methods to combine signature fitting and extraction in a single inferential process, thus facilitating the deconvolution of rare or admixed signatures. The package provides interfaces to four different Bayesian models of signatures (multinomial, Poisson, normal and negative binomial), as well as a range of functions to generate publication-quality graphics of the corresponding mutational catalogues, signatures and exposures. Furthermore, the signature fitting and extraction methods in sigfit can be seamlessly applied to mutational profiles beyond SNV data, including indel or rearrangement count data, and even real-valued data such as DNA methylation profiles.

Enhancements in version 2.2 (October 2021)

  • New COSMIC v3.2 signatures for indels and doublet base substitutions (18 ID and 11 DBS signatures)
  • Support for processing and plotting mutational spectra of indels and doublets (see vignette section 'Alternative mutation type definitions')

Enhancements in version 2.1 (May 2021)

  • New COSMIC v3.2 signatures (78 SBS signatures)
  • Functions to calculate cosine similarity and L2 distance between signatures

Enhancements in version 2.0 (November 2019)

  • New models for analysis of real-valued data ("normal") and robust fitting to sparse data ("negbin")
  • New COSMIC v3 signatures (67 SBS signatures) and test datasets
  • Straightforward analysis and plotting of signatures defined over arbitrary mutation types
  • Support for mutational opportunities in all signature models
  • Support for signature and exposure priors in all signature models
  • Enhanced plotting functionalities
  • Increased MCMC sampling efficiency
  • Extended package vignette

Installation

sigfit is an R package. As it is in early development it is not yet on CRAN, but can be installed from inside an R session using the devtools package.

devtools::install_github("kgori/sigfit", build_vignettes = TRUE,
                         build_opts = c("--no-resave-data", "--no-manual"))

The arguments build_vignettes and build_opts are necessary for the package vignette to be built.

For solutions to some of the problems that may arise during installation, see the Troubleshooting installation section.

Usage guide

See the package vignette for detailed usage examples:

browseVignettes("sigfit")

If the vignette is not available because it was not built during installation, you can download it and build it using the rmarkdown package (the vignette will be saved as an HTML file in your current working directory):

download.file("https://raw.githubusercontent.com/kgori/sigfit/master/vignettes/sigfit_vignette.Rmd",
              destfile = "sigfit_vignette.Rmd")
rmarkdown::render("sigfit_vignette.Rmd")
browseURL("sigfit_vignette.html")

You can also browse the package vignette on GitHub.

Citation

To cite sigfit in publications, please use:

  • Kevin Gori, Adrian Baez-Ortega. sigfit: flexible Bayesian inference of mutational signatures. bioRxiv, 372896 (2018). doi: 10.1101/372896.

The corresponding BibTeX entry is:

@Article{sigfit,
    title = {sigfit: flexible Bayesian inference of mutational signatures},
    author = {Gori, Kevin and Baez-Ortega, Adrian},
    journal = {bioRxiv},
    year = {2018},
    pages = {372896},
    doi = {10.1101/372896}
}

Licence

Authors: Kevin Gori and Adrian Baez-Ortega
Transmissible Cancer Group, University of Cambridge

sigfit is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses.

Troubleshooting installation

Below are the solutions to some of the problems that may arise during installation.

Problem:

Error: 'rstan_config' is not an exported object from 'namespace:rstantools'

Solution:
Update rstantools: devtools::install_github("stan-dev/rstantools")


Problem:

C++14 standard requested but CXX14 is not defined

Solution:
Provide R with c++14 options via the file ~/.R/Makevars, e.g.

CXX14 = g++
CXX14FLAGS = -g -O2
CXX14PICFLAGS = -fpic
CXX14STD = -std=gnu++14

Problem:

make: *** [stanExports_sigfit_ext.o] Error 1
ERROR: compilation failed for package ‘sigfit’

Which is preceded by a series of compilation errors related to StanHeaders, such as:

.../StanHeaders/include/stan/math/rev/mat/functor/adj_jac_apply.hpp:619:15: error: invalid use of ‘auto’

Solution:
Provide R with c++14 options via the file ~/.R/Makevars, as described above.


Problem:

g++: error: unrecognized command line option '-std=gnu++14'

Solution: Upgrade the gcc and g++ compilers to version 5. In Ubuntu, this can be done as follows:

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-5 g++-5
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 60 --slave /usr/bin/g++ g++ /usr/bin/g++-5

Alternatively, if you have version 4.8.1 or higher of gcc and g++, you can enable C++14 features by substituting -std=gnu++1y for -std=gnu++14 in ~/.R/Makevars (see above).