Multi-variant colocalisation with summary genetic association data. Merge JAM results for two distinct traits, for posterior inference about shared genetic variants associated with both traits.
You can install cojam from GitHub with:
# install.packages("devtools")
devtools::install_github("simisc/cojam")
GWAS are generally analysed one SNP at a time, generating summary data that obscure the presence of causal variants because many SNPs in linkage disequilibrium (LD) with the causal SNP will also have significant associations. JAM is a scalable fine-mapping algorithm to identify candidate causal SNPs that best explain the joint pattern of single-SNP summary associations (Newcombe et al. 2016 Genetic Epidemiology, Newcombe et al. 2018 Genetic Epidemiology). The summary associations are conditioned on each other by estimating their correlation from a reference genotype panel (e.g. UK Biobank).
Because summary data for different traits are usually available from distinct studies, cojam first models the causal genetic variants for each trait, then combines these results to draw inferences about the existence of joint causal variants. Building on the colocalisation framework used in coloc (Giambartolomei et al. 2014 PLOS Genetics, Wallace 2020 PLOS Genetics), cojam assigns joint models for the two traits to one of five mutually exclusive hypotheses. The two hypotheses of interest are:
- H3: associations with both traits, distinct SNPs (colocalisation)
- H4: associations with both traits, including at least one shared SNP
In cojam, support for H4 over H3 is quantified by the Bayes Factor (likelihood ratio), BF = p(D|H4) / p(D|H3) = PosteriorOdds[H4:H3] / PriorOdds[H4:H3]. The posterior odds are estimated from JAM’s independent rjMCMC posterior samples; the prior odds are calculated combinatorially, also assuming independence between traits, and taking into account the (potentially different) priors on the proportion of causal variants in each JAM model. The effects of the independence assumption on the prior and posterior odds cancel out in the BF.
When two traits both have genetic drivers in the same region, this suggests some relationship between the traits. For a chosen prior odds of H4 against H3 (reflecting the dependence between colocalised traits), multiplying by BF provides the resulting posterior odds.
Caution: The current version of cojam does not include methods for assessing whether merged chains have efficiently searched the joint model space, or for visualising multi-SNP results. Both JAM models should be thoroughly checked before passing them to cojam, using methods provided in R2BGLiMS.
To do.