-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gnomAD update #183
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Update gnomAD to version 4 in the crg2-hg38 branch.
Include gnomAD_faf95_popmax column.
Look into whether or not there is a GRCh37 version that we can use to update the GRCh37 pipeline.
Please see this document for a summary of a previous gnomAD update. And this associated pull request..
NOTE that you will need to be in branch crg2-hg38, not master, to run the hg38 crg2 pipeline! And for cre, switch to branch hg38 for report generation.
gnomAD is a database of exomes and genomes from (mostly) healthy individuals. We use gnomAD as a control cohort; a variant with a population allele frequency (AF) of 1% or higher is almost certainly not the cause of an extremely rare monogenic disease. The gnomAD AFs allow us to filter down the variants in an individual with rare monogenic disease so that we can more easily identify the variant or variants associated with their phenotype. Here we will be updating the gnomAD SNV/indel annotation source (they also provide SV AFs).
gnomAD AFs are available in a VCF (or per-chromosome VCFs that can be combined). We use vcfanno to add these AFs to the VCF generated by crg2 in this [rule](variant allele frequencies ). vcfanno requires a config that specifies which fields to use from a VCF to annotate another VCF, and any operations that might be applied to these. In crg2-hg38, that config is here.
You will need to:
The text was updated successfully, but these errors were encountered: