Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mtCN result from WDL output files #4

Open
pkrithivasan opened this issue Aug 30, 2023 · 1 comment
Open

mtCN result from WDL output files #4

pkrithivasan opened this issue Aug 30, 2023 · 1 comment

Comments

@pkrithivasan
Copy link

Hi, we have successfully run the v2.5_MongoSwirl_Single pipeline on a single WGS sample, and are reviewing the output files. Could you please point us to the output file that indicates the mtDNA copy number (mtCN)? Thanks!

@rahulg603
Copy link
Owner

rahulg603 commented Sep 25, 2023

Hello! Great to hear that the run was successful. If you ran this using Terra/Cromwell you usually have the option to save outputs in a table – the mtDNA mean coverage can be found in "mean_coverage" and the mtDNA median coverage can be found in "median_coverage". This can also be found in the "stats_outputs" file, which contains all of the per-sample statistics computed during the pipeline run – usually named "[sample_name]_mtanalysis_diagnostic_statistics.tsv".

To get mtCN, you'll need to get the mean nuclear DNA coverage. Often this is pre-computed as part of WGS QC. If it is not available, you can use samtools idxstats, samtools flagstat, and GATK CollectQualityYieldMetrics to quickly approximate this number. We implement this in the multi pipeline here (idxstats and flagstat) and here (CollectQualityYieldMetrics).

In the methods section of the paper (in "Computing mean nucDNA coverage in UKB") you can find the formula we used to efficiently compute the approximate nuclear genome coverage depth. You can see how we combined the outputs of idxstats/flagstat to obtain the relevant values here.

Once you have the appropriate statistics, we computed mtCN using: 2 * mtDNA coverage / nucDNA coverage.

Hope this is helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants