Uses the MultiQC tool developed and maintained by Phil Ewels: ewels/MultiQC
Docker images are available from here
This app downloads the quality metric files as specified by the location and filename extension in the "dx_sp" field in the config.yaml. Then runs the MultiQC tool to generate a run-wide quality report. Modules and columns to be displayed are specified in the config file, the full list of supported modules can be found here.
It is essential to use a config file that has a "dx_sp" section with "primary" for file extenstions in app/outputs of the primary workflow and "secondary" for file extenstions in app/outputs of the secondary workflow.
multiqc_docker
: Docker image of the MultiQC tool (can be found in 001_Reference:/assets/MultiQC)multiqc_config_file
: A config.yaml file specifying which modules to run and the search patterns to recognise QC files for each module- input data, which may come from different sources depending on the use-case
Option 1 - for production run on workflow output folders:
project_for_multiqc
- the name of the project to be used, eg 002_211008_A01303_0030_AHKTMYDRXY_TWE- This project must have an 'output' folder in its root directory (project:/output/).
primary_workflow_output
- the exact name of the output directory of the primary workflow, eg. dias_single_v1.2.3-TWE_v1.0.5-211011-1)- This folder must have subfolders for each app, like project:/output/primary_workflow/app_output)
- OPTIONAL
secondary_workflow_output
- The exact name of the output directory of the secondary workflow, eg. dias_multi_v1.2.3-TWE_v1.0.5-211011-1)- This folder must have subfolders for each app, like project:/output/primary_workflow/secondary_workflow/app_output)
Option 2 - for testing and development:
- set
single_folder
to TRUE and provide project_for_multiqc
- the project name andprimary_workflow_output
- the absolute path to a specific folder in the project with all QC files For example, if you want files from project:/folder/subfolder/data_files, you need to input '/folder/subfolder' into the field This will download all files from only this folder and will break if there are further subfolders inside!
OPTIONAL: to calculate target bases coverage at custom depths set calc_custom_coverage to TRUE and provide depths as a list of comma separated integer values --> config.yaml should be changed accordingly to recognise and display the custom coverage columns
The following outputs are placed in the DNAnexus project in the specified output folder:
- a HTML QC report (with the name of the runfolder-workflow)
- a folder containing the qc metrics data in text format. (The folder is named after project-workflow-multiqc_data)
- original config_file.yaml that was used to generate the report
- The app downloads files within the $project_for_multiQC:/output/$primary_workflow_output/QCapp directories of the project, where 'QCapp' and filename extensions are specified in the config file.
- The app runs the dockerised MultiQC tool with the config file provided.
- MultiQC parses all recognised files and displays them in the html report.
- The MultiQC outputs are uploaded to DNAnexus.
dx run multiqc-applet_ID \
-imultiqc_docker='{}' \
-imultiqc_config_file='{}' \
-iproject_for_multiqc='{}' \
-iprimary_workflow_output='{}' \
-isecondary_workflow_output='{}' \
--destination='{}'