This repository is in active development and is not yet ready for production use.
Next generation sequencing and bioinformatic and genomic analysis at the Colorado Department of Public Health and Environment (CDPHE) is not CLIA validated at this time. These workflows and their outputs are not to be used for diagnostic purposes and should only be used for public health action and surveillance purposes. CDPHE is not responsible for the incorrect or inappropriate use of these workflows or their results.
The following documentation describes the Colorado Department of Public Health and Environment's workflows for the assembly and analysis of next genome sequencing data of H5 influenza on GCP's Terra.bio platform.
The workflow currently allows for various primer schemes and references for alignment. Some are only for the HA gene segment while others are whole genome. See Workspace data below.
The workflow is split into multiple wdl files, but is all launched as one workflow.
Name | Description | Subworkflow/task calls |
---|---|---|
h5_testing |
The main workflow that calls all other subworkflows and tasks. | version_capture_tasks.workflow_metadata h5_structs.declare_structs primer_tasks.primer_level_tasks reference_tasks.reference_level_tasks version_capture_tasks.capture_versions other_tasks.transfer_vc |
h5_structs |
Contains struct definitions and subworkflow to declare the primer schemes and references structs. | None |
primer_tasks |
Subworkflow and task declarations for primer-level tasks. | fastqc_raw seqyclean fastqc_clean sample_qc_file multiqc_fastqc multiqc_seqyclean transfer |
reference_tasks |
Subworkflow and task declarations for reference-level tasks. | align_bwa trim_primers_ivar generate_consensus_ivar alignment_metrics calculate_percent_coverage concat_sample_reference_metrics multiqc_samtools transfer |
version_capture_tasks |
Task declarations for tool and workflow version capture. | workflow_metadata capture_versions |
other_tasks |
Task declarations for all other tasks. | transfer multiqc concat_all_samples_metrics |
Note- Commonly-used inputs such as sample_name
are not noted below.
- Call
version_capture_tasks.workflow_metadata
task to get the workflow version and analysis date. - Call
h5_structs.declare_structs
subworkflow to declare reference and primer scheme struct objects. - Scatter samples to create
Sample
struct objects. - Scatter
PrimerScheme
array.- Create primer sample list, excluding those with empty fastq files.
- Call
primer_tasks.primer_level_tasks
subworkflow- Scatter samples
- Call
fastqc_raw
. Input - raw fastq files. - Call
seqyclean
. Input - raw fastq files. - Call
fastqc_clean
. Input - cleaned fastq files fromseqyclean
.
- Call
- Call
sample_qc_file
. Input -data.txt
files fromfastqc_raw
andfastqc_clean
,SummaryStatistics.txt
fromseqyclean
. - Call
multiqc_fastqc
. Input-data.txt
files fromfastqc_raw
andfastqc_clean
. - Call
multiqc_seqyclean
. Input-SummaryStatistics.txt
files fromseqyclean
. - Call
other_tasks.transfer
task to transfer all primer-level task outputs to their respective directories.
- Scatter samples
- Call
reference_tasks.reference_level_tasks
subworkflow- Scatter samples
- Call
align_bwa
. Input -reference
name and fasta file, cleaned fastq files fromseqyclean
. - Call
trim_primers_ivar
. Input -bwa
fromalign_bwa
,primer_bed
file - Call
generate_consensus_ivar
. Input -trimmed.sorted.bam
fromtrim_primers_ivar
,reference
fasta file. - Call
alignment_metrics
. Input -trimmed.sorted.bam
fromtrim_primers_ivar
- Call
calculate_percent_coverage
. Input - {to fill in} - Call
concat_sample_reference_metrics
. Input -stats.txt
andcoverage.txt
fromalignment_metrics
,coverage_stats.csv
fromcalculate_percent_coverage
.
- Call
- Call
multiqc_samtools
. Input -stats.txt
andcoverage.txt
fromalignment_metrics
. - Call
other_tasks.transfer
task to transfer all reference-level task outputs to their respective directories.
- Scatter samples
- Call
version_capture_tasks.capture_versions
. Input:analysis_date
andworkflow_version
fromworkflow_metadata
,Array[VersionInfo]
objects from all tools used. - Call
other_tasks.transfer
task to transferversion_capture.csv
fromversion_capture_tasks.capture_versions
.
Name | Description |
---|---|
AVRL_H5N1_250bp_bed |
cattle-specific, tiled whole genome |
houston_bed |
Rice, H1, H3, H5, N1, N2 - Olivar (tiled) |
human_h5_200_bed |
cattle-H5-specific, 200bp, tiled HA gene |
human_h5_250_bed |
cattle-H5-specific, 250bp, tiled HA gene |
bovine_texas_029328_01_UtoT_fasta |
Cattle reference, whole genome |
bovine_texas_029328_01_UtoT_ha_fasta |
Cattle reference, HA gene |
darwin_9_2021_h3n2_ha_h3_fasta |
CDC vaccine strain reference for H3N2, HA gene |
victoria_4897_2022_h1n1_ha_h1_fasta |
CDC vaccine strain reference for H1N1, HA gene |
vietnam_1203_2024_h5n1_ha_v2_fasta |
CDC vaccine strain reference for H5N1, HA gene |
contaminants_fasta |
Adapters and contaminants fasta file |