Add neg ctrl test data to CI; bump to version 3.3.5

peterk87 · Sep 15, 2023 · 4b7db19 · 4b7db19
1 parent 0946f98
commit 4b7db19
Show file tree

Hide file tree

Showing 4 changed files with 27 additions and 11 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -157,6 +157,11 @@ jobs:
       - name: Fetch IBV test seq
         run: |
           curl -SLk --silent https://github.com/CFIA-NCFAD/nf-test-datasets/raw/nf-flu/nanopore/fastq/SRR24826962.sampled.fastq.gz > reads/SRR24826962.fastq.gz
+      - name: Fetch negative control seq data
+      - run: |
+          curl -SLk --silent https://github.com/CFIA-NCFAD/nf-test-datasets/raw/nf-flu/nanopore/fastq/ntc-bc15.fastq.gz > reads/ntc-bc15.fastq.gz
+          curl -SLk --silent https://github.com/CFIA-NCFAD/nf-test-datasets/raw/nf-flu/nanopore/fastq/ntc-bc31.fastq.gz > reads/ntc-bc31.fastq.gz
+          curl -SLk --silent https://github.com/CFIA-NCFAD/nf-test-datasets/raw/nf-flu/nanopore/fastq/ntc-bc47.fastq.gz > reads/ntc-bc47.fastq.gz
       - name: Check IBV data
         run: |
           file reads/SRR24826962.fastq.gz
@@ -177,6 +182,9 @@ jobs:
           echo "ERR6359501,$(realpath run1)" | tee -a samplesheet.csv
           echo "ERR6359501,$(realpath run2)" | tee -a samplesheet.csv
           echo "SRR24826962,$(realpath reads/SRR24826962.fastq.gz)" | tee -a samplesheet.csv
+          echo "ntc-bc15,$(realpath reads/ntc-bc15.fastq.gz)" | tee -a samplesheet.csv
+          echo "ntc-bc31,$(realpath reads/ntc-bc31.fastq.gz)" | tee -a samplesheet.csv
+          echo "ntc-bc47,$(realpath reads/ntc-bc47.fastq.gz)" | tee -a samplesheet.csv
       - name: Cache subsampled influenza.fna
         uses: actions/cache@v3
         id: cache-influenza-fna
@@ -205,6 +213,13 @@ jobs:
             --input samplesheet.csv \
             --ncbi_influenza_fasta influenza-10k.fna.zst \
             --ncbi_influenza_metadata influenza.csv.zst
+      - name: Tree of results
+        run: tree -h results/
+      - name: Upload .nextflow.log
+        uses: actions/upload-artifact@v1.0.0
+        with:
+          name: nextflow-log-nanopore-${{ matrix.nxf_ver }}
+          path: .nextflow.log
       - name: Upload pipeline_info/
         if: success()
         uses: actions/upload-artifact@v1.0.0
@@ -223,8 +238,3 @@ jobs:
         with:
           name: nanopore-test-results-multiqc-${{ matrix.nxf_ver }}
           path: results/MultiQC/multiqc_report.html
-      - name: Upload .nextflow.log
-        uses: actions/upload-artifact@v1.0.0
-        with:
-          name: nextflow-log-nanopore-${{ matrix.nxf_ver }}
-          path: .nextflow.log
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,6 +3,12 @@
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [[3.3.5](https://github.com/CFIA-NCFAD/nf-flu/releases/tag/3.3.5)] - 2023-09-15
+
+### Fixes
+
+* handling of empty IRMA `amended_consensus/` when running a negative control or blank sequence (#47)
+
 ## [[3.3.4](https://github.com/CFIA-NCFAD/nf-flu/releases/tag/3.3.4)] - 2023-08-18
 
 ### Fixes

diff --git a/docs/output.md b/docs/output.md
@@ -66,19 +66,19 @@ The primary output from [IRMA][] are the consensus sequences for gene segments,
 <summary>Output files</summary>
 
 - `blast/ncbi/blast_db/`
-  - Nucleotide BLAST database of [NCBI Influenza DB][] and reference database (if provided option `--ref_db`): `influenza_db.*`
+  - Nucleotide [BLAST] database of [NCBI Influenza DB][] and reference database (if provided option `--ref_db`): `influenza_db.*`
 - `blast/ref_db/blast_db/`
-  - Nucleotide BLAST database of the reference database (if provided option `--ref_db`) ref_fasta.fixed.*`
+  - Nucleotide [BLAST] database of the reference database (if provided option `--ref_db`) ref_fasta.fixed.*`
 - `blast/blastn/irma`
-  - Nucleotide BLAST tabular output files (`-outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle"`) of sample IRMA assembled gene segments against the [NCBI Influenza DB][] and the reference database (if provided option `--ref_db`)
+  - Nucleotide [BLAST] tabular output files (`-outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle"`) of sample IRMA assembled gene segments against the [NCBI Influenza DB][] and the reference database (if provided option `--ref_db`)
 - `blast/blastn/consensus`
-  - Nucleotide BLAST tabular output files (`-outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle"`) of sample final consensus assembled gene segments against the [NCBI Influenza DB][] and the reference database (if provided option `--ref_db`)
+  - Nucleotide [BLAST] tabular output files (`-outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle"`) of sample final consensus assembled gene segments against the [NCBI Influenza DB][] and the reference database (if provided option `--ref_db`)
 - `blast/blastn/against_ref_db`
   - Nucleotide BLAST tabular output files (`-outfmt "6 qaccver saccver pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen qcovs stitle"`) of sample final consensus assembled gene segments against the reference database only (if provided option `--ref_db`)
 
 </details>
 
-Nucleotide [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi) (`blastn`) is used to query [IRMA][] assembled gene segment sequences against [Influenza sequences from NCBI](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNucleotide/) (and optionally, against user-specified sequences (`--ref_db`) to predict the H and N subtype of each sample if possible (i.e. if segments 4 (hemagglutinin) and/or 6 (neuraminidase) were assembled) and to determine the closest matching reference sequence for each segment for reference mapped assembly.
+Nucleotide [BLAST][] (`blastn`) is used to query [IRMA][] assembled gene segment sequences against [Influenza sequences from NCBI](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNucleotide/) (and optionally, against user-specified sequences (`--ref_db`) to predict the H and N subtype of each sample if possible (i.e. if segments 4 (hemagglutinin) and/or 6 (neuraminidase) were assembled) and to determine the closest matching reference sequence for each segment for reference mapped assembly.
 
 ### Coverage Plots
 

diff --git a/nextflow.config b/nextflow.config
@@ -151,7 +151,7 @@ manifest {
   description     = 'Influenza A virus genome assembly pipeline'
   homePage        = 'https://github.com/CFIA-NCFAD/nf-flu'
   author          = 'Peter Kruczkiewicz, Hai Nguyen'
-  version         = '3.3.4'
+  version         = '3.3.5'
   nextflowVersion = '!>=22.10.1'
   mainScript      = 'main.nf'
   doi             = '10.5281/zenodo.7011213'