Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
peterk87 committed Jul 11, 2023
1 parent b21b89d commit 2ed8bf5
Show file tree
Hide file tree
Showing 6 changed files with 25 additions and 16 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,14 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [[3.3.0](https://github.com/CFIA-NCFAD/nf-flu/releases/tag/3.3.0)] - 2023-07-11

This release migrates to more recently updated Influenza virus sequences since the last update for the [NCBI Influenza DB FTP data](https://ftp.ncbi.nih.gov/genomes/INFLUENZA/) was in 2020-10-13. By default, all Orthomyxoviridae virus sequences were parsed from the daily updated NCBI Viruses [`AllNucleotide.fa`](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNucleotide/) and [`AllNuclMetadata.csv.gz`](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNuclMetadata/AllNuclMetadata.csv.gz) and uploaded to [Figshare](https://figshare.com/articles/dataset/2023-06-14_-_NCBI_Viruses_-_Orthomyxoviridae/23608782) as Zstd compressed files. nf-flu no longer uses the [influenza.fna.gz](https://ftp.ncbi.nih.gov/genomes/INFLUENZA/influenza.fna.gz) and [genomeset.dat.gz](https://ftp.ncbi.nih.gov/genomes/INFLUENZA/genomeset.dat.gz) files for Influenza sequences and metadata, respectively.

### Fixes

* More up-to-date Influenza sequences database used by default (#24)

## [[3.2.1](https://github.com/CFIA-NCFAD/nf-flu/releases/tag/3.2.1)] - 2023-07-07

### Fixes
Expand Down
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@ After reference sequence selection, the pipeline performs read mapping to each r

## Pipeline summary

1. Download latest [NCBI Influenza DB][] sequences and metadata (or use user-specified files)
2. Merge reads of re-sequenced samples ([`cat`](http://www.linfo.org/cat.html)) (if needed)
1. Download latest [NCBI Orthomyxoviridae sequences](https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=11308&lvl=3&keep=1&srchmode=1&unlock) and metadata (parsed from [NCBI Viruses FTP data](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNucleotide/)).
2. Merge reads of re-sequenced samples ([`cat`](http://www.linfo.org/cat.html)) (if needed).
3. Assembly of Influenza gene segments with [IRMA][] using the built-in FLU module
4. Nucleotide [BLAST][] search against [NCBI Influenza DB][]
5. Automatically select top match references for segments
6. H/N subtype prediction and Excel XLSX report generation based on BLAST results
7. Perform Variant calling and genome assembly for all segments.
4. Nucleotide [BLAST][] search against [NCBI Influenza DB][] sequences
5. H/N subtype prediction and Excel XLSX report generation based on BLAST results.
6. Automatically select top match reference sequences for segments
7. Read mapping, variant calling and consensus sequence generation for each segment against top reference sequence based on BLAST results.
8. MultiQC report generation.

## Quick Start

Expand Down
2 changes: 1 addition & 1 deletion docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ The primary output from [IRMA][] are the consensus sequences for gene segments,

</details>

Nucleotide [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi) (`blastn`) is used to query [IRMA][] assembled gene segment sequences against the [NCBI Influenza DB][] sequences (and optionally, against user-specified sequences (`--ref_db`) to predict the H and N subtype of each sample if possible (i.e. if segments 4 (hemagglutinin) and/or 6 (neuraminidase) were assembled) and to determine the closest matching reference sequence for each segment for reference mapped assembly.
Nucleotide [BLAST](https://blast.ncbi.nlm.nih.gov/Blast.cgi) (`blastn`) is used to query [IRMA][] assembled gene segment sequences against [Influenza sequences from NCBI](https://ftp.ncbi.nlm.nih.gov/genomes/Viruses/AllNucleotide/) (and optionally, against user-specified sequences (`--ref_db`) to predict the H and N subtype of each sample if possible (i.e. if segments 4 (hemagglutinin) and/or 6 (neuraminidase) were assembled) and to determine the closest matching reference sequence for each segment for reference mapped assembly.

### Coverage Plots

Expand Down
8 changes: 4 additions & 4 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,17 +261,17 @@ Maximum of top blastn result reported

- Optional
- Type: string
- Default: `https://ftp.ncbi.nih.gov/genomes/INFLUENZA/influenza.fna.gz`
- Default: `https://api.figshare.com/v2/file/download/41415330`

Path/URL to NCBI Influenza DB sequences FASTA file.
Path/URL to Zstandard compressed NCBI Influenza virus sequences FASTA file.

#### `--ncbi_influenza_metadata`

- Optional
- Type: string
- Default: `https://ftp.ncbi.nih.gov/genomes/INFLUENZA/genomeset.dat.gz`
- Default: `https://api.figshare.com/v2/file/download/41415333`

Path/URL to NCBI Influenza DB metadata file.
Path/URL to Zstandard compressed NCBI Influenza virus sequences metadata CSV file.

### Generic options

Expand Down
2 changes: 1 addition & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ manifest {
description = 'Influenza A virus genome assembly pipeline'
homePage = 'https://github.com/CFIA-NCFAD/nf-flu'
author = 'Peter Kruczkiewicz, Hai Nguyen'
version = '3.2.1'
version = '3.3.0'
nextflowVersion = '>=21.10'
mainScript = 'main.nf'
doi = '10.5281/zenodo.7011213'
Expand Down
8 changes: 4 additions & 4 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -213,14 +213,14 @@
},
"ncbi_influenza_fasta": {
"type": "string",
"default": "https://ftp.ncbi.nih.gov/genomes/INFLUENZA/influenza.fna.gz",
"description": "Path/URL to NCBI Influenza DB sequences FASTA file.",
"default": "https://api.figshare.com/v2/file/download/41415330",
"description": "Path/URL to Zstandard compressed NCBI Influenza virus sequences FASTA file.",
"fa_icon": "fas fa-file-alt"
},
"ncbi_influenza_metadata": {
"type": "string",
"default": "https://ftp.ncbi.nih.gov/genomes/INFLUENZA/genomeset.dat.gz",
"description": "Path/URL to NCBI Influenza DB metadata file.",
"default": "https://api.figshare.com/v2/file/download/41415333",
"description": "Path/URL to Zstandard compressed NCBI Influenza virus sequences metadata CSV file.",
"fa_icon": "fas fa-file-csv"
}
},
Expand Down

0 comments on commit 2ed8bf5

Please sign in to comment.