Skip to content

Commit

Permalink
Move downloaded Nextclade executable to data/nextclade
Browse files Browse the repository at this point in the history
Avoids clash of downloaded Nextclade executable with the Nextclade
command available in the environment.

Includes the side-effect of the downloaded executable being removed
as part of `bin/clean` when running the workflow without the
`keep_all_files=True` config param. This ensures that the workflow will
start from a clean slate.
  • Loading branch information
joverlee521 committed Jul 26, 2024
1 parent 6c6a4ff commit 9a2ca57
Showing 1 changed file with 13 additions and 12 deletions.
25 changes: 13 additions & 12 deletions workflow/snakemake_rules/nextclade.smk
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ if config.get("s3_dst") and config.get("s3_src"):

rule use_nextclade_cache:
input:
nextclade="./nextclade",
nextclade="data/nextclade",
nextclade_dataset=lambda w: f"data/nextclade_data/sars-cov-2{w.reference.replace('_','-')}.zip",
params:
dst_source=config["s3_dst"],
Expand Down Expand Up @@ -166,40 +166,40 @@ rule get_sequences_without_nextclade_annotations:
rule download_nextclade_executable:
"""Download Nextclade"""
output:
nextclade="nextclade",
nextclade="data/nextclade",
benchmark:
f"benchmarks/download_nextclade_executable_{database}.txt"
shell:
"""
if [ "$(uname)" = "Darwin" ]; then
curl -fsSL "https://github.com/nextstrain/nextclade/releases/latest/download/nextclade-x86_64-apple-darwin" -o "nextclade"
curl -fsSL "https://github.com/nextstrain/nextclade/releases/latest/download/nextclade-x86_64-apple-darwin" -o {output.nextclade:q}
else
curl -fsSL "https://github.com/nextstrain/nextclade/releases/latest/download/nextclade-x86_64-unknown-linux-gnu" -o "nextclade"
curl -fsSL "https://github.com/nextstrain/nextclade/releases/latest/download/nextclade-x86_64-unknown-linux-gnu" -o {output.nextclade:q}
fi
chmod +x nextclade
chmod +x {output.nextclade:q}
if ! command -v ./nextclade &>/dev/null; then
if ! command -v {output.nextclade:q} &>/dev/null; then
echo "[ERROR] Nextclade executable not found"
exit 1
fi
NEXTCLADE_VERSION="$(./nextclade --version)"
NEXTCLADE_VERSION="$({output.nextclade:q} --version)"
echo "[ INFO] Nextclade version: $NEXTCLADE_VERSION"
"""


rule download_nextclade_dataset:
"""Download Nextclade dataset"""
input:
"nextclade",
nextclade="data/nextclade",
output:
dataset="data/nextclade_data/{dataset_name}.zip",
benchmark:
f"benchmarks/download_nextclade_dataset_{database}_{{dataset_name}}.txt"
shell:
"""
./nextclade dataset get --name="{wildcards.dataset_name}" --output-zip={output.dataset} --verbose
{input.nextclade:q} dataset get --name="{wildcards.dataset_name}" --output-zip={output.dataset} --verbose
"""


Expand All @@ -210,7 +210,7 @@ rule run_wuhan_nextclade:
metrics which will ultimately end up in metadata.tsv.
"""
input:
nextclade_path="nextclade",
nextclade_path="data/nextclade",
dataset="data/nextclade_data/sars-cov-2.zip",
sequences=f"data/{database}/nextclade.sequences.fasta",
params:
Expand Down Expand Up @@ -245,7 +245,7 @@ rule run_21L_nextclade:
Like wuhan nextclade, but TSV only, no alignments output
"""
input:
nextclade_path="nextclade",
nextclade_path="data/nextclade",
dataset=lambda w: f"data/nextclade_data/sars-cov-2-21L.zip",
sequences=f"data/{database}/nextclade_21L.sequences.fasta",
output:
Expand All @@ -266,6 +266,7 @@ rule run_21L_nextclade:

rule nextclade_tsv_concat_versions:
input:
nextclade="data/nextclade",
tsv=f"data/{database}/nextclade{{reference}}_new_raw.tsv",
dataset=lambda w: f"data/nextclade_data/sars-cov-2{w.reference.replace('_','-')}.zip",
output:
Expand All @@ -276,7 +277,7 @@ rule nextclade_tsv_concat_versions:
"""
if [ -s {input.tsv} ]; then
# Get version numbers
nextclade_version="$(./nextclade --version)"
nextclade_version="$({input.nextclade:q} --version)"
dataset_version="$(unzip -p {input.dataset} pathogen.json | jq -r '.version.tag')"
timestamp="$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
Expand Down

0 comments on commit 9a2ca57

Please sign in to comment.