Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNA-mapping pipeline gets stuck at origFASTQ rule when using a custom cluster config yaml #887

Open
TobiasHohl opened this issue Mar 6, 2023 · 4 comments
Labels

Comments

@TobiasHohl
Copy link
Member

TobiasHohl commented Mar 6, 2023

Using the following command, I tried running the DNA-mapping pipeline using a customized cluster config yaml to circumvent issues with one node on our server:
DNA-mapping --DAG --trim --trimmerOptions '-a nexteraF=CTGTCTCTTATA -A nexteraR=CTGTCTCTTATA' --dedup --mapq 2 -j 20 -i ./fastq/ -o ./mapping/ hg38 --clusterConfigFile custom-cluster-config.yaml

Apparently, the pipeline always got stuck at the execution of the first rules (origFASTQ1 and origFASTQ2), no matter what name I gave the custom config yaml and no matter what output folder I assigned. This was not the case when using the default cluster configuration.
Also, when I start the pipeline with the default and interrupt after the origFASTQ folder and its contents are created, I can restart with the custom config and it finishes.

Custom cluster config:

CollectAlignmentSummaryMetrics:
  memory: 2G
CollectInsertSizeMetrics:
  memory: 1G
FASTQdownsample:
  memory: 4G
__default__:
  memory: 1G
bamCoverage:
  memory: 4G
bamCoverage_RPKM:
  memory: 5G
bamCoverage_coverage:
  memory: 5G
bamCoverage_filtered:
  memory: 4G
bamCoverage_raw:
  memory: 5G
bamCoverage_unique_mappings:
  memory: 5G
bamPE_fragment_size:
  memory: 10G
bowtie2:
  memory: 4G
bwa:
  memory: 4G
bwamem2:
  memory: 6G
create_snpgenome:
  memory: 30G
filter_reads:
  memory: 3G
filter_reads_umi:
  memory: 10G
plotCorrelation_pearson:
  memory: 3G
plotCorrelation_pearson_allelic:
  memory: 5G
plotCorrelation_spearman:
  memory: 3G
plotCorrelation_spearman_allelic:
  memory: 2G
plotCoverage:
  memory: 1G
plotEnrichment:
  memory: 1G
plotFingerprint:
  memory: 1G
plotPCA:
  memory: 4G
plotPCA_allelic:
  memory: 4G
plot_heatmap_CSAW_up:
  memory: 10G
snakePipes_cluster_logDir: cluster_logs
snakemake_cluster_cmd: module load slurm; sbatch --ntasks-per-node 1 -p bioinfo --mem-per-cpu
  {cluster.memory} -c {threads} -e cluster_logs/{rule}.%j.err -o cluster_logs/{rule}.%j.out
  -x deep9 -J {rule}.snakemake
snakemake_latency_wait: 300
snp_split:
  memory: 10G

Edit: formatting

@TobiasHohl TobiasHohl added the bug label Mar 6, 2023
@katsikora
Copy link
Contributor

Hi Tobi,

I'll have a look if I can reproduce this.
Just to understand, you've only modified the cluster command to exclude deep9?

Best wishes,

Katarzyna

@TobiasHohl
Copy link
Member Author

Hi Katarzyna,

yes, when I start the pipeline with the following command it works:
DNA-mapping --DAG --trim --trimmerOptions "-a nexteraF=CTGTCTCTTATA -A nexteraR=CTGTCTCTTATA" --dedup --mapq 2 -j 20 -i ./fastq/ -o ./sp/ hg38

Only when using the cluster config yaml as stated above the pipeline gets stuck.

Thanks for looking into this!

Best
Tobi

@katsikora
Copy link
Contributor

Hi Tobi,

thanks for submitting the issue. Indeed, I can reproduce it.
I'll have a look what might be causing this.

Best,

Katarzyna

@katsikora
Copy link
Contributor

Hi Tobi,

as an update, I will attempt to circumvent this and other slurm-related issues in snakePipes by implementing native slurm support available in more recent snakemake versions. This is now pending, as I need the IT to solve an issue related to the slurm folder on the package partition.

Best,

Katarzyna

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants