-
Notifications
You must be signed in to change notification settings - Fork 43
Cluster config
The cluster_config.json
file is designed to interact with the SLURM workload manager and contains parameters relevant to job submissions such as the number of core n
per job, the memory mem
of RAM per job, you cluster account name, etc.
The cluster_config.json
file looks something like this:
{
"__default__" : {
"account" : "your_account_name_on_the_cluster",
"time" : "0-06:00:00",
"n" : 48,
"tasks" : 1,
"mem" : 180G,
"name" : "DL.{rule}",
"output" : "logs/{wildcards}.%N.{rule}.out.log",
},
}
Most importantly, you should replace your_account_name_on_the_cluster
with your account name on your cluster. The max runtime time
, number of cores n
, and memory mem
are actually written to this config file when you use the metaGEM.sh
script.
For example, if you run the following command:
metaGEM.sh -t qfilter -j 10 -c 2 -m 8 -t 2
Then the metaGEM.sh
parser will submit 10 jobs with the cluster_config.json
file configured like this:
{
"__default__" : {
"account" : "satoshi",
"time" : "0-02:00:00",
"n" : 2,
"tasks" : 1,
"mem" : 8G,
"name" : "DL.{rule}",
"output" : "logs/{wildcards}.%N.{rule}.out.log",
},
}
As you can see, now the submitted jobs will request 2 cores + 8 GB RAM per jobs with a max runtime of 2 hours. The Snakemake log will be stored in a file called nohup.out
, and the logs for the individual jobs will be stored in the subfolder called logs
. Make sure that the logs
folder already exists before submitting jobs, this can be done automatically by running:
metaGEM.sh -t createFolders
Note that this will also create subfolders for all entries under folder
in the config.yaml
file, which includes logs
. If you just want to make sure that the logs
folder exists then simply run:
mkdir -p logs
If you are developing or testing new rules you may want to submit jobs without going through the metaGEM.sh
parser. In this case you can run the following commands to submit jobs manually:
nohup snakemake all -j 200 -k --cluster-config cluster_config.json -c "sbatch -A {cluster.account} -p {cluster.part} --mem {cluster.mem} -t {cluster.time} -n {cluster.n} --ntasks {cluster.tasks} --cpus-per-task {cluster.n} --output {cluster.output}" &
The above code will submit 200 jobs based on the output of what is in the rule all
in the Snakefile.
- Quality filter reads with fastp
- Assembly with megahit
- Draft bin sets with CONCOCT, MaxBin2, and MetaBAT2
- Refine & reassemble bins with metaWRAP
- Taxonomic assignment with GTDB-tk
- Relative abundances with bwa
- Reconstruct & evaluate genome-scale metabolic models with CarveMe and memote
- Species metabolic coupling analysis with SMETANA