Skip to content

Eukaryotic draft bins with EukRep and EukCC

Francisco Zorrilla edited this page Mar 22, 2021 · 2 revisions

EukRep is implemented as follows:

rule eukrep:
    input:
        f'/home/fz274/rds/hpc-work/routy/all_bins/dump'
    output:
        f'/home/fz274/rds/hpc-work/routy/all_bins/euks_all.csv'
    message:
        """
        Filters concoct bins for eukaryotic MAGs/contigs with EukRep function filter_euk_bins.py.
        Parameters are set very loosely to capture any bin that has at least 1Mbp of eukaryotic dna present.
            1. minl length = min_contig length from assembly = 1kbp
            2. eukratio = ratio of euk to prok dna in bins = 0
            3. minbp = minimum mag length = 1Mbp
            4. minbpeuks = minimum euk length in mag = 1kbp

        Assumes that concoct bins have been dumped into folder /home/fz274/rds/hpc-work/routy/all_bins/dump
        """
    shell:
        """
        set +u;source activate {config[envs][metabagpipes]};set -u;
        cd $(dirname {input})

        filter_euk_bins.py --output euks_all.csv \
                           --threads 56 \
                           --minl 1000 \
                           --eukratio 0 \
                           --minbp 1000000 \
                           --minbpeuks 1000 \
                           dump/*.fa
        """

EukCC is implemented as follows:

rule eukcc:
    input:
        f'/home/fz274/rds/hpc-work/routy/all_bins/mags/{{binIDs}}.fa'
    output:
        f'/home/fz274/rds/hpc-work/routy/all_bins/eukcc/{{binIDs}}'
    message:
        """
        Grabs mags from input folder and runs them in series through eukcc to get 
        completeness and lineage info. Bins with > 0.5 Mbp were included, total ~300.
        Tried running with pygmes as shown below but did not work
        eukcc --db eukccdb -o . --ncorespplacer 1 --ncores 16 --pygmes --diamond uniref50_pygmes.dmnd genome.fna
        """
    shell:
        """
        set +u;source activate metagem2;set -u;
        eukcc --db /home/fz274/rds/hpc-work/eukcc/eukccdb --outdir {output} --ncorespplacer 1 --ncores 16 {input}
        """