Skip to content

Experimental tool to taxonomically classify microbial genomes in a few minutes

Notifications You must be signed in to change notification settings

ayixon/genomescanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 

Repository files navigation

- EXPERIMENTAL tool to taxonomically classify microbial genomes in a few minutes

COPYRIGHT="Copyright (C) 2022 Ayixon Sánchez Reyes"
This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation.This program is distributed WITHOUT ANY WARRANTY.

+DEPENDENCIES: Mash; JolyTree, apcalc, ncbi-entrez-direct, orthoani, Blast, Biopython, bPTP, mptp                                 

Before you begin, install the following:

sudo apt update

sudo apt upgrade -y

sudo apt install -y build-essential

Install the dependencies

sudo apt install -y mptp apcalc ncbi-entrez-direct ncbi-blast+

Note: in some distribution ncbi-blast+ from aptitude doesn´t work. Install with conda instead

Install Python packages

pip install orthoani

Install Conda packages

mamba install -y -c bioconda jolytree

mamba install -y -c bfurneaux bptp

Usage:

 ./GenoScanner.sh  -i <input_file> -d <database_file> -m <model>

"Options:"

"  -i <input_file>    Input fasta, fna, or fa archive"

"  -d <database_file> Database file in .msh format"

"  -m <model>         Select between two models: mptp or bptp"

"  -h                 Display this help message"
- (bPTP is the default option in our program because it works best for the speciation hypothesis, according to our experience.
- The working directory must contain the mash database (.msh) and the query genome in fasta format

Download preconfigured database here Mash DB format .msh

This database contains ~18,000 genomic records with standing in nomenclature

Rational: Compare a query_genome vs a curated MASH database; select the nearest phylogenetic neighbors; Estimate the ANI of the query vs the references; store the genomes in a folder and pass them to JolyTree for phylogenetic estimation. Finally, the tree is subjected to speciation hypothesis testing under Poisson Tree Processes Model.

Fast genome classifier deals with the "Phylophenetic Species Concept" by testing following hypotheses:

The Genomic Coherence measured through the genomic distance of Mash and ANI
 
The phylogenetic hypothesis of monophyly

The molecular speciation under Poisson Tree Processes Model
@@ Ayixon Sánchez-Reyes   ayixon@gmail.com @@

Computational Microbiology   

Microbiological Observatory 

Institute of Biotechnology, UNAM, Cuernavaca, MEXICO 

VERSION=1.0. Written by Ayixon Sánchez Reyes    
###########################################################################################################               

About

Experimental tool to taxonomically classify microbial genomes in a few minutes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages