Automates running of OrthoMCL software from http://orthomcl.org/orthomcl/
The brief overview of running the OrthoMCL pipeline is as follows:
-
Run the following command to setup the database, verify the setup and generate an OrthoMCL configuration file.
perl scripts/orthomcl-setup-database.pl --user orthomcl_database_user --password orthomcl_database_password --host orthomcl_database_host --database orthomcl_database --outfile configure_outfile.conf [--no-create-database]
-
Run the following command to start OrthoMCL.
perl scripts/orthomcl-pipeline.pl -i input/ -o output/ -m orthomcl.conf --nocompliant
Where
input/
contains a set of gene annotations in FASTA format, one file per genome (e.g.genome1.fasta
,genome2.fasta
, must end in .fasta),output/
is the location to store the OrthoMCL output files,orthomcl.conf
is the OrthoMCL configuration file generated in step 2, and--nocompliant
adjusts gene names in fasta files to make them unique.
A walkthrough of using the OrthoMCL pipeline on example data can be found at https://github.com/apetkau/microbial-informatics-2014/tree/master/labs/orthomcl and a virtual machine containing a pre-installed version of the OrthoMCL pipeline can be found at https://www.corefacility.ca/wiki/bin/view/BioinformaticsWorkshop/Software2014.
Please see the Installation documentation for details on how to install.
Usage: orthomcl-pipeline -i [input dir] -o [output dir] -m [orthmcl config] [Options]
Options:
-i|--input-dir: The input directory containing the files to process.
-o|--output-dir: The output directory for the job.
-s|--split: The number of times to split the fasta files for blasting
-c|--config: The main config file (optional, overrides default config).
-m|--orthomcl-config: The orthomcl config file
--compliant: If fasta data is already compliant (headers match, etc) (default).
--nocompliant: If fasta data is not already compliant (headers match, etc).
--print-config: Prints default config file being used.
--print-orthomcl-config: Prints example orthomcl config file.
--yes: Automatically answers yes to every question (could overwrite/delete old data).
--scheduler: Defined scheduler (sge or fork).
--no-cleanup: Does not remove temporary tables from database.
-h|--help: Show help.
Examples:
orthomcl-pipeline -i input/ -o output/ -m orthomcl.config
Runs orthomcl using the input fasta files under input/ and orthomcl.confg as config file.
Places data in output/. Gets other parameters (blast, etc) from default config file.
orthomcl-pipeline -i input/ -o output/ -m orthomcl.config -c orthomcl-pipeline.conf
Runs orthomcl using the given input/output directories. Overrides parameters (blast, etc)
from file orthomcl-pipeline.conf.
orthomcl-pipeline --print-config
Prints default orthomcl-pipeline.conf config file (which can then be changed).
orthomcl-pipeline --print-orthomcl-config
Prints orthomcl example config file which must be changed to properly run.
orthomcl-pipeline -i input/ -o output/ -m orthomcl.confg --compliant
Runs orthmcl with the given input/output/config files.
Skips the orthomclAdjustFasta stage on input files.
orthomcl-pipeline -i input/ -o output/ -m orthomcl.confg --no-cleanup
Runs orthmcl with the given input/output/config files.
Does not cleanup temporary tables.