AGORA stands for “Algorithm for Gene Order Reconstruction in Ancestors” and was developed by Matthieu Muffato in the DYOGEN Laboratory at the École normale supérieure in Paris in 2008.
// | | // ) ) // ) ) // ) ) // | |
//__| | // // / / //___/ / //__| |
/ ___ | // ____ // / / / ___ ( / ___ |
// | | // / / // / / // | | // | |
// | | ((____/ / ((___/ / // | | // | |
AGORA is used to generate ancestral genomes for the Genomicus online server for gene order comparison, and has been in constant use in the group since.
This code may be freely distributed and modified under the terms of the GNU General Public License version 3 (GPL v3) and the CeCILL licence version 2 of the CNRS. These licences are contained in the files:
- LICENSE-GPL.txt (or on www.gnu.org)
- LICENCE-CeCILL.txt (or on www.cecill.info)
Copyright for this code is held jointly by the Dyogen (DYnamic and Organisation of GENomes) team of the Institut de Biologie de l'Ecole Normale Supérieure (IBENS) 46 rue d'Ulm Paris, the European Bioinformatics Institute outstation of the European Molecular Biology Laboratory, Genome Research Ltd, and the individual authors.
- Copyright © 2006-2022 IBENS/Dyogen : Alexandra LOUIS, Thi Thuy Nga NGUYEN, Matthieu MUFFATO and Hugues ROEST CROLLIUS
- Copyright © 2020-2021 EMBL-European Bioinformatics Institute
- Copyright © 2021-2022 Genome Research Ltd
Email agora {at} bio {dot} ens {dot} psl {dot} eu
Matthieu Muffato, Alexandra Louis, Nga Thi Thuy Nguyen, Joseph M. Lucas, Camille Berthelot, Hugues Roest Crollius. Reconstruction of hundreds of reference ancestral genomes across the eukaryotic kingdom. Nat Ecol Evol (Jan 2023).
To simplify deployment, AGORA already embeds a modified version of LibsDyogen version 1.0 (6/11/2015), a Python library for bioinformatics and comparative genomics developed by the same group.
AGORA is written in Python 3, which is widely available. You can install a Python 3 environment with all the dependencies with conda
conda env create --file conda_env.yml
Alternatively you can add the required dependencies to an existing environment (e.g. a Python virtualenv):
pip install -r requirements.txt
AGORA is compatible with PyPy (an alternative, faster implementation of Python) which significantly speeds up the reconstructions, whilst using more memory.
Once everything is installed, run this to check the installation:
./checkAgoraIntegrity.sh
It should run for a few minutes and end with this message in green:
The ancestral genomes are available in tmp/ancGenomes/
In a nutshell, you need to gather:
- a species tree (e.g.
species-tree.nwk
) - the list of genes of each species (e.g. matching the pattern
genes/genes.%s.list
) - gene trees (e.g.
gene-trees.nhx
), or orthology groups for each ancestor (e.g. matching the patternorthologyGroups/orthologyGroups.%s.list
)
and then try one of these:
src/agora-basic.py species-tree.nwk gene-trees.nhx genes/genes.%s.list
src/agora-basic.py species-tree.nwk orthologyGroups/orthologyGroups.%s.list genes/genes.%s.list
If the ancestral genomes are too fragmented, run src/agora-generic.py
instead of src/agora-basic.py
.
Check out our user manual for more information about the input file formats, what these two scripts do, and how to tune AGORA even further. Also available as docx and pdf.