Skip to content

Pipeline for detecting and annotating genomic islands and relationships between the respective genomes

License

Notifications You must be signed in to change notification settings

brinkmanlab/IslandCompare

Repository files navigation

IslandCompare

Genomic island prediction software developed to facilitate the analysis of microbial population datasets. IslandCompare is designed to process sets of microbial genomes and present genomic island content with an interactive visual to enable exploration of cross-genome genomic island content.

IslandCompare exists as nothing more than a Galaxy workflow JSON file and a client side only web UI that invokes the workflow via Galaxies API. A command line interface is also available that will talk to Galaxies API, invoking the workflow.

IslandCompare operates on Genbank or EMBL formatted data. It will attempt to stitch together draft genomes as some tools do not work with multi-contig datasets. It will also accept a pre-constructed phylogenetic tree in Newick format. The resulting output includes a GFF3 file containing all of the results, along with the generated newick file, any stitched datasets, and a GFF3 file containing only the genomic islands.

Use

IslandCompare is publicly hosted for your use at https://islandcompare.ca. Where you can upload data, run analysis, and visualize the result.

If you prefer to deploy your own instance of IslandCompare, a containerized deployment of Galaxy is available along with scripts to automatically deploy the IslandCompare workflow and dependencies. See the following section (Installation) for more information. The primary intended means of interacting with a local deployment of IslandCompare is via the command line interface.

Installation

Automated

Automated deployments were prepared using Terraform. See ./deployment/README.md for more information regarding deployment and running an analysis.

Manual

  • Install and configure a Galaxy instance. The minimum required version is Galaxy 20.09.
  • Install CVMFS and configure it for the MicrobeDB database
  • Download the workflow and import it via Galaxies web interface.
  • Publicly share the workflow via the workflow settings.
  • Manually install all tools. See http://github.com/brinkmanlab/galaxy-tools for instructions.
  • Install toolshed.g2.bx.psu.edu/brinkmanlab/microbedb/2f6ef3a184df
  • The visualization plugin must be installed into Galaxy. See the multiviz repo for more information.
  • Execute the rgi_database_builder data manager tool from the Local Data admin panel. Name: "latest", URL: "https://card.mcmaster.ca/latest/data".
  • Execute the microbedb_all_fasta data manager tool from the Local Data admin panel. Builds: true, DB: "path/to/cvmfs/microbedb.brinkmanlab.ca/mount".

Front-end

See ./ui/README.md for instructions to build the IslandCompare website.

Repository layout

  • ./ui - Web front end source code. See ./ui/README.md.
  • ./docs - High level documentation of the workflow
  • ./workflow/*.rule - Rules used in the Apply rules to collection tool throughout the workflow
  • ./workflow/prepare_worflow - Script that automates cleaning up and copying into the repository downloaded workflows from Galaxy
  • ./workflow/workflow_notes - Text file documenting various settings of the tools throughout the workflow for reproduction purposes
  • ./workflow/scripts/*.gawk - All scripts used in the awkscript tool throughout the workflow
  • ./workflow/workflows/*.ga - Galaxy workflows and subworkflows

Deployment

Terraform is used to deploy the various resources needed to run Galaxy to the cloud provider of choice.

  • ./destinations - Terraform modules responsible for deployment into the various providers
  • ./deployment - Usage examples for the destination modules

Notes

  • See workflow_notes for details of the configuration settings of each tool in the workflow.
  • ./.github/workflows/nodejs.yml specifies to the GitHub CI how to automatically deploy the front-end to preconfigured urls. Changes pushed to ./ui/ will automatically be built and deployed.

About

Pipeline for detecting and annotating genomic islands and relationships between the respective genomes

Resources

License

Stars

Watchers

Forks

Packages

No packages published