This repo allows to reproduce the analysis in the poreFUME paper.
Run run.sh
to setup the analysis, this will:
- Clone into poreFUME.
- Download the input nanopore, PacBio and Sanger sequence data files and raw nanopore data from ENA
- Download the processed data files by poreFUME, these are not strictly necessary, but allow the user to skip the poreFUME pipeline itself
- Run
install.sh
of poreFUME which takes care of the daligner, DAZZ_DB, POA, nanocorrect dependencies.
Next you can check out the calculateResistome.ipynb notebook to run poreFUME and some auxiliary analysis (ie. sequence identity to the Sanger set) or directly go to analyzeResistome.ipynb to reproduce the figures in the paper
Minimal Python 2.7, pandas, numpy, biopython as described in the poreFUME install document.
poreFUME makes use of the CARD database. So when using please cite McArthur et al. 2013. The Comprehensive Antibiotic Resistance Database. Antimicrobial Agents and Chemotherapy, 57, 3348-3357. Furthermore Nanocorrect and Nanopolish are used, which can be cited by Loman NJ, Quick J, Simpson JT: A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 2015, 12:733–735.
Check the poreFUME README and poreFUME install documents for more details and specific dependencies of poreFUME.