GitHub - C3BI-pasteur-fr/taxodb_ncbi: NCBI Taxonomy Database formater

taxodb_ncbi.py is a simple python script used to format the NCBI taxonomy Database. It requires bsddb3 python library and Berkeley DB library to work.

INSTALL

Install Berkeley DB

Mac OSX

brew install berkeley-db4

Ubuntu/Debian

sudo apt-get install libdb-dev

CentOS

sudo yum install libdb-devel

Install bsddb3

pip install bsddb3

Install taxodb_ncbi.py

python setup.py install

GETTING DATA

taxodb_ncbi.py only requires 'Taxonomy nodes' (nodes.dmp) and 'Taxonomy names' (names.dmp) files to work. These files are provided within NCBI Taxonomy database.

Download required files from NCBI taxonomy database:

  $ wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz
  $ tar zxf taxdump.tar.gz

USAGE

$ python taxodb_ncbi.py -h
usage: taxodb_ncbi.py [-h] -n file -d file [-k File] [-t File] [-b File]
                      [-f string]

program uses to format the NCBI Taxonomy Database

optional arguments:
  -h, --help            show this help message and exit

Options:
  -n file, --names file
                        names.dmp from NCBI taxonomy databank (default: None)
  -d file, --nodes file
                        nodes.dmp from NCBI taxonomy databank (default: None)
  -k File, --flatdb File
                        Output file: flat databank format. (default: None)
  -t File, --tab File   Output file: tabulated format. Organism with
                        classification. (default: None)
  -b File, --bdb File   Output file: Berleley db format (default: None)
  -f string, --format string
                        By default, reports only full taxonomy ie taxonomies
                        that have 'species', 'subspecies' or 'no rank' at the
                        final position. Otherwise, reports all taxonomies even
                        if they are partial (default: full)

RUNNING

Create Berkeley DB database and associated databank and tabulates files:

python taxodb_ncbi.py -n names.dmp -d nodes.dmp -k taxodb_ncbi.out -t taxodb_ncbi.out -b taxodb_ncbi.bdb

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
src		src
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INSTALL

GETTING DATA

USAGE

RUNNING

About

Releases 8

Packages

Contributors 3

Languages

License

C3BI-pasteur-fr/taxodb_ncbi

Folders and files

Latest commit

History

Repository files navigation

INSTALL

GETTING DATA

USAGE

RUNNING

About

Resources

License

Stars

Watchers

Forks

Releases 8

Packages 0

Contributors 3

Languages

Packages