Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bacterial GTDB-Tk classification file does not exist error when using gtdb_to_ncbi_majority_vote.py #595

Open
schmittel opened this issue Jun 22, 2024 · 1 comment
Labels
error Help required for a GTDB-Tk error.

Comments

@schmittel
Copy link

schmittel commented Jun 22, 2024

Hi,

I'm having the aforementioned issue. Here's the output I'm getting from gtdb_to_ncbi_majority_vote.py, including the parameters I used:

gtdb_to_ncbi_majority_vote.py v0.2.0: Translate GTDB to NCBI classification via majority vote.
  by Donovan Parks (donovan.parks@gmail.com)

[2024-06-22 07:40:06] INFO: GTDB to NCBI majority vote v0.2.0
[2024-06-22 07:40:06] INFO: gtdb_to_ncbi_majority_vote.py --gtdbtk_output_dir /gtdbtk/metagenomes_markers/classify --output_file /ncbi_conversion/gtdb_to_ncbi.txt --bac120_metadata_file /ncbi_conversion/bac120_metadata.tsv
[2024-06-22 07:40:06] INFO: Parsing GTDB-Tk classifications:
[2024-06-22 07:40:06] WARNING: Bacterial GTDB-Tk classification file does not exist.
[2024-06-22 07:40:06] WARNING: Assuming there are no bacterial genomes to reclassify.
[2024-06-22 07:40:06] INFO:  - identified 0 archaeal classifications
[2024-06-22 07:40:06] INFO:  - identified 0 bacterial classifications
[2024-06-22 07:40:06] INFO: Identifying GTDB-Tk classification trees:
[2024-06-22 07:40:06] INFO:  - identified 0 bacterial tree(s)
[2024-06-22 07:40:06] INFO: Parsing NCBI taxonomy from GTDB metadata files:
[2024-06-22 07:40:06] INFO: Processing bacterial metadata file.
[2024-06-22 07:40:22] INFO:  - read NCBI taxonomy for 584,382 genomes
[2024-06-22 07:40:22] INFO:  - identified 107,235 GTDB species clusters
[2024-06-22 07:40:22] INFO:  - identified genomes in 4,896 GTDB families
[2024-06-22 07:40:22] INFO: Determining NCBI majority vote classifications for GTDB species clusters.
[2024-06-22 07:40:24] INFO:  - identified 107,235 GTDB species clusters with an NCBI classification
[2024-06-22 07:40:24] INFO: Determining NCBI majority vote classification for each genome:
[2024-06-22 07:40:24] INFO: Results written to: /ebio/abt3_scratch/jmarsh/tract_score3/ncbi_conversion/gtdb_to_ncbi.txt
[2024-06-22 07:40:25] INFO: Done.

Here's the contents of my classify folder:

/gtdbtk/metagenomes_markers/classify/gtdbtk.ar53.classify.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.ar53.summary.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.1.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.2.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.3.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.4.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.5.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.6.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.7.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.classify.tree.8.tree
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.summary.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.bac120.tree.mapping.tsv
/gtdbtk/metagenomes_markers/classify/gtdbtk.backbone.bac120.classify.tree

I can't figure out why this isn't working. I get the same error when using gtdb_to_ncbi_majority_vote.py version 0.2.0 and 0.2.1. Many thanks for your help.

@schmittel schmittel added the error Help required for a GTDB-Tk error. label Jun 22, 2024
@donovan-h-parks
Copy link
Collaborator

Hi,

Sorry for the slow reply. Were you able to resolve this issue?

Cheers,
Donovan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error Help required for a GTDB-Tk error.
Projects
None yet
Development

No branches or pull requests

2 participants