Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in gff3_fix #128

Open
DiegoSafian opened this issue Dec 27, 2022 · 1 comment
Open

Error in gff3_fix #128

DiegoSafian opened this issue Dec 27, 2022 · 1 comment

Comments

@DiegoSafian
Copy link

DiegoSafian commented Dec 27, 2022

Hi,
After successfully using gff3_QC, gff3_fix is giving me the following error:

(genometools) [safiand@login001 grass]$ gff3_fix -qc_r test.txt -g turneri_annotation.gff3 -og new_corrected.gff3
INFO     Checking QC report file (test.txt)...
INFO     Checking GFF3 file (turneri_annotation.gff3)...
INFO     Reading QC report file: (test.txt)...
INFO     Reading GFF3 file: (turneri_annotation.gff3)...
Traceback (most recent call last):
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/bin/gff3_fix", line 8, in <module>
    sys.exit(script_main())
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/bin/gff3_fix.py", line 95, in script_main
    gff3_fix.fix.main(gff3=gff3, output_gff=args.output_gff, error_dict=error_dict, line_num_dict=line_num_dict, logger=logger_stderr)
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 692, in main
    split(gff3=gff3, error_list=error_dict[error_code], logger=logger)
  File "/camp/home/safiand/home/users/safiand/.conda/envs/genometools/lib/python3.10/site-packages/gff3tool/lib/gff3_fix/fix.py", line 165, in split
    childrenlist.append(c1['attributes']['ID'])
KeyError: 'ID'

So I tried the gff3_ID_generator.py, but this one also give me a similar message:

(genometools) [safiand@login001 grass]$ python gff3_ID_generator.py -g turneri_annotation.gff3 -og new.gff3
INFO     Reading input gff3 file: (turneri_annotation.gff3)
INFO     Generate new ID for features in (turneri_annotation.gff3)
Traceback (most recent call last):
  File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 333, in <module>
    main(in_gff=args.gff, merge_report=args.merge_report, out_merge_report=args.out_merge_report, out_gff=args.output_gff, uuid_on=args.universally_unique_identifier, prefix=arg
s.idprefix, digitlen=args.digitlen, report=args.report, alias=args.alias)
  File "/camp/lab/cardoso-moreiam/home/users/safiand/genome_annotation/turneri/busco/turneri_rna_prot_multiples_species/grass/gff3_ID_generator.py", line 238, in main
    ID_dict[child['attributes']['ID']] = [newcID]
KeyError: 'ID'

What can I do to solve this problem? Am I doing something wrong?

My gff3 file look like this:

(genometools) [safiand@login001 grass]$ head turneri_annotation.gff3 -n 20
# gffread augustus.hints.gtf -o turnerifiltered.gff3 --merge -L -g GCA_922788865.1_HVK001PTURNERI_genomic.shortID.fna
# gffread v0.11.6
##gff-version 3
CAKLNU010000942.1       gffcl   locus   724     2835    .       +       .       ID=RLOC_00000001;transcripts=jg1.t1
CAKLNU010000942.1       AUGUSTUS        transcript      724     2835    .       +       .       ID=jg1.t1;geneID=jg1;locus=RLOC_00000001
CAKLNU010000942.1       AUGUSTUS        CDS     724     1083    .       +       0       Parent=jg1.t1
CAKLNU010000942.1       AUGUSTUS        CDS     1181    1625    0.34    +       0       Parent=jg1.t1
CAKLNU010000942.1       AUGUSTUS        CDS     2270    2835    0.42    +       2       Parent=jg1.t1
CAKLNU010000422.1       gffcl   locus   1528    9153    .       +       .       ID=RLOC_00000002;transcripts=jg2.t1
CAKLNU010000422.1       AUGUSTUS        transcript      1528    9153    .       +       .       ID=jg2.t1;geneID=jg2;locus=RLOC_00000002
CAKLNU010000422.1       AUGUSTUS        CDS     1528    1574    0.69    +       1       Parent=jg2.t1
CAKLNU010000422.1       AUGUSTUS        CDS     1718    1788    0.68    +       2       Parent=jg2.t1
CAKLNU010000422.1       AUGUSTUS        CDS     9010    9153    0.6     +       0       Parent=jg2.t1
CAKLNU010000746.1       gffcl   locus   834     3644    .       -       .       ID=RLOC_00000003;transcripts=jg3.t1
CAKLNU010000746.1       AUGUSTUS        transcript      834     3644    .       -       .       ID=jg3.t1;geneID=jg3;locus=RLOC_00000003
CAKLNU010000746.1       AUGUSTUS        CDS     834     878     0.96    -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     988     1011    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     1310    1336    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     2483    2518    1       -       2       Parent=jg3.t1
CAKLNU010000746.1       AUGUSTUS        CDS     2597    2695    1       -       2       Parent=jg3.t1

Thanks!

@mpoelchau
Copy link
Contributor

@DiegoSafian apologies, I completely missed this issue. Can you try removing the locus features from your gff3 file, to see if that is what the ID generator is erroring out on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants