Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many ChEBI chemicals not available for metabolic models #96

Open
marcfeuermann opened this issue May 6, 2022 · 17 comments
Open

Many ChEBI chemicals not available for metabolic models #96

marcfeuermann opened this issue May 6, 2022 · 17 comments

Comments

@marcfeuermann
Copy link

Hello,
By creating models for secondary metabolites biosynthesis I realized that part of the ChEBI chemicals are not available with the tool.
As an example, within the P. expansum patulin biosynthesis pathway, I cannot get:

CHEBI:5325 gentisyl alcohol
CHEBI:145109 (+)-isoepoxydon
CHEBI:145112 (E)-ascladiol
CHEBI:145110 phyllostine
CHEBI:145111 isopatulin

This is probably not limited to this pathway and many ChEBI chemicals are probably not available yet for creating models
Thanks a lot.
Best regards,
Marc.

@cmungall
Copy link
Member

cmungall commented May 6, 2022 via email

@deustp01
Copy link

deustp01 commented May 9, 2022

If we want to extend this

Unless the larger import of ChEBI into go-lego has some major technical cost (memory, performance, ...), my guess from the Reactome experience is that loading everything beats trying to anticipate user needs. We never succeeded at the anticipation part.

A local fix is that our data model allows reference instances to be added to our central database one at a time, so we have a series of wizards that enable a curator to supply a valid ID (ChEBI, UniProt, etc.), the wizard fetches the relevant information and adds it to the local project, and when the project is saved the new reference instance is added to the central repository and is accessible to all users. No clue here whether the owl structure and the rest of Noctua is compatible with this.

@pgaudet
Copy link

pgaudet commented Jun 2, 2022

Would it make sense to load the CheBI7.3 set ?

https://ftp.expasy.org/databases/rhea/tsv/chebi%5FpH7%5F3%5Fmapping.tsv

That's 90k terms though, it seems like a lot?

@cmungall
Copy link
Member

cmungall commented Jun 2, 2022

I think supplementing imports/chebi_import with 7.3terms plus their is-a ancestors is a good strategy

@deustp01
Copy link

deustp01 commented Jun 2, 2022

think supplementing imports/chebi_import with 7.3terms

An issue possibly to discuss with ChEBI is that the form of an ionizable molecule that is prevalent at pH 7;3 is not always tagged as such in ChEBI. Can these tags somehow magically be applied to all appropriate ChEBI instances?

@pgaudet
Copy link

pgaudet commented Jun 24, 2022

@cmungall can this be done in the near future? Or not? Is this a lot of work?
This is a blocker for Marc's biosynthesis models.

Alternatively, could we manually upload all the RHEA ph7.3 in the imports/chebi file?

Thanks, Pascale

@cmungall
Copy link
Member

An issue possibly to discuss with ChEBI is that the form of an ionizable molecule that is prevalent at pH 7;3 is not always tagged as such in ChEBI. Can these tags somehow magically be applied to all appropriate ChEBI instances?

It is Rhea that makes this file, and some of these are computed. E.g. all the entries marc provides are marked computational

@balhoff how much work would this be?

We want to take all entries in chebi_pH7_3_mapping.tsv plus their is-a ancestors only and supplement the go-lego file with this.

@balhoff
Copy link
Member

balhoff commented Jun 27, 2022

This should be in the next go-lego snapshot.

@balhoff
Copy link
Member

balhoff commented Jun 27, 2022

Actually that will be held up by geneontology/go-ontology#23567 geneontology/go-ontology#23568.

@pgaudet
Copy link

pgaudet commented Jul 5, 2022

Is the blocking issue resolved? (I thought it was #23568)

Thanks, Pascale

@balhoff
Copy link
Member

balhoff commented Jul 5, 2022

Thanks, I fixed the link to the issue. I just merged my PR, and the additional CHEBI terms should be in the next go-lego snapshot. Thanks for your help!

@cmungall
Copy link
Member

cmungall commented Oct 11, 2022 via email

@deustp01
Copy link

curator guidance on picking chebi terms

Not sure you want this. Coming up with filters to get rid of ones irrelevant to biology seems hard and risky. Why not dump the whole collection of ChEBI chemicals into Noctua (with arrangements for periodic updates to capture changes in ChEBI). Curators, just as they do now (and perhaps with some explicit guidance pointing to Rhea as a resource for guidance on these issues), will need to figure out which ChEBI term represents the correct charge state and stereochemistry for their particular organism and environment.

@cmungall
Copy link
Member

I think we do want to eliminate choices over protonation state and just select ph7.3 form

@deustp01
Copy link

deustp01 commented Oct 11, 2022

As far as I can tell, pH 7.3 charge states are sometimes, but not always, noted in the ChEBI entries so this will require some clean-up, either at ChEBI (always eager to get someone else to do the work) or magically, at import-to-Noctua time.

Maybe, for now, import everything because that already supports better, easier curation, and work on figuring out how to prune the list.

@cmungall
Copy link
Member

cmungall commented Oct 11, 2022 via email

@deustp01
Copy link

conservative approach

This should not exclude anything we want, so it's a good start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants