-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assess existence of experimental models #33
Comments
Found a treasure trove of data on experimental models for diseases (and perhaps specific phenotypes) on Monarch: However, these files don't include gene-level info (which we would want if we have a particular gene therapy target in mind), but I'm checking to see if there's a way I can extract that from the larger Monarch knowledge graph: They also only provide MONDO ID's for each disease, so I need to find an effective way to map these back to the HPO/OMIM/DECIPHER/ORPH IDs provided by HPO. |
What's argument against just using Mammalian Phenotype Ontology overlap? Also, here's some of the messages we sent relating to this previously: Here's one of the gene's that a mouse model for respiratory failure: http://www.informatics.jax.org/reference/J:120296 Here's the list of mammalian phenotype ontology genes (for respiratory failure): http://www.informatics.jax.org/mp/annotations/MP:0001953 (edited) Gene therapy for ABCA3 in respiratory failure is already being looked into: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8798122/ |
Several reasons:
|
Sounds good!
Sent from Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Brian M. Schilder ***@***.***>
Sent: Thursday, November 30, 2023 1:24:56 PM
To: neurogenomics/RareDiseasePrioritisation ***@***.***>
Cc: Skene, Nathan G ***@***.***>; Comment ***@***.***>
Subject: Re: [neurogenomics/RareDiseasePrioritisation] Assess existence of experimental models (Issue #33)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
What's argument against just using Mammalian Phenotype Ontology overlap?
Several reasons:
* Monarch includes MPO, as well as other model organism databases beyond just mouse.
* We need to map MPO to HPO terms. UPHENO provides this, which is also integrated in Monarch.
—
Reply to this email directly, view it on GitHub<#33 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AH5ZPEZIEVU2HSDYDGW4EDLYHCCKRAVCNFSM6AAAAAA7M2TMMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZTG44DCMRQHE>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Preliminary summary plot showing the proportion of orthologous genes overlapping between HPO and non-human ontology databases (within a given phenotype), repeated across many phenotypes: Will include this in the final report as well as showing how we can use this to prioritise gene/phenotype-specific therapeutic targets. |
Great, didn’t think about looking at zebrafish models etc as well!
Can you explain the x-axis?
|
Sure!
Dividing one over the other thus gives you the proportion of HPO gene annotations recapitulated in the equivalent phenotype of another species. This proportion will be influenced by both evolutionary distance and how well studied each species is (notice the difference between mouse and rats, despite the fact that they're equally related to humans). |
Here's some gene therapy target phenotypes identified by our previous analyses. The exact phenotypes will likely change once we add chatGPT annotations to our filtering strategy with the round of enrichment results. But for now these can serve as an example. with the heatmap colored by the "equivalence score", which is essentially UPHENO's way of quantifying how well a phenotype matches up across species (on a scale from 0-1). Data comes from here.
Not sure exactly on what basis they computed Jaccard similarity, but I'll look into this some more. upheno_top_targets_heatmap.pdf Looks like UPHENO has been thinking about adding fly ontology mappings as well, though there hasn't been any activity on this since 2016 it seems. Just pinged them to get an update: |
This HPO publication, in which they did the mapping with Exomiser.
Though this figure suggests there's also already mapping between fly and frog as well. I'll reach out to the HPO team to confirm where i might find this, and to confirm the methodology they used to do the phenotype mapping: |
@bschilder would you be up for a quick call on the matter? I will sort you out with fuzzy and proper matches as well. |
Absolutely! Thank you so much for reaching out! Setting up a time for us to meet. |
Met with @matentzn who was extremely helpful in explaining the cross-species phenotype matching procedure to me, and pointing me to some additional resources. For mapping MONDO IDs in the Monarch model's file, I'm switching to using this file as it avoid issues observed here: With these changes, library(HPOExplorer)
> model <- get_monarch("disease_to_model")
[100%] Downloaded 883280 bytes...
> model$db <- stringr::str_split(model$subject,":", simplify = TRUE)[,1]
> model <- map_mondo(dat = model,
+ input_col="object",
+ output_col="OMIM_ID",
+ to=c("OMIM","Orphanet"))
[100%] Downloaded 1082741 bytes...
476 / 5,154 (9.24%) OMIM_ID missing. The only issue is, as far as I can tell MONDO doesn't seem to contain any mappings between MONDO IDs and DECIPHER IDs. DECIPHER IDs only make up a small fraction of the HPO annotations, but would be nice to have a complete mapping nonetheless: > phenos <- make_phenos_dataframe(add_disease_data = TRUE)
> phenos$disease_db <- stringr::str_split(phenos$disease_id,":", simplify = TRUE)[,1]
> table(phenos$disease_db) |
To summarise, the phenotype matching procedure is meant to captured semantic similarity using a semi-heuristic model (a combination of explicit rules and data-driven). Data inputs come from a variety of sources. Ultimately, they linking together concepts (species, diseases, phenotypes, genes, pathways, etc.) in a knowledge graph derived from a mix of NLP queries to the published literature and other database. @matentzn this is probably a poor attempt to explain this properly, but if there's a paper or docs page you could point me to that would be quite helpful! Thanks! |
We have this for DECIPHER: https://github.com/monarch-initiative/mondo/blob/master/src/ontology/mappings/mondo_hasdbxref_decipher.sssom.tsv Which will do the job for you!
Its simpler than that.
I requested an FBcv profile for you here: monarch-initiative/monarch-semantic-similarity-profiles#16 So you can take a look how it looks like. |
Ah, amazing! I had totally missed that bc i was using this file, which I assumed included all the other ones: I've implemented many of these functions within a new package for accessing/processing knowledge graphs in general (HPOExplorer was getting to bloated): I was also just alerted to the I've also begun exploring some of the graph query resources/tools you alerted to me on our call:
Ahhh, this makes so much more sense now! Thanks for explaining that in more detail, and for the paper (super interesting work!). Along those lines, I've found the
Thank you so much! I really appreciate this, and all your other help. |
Assess whether there is an existing experimental model for each candidate therapeutics target.
We can check this by seeing if there is an MPO or UPHENO annotation for the same phenotype.
The text was updated successfully, but these errors were encountered: