You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is about controlled dictionaries, and creating tabby input enriched by ontology lookup
Current state
The current sfb tabby requires sample[organism] to be expressed as ID in the NCBI organismal taxonomy, formatted as, e.g. NCBITaxon:9606.
This can translate (by string substitution) to http://purl.obolibrary.org/obo/NCBITaxon_9606, which can be looked up (also via an API), e.g. in OLS: NCBITaxon:9606 yielding e.g. label (i.e. Latin name) and exact synonym (genbank common name, i.e. English name).
For feeding this info to the catalog, I like using OpenMINDS controlled term for Species, because it has fields such as name (required), preferredOntologyIdentifier, and synonym. These map nicely into Latin name, IRI, and English name, and make it easy to create a catalog template for displaying this information.
Consequently, the dataset attribute is currently modelled as (note that using Species as attrubute IRI is probably not a good idea):
Classification of organism(s) associated with, or studied
for the dataset. One or more organisms can be given, one per
column. Organisms must be identified by their ID in the
NCBI organismal taxonomy, which can be searched at
https://www.ebi.ac.uk/ols4/ontologies/ncbitaxon. For
example, the identifier for human or homo sapiens is
NCBITaxon:9606. The column value should be NCBITaxon:9606 in
this case.
and there is currently no range (or string pattern) defined, and there is no custom Species object definition.
Note: the same applies to sample[organismPart] / openminds:UBERONParcellation.
Questions
Should we define our Species object (with IRI pointing at OpenMINDS or not) with the three properties listed above?
Should we keep it as string, and just provide a string-matching pattern for validation
Thoughts
The problem is that a datalad-tabby convention could convert the NCBITaxon:1234 into a full IRI, but it couldn't (and probably shouldn't) perform an ontology lookup - links to a question, which stage of our processing we are modelling. With that, it cannot produce a valid openMINDS Species object (no name).
I really like an OpenMINDS-like representation for feeding data which is based on a controlled dictionary into the catalog.
I am tempted to define my own Species object, that would only have an IRI / preferred ontology identifier required, and other fields optional. Then, these fields could be filled in during preparation for the catalog. And our schema would sort-of live in the middle of the tabby-to-catalog process,
The text was updated successfully, but these errors were encountered:
This issue is about controlled dictionaries, and creating tabby input enriched by ontology lookup
Current state
The current sfb tabby requires
sample[organism]
to be expressed as ID in the NCBI organismal taxonomy, formatted as, e.g.NCBITaxon:9606
.This can translate (by string substitution) to http://purl.obolibrary.org/obo/NCBITaxon_9606, which can be looked up (also via an API), e.g. in OLS: NCBITaxon:9606 yielding e.g. label (i.e. Latin name) and exact synonym (genbank common name, i.e. English name).
For feeding this info to the catalog, I like using OpenMINDS controlled term for Species, because it has fields such as name (required), preferredOntologyIdentifier, and synonym. These map nicely into Latin name, IRI, and English name, and make it easy to create a catalog template for displaying this information.
Consequently, the dataset attribute is currently modelled as (note that using Species as attrubute IRI is probably not a good idea):
crc-schema-draft/src/sfb1451_schema.yaml
Lines 89 to 101 in 3a233f5
and there is currently no range (or string pattern) defined, and there is no custom Species object definition.
Note: the same applies to
sample[organismPart]
/openminds:UBERONParcellation
.Questions
Thoughts
The problem is that a datalad-tabby convention could convert the NCBITaxon:1234 into a full IRI, but it couldn't (and probably shouldn't) perform an ontology lookup - links to a question, which stage of our processing we are modelling. With that, it cannot produce a valid openMINDS Species object (no name).
I really like an OpenMINDS-like representation for feeding data which is based on a controlled dictionary into the catalog.
I am tempted to define my own Species object, that would only have an IRI / preferred ontology identifier required, and other fields optional. Then, these fields could be filled in during preparation for the catalog. And our schema would sort-of live in the middle of the tabby-to-catalog process,
The text was updated successfully, but these errors were encountered: