Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDMP-523 - Challenges with the representation of a list for defining protein sequences #353

Open
ElisaKendall opened this issue May 2, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@ElisaKendall
Copy link
Contributor

What is the correct relation between the protein sequence and the amino acide residues?

AlaLeuGlu a idmp-sub:ProteinSequence .
Ala/Leu/Glu a idmp-sub:AminoAcidResidue .

  1. AlaLeuGlu cmns-coll:hasConstituent Ala, Leu, Glu .
  2. AlaLeuGlu cmns-col:hasMember Ala, Leu,Glu .

It can’t be both due to the disjointness between hasConstituent and hasMember.

The protein (substance) is defined by its protein sequence. The protein specifies physical proteins which is matter composed of gazillions (about 1E23/g) of scattered protein molecules.
The al,leu or glu in the Dayhoff notation designates also a structure which defines a molecular group that is part of the protein molecule. All can be described in the structural representation:
X-NH-R-(C=0)-Y with X,Y being wildcards (X for an hydrogen atom or the chain, and Y for OH-group or the further chain) and R a wildcard defining the type of amino acid.

The first seems more wrong, if we can say so:
It is the constituency of the structure that has the amino acid as constituent:

AlaLeuGlu a idmp-sub:StructuralConstituency, cmns-col:Constituency .
AlaLeuGlu-Constituency cmns-dsg:defines AlaLeuGlu .
AlaLeuGlu-Constituency cmns-col:hasConstituent Ala,Leu,Glu .

However AlaLeuGlu is also a cmns-strcol:List therefore:
AlaLeuGlu cmns-col:hasMember Ala,Leu,Glu. (by being a cmns-col:Collection)
and
AlaLeuGlu cmns-col:hasConstituent AlaC, LeuC, GluC (by being a cmns-strcol:List)
AlaC is an cmns-strcol:IndexedConstituent.

By the disjoint property condition
AlaC and Ala must be disjoint individuals.
Instead it is
AlaLeuGlu cmns-col:hasConstituent AlaC .
AlaC ?x Ala .

The predicate ?x links the constituent to the thing that plays the constituent role.
What is x? Is it cmns-pts:playsRole?

I have understood you right then we should not use such a predicate x, but instead make the individual both of type Constituent and the class that plays the role (AminoAcidResidue), but that is no longer allowed because of the disjointness.

This problem occurs in all cases where the whole is a list.
cmns-strcol:List subclass of (cmns-col:hasConstituent only cmns-strcol:IndexedConstituent) has to go and must be replaced with

cmns-strcol:List subclass of (cmsn-col:hasMember only cmns-strcol:IndexedConstituent)

Now the element can be both a constituent and the element type. If we want to use cmns-col:hasConstituent from the list, we have to put the cmns-col:Constituency in between.

I certainly like to use cmns-strcol:hasFirst resp. cmns-strcol:hasLast to define the N- and C-Terminals, and if that predicates can be defined based on IndexedConstituent, then it is awkward to write:
aList a List.
aList cmns-col:isDefinedIn aListConstituency.
aListConstituency cmns-pts:hasConstituent aListConstituent.
aListConstituent cmns-pts:isPlayedBy aListElement .
aListConstituent cmns-strcol:hasIndexValue aListElementIndexValue .
aListElementIndexValue cmns-qtu:hasNumericValue 1 . (cmns-x:hasOrdinalValue??)

instead of the more simpler
aList a List.
aList cmns-col:hasMember aListElement.
aListElement cmns-col:hasIndexValue [ cmns-x:hasNumericValue 1 ] . # or even simpler: aListElement cmns-col:hasOrdinalIndexValue 1

In my example:
AlaLeuGlu a ProteinSequence.
AlaLeuGlu cmns-col:hasMember Ala,Leu, Glu.
Ala cmns-x:hasOrdinalIndexValue 1
Leu cmns-x:hasOrdinalIndexValue 2
Glu cmns-x:hasOrdinalIndexValue 3

making 7 statements

Instead of
AlaLeuGlu a ProteinSequence .
AlaLeuGlu cmns-col:hasMember Ala,Leu, Glu.
AlaLeuGlu cmns-dsg:isDefinedIn AlaLeuGlu-C.
AlaLueGlu-C cmns-pts:hasConstituent Ala-C, Leu-C, Glu-C .
Ala-C cmns-pts:isPlayedBy Ala .
Ala-C cmns-strcol:hasIndexValue [ cmns-qtu:hasNumericValue 1 ]
Leu-C cmns-pts:isPlayedBy Leu .
Leu-C cmns-strcol:hasIndexValue [ cmns-qtu:hasNumericValue 1 ]
Glu-C cmns-pts:isPlayedBy Glu .
Glu-C cmns-strcol:hasIndexValue [ cmns-qtu:hasNumericValue 1 ]

with 17 statements.

To define a 3-peptide.
In this case, nobody will use IDMP-O, and instead use the Dayhoff string encoding only “ala-leu-glu”.

IMHO the disjoint property axiom should be dropped and the usage guidelines in cmns-col be clarified, but that will probably not happen.

So we have to make IndexedConstituent not a subclass of Constituent but of Member
With
Member equivalent (cmns-col:isMemberOf cmns-col:Collection)
And then we would have to rename the class as well to avoid confusion.

Or
We must replace
cmns-col:List subclass of (cmns-col:hasConstituent only cmns-strcol:IndexedConstituent)
with
cmns-col:List subclass of (cmns-dsg:isDefinedIn some (cmns-col:Constituency and cmns-col:hasConstituent only cmns-strcol:IndexedContituent))

@ElisaKendall ElisaKendall added the enhancement New feature or request label May 2, 2023
@ElisaKendall ElisaKendall self-assigned this May 2, 2023
@mereolog mereolog added this to the Release 0.4.0 milestone Jun 9, 2023
@mereolog mereolog removed this from the Release 0.4.0 milestone Jul 6, 2023
@mereolog mereolog added this to the Release 0.5.0 milestone Aug 3, 2023
@mereolog mereolog modified the milestones: Release 0.5.0, Release 1.0.0 Oct 4, 2023
@mereolog mereolog removed this from the Release 1.0.0 milestone Jan 4, 2024
@mereolog
Copy link
Contributor

@ElisaKendall is this still an issue?
The ticket is almost 10 months old - can we close it?

@ElisaKendall
Copy link
Contributor Author

@mereolog To simplify things we are not using the pattern that @tw-osthus describes above at the moment, but that doesn't mean that we won't want to address this later this year. I think it needs to stay open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants