Datatype system design choices #185
mkroetzsch
started this conversation in
Design issue
Replies: 2 comments
-
Further reading: The documentation of supported SPARQL types in Oxigraph might be instructive to get an idea of how the RDF-side of types roughly looks, and how they can be implemented in a DBMS context. |
Beta Was this translation helpful? Give feedback.
0 replies
-
After the above technical musings, some notes on usability:
Having good adaptivity would suggest a system with fewer primitive types, where specific types use the same internal handling with merely some extra constraints on the values. However:
So one possible initial approach might be:
Still to be clarified:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This discussion is to gather some thoughts about Nemo's logical datatype system, i.e., the types that a user would be allowed to select on the rules level.
What's in a datatype?
Datatypes have at least the following tasks:
Note that logical values can have indirect or composite representations on the physical layer. For example, a date might be stored in the form of several physical values (in several columns) and a nested function term might be represented through tuples in several relations. Built-in predicates and functions need to undergo a similar translation into physical built-ins to realise their semantics.
Note that "lexical" means "unicode string" for us. Low-level encoding of unicode glyphs (e.g., UTF-8 vs. UTF-16) is not a concern of the datatype, but of I/O routines that have to turn glyphs into bytes. In memory, we work with glyphs in all cases.
Also note that our lexical value -- in contrast to RDF and XSD -- does not need to encode our "internal" logical datatype. For example,
"42"^^xsd:int
might be a lexical value of a value in a logical integer datatype but also in a general RDF term type.What kind of datatypes are there?
One can imagine datatypes of several basic forms:
Types have a natural hierarchical relation based on set inclusion of their value spaces. Moreover, there could be built-in functions to convert values in different ways (in the style of "toString: int -> string" or "langTag: LanguageTaggedString -> string").
Which datatypes do we need?
This needs discussion, but we should be open to future extensions. One can derive some possible demands from related systems and technologies that we support:
Primitive types:
The basic types are subsumed by the RDF types. Many XSD types could be practically realised as restrictions of general types.
Structured types:
The final part ("object terms") is overlapping with more general notion of "frame-like atoms". As a term, these would correspond to "frame-like functions", but storage issues are very similar.
How should our datatypes by denoted?
Specific basic questions to answer early on
Beta Was this translation helpful? Give feedback.
All reactions