-
Notifications
You must be signed in to change notification settings - Fork 2
ordered properties
How should we order properties in NIEM JSON?
There is a mismatch:
- XML elements are ordered. The XML infoset presents element and character content in an order.
- RDF triples are not ordered. RDF is defined such that all triples reside in a graph, but their order is not distinct.
Ordered properties are useful when they are ordered deliberately, which is why NIEM defines structures:sequenceID
to deliberately order properties.
The fact that XML content is always ordered can be harmful, when the order may be respected differently in different environments. For example, if data is serialized through a relational database, or across Java objects, the input order may not be retained on output.
NIEM JSON uses JSON-LD, which is an RDF implementation, which means that the order of JSON-LD properties as they appear in a JSON file
Do we know of a reasonable way to order RDF data, and therefore NIEM JSON data?
In response to issue https://github.com/NIEM/NIEM-JSON-Spec/issues/3.
What does normal data look like?
Base example: link:
{
"@context": {
"@vocab": "http://release.niem.gov/niem/niem-core/5.0/"
},
"PersonGivenName": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ],
"PersonSurName": "Sutherland"
}
Yields
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Dempsey" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Frederick" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "George" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Kiefer" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Rufus" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "William" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .
The properties are unordered.
Experimenting with ordering content.
The JSON-LD spec, section 4.3.1 "Lists" describes ordering content of a single property:
Example with list: link
{
"@context": {
"@vocab": "http://release.niem.gov/niem/niem-core/5.0/"
},
"PersonGivenName": {
"@list": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ]
},
"PersonSurName": "Sutherland"
}
The list declaration can be put into the context: link
{
"@context": {
"@vocab": "http://release.niem.gov/niem/niem-core/5.0/",
"PersonGivenName": { "@container": "@list" }
},
"PersonGivenName": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ],
"PersonSurName": "Sutherland"
}
Both yield the same RDF data:
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b1 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Kiefer" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b2 .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "William" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b3 .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Frederick" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b4 .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Dempsey" .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b5 .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "George" .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b6 .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Rufus" .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
The result uses the RDF representation for lists (rdf:first
, rdf:rest
, rdf:nil
), which gives a property (PersonGivenName
) a value that is a list object (_:b1
).
The @list
construct provides no way to order different properties (e.g., to say a family name comes before a given name).
A writeup on ordering properties in RDF: Sergey Melnik & Stefan Decker: Representing Order in RDF.
- None of these options seems to be what we want to do for NIEM.
Options listed
- Have a property point to a "list object" that has properties 1, 2, etc. pointing to the values
- Have a property point to a "list object" that has head pointing to a value, rest pointing to a list or nil.
- turn a triple into an object with "source", "dest", "type" (the property name), and a "next" pointing at the next item in the list.
- change property names into specialized properties that have an index attached: property "creator" turns into "creator:1", "creator:2".
- have a reified object point to "source", "type" (the property name), and properties 1, 2, etc. pointing to the values
- turn triples into reified objects, each having "source" (property subject), "dest" (property object), "type" (property name), "order" (the index of the triple)
- give a triple an "Order" property; their example presumes some reification of triples behind the scenes.
- reify property, and give the triple a "next" pointing to the next object in the sequence
We would like an RDF/JSON encoding for the given XML:
<PersonGivenName structures:sequenceID="1">Kiefer</PersonGivenName>
<PersonGivenName structures:sequenceID="2">William</PersonGivenName>
<PersonGivenName structures:sequenceID="3">Frederick</PersonGivenName>
<PersonGivenName structures:sequenceID="4">Dempsey</PersonGivenName>
<PersonGivenName structures:sequenceID="5">George</PersonGivenName>
<PersonGivenName structures:sequenceID="6">Rufus</PersonGivenName>
<PersonSurName structures:sequenceID="7">Sutherland</PersonSurName>
A candidate, using structures:sequenceId
as a property: link
{
"@context": {
"nc": "http://release.niem.gov/niem/niem-core/5.0/",
"structures": "http://release.niem.gov/niem/structures/5.0/",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
},
"PersonGivenName": [
{
"rdf:value": "Kiefer",
"structures:sequenceID": "1"
},
{
"rdf:value": "William",
"structures:sequenceID": "2"
},
{
"rdf:value": "Frederick",
"structures:sequenceID": "3"
},
{
"rdf:value": "Dempsey",
"structures:sequenceID": "4"
},
{
"rdf:value": "George",
"structures:sequenceID": "5"
},
{
"rdf:value": "Rufus",
"structures:sequenceID": "6"
}
],
"PersonSurName": {
"rdf:value": "Sutherland",
"structures:sequenceID": "7"
}
}
Yields RDF:
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b1 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b2 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b3 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b4 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b5 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b6 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> _:b7 .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Kiefer" .
_:b2 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "William" .
_:b3 <http://release.niem.gov/niem/structures/5.0/sequenceID> "3" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Frederick" .
_:b4 <http://release.niem.gov/niem/structures/5.0/sequenceID> "4" .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Dempsey" .
_:b5 <http://release.niem.gov/niem/structures/5.0/sequenceID> "5" .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "George" .
_:b6 <http://release.niem.gov/niem/structures/5.0/sequenceID> "6" .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Rufus" .
_:b7 <http://release.niem.gov/niem/structures/5.0/sequenceID> "7" .
_:b7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Sutherland" .
The results are data objects, each of which has a sequenceID
, which does not express an ordering of the properties. It instead is just applying an index to the data objects themselves.
This candidate does not support a case where an object is the value of 2 or more properties, each of which must appear in a specified order in its instance.
A successful formulation would express order of properties, not of objects. The sequence is a characteristic of the relationship between a property and the object that holds the property.
A candidate encoding is to represent properties as individual objects. We encode all properties as individual objects that are connected to the base object (person_1
) via a new property structures:sequence
.
{
"@context": {
"nc": "http://release.niem.gov/niem/niem-core/5.0/",
"structures": "http://release.niem.gov/niem/structures/5.0/",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"@base": "http://example.org/instance/"
},
"@id": "person_1",
"structures:sequence": [
{
"structures:sequenceID": "1",
"nc:PersonGivenName": "Kiefer"
},
{
"structures:sequenceID": "2",
"nc:PersonGivenName": "William"
},
{
"structures:sequenceID": "3",
"nc:PersonGivenName": "Frederick"
},
{
"structures:sequenceID": "4",
"nc:PersonGivenName": "Dempsey"
},
{
"structures:sequenceID": "5",
"nc:PersonGivenName": "George"
},
{
"structures:sequenceID": "6",
"nc:PersonGivenName": "Rufus"
},
{
"structures:sequenceID": "7",
"nc:PersonSurName": "Sutherland"
}
]
}
Yields
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b0 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b1 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b2 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b3 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b4 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b5 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b6 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Kiefer" .
_:b0 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b1 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "William" .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b2 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Frederick" .
_:b2 <http://release.niem.gov/niem/structures/5.0/sequenceID> "3" .
_:b3 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Dempsey" .
_:b3 <http://release.niem.gov/niem/structures/5.0/sequenceID> "4" .
_:b4 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "George" .
_:b4 <http://release.niem.gov/niem/structures/5.0/sequenceID> "5" .
_:b5 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Rufus" .
_:b5 <http://release.niem.gov/niem/structures/5.0/sequenceID> "6" .
_:b6 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .
_:b6 <http://release.niem.gov/niem/structures/5.0/sequenceID> "7" .
Results:
- Each property is encoded as its own object.
- None of these name properties is a property of
person_1
- The sequence identifier is required, since
sequenceID
does not provide a total ordering of properties. 2 or more properties may have the same value ofsequenceID
, in which case their relative order is ambiguous.
This encoding may cover the data requirements well, but it doesn't do anything useful for anyone. The JSON encoding is overly-complex. The RDF doesn't represent the right concept. No reasoner would ever make sense of the resulting RDF. This is a bad idea.
We could use RDF reification to express the semantic. This example uses @graph
to provide a pile of objects. link
{
"@context": {
"nc": "http://release.niem.gov/niem/niem-core/5.0/",
"structures": "http://release.niem.gov/niem/structures/5.0/",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"@base": "http://example.org/instance/"
},
"@graph": [
{
"@type": "rdf:Statement",
"rdf:subject": "person_1",
"rdf:predicate": "nc:PersonGivenName",
"rdf:object": "Kiefer",
"structures:sequenceID": "1"
},
{
"@type": "rdf:Statement",
"rdf:subject": "person_1",
"rdf:predicate": "nc:PersonGivenName",
"rdf:object": "William",
"structures:sequenceID": "2"
}
]
}
This yields the RDF:
_:b0 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "Kiefer" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> "nc:PersonGivenName" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> "person_1" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "William" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> "nc:PersonGivenName" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> "person_1" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
This encoding handles the data requirements perfectly, but the JSON is terrible, and there's no reason to think that the resulting RDF data is going to be useful to anyone for reasoning.
Characteristic of a a good representation for sequenceID
:
- Some properties of an object must have a fixed relative order.
- The order of a property is a characteristic of an occurrence of a property and its relationship to other properties.
- The order of a property is not a characteristic of the value of the property.
- Different properties (e.g., given and sur name) must be ordered relative to each other. Order is not only a characteristic of a single property name.
- Orders are partial; properties are not provided as a single list, as the interpretation does not disambiguate properties with the same value of
sequenceID
.
We have not yet found a good way to order order properties in JSON-LD/RDF that shares the semantic established by sequenceID
.