Skip to content

ordered properties

Webb Roberts edited this page Oct 23, 2020 · 1 revision

NIEM JSON / RDF and ordered properties

How should we order properties in NIEM JSON?

Introduction

There is a mismatch:

  • XML elements are ordered. The XML infoset presents element and character content in an order.
  • RDF triples are not ordered. RDF is defined such that all triples reside in a graph, but their order is not distinct.

Ordered properties are useful when they are ordered deliberately, which is why NIEM defines structures:sequenceID to deliberately order properties.

The fact that XML content is always ordered can be harmful, when the order may be respected differently in different environments. For example, if data is serialized through a relational database, or across Java objects, the input order may not be retained on output.

NIEM JSON uses JSON-LD, which is an RDF implementation, which means that the order of JSON-LD properties as they appear in a JSON file

Do we know of a reasonable way to order RDF data, and therefore NIEM JSON data?

In response to issue https://github.com/NIEM/NIEM-JSON-Spec/issues/3.

Research

JSON-LD data with no order

What does normal data look like?

Base example: link:

{
  "@context": {
    "@vocab": "http://release.niem.gov/niem/niem-core/5.0/"
  },
  "PersonGivenName": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ],
  "PersonSurName": "Sutherland"
}

Yields

_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Dempsey" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Frederick" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "George" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Kiefer" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Rufus" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "William" .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .

The properties are unordered.

JSON-LD lists

Experimenting with ordering content.

The JSON-LD spec, section 4.3.1 "Lists" describes ordering content of a single property:

Example with list: link

{
  "@context": {
    "@vocab": "http://release.niem.gov/niem/niem-core/5.0/"
  },
  "PersonGivenName": {
    "@list": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ]
  },
  "PersonSurName": "Sutherland"
}

The list declaration can be put into the context: link

{
  "@context": {
    "@vocab": "http://release.niem.gov/niem/niem-core/5.0/",
    "PersonGivenName": { "@container": "@list" }
  },
  "PersonGivenName": [ "Kiefer", "William", "Frederick", "Dempsey", "George", "Rufus" ],
  "PersonSurName": "Sutherland"
}

Both yield the same RDF data:

_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b1 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Kiefer" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b2 .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "William" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b3 .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Frederick" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b4 .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Dempsey" .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b5 .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "George" .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b6 .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> "Rufus" .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .

The result uses the RDF representation for lists (rdf:first, rdf:rest, rdf:nil), which gives a property (PersonGivenName) a value that is a list object (_:b1).

The @list construct provides no way to order different properties (e.g., to say a family name comes before a given name).

A writeup on ordering properties in RDF

A writeup on ordering properties in RDF: Sergey Melnik & Stefan Decker: Representing Order in RDF.

  • None of these options seems to be what we want to do for NIEM.

Options listed

  1. Have a property point to a "list object" that has properties 1, 2, etc. pointing to the values
  2. Have a property point to a "list object" that has head pointing to a value, rest pointing to a list or nil.
  3. turn a triple into an object with "source", "dest", "type" (the property name), and a "next" pointing at the next item in the list.
  4. change property names into specialized properties that have an index attached: property "creator" turns into "creator:1", "creator:2".
  5. have a reified object point to "source", "type" (the property name), and properties 1, 2, etc. pointing to the values
  6. turn triples into reified objects, each having "source" (property subject), "dest" (property object), "type" (property name), "order" (the index of the triple)
  7. give a triple an "Order" property; their example presumes some reification of triples behind the scenes.
  8. reify property, and give the triple a "next" pointing to the next object in the sequence

Using structures:sequnceID as a property

We would like an RDF/JSON encoding for the given XML:

<PersonGivenName structures:sequenceID="1">Kiefer</PersonGivenName>
<PersonGivenName structures:sequenceID="2">William</PersonGivenName>
<PersonGivenName structures:sequenceID="3">Frederick</PersonGivenName>
<PersonGivenName structures:sequenceID="4">Dempsey</PersonGivenName>
<PersonGivenName structures:sequenceID="5">George</PersonGivenName>
<PersonGivenName structures:sequenceID="6">Rufus</PersonGivenName>
<PersonSurName structures:sequenceID="7">Sutherland</PersonSurName>

A candidate, using structures:sequenceId as a property: link

{
  "@context": {
    "nc": "http://release.niem.gov/niem/niem-core/5.0/",
    "structures": "http://release.niem.gov/niem/structures/5.0/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  },
  "PersonGivenName": [ 
    { 
      "rdf:value": "Kiefer",
      "structures:sequenceID": "1"
    },
    {
      "rdf:value": "William",
      "structures:sequenceID": "2"
    },
    {
      "rdf:value": "Frederick",
      "structures:sequenceID": "3"
    },
    {
      "rdf:value": "Dempsey",
      "structures:sequenceID": "4"
    },
    {
      "rdf:value": "George",
      "structures:sequenceID": "5"
    },
    {
      "rdf:value": "Rufus",
      "structures:sequenceID": "6"
    }
  ],
  "PersonSurName": {
    "rdf:value": "Sutherland",
    "structures:sequenceID": "7"
  }
}

Yields RDF:

_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b1 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b2 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b3 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b4 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b5 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> _:b6 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> _:b7 .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Kiefer" .
_:b2 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "William" .
_:b3 <http://release.niem.gov/niem/structures/5.0/sequenceID> "3" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Frederick" .
_:b4 <http://release.niem.gov/niem/structures/5.0/sequenceID> "4" .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Dempsey" .
_:b5 <http://release.niem.gov/niem/structures/5.0/sequenceID> "5" .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "George" .
_:b6 <http://release.niem.gov/niem/structures/5.0/sequenceID> "6" .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Rufus" .
_:b7 <http://release.niem.gov/niem/structures/5.0/sequenceID> "7" .
_:b7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#value> "Sutherland" .

The results are data objects, each of which has a sequenceID, which does not express an ordering of the properties. It instead is just applying an index to the data objects themselves.

This candidate does not support a case where an object is the value of 2 or more properties, each of which must appear in a specified order in its instance.

A successful formulation would express order of properties, not of objects. The sequence is a characteristic of the relationship between a property and the object that holds the property.

Property objects

A candidate encoding is to represent properties as individual objects. We encode all properties as individual objects that are connected to the base object (person_1) via a new property structures:sequence.

{
  "@context": {
    "nc": "http://release.niem.gov/niem/niem-core/5.0/",
    "structures": "http://release.niem.gov/niem/structures/5.0/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "@base": "http://example.org/instance/"
  },
  "@id": "person_1",
  "structures:sequence": [
    { 
      "structures:sequenceID": "1",
      "nc:PersonGivenName": "Kiefer"
    },
    { 
      "structures:sequenceID": "2",
      "nc:PersonGivenName": "William"
    },
    { 
      "structures:sequenceID": "3",
      "nc:PersonGivenName": "Frederick"
    },
    { 
      "structures:sequenceID": "4",
      "nc:PersonGivenName": "Dempsey"
    },
    { 
      "structures:sequenceID": "5",
      "nc:PersonGivenName": "George"
    },
    { 
      "structures:sequenceID": "6",
      "nc:PersonGivenName": "Rufus"
    },
    { 
      "structures:sequenceID": "7",
      "nc:PersonSurName": "Sutherland"
    }
  ]
}

Yields

<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b0 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b1 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b2 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b3 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b4 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b5 .
<http://example.org/instance/person_1> <http://release.niem.gov/niem/structures/5.0/sequence> _:b6 .
_:b0 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Kiefer" .
_:b0 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b1 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "William" .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b2 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Frederick" .
_:b2 <http://release.niem.gov/niem/structures/5.0/sequenceID> "3" .
_:b3 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Dempsey" .
_:b3 <http://release.niem.gov/niem/structures/5.0/sequenceID> "4" .
_:b4 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "George" .
_:b4 <http://release.niem.gov/niem/structures/5.0/sequenceID> "5" .
_:b5 <http://release.niem.gov/niem/niem-core/5.0/PersonGivenName> "Rufus" .
_:b5 <http://release.niem.gov/niem/structures/5.0/sequenceID> "6" .
_:b6 <http://release.niem.gov/niem/niem-core/5.0/PersonSurName> "Sutherland" .
_:b6 <http://release.niem.gov/niem/structures/5.0/sequenceID> "7" .

Results:

  1. Each property is encoded as its own object.
  2. None of these name properties is a property of person_1
  3. The sequence identifier is required, since sequenceID does not provide a total ordering of properties. 2 or more properties may have the same value of sequenceID, in which case their relative order is ambiguous.

This encoding may cover the data requirements well, but it doesn't do anything useful for anyone. The JSON encoding is overly-complex. The RDF doesn't represent the right concept. No reasoner would ever make sense of the resulting RDF. This is a bad idea.

Reification

We could use RDF reification to express the semantic. This example uses @graph to provide a pile of objects. link

{
  "@context": {
    "nc": "http://release.niem.gov/niem/niem-core/5.0/",
    "structures": "http://release.niem.gov/niem/structures/5.0/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "@base": "http://example.org/instance/"
  },
  "@graph": [
    { 
      "@type": "rdf:Statement",
      "rdf:subject": "person_1",
      "rdf:predicate": "nc:PersonGivenName",
      "rdf:object": "Kiefer",
      "structures:sequenceID": "1"
    },
    { 
      "@type": "rdf:Statement",
      "rdf:subject": "person_1",
      "rdf:predicate": "nc:PersonGivenName",
      "rdf:object": "William",
      "structures:sequenceID": "2"
    }
  ]
}

This yields the RDF:

_:b0 <http://release.niem.gov/niem/structures/5.0/sequenceID> "1" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "Kiefer" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> "nc:PersonGivenName" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> "person_1" .
_:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
_:b1 <http://release.niem.gov/niem/structures/5.0/sequenceID> "2" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "William" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> "nc:PersonGivenName" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> "person_1" .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .

This encoding handles the data requirements perfectly, but the JSON is terrible, and there's no reason to think that the resulting RDF data is going to be useful to anyone for reasoning.

Findings

Characteristic of a a good representation for sequenceID:

  1. Some properties of an object must have a fixed relative order.
  2. The order of a property is a characteristic of an occurrence of a property and its relationship to other properties.
  3. The order of a property is not a characteristic of the value of the property.
  4. Different properties (e.g., given and sur name) must be ordered relative to each other. Order is not only a characteristic of a single property name.
  5. Orders are partial; properties are not provided as a single list, as the interpretation does not disambiguate properties with the same value of sequenceID.

We have not yet found a good way to order order properties in JSON-LD/RDF that shares the semantic established by sequenceID.