Skip to content

JSON Type Notation

Wesley Miaw edited this page Jul 15, 2016 · 4 revisions

JSON Type Notation provides a lightweight way to specify the data types of an object. Typically the object will be represented as JSON, but this is not strictly necessary. Like JSON, JTON is easy for humans to read and write and easy for machines to parse and generate. JTON type values are specified in (a small subset of) JavaScript. The JTON for a type mirrors the structure of the values of that type. So a JTON object looks a lot like an example of the type it specifies. People like examples.

JTON does not attempt to encompass all possible types. This is in the interest of simplicity. Nevertheless, JTON is powerful enough that it should not be overly constraining.

In the following the term type specifier refers to a JTON object and the term value refers to a value compliant to a type specifier.

Why JSON?

JSON vs. XML is a perennial debate. Aside from just following the crowd there is essentially one reason to choose JSON: simplicity. The JSON specification is a single, simple, web page and pretty much completely defines the format. The simplicity of JSON is beautiful. Compare this to the many hundreds of pages of complex XML specifications (ok, that’s not quite a fair comparison, but you get the point).

The simplicity of JSON has practical implications for software tools. JSON software tools are themselves simple, providing a basic set of functionality which is common across languages and implementations. Compare to XML tools where the many possible levels of complexity means that every tool or library provides a slightly different level of functionality.

A big advantage of XML, however, is the possibility to formally define data types using XML schemas. Formal data type definitions enable automatic validation of encoded data, reduce programming errors based on type mismatches (for example programmers making incorrect assumptions about the value range of numbers) and provide a precise way for engineers to communicate – for example when discussing new proposals. Automatic validation has security advantages and facilitates testing and earlier discovery of bugs. Data binding tools can be used to generate code from XML schemas providing programmers with an easy-to-use object interface to the XML data.

JSON reduces the need for data binding (schema based code generation): JSON software tools de-serialize and serialize to/from easy-to-use object representations of the data directly. If those objects have been validated against the expected incoming data format, then they are about as robust and easy to use as custom-generated data binding classes.

In an attempt to obtain the simplicity benefits of JSON together with the benefits of formal data type definitions we describe here a type notation for JSON. To meet the objective, this has to be as simple as JSON itself – simply re-writing XML schemas in JSON (for example see JSON Schema) or adapting RELAX NG doesn’t cut it (IMO).

JSON Type Notation (JTON)

Objects

The type specifier for an object is a JSON object, called a type object. Each member of the type object specifies a possible member of a value object, except that member names beginning with ‘#’ are reserved for additional type information. The values of the members of the type object are the type specifiers for the members of the value object (read it twice).

The following special member names are defined:

Member Name Value
#mandatory A list of object members whose presence is mandatory.
#defaults An object specifying default values for some or all of the members.
#extensible A boolean indicating whether the type is extensible. When validating an extensible object type, unrecognized members are allowed and ignored. The default is TRUE.
#all A type specifier which all members of the object without explicit type specifiers must be compliant to.
#conditions A list of conditions to which the members of the object must be compliant. The syntax for a condition is:
condition
    member-name
    'member-name'
    condition and condition
    condition or condition
    condition xor condition
    not condition
    ( condition )
The member-name and ’ member-name ’ conditions are true if the member referred to is present. The other choices have their obvious meaning. Operators have the following precedance: not > and > ( or, xor ). Operators of equal precedence are left associative. Brackets “(”, “)” may be used to control evaluation order.

Arrays

The type specifier for a variable-length array is a JSON array with one element. The single element gives the type specifier for array elements.

An array with a fixed number of elements greater than 1 (i.e. a tuple) can be specified as a JSON array with that number of elements, each element giving the type specified for the corresponding element in the array.

Basic Types

The type specifier for a basic type is a string beginning with one of the following values:

  • number
  • integer
  • string
  • hex
  • binary
  • boolean
  • enum

The number and integer specifiers can optionally be postfixed with a range qualifier:

range-qualifier
    ( number-or-dash, number-or-dash ) 

number-or-dash
    -
    number

The numbers specify the minimum and maximum values, where ‘-’ means unbounded.

Additional symbols int16, int32, int64, uint16, uint32, uint64 and double are defined as shorthand for integer or number with the corresponding range (and accuracy) restrictions (as per their usual meanings from C++).

The string, binary, and hex specifiers can optionally be postfixed with a length qualifier:

length-qualifier
    ( number )
    ( number, number-or-dash )

The first form specifies a fixed length, the second form the minimum and maximum length. The length for string is the number of Unicode characters, for hex the number of hex characters, and for binary the number of octets. hex values are not case-sensitive.

The enum specifier must be postfixed with an enum value list:

enum-value-list
    ( enum-values )

enum-values
    enum-value
    enum-value | enum-values

enum-value
    enum-char
    enum-char enum-value

enum-char
    any char except | or )

The (tbd) date specifier indicates that a value has the format of an ISO8601 date or combined date and time string.
The (tbd) url specifier indicates that the value must be a URL (although this is not saying much).

Any

The type specifier any matches any type.

Choice

A type specifier which is an object with a single member, ‘#choice’, specifies a choice type. The value of the #choice member must be an array containing a list of alternative type specifiers. A value is valid according to this type specifier if it is valid according to any one of the type specifiers in the array.

Example

score = { "testid" : "string", "result" : "integer(0,100)" }
 
student =
{   "#mandatory" : [ "name", "gender", "dob" ],
    "#extensible" : true,
    "name"  : "string",
    "gender": "enum(male|female)",
    "height"  : "integer(0,-)",
    "dob"  : "date",
    "password": "hex(16)",
    "homepage": "url",
    "id"  : { "#choice" : [ "uint32", "hex(8)" ] },
    "sat"  : "double",
    "testscores": [ score ]
}

Constraints

JTON places the following constraints on the JSON types it can specify:

  • All elements of an array must have the same type, or the array must be fixed length (and, in practice, small i.e. a tuple).
  • Enumeration tokens may not include the characters ‘)’ or ‘|’.
  • Object member names may not start with ‘#’.

These constraints are either trivial or just exclude things which should probably be avoided anyway as they could lead to confusion.

JTON is unable to express complex constraints, such as presence of one member being conditional on the presence of value of another. These things need to be validated by the user.

Clone this wiki locally