Skip to content

Tilda JSON Syntax: View Columns

Laurent Hasson edited this page Aug 14, 2019 · 24 revisions

<-- View Syntax

Columns

Views contain columns as a baseline for their definition. That’s pretty obvious. Although many elements are shared with Object columns, view columns are very different: for example, they allow aggregates, expressions, are picked from a source column in either another view or a root table etc... A view column looks like the following:

// A regular view column
{ "sameAs":"someSchema.SomeTable.someColumn" // let's assume it's a date time column
 ,"name": "someColumnX"
 ,"as":"_2"
 ,"description":"Type"
 ,"formulaOnly": true|false
 ,"joinOnly": true|false
}

// An aggregate view column
{ "sameAs":"someSchema.SomeTable.someColumnInt" 
 ,"name": "someColumnIntX"
 ,"description":"ColumnInt X blah blah blah"
 ,"aggregate": "SUM"|"AVG"|"MIN"|"FIRST"|"MAX"|"LAST"|"DEV"|"VAR"|"COUNT"|"ARRAY"
 ,"orderBy":["col1","col2"]
 ,"coalesce":'1111-11-11'
 ,"distinct": true|false
 ,"filter": "\"colA\" > 10"
}

// An expression view column
{ "sameAs":"someSchema.SomeTable.someColumnDateTime" // let's assume it's a date time column
 ,"name": "someColumnDateTimeX"
 ,"description":"ColumnDate X blah blah blah"
 ,"expression":"?::DATE" // convert to a DATE
 ,"type": "DATE" // new type
 ,"size": 2 // only for STRING type expressions
}

// A .* view column
{ "sameAs":"someSchema.SomeTable2.*"
 ,"prefix":"abc_"
 ,"exclude":["col1","col2"]
 ,"block":["colA","colB"]
}

The fields are:

  • name: the mandatory name of the column.

    • Tilda uses isomorphic names across all the programming models it works with (i.e., SQL, JSON, JavaScript, Java). As such the most common variable naming convention (starting with a lower case and use camel-case) is preferred. This often clashes with DB-oriented teams who prefer all lower-case identifiers with underscores. Naming conventions are hard! If your project is mainly application-driven, then a Java/JavaScript naming convention would be preferred, but if you use Tilda mostly for data warehousing or database-centric work, then the more common naming convention in those environment would obviously be preferred. In general, it's a god thing to define a naming convention at the schema-level at the very least and stick to it.
    • Column names are case-sensitive across most programming environments. However, within a single table, duplicates are checked in a case-insensitive way (for example, “helloWorld“and “helloworld” are treated as equivalent when checking for column definition duplicates. Some databases use case-insensitive names by default.
    • Column names must abide to identifier syntax rules, i.e., they must start with a letter or an underscore, and must only contain letters, digits or underscores (no spaces, no dash or other such characters).
  • type: A type for the column. Tilda uses a generic type system. The values can be:

    • STRING: a string value
      • String values must also define a “size” attribute with a value > 1. The framework typically implements a threshold depending on the database to decide whether to store a CHAR of VARCHAR column. In Postgres, that threshold is 8, i.e., STRINGs less than 8 characters are stores as char(n), whereas STRINGs 8 character or more are stored as VARCHAR(n). In SQLServer, that threshold is 20. At this time, this is not changeable.
    • JSON: a JSON data structure (which may be stored as a text field in databases that do not support native JSON storage
    • CHAR: a single character
    • INTEGER: a 32-bit integer
    • LONG: a 64-bit integer
    • FLOAT: a 32-bit floating point value
    • DOUBLE: a 64-bit floating point value
    • BOOLEAN: a Boolean value (which may be stored as an integer with the values 0/1 in databases that do not support Boolean values)
    • DATE: a date
    • DATETIME: a date and time, with timezone if supported by the DB
    • BINARY: a binary object
    • BITFIELD: up to 64 bit values

    Columns can also be defined as collections by appending to the type name “[]” for arrays/lists (sequential values), and “{}” for sets (i.e., unique values, non-sequential).

    • For databases that don’t support arrays, a comma-separated list of values in a text field is used and serialization/deserialization is done in the generated Java code. Any database-level logic (ETL or stored procedures and so on) must implement support appropriately by using stock Tilda functions.
    • For sets, as far as we know, no database support them natively as column types. As such, the “deduping” (removing duplicates), is performed in the generated application code at serialization/deserialization time. Any database-level logic (ETL or stored procedures and so on) must implement support appropriately by using stock Tilda functions.
  • nullable: optional, true by default, whether the column is nullable or not.

  • mode: optional, NORMAL by default, the mode for the column. The following values are allowed:

    • NORMAL: a normal column.
    • AUTO: a column that is wholly managed by the application code and cannot be set/get directly. This can be used for caching or hidden values.
    • CALCULATED: a column that only exists in the application space (doesn’t exist in the database), and can represent some client-side calculations.
  • protect: optional, ABSOLUTE by default, denotes the protection/security mode for the column. This is settable only for STRING columns. This is typically used for contents that is client-based and is re-rendered back on the client (i.e., the Browser). Allowed values are:

    • NONE: no protection. What goes in is what goes out.
    • SMART: will parse input data and eliminate injection opportunities in JavaScript.
    • ABSOLUTE: replaces all ‘<’ characters as ‘<’
  • invariant: whether the column can be updated after being initially set. This means the column can be set when a record is created, but cannot be updated afterwards, i.e., no setter is available in the application space.

  • description: a meaningful description of the column, which could contain plain HTML, and which will be output in the automatically generated documentation (JavaDocs as well as HTML SchemaDocs).

  • values: an optional set of name/value pairs for values allowed for the column. See the next section for details on this element.

  • default: a default value for a column. The value must match the type of the column and is generally used for non-nullable fields, although the framework won't scream at you if you use it for nullable fields too.

  • jsonSchema: an optional json definition of the schema for that field. See the next section for details on this element.

  • sameAs: A column can be defined in terms of another column. This is useful for a variety of things:

    • This can help make the database more consistent with related fields linked together across tables and schemas, for example, an addressLine1 field which would be a String of 255 characters across the board.
    • This is also the base construct to declare a foreign key by linking the definition of a column to that of the column for the primary key in a parent table.

    When declaring a sameAs, a number of fields are automatically inherited from the linked column:

    • type: cannot be changed
    • size: cannot be changed
    • nullable: can be overridden (i.e., it’s possible to have a column in the parent table be not-null, while a child table would have a nullable foreign key).
    • invariant: same as for nullable.
    • mode: cannot be changed
    • protect: cannot be changed
    • values: cannot be changed
    • description: can be changed to reflect the semantics of the column in the child table. For example, a primary key field may be described as “Surrogate key for the Patient table”, whereas in the child table, it would be “Foreign key to the Patient table”.
    • jsonSchema: cannot be changed

Values

A column can define a set of values that are allowed, i.e., an enumeration type. This is only setting constants and making it easier to generate client-side artifacts (e.g., drop downs), or help abstract code from constant values. No constraint is generated to limit values for this column in the database itself.

"values":[
   { "name": "Org" , "value": "O", "description": "Organization" }
  ,{ "name": "Fac" , "value": "F", "description": "Facility"
    ,"label":"Facility", "groupings":["Facilities"]
   },
  ,{ "name": "Clin", "value": "C", "description": "Clinics"
    ,"label":"Clinics", "groupings":["Facilities"]
   },
  ,{ "name": "Other", "value": "X", "description": "Other"
    ,"default":"CREATE"
   },
]

The fields are:

  • name: a mandatory partial name of the constant generated. If the column names is called “xxx” and the value is named “Abc”, the final variable generated will be called “xxxAbc”, which is the reason why contrary to column names, value names start with an upper case.
  • value: the mandatory value to be stored in the DB. The type of the values must be compatible with the type of the column, i.e., you cannot have a string value for an integer column.
    • For DATE and DATETIME columns, values of NOW (current date/timestamp), UNDEFINED (a system value of Jan 1st year 1 at 0h:0mn:0s:0ms).
    • For DATE, a an explicit timestamp of the format yyyy-MM-dd can be used.
  • description: a mandatory description for the value
  • label: an optional label for the value, to be used in front end ui for example in a drop down. If a label isn’t supplied, the value of “name” will be used.
  • groupings: a list of string that can be used in the UI to subset the list of values.
  • default: CREATE, UPDATE, NONE, ALWAYS, whether the value should be used as default at creation time, update time, or always. If unspecified, NONE is used. This allows to create columns with default values.

🎈 NOTE: Names, Values and Labels should be unique in the list.

🎈 NOTE: Only one value can be made as default CREATE, UPDATE or ALWAYS.

jsonSchema

A column that contains JSON data may also define a mini-schema. Code-side artifacts will be generated to access that data at runtime. The data can be extended easily, i.e., add new elements etc… However, on the code-side only fields defined here will be accessible unless the developer implements custom json logic. Typically, the json definition will be evolved naturally as part of the application design, without affecting the database table structure (i.e., schema-less), making incremental improvements a breeze without requiring complex redesigns or migrations (i.e., the promise of most NoSQL systems).

  • If the database targeted does not support JSON as a data type, a text field will be used, and only code-side will have json capabilities (i.e., database level artifacts such as external ETL processes, user defined functions or stored procedures will access the field as a plain text field and will need to implement their own json logic).
  • If the targeted database supports JSONB, that type will be preferred over text-based plain JSON.
  • PostgreSQL is a particularly great database to manage JSON payloads and we have implemented very large tables (billions of rows) with essentially a primary key, some fields related to the natural key of the object, and then a single column containing a flexible JSON payload.
// A JSON column with a JSON schema type definition.
{ "name":"data"
 ,"type":"JSON"
 ,"description":"Some JSON Data"
 ,"jsonSchema": {
     "typeName": "ClinicalRule"
    ,"descr": "blah..."
    ,"fields": [
	 {"name":"drgs" , "type":"STRING[]", "description":"DRG codes" }
	,{"name":"icd9s", "type":"STRING[]", "description":"ICD9 dx codes" }
      ]
    ,"validation": {
         "rule": [
            "   drgs   IS NOT NULL AND len(drgs  ) > 0"
           ,"OR icd9s  IS NOT NULL AND len(icd9s ) > 0"
          ]
        ,"descr": "At least one DRG or ICD9 value must be supplied"
      }
   }
}

The elements are as follows:

  • typeName: the name of top-level element for this field when a full JSON is exported.
  • descr: a description for the json data type
  • fields: the list of fields, similar to columns in an object definition, with the field's name, type, size, nullable, and description.
  • validation: the definition of rules (as an array of strings to make it easier to capture more complex multi-line rules) to validate whether the json data is ok or not.
    • rule: An expression that can be run at runtime in the generated code. This formula uses SQL with a few generalized functions for data processing. See the next chapter for details on the formula language.
    • descr: the description of what the formula does.
Clone this wiki locally