-
Notifications
You must be signed in to change notification settings - Fork 6
Tilda JSON Syntax: Object And View Output Maps
<-- Object Syntax
<-- View Syntax
Tilda is used in web apps or other data integration environments and provides the ability to export data as JSON and/or CSV.
// Within an Object definition, you can define any number of maps (or none).
"outputMaps":[
// An output targeted at a model where 4 columns are output, normalized to DOUBLE and
// are provided on column form (i.e., flat output like NVP or JSON for example).
{ "name": "ModelX", "columns": ["genderM", "genderF", "age0to64","age65andolder"], "outTypes":["NVP"]
,"nvpSrc":"COLUMNS", "nvpValueType": "DOUBLE"
}
// An output map for a Web client for example using JSON and sync'ed. Assuming the object has columns
// "colA", "colB", "colC", "colD", "colE", "colF" and "colG", the "col*" below will capture all those
// columns automatically and will update automatically if new columns are added to the object. JSON being
// a permissive format, this is OK.
,{ "name": "Simple", "columns": ["refnum", "col*", "created", "lastUpdated", "deleted"]
,"outTypes":["JSON"], "sync":true
}
// An output map for a data dump as CSV. Here, we expect some consumer of the CSV and we want to fix the
// output format: adding new columns, or changing the column order, will almost certainly break a client
// parsing out the CSV file and expecting exact columns. So, contrary to the JSON definition, we will
// explicitly name the columns. Any updates will be a conscious effort for that mapping.
,{ "name": "DataFile", "columns": ["refnum", "colA", "colB", "colC", "colD", "colE", "created", "lastUpdated"]
,"outTypes":["CSV"]
}
]
The fields are:
- name: The name of the mapping which will be used to generate code methods for the application. Names must be unique for a given object.
- columns: A list of 1 or more columns to be managed by the mapping, supporting leading or trailing * such as "xyz_*" to capture all columns starting with "xyz_", or "*_dt" to capture all columns ending in "_dt".
- sync: A boolean on whether to generate output appropriate for sync protocols. Code generated will take a sync token (a timestamp since the last sync) and will mark all records with an additional column "__sync" as New, Updated, or Deleted, and will omit untouched records.
- outTypes: The format(s) of the output: JSON, CSV, or NVP. At least one must be specified.
- nvpSrc: The output format for the data: ROWS, COLUMNS. This is mandatory if the output type is NVP and allows to dynamically unpivot the data if "ROWS" is specified, in which case, only 2 columns should be provided in the column list.
- nvpValueType: A type (as per the standard Tilda Type System) to unify all the output columns if outTypes=NVP and nvpSrc=COLUMNS. All column named must be compatible with that single type: for example, if a column type is INTEGER, it can be reliably converted to STRING, but the reverse is not true.
The "CSV" and "JSON" output formats are pretty standard, text-based, and behave as expected. The "NVP" output format however seems to be unique to Tilda and needs some more detailed explanation. We have used Tilda in Analytics environments and needed to standardize programmatically how to feed data into models. The decision was to either feed a vector (COLUMNS as List<Map<String,TildaType>), or a list of rows (ROWS as List<Pair<String,TildaType>) with the appropriate Java data structure. In general, analytical models will be fed a feature vector of data in one way or another, likely normalized as a number type (i.e., INTEGER or DOUBLE).
🎈 NOTE: Only objects with OCC turned on (default, i.e., with "created", "lastUpdated" and "deleted" timstamps), can have synced output.
🎈 NOTE: At this time, sync is only supported for CSV and JSON output formats.
🎈 NOTE: The '*' feature for columns is both a blessing and a potential source of surprise. In many projects using strong naming convention, it's useful to be able to mark a bunch of columns that way: it allows for later updates from source tables/views (even those in components you might use indirectly) to flow automatically to outputs without any code change. That enables a key aspect of Tilda's models to act as automated intermediaries between the database and a web client for example. On the other hand, you may mistype a column name which will generate no match. The Tilda compiler will only warn you if no match at all are generated by your column list as a mapping requires at least one column to be included.
🎈 NOTE: When creating mappings, you need to consider your endpoints carefully.
- In most cases, you will absolutely want to include at least one of the object's identify columns (either PK or a unique index).
- Adding the sync protocol may be necessary for some offline applications, or data feeds, but does add to the general payload (the extra "__sync" attribute/column), takes moderately more processing power, and is more complex programmatically to handle as you need to keep track of the "sync token".