Skip to content

Prototext and JSON PB support

Compare
Choose a tag to compare
@cueckoo cueckoo released this 03 Jul 11:13
· 2446 commits to master since this release

This release introduces a more comprehensive implementation of protobuf, supporting textproto and JSON protobuf mappings, and paving the way for binary protobuf support as well.

This release required some significant changes to the cue command logic.

A hallmark property of protobuf data formats is that they cannot be parsed without a schema (unlike JSON, for instance). This is true to various degrees for the different formats, but it holds true in some shape or form for all of these. This required some changes in the CUE tooling to make this possible. It also required some backwards incompatible changes.

Protobuf changes

Filetype *.textproto

CUE now supports interpreting and writing text proto files. Text proto files cannot be interpreted, or even parsed, without a schema. This means that .textproto files can only be read with an explicit schema using the --schema/-d flag. Moreover, the schema must have @protobuf for certain types, such as maps, to be interpreted properly.

Note that the available .textproto parsing libraries are incredibly buggy. So there will be some rough edges that are kind out of CUE’s hands.

JSON conversion: package jsonpb and json+pb

The Protobuf documentation has a recommendation on how Proto messages should map to JSON. Package jsonpb now supports this mapping in both directions. It implements this by rewriting a CUE AST based on a given schema, a cue.Value. This allows this mapping to also be combined in conjunction with Yaml, for instance.

On the command line this feature can be used through the pb “interpretation”, for instance as json+pb or yaml+pb. Both input and output are supported.

Note that interpretations to CUE require a schema for the interpretation. This can be an explicitly specified schema using the --schema/-d flag or an implicit only by unifying it with a schema value or playing it within a schema using the placement flags (see cue help flags).

Backwards incompatible changes

@protobuf tag

The protobuf previously had only one required argument: the numeric value of the enum. The type was optional and only included if different in name from the CUE type. As it turned out, though, the CUE type was not always sufficient information to be able to represent proto values. Most notably, integer values are encoded differently in JSON depending on the type of integer in the proto specification (this is not a typo!!).

The new format includes the type unconditionally as the second argument. CUE does its best to recognize old formats for backwards compatibility purposes, but this may cause issues.

@protobuf(<tag num>,<type>, ...options)

Protobuf options are still represented as string literals, allowing CUE options, such as alternative names (name=x) to be represented as the usual <key>=<value> format.

Another change is the representation for map types. Previously, the Protobuf map type was included verbatim (map<T,U>). This was somewhat inconvenient for parsing attributes, though: angular brackets are, unlike (), [], and {} not matched. So the comma in the map type commanded some escaping. To avoid this, maps are now represented as map[T]U. We contemplated using [T]:U, but opted for the map prefix for clarity and future extendibility.

JSON mappings

An initial design decision for the Proto to CUE mapping was to have CUE follow the Proto to JSON mapping. By hindsight, though, this did not make much sense. The inconsistency of having integers represented as strings does not make sense for a language that supports expressions on such integers (unless we want to give up typing, perhaps).

The stance now is to take the representation that makes sense, and have a protobuf-specific CUE to/from JSON converters. This is akin to the JSON schema and OpenAPI converters, which map CUE to some data format, using JSON or YAML, for instance, as a transport layer.

Luckily, the Protobuf to CUE mapping already deviated from the recommended mapping, always defaulting to int for integer types. So there are no changes there, other than that there is now support for following the recommended mappings for import and export.

Enum mappings

The most noteworthy backwards incompatible change is how enum types are mapped. Previously, CUE followed the recommended JSON mapping of using strings. This turned out to be a bad idea for various reasons. Firstly, the source of truth are integer values, really, not the names. There may be multiple names per number, making comparing names somewhat meaningless. Finally, most other languages used integers as representations, making the CUE representation not interoperate well with such languages.

Although the old mapping is still supported, the integer-based mapping is now recommended.

The default enum representation when using cue import proto is now to represent enums as integers in CUE. The --proto_enum=json flag can be used to invoke the old conversion behavior.

The API will keep converting using the JSON format, but now has a Config.EnumMode option for selecting the alternative behavior.

The Protobuf to JSON interpretation (filetype json+pb or package jsonpb) supports converting back and forth in either format.

Changelog

aaf6e84 cmd/cue/cmd: compute schema before values
0b5084a cmd/cue/cmd: hook up protobuf encodings types
8e5eeab cmd/cue/cmd: parseArgs: split values and schemas early
f228236 cmd/cue/cmd: preparse command line path expressions
75d0180 cmd/cue/cmd: simplify parseArgs in preparation of proto support
dea3c5d cue/build: organized Instance fields
660b090 cue/build: remove support for file lists
38ad7c3 cue: add ReferencePath
af509c6 cue: eliminate context type
52db572 cue: get rid of internal index type
c0fe9ce cue: refactor index
362e3a5 encoding/protobuf/jsonpb: add encoder
4a288d5 encoding/protobuf/textproto: add decoder implementation
a0035de encoding/protobuf/textproto: add encoder
3bdfa5d encoding/protobuf: always include type as second argument
dcfff00 encoding/protobuf: support integer enums
8ba98ee internal/encoding: pass schema to config
80a0a6e internal: move DebugStr to new astinternal package
22abdad internal: replace internal.CoreValue with more type-safe variant