Skip to content

The striping and assembly algorithms from the Dremel paper

Dominik Moritz edited this page Oct 21, 2013 · 6 revisions

To understand the examples given in the Dremel paper it is useful to annotate the schema with the corresponding definition level and repetition level. See example figure 2 and 3 in the paper

Nested schema to columns

The paper uses the following schema as an example:

message Document {
  required int64 DocId;
  optional group Links {
    repeated int64 Backward;
    repeated int64 Forward; }
  repeated group Name {
    repeated group Language {
      required string Code;
      optional string Country; }
    optional string Url; }}

The schema can be seen as a tree whose leaves are primitive types. A column is created for each leaf. So for this example we end up with 6 columns:

DocId
Links.Backward
Links.Forward
Name.Language.Code
Name.Language.Country
Name.Url

As some of the fields are repeated fields, we need extra pieces of information to be stored along with the data to allow re-assembling the records together. That's what repetition and definition levels do.

Column striping algorithm:

You can picture serializing a record as (depth-first) traversing a tree. Starting from the root of the record, follow down the first child until you reach a leaf. Children are either a different field or a different instance of a repeated field. When a leaf is reached write out the value for the corresponding column with the maximum definition level for this column (meaning it is defined) and the current repetition level (initially 0 as we start at the root). When an empty field is reached, write no value along with the definition level for the last defined level for all leaf nodes of the type bellow this node (and the current repetition level as well). When going back up the tree to find the next branch to follow down, the level where the last repetition was is the new current repetition level.

When a field is not defined, the definition level will be smaller than the definition level of this field.

Record assembly

Let's say you want to reconstruct the record but need to access only one of the fields. Re-assembling the record is doing the same traversal of the tree over again. Pick the column for the field you need and reconstruct the record as follows: When a definition level is lower than the max stop going down at this level, once a leaf as been reached go down to the repetition level and start going down from there with the next value.

Repetition and definition levels

One important thing to remember to understand the examples is that not every level of the tree needs a new definition or repetition level. Only repeated fields increment the repetition level, only non-required fields increment the definition level. As those levels are very small bounded values they can be encoded efficiently using a few bits.

Required fields are always defined and do not need a definition level. Non repeated fields do not need a repetition level.

Here is the schema annotated with the actual levels to help follow along the example in the paper: Schema annotated

Because the repetition and definition levels are bound by the depth of the schema and are small integers. They can be stored in only a few bits. A flat data structure with all required fields uses no extra bits to store the levels which are always zero. An optional field requires one extra bit to store zero if it is NULL and one if it is defined. NULL values do not need to be stored as the definition level captures this information.

Dremel paper

Step by step:

R1

DocId: 10
Links
  Forward: 20
  Forward: 40
  Forward: 60
Name
  Language
    Code: 'en-us'
    Country: 'us'
  Language
    Code: 'en'
  Url: 'http://A'
Name
  Url: 'http://B'
Name
  Language
    Code: 'en-gb'
    Country: 'gb'

outputs:

R = 0 (current repetition level)
DocId: 10, R:0, D:0
Links.Backward: NULL, R:0, D:1 (no value defined so D < 2)
Links.Forward: 20, R:0, D:2
R = 1 (we are repeating 'Links.Forward' of level 1)
Links.Forward: 40, R:1, D:2
R = 1 (we are repeating 'Links.Forward' again of level 1)
Links.Forward: 60, R:1, D:2
Back to the root level: R=0
Name.Language.Code: en-us, R:0, D:2
Name.Language.Country: us, R:0, D:3
R = 2 (we are repeating 'Name.Language' of level 2)
Name.Language.Code: en, R:2, D:2
Name.Language.Country: NULL, R:2, D:2 (no value defined so D < 3)
Name.Url: http://A, R:0, D:2
R = 1 (we are repeating 'Name' of level 1)
Name.Language.Code: NULL, R:1, D:1 (Only Name is defined so D = 1)
Name.Language.Country: NULL, R:1, D:1
Name.Url: http://B, R:1, D:2
R = 1 (we are repeating 'Name' again of level 1)
Name.Language.Code: en-gb, R:1, D:2
Name.Language.Country: gb, R:1, D:3
Name.Url: NULL, R:1, D:1

R2

DocId: 20
Links
  Backward: 10
  Backward: 30
  Forward:  80
Name
  Url: 'http://C'

outputs:

DocId: 20, R:0, D:0
Links.Backward: 10, R:0, D:2
Links.Backward: 30, R:1, D:2
Links.Forward: 80, R:0, D:2
Name.Language.Code: NULL, R:0, D:1
Name.Language.Country: NULL, R:0, D:1
Name.Url: http://C, R:0, D:2
Clone this wiki locally