Skip to content

Commit

Permalink
Improve protocol regarding tightBounds
Browse files Browse the repository at this point in the history
  • Loading branch information
andreaschat-db committed Jun 12, 2024
1 parent d678bae commit f37614e
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion PROTOCOL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1708,7 +1708,7 @@ numRecords | The number of records in this data file.
tightBounds | Whether per-column statistics are currently **tight** or **wide** (see below).

For any logical file where `deletionVector` is not `null`, the `numRecords` statistic *must* be present and accurate. That is, it must equal the number of records in the data file, not the valid records in the logical file.
In the presence of [Deletion Vectors](#Deletion-Vectors) the statistics may be somewhat outdated, i.e. not reflecting deleted rows yet. The flag `stats.tightBounds` indicates whether we have **tight bounds** (i.e. the min/maxValue exists[^1] in the valid state of the file) or **wide bounds** (i.e. the minValue is <= all valid values in the file, and the maxValue >= all valid values in the file). These upper/lower bounds are sufficient information for data skipping.
In the presence of [Deletion Vectors](#Deletion-Vectors) the statistics may be somewhat outdated, i.e. not reflecting deleted rows yet. The flag `stats.tightBounds` indicates whether we have **tight bounds** (i.e. the min/maxValue exists[^1] in the valid state of the file) or **wide bounds** (i.e. the minValue is <= all valid values in the file, and the maxValue >= all valid values in the file). These upper/lower bounds are sufficient information for data skipping. Note, `stats.tightBounds` is evaluated to `true` when it is not present in the statistics.

Per-column statistics record information for each column in the file and they are encoded, mirroring the schema of the actual data.
For example, given the following data schema:
Expand Down

0 comments on commit f37614e

Please sign in to comment.