You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug, including details regarding any error messages, version, and platform.
As part of adding Parquet encryption to arrow-rs (apache/arrow-rs#6637), @rok and I found that arrow-rs could not read the example files in parquet-testing due to invalid repetition levels. arrow-rs complains that:
Parquet error: first repetition level of batch must be 0
This is due to the int64 list column data being written with the repetition levels flipped, 0 should indicate the start of a new list but 1 is used:
Related to this, is it also a bug that Arrow would read these files without complaining? If I test reading one of these files into Arrow format with PyArrow, the first leaf value is skipped.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered:
Describe the bug, including details regarding any error messages, version, and platform.
As part of adding Parquet encryption to arrow-rs (apache/arrow-rs#6637), @rok and I found that arrow-rs could not read the example files in parquet-testing due to invalid repetition levels. arrow-rs complains that:
This is due to the int64 list column data being written with the repetition levels flipped, 0 should indicate the start of a new list but 1 is used:
arrow/cpp/src/parquet/encryption/test_encryption_util.cc
Line 121 in b655852
Related to this, is it also a bug that Arrow would read these files without complaining? If I test reading one of these files into Arrow format with PyArrow, the first leaf value is skipped.
Component(s)
C++, Parquet
The text was updated successfully, but these errors were encountered: