Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Added support to read parquet files with empty row groups #6183

Merged

Conversation

malhotrashivam
Copy link
Contributor

@malhotrashivam malhotrashivam commented Oct 8, 2024

Closes #6179, #5530

@malhotrashivam malhotrashivam added parquet Related to the Parquet integration NoDocumentationNeeded ReleaseNotesNeeded Release notes are needed labels Oct 8, 2024
@malhotrashivam malhotrashivam added this to the 0.37.0 milestone Oct 8, 2024
@malhotrashivam malhotrashivam self-assigned this Oct 8, 2024
@malhotrashivam malhotrashivam changed the title fix: Added support to read parquet files with empty row groups at end fix: Added support to read parquet files with empty row groups Oct 8, 2024
@@ -197,6 +197,10 @@ private RowSet computeIndex() {

for (int rgi = 0; rgi < rowGroups.length; ++rgi) {
final long subRegionSize = rowGroups[rgi].getNum_rows();
if (subRegionSize == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this address empty row groups in the middle of a file? It sort of looks like it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does, verified by generating such a file, added a test for it.

malhotrashivam added a commit to malhotrashivam/deephaven-core that referenced this pull request Oct 9, 2024
@malhotrashivam
Copy link
Contributor Author

Branch used for generating test data in this PR: https://github.com/malhotrashivam/deephaven-core/tree/sm-ref-branch

@malhotrashivam malhotrashivam merged commit 6612f68 into deephaven:main Oct 9, 2024
17 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Oct 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
NoDocumentationNeeded parquet Related to the Parquet integration ReleaseNotesNeeded Release notes are needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Empty parquet file leads to Barrage issue
2 participants