Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for parquet_writer_version session property #11151

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

svm1
Copy link
Collaborator

@svm1 svm1 commented Oct 2, 2024

Allow the Presto session property parquet_writer_version, which is currently ignored by Velox, to toggle the parquet writer datapage version (V1 or V2). The value can be set as a session property or can be provided in the Hive config. Defaults to V2.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
Copy link

netlify bot commented Oct 2, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 1807de1
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67945279a6e4e800087ab05d

velox/dwio/parquet/writer/Writer.h Outdated Show resolved Hide resolved
velox/dwio/parquet/writer/Writer.cpp Outdated Show resolved Hide resolved
velox/connectors/hive/HiveConnectorUtil.cpp Outdated Show resolved Hide resolved
velox/dwio/parquet/writer/Writer.h Outdated Show resolved Hide resolved
@svm1 svm1 force-pushed the parq_2 branch 2 times, most recently from 40a6c6c to 4982e1f Compare November 26, 2024 09:47
@svm1
Copy link
Collaborator Author

svm1 commented Nov 26, 2024

@yingsu00 @majetideepak Thank you for reviewing - made all necessary changes, please take a look!

@@ -921,6 +921,21 @@ std::optional<std::string> getTimestampTimeZone(
return std::nullopt;
}

parquet::WriterOptions::DataPageVersion getParquetDataPageVersion(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this file have to move to the parquet::WriterOptions inside parquet/writer/Writer.h

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

velox/connectors/hive/HiveConnectorUtil.cpp Outdated Show resolved Hide resolved
// Set parquet datapage version and write data - then read to ensure the
// property took effect.
const auto testDataPageVersion =
[&](facebook::velox::parquet::WriterOptions::DataPageVersion
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove facebook::velox:: here and at other places.

velox/dwio/parquet/writer/Writer.h Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants