Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Remove dropped columns when running REORG PURGE #3371

Conversation

xzhseh
Copy link
Contributor

@xzhseh xzhseh commented Jul 12, 2024

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

According to #3228, we add the support to also find and remove dropped columns when running REORG PURGE.

Close #3228.

How was this patch tested?

Through unit test in DeltaReorgSuite.scala.

Does this PR introduce any user-facing changes?

No.

@xzhseh
Copy link
Contributor Author

xzhseh commented Jul 12, 2024

Hi @johanl-db, could you help review this pr? Thanks!

@xzhseh xzhseh requested a review from johanl-db July 15, 2024 20:52
@allisonport-db allisonport-db merged commit 0e45ad2 into delta-io:master Jul 18, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request][Spark] Remove dropped columns from Parquet files in REORG TABLE (PURGE)
3 participants