-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Iceberg 1.7.0 #442
base: main
Are you sure you want to change the base?
Conversation
ce63adf
to
42fa2ad
Compare
propertiesWithS3CustomizedClientFactory.put( | ||
S3FileIOProperties.CLIENT_FACTORY, PolarisS3FileIOClientFactory.class.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer required because of this : https://github.com/apache/iceberg/pull/11259/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might have to mark this deprecated first since it's a public class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was only ever used for this purpose, so we're probably fine
42fa2ad
to
de8eed5
Compare
LGTM with a minor style issue:
|
@@ -277,6 +277,7 @@ commons-collections:commons-collections | |||
commons-io:commons-io | |||
commons-logging:commons-logging | |||
commons-net:commons-net | |||
dev.failsafe:failsafe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is it a new transitive dependency brought by iceberg 1.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, it's added as part of AWS S3InputStream retry handling via : https://github.com/apache/iceberg/pull/10433/files
checking test failure |
on debugging it, looks like we need to handle this change cleanly [1], since polaris supports more endpoint than default impl, [1] apache/iceberg@a2b8008#diff-86450612dbe323d6d06cbc3846aa1913f042eaedadc0ca027c36bfbe08d3a46c |
@@ -341,7 +341,7 @@ public void testIcebergListNamespacesNestedNotFound() throws IOException { | |||
sessionCatalog.listNamespaces( | |||
sessionContext, Namespace.of("top_level", "whoops"))) | |||
.isInstanceOf(NoSuchNamespaceException.class) | |||
.hasMessage("Namespace does not exist: top_level.whoops"); | |||
.hasMessage("Namespace does not exist: top_level%1Fwhoops"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found this change : apache/iceberg@5fc1413 looks like we changes the seperator,
pending to make it configurable apache/iceberg#10877
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no more intent to make the separator configurable. At least, that's the latest I know. There was a big discussion around this topic in August this year.
IMHO, making the separator configurable makes things overly complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the response in the error message now? I think the dot in the error is much more user friendly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, as Iceberg still uses dot to join different level of namespaces, https://github.com/apache/iceberg/blob/09634857e4a1333f5dc742d1dca3921e9a9f62dd/api/src/main/java/org/apache/iceberg/catalog/Namespace.java#L97-L97.
In this case, Namespace.of("top_level", "whoops")
should be referred to top_level.whoops
, unless the rest util somehow squash two levels together with the "%1F".
@singhpk234 , could you take a look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops looks like it was a style miss : https://github.com/apache/polaris/actions/runs/11859822454/job/33055322598
Here are the local results :
error : org.apache.iceberg.exceptions.NoSuchNamespaceException: Namespace does not exist: ns1?ns1a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the revert PR (apache/iceberg#11574) is targeting Iceberg 1.7.1. Should we wait for the 1.7.1 release to avoid unnecessary back-and-forth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I think it's fair to wait as we not are chasing a release time in Polaris and 1.7.1 should be fast, I saw an RC is cut already apache/iceberg#11593
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me see, when the pr get merged and we have nightly artifact published, I can try reverting my change to correct the test and run our polaris suite with that ? wdyt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can confirm its passing now, I have pointed to 1.8.0 artifact (can't find 1.7.1 in nightly) from nigthly, will change this 1.7.1 when it get released
@flyrain @singhpk234 should we update iceberg 1.7.0 dependencies for "getting start" examples as well? Most of them are using 1.5.2 iceberg runtime. I can take this if not already done. |
Note: Today I realized that there are two "special" changes in Iceberg/Java 1.7.0:
|
This is due to the new Object Storage v3 changes introduced in the 1.7 release. However, since Polaris doesn’t currently support the hash prefix before the table path, this update should not impact functionality as far as I understand. Let me know if I’m missing any nuances here!
#383 will resolve this issue, so let's prioritize it. In the short term, it should be manageable since most REST clients haven’t yet adopted the latest 1.7 release. |
For Polaris it's the amount of combinations that need to be supported: no object-storage prefix, the old and the new one. For GCS it's easier, because Google accepts regular expressions - S3 however does not.
|
+1, @MonkeyCanCode! Thanks for picking it up as a follow-up. |
Anytime. Will PR this once current PR is merged. Then i can build it locally for both client (spark) and server (polaris). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 Thanks @singhpk234 for working on it. Hi @dennishuo @collado-mike @eric-maynard, could you take a look as well?
I will merge this by EOD if there is no objection. |
Here is a PR for the 2 missing changes: #447 |
95c4b4f
to
7bfa42d
Compare
ccbaa3f
to
3bcf63d
Compare
@@ -19,7 +19,7 @@ | |||
|
|||
[versions] | |||
hadoop = "3.4.0" | |||
iceberg = "1.6.1" | |||
iceberg = "1.8.0-SNAPSHOT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to 1.7.1 when released please ref this discussion thread #442 (comment)
@@ -69,6 +69,7 @@ dependencyResolutionManagement { | |||
repositories { | |||
mavenCentral() | |||
gradlePluginPortal() | |||
maven(url = "https://repository.apache.org/content/repositories/snapshots") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this when 1.7.1 is release please ref this thread : #442 (comment)
Took this branch and ran with 1.7.1-RC1 of Iceberg. All tests passed |
Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.
Upgrade Apache Iceberg to 1.7.0
Additionally :
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Test Configuration:
Checklist:
Please delete options that are not relevant.