Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Add coordinated commits interfaces and table properties #3370

Merged
merged 1 commit into from
Jul 19, 2024

Conversation

EstherBear
Copy link
Contributor

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Add coordinated commits interfaces and related table properties in Kernel to prepare for Kernel read and write supported by coordinated commits.

How was this patch tested?

Unit tests

Does this PR introduce any user-facing changes?

Yes. User can define and register their own commit coordinator builder and get the corresponding commit coordinator client handler from engine.

@EstherBear EstherBear force-pushed the cc-interface branch 2 times, most recently from 37b3148 to 210a48f Compare July 12, 2024 23:04
build.sbt Outdated Show resolved Hide resolved
@@ -201,4 +260,35 @@ private void validate(String value) {
private static void addConfig(HashMap<String, TableConfig<?>> configs, TableConfig<?> config) {
configs.put(config.getKey().toLowerCase(Locale.ROOT), config);
}

private static Map<String, String> parseStringToJsonMap(String jsonString) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add some tests?

  1. A few happy path cases
  2. Handling of escaped characters
  3. Test the two areas where you throw exception
    I would recommend creating some complex static json with unicode characters using a standard serializer and then deserializing the static json in a test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, and this can be its own separate PR with the design decision doc attached to it.

/*
* This table property is used to track the table managed by commit-coordinator internally.
*/
public static final TableConfig<Map<String, String>> COORDINATED_COMMITS_TABLE_CONF =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vkorukanti should a new type be created here that encapsulates Map<String, String> or is this okay?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok.

/*
* This table property is used to track the table managed by commit-coordinator internally.
*/
public static final TableConfig<Map<String, String>> COORDINATED_COMMITS_TABLE_CONF =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok.

@@ -201,4 +260,35 @@ private void validate(String value) {
private static void addConfig(HashMap<String, TableConfig<?>> configs, TableConfig<?> config) {
configs.put(config.getKey().toLowerCase(Locale.ROOT), config);
}

private static Map<String, String> parseStringToJsonMap(String jsonString) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, and this can be its own separate PR with the design decision doc attached to it.

@EstherBear EstherBear force-pushed the cc-interface branch 2 times, most recently from d84b28c to 8c06239 Compare July 19, 2024 02:34
Copy link
Collaborator

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for just posting the interfaces. Easy to read and understand.

import io.delta.kernel.engine.coordinatedcommits.actions.AbstractProtocol;
import io.delta.kernel.utils.CloseableIterator;

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general comment to all APIs in this interface: what exception is thrown if the table is not managed by a coordinated commit?

Copy link
Contributor Author

@EstherBear EstherBear Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the commit coordinator assumes that the logPath passed as an argument represents a table managed by the commit coordinator. For example, the InMemoryCommitCoordinator in Delta Spark doesn't throw a specific exception for this case (will just throw a NullPointerException). So the user should ensure that they pass a correct logPath.
@dhruvarya-db Do I understand correctly?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that there has been any standardization around that. @prakharjain09 any idea?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No standardization for now.

Comment on lines 142 to 151
/**
* Determines whether this CommitCoordinatorClientHandler is semantically equal to another
* CommitCoordinatorClientHandler.
* <p>
* Semantic equality is determined by each CommitCoordinatorClientHandler implementation based
* on whether the two instances can be used interchangeably when invoking any of the
* CommitCoordinatorClientHandler APIs, such as {@link #commit}, {@link #getCommits}, etc. For
* example, both instances might be pointing to the same underlying endpoint.
*/
Boolean semanticEquals(CommitCoordinatorClientHandler other);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why Boolean and not boolean?

Copy link
Contributor Author

@EstherBear EstherBear Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think boolean should work.

Copy link
Contributor

@raveeram-db raveeram-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for refactoring these out, very clear!

@EstherBear EstherBear force-pushed the cc-interface branch 2 times, most recently from b7fa65a to 9ed9ced Compare July 19, 2024 10:45
Copy link
Collaborator

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vkorukanti vkorukanti merged commit 589caba into delta-io:master Jul 19, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants