Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data loader core key and column utils #1771

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

ypeckstadt
Copy link
Contributor

@ypeckstadt ypeckstadt commented May 26, 2024

Description

Later in the code for the export and import managers, we need a lot of utils that help us deal with working and converting ScalarDB keys and columns. This PR adds the basic version of the utils class, allowing us to convert keys based on CLI input in the following PR. More methods will be added to these utils later.

Related issues and/or PRs

This PR depends on the following PRs and should be reviewed once merged.

Changes made

  • Added KeyUtils class
  • Added ColumnUtils class
  • Added exceptions thrown by the utils
  • Added unit test for both utils
  • Added error messages to the ScalarDB CoreError

Checklist

The following is a best-effort checklist. If any items in this checklist do not apply to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

NA

Release notes

NA

DataType columnDataType = tableMetadata.getColumnDataType(columnName);
if (columnDataType == null) {
throw new KeyParsingException(
CoreError.DATA_LOADER_INVALID_COLUMN_KEY_PARSING_FAILED.buildMessage(columnName));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what if there are 3 tables that have name column and this exception is thrown for the second table. Users may want to know which table...? What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the method and now passing in the table name as well. I had to add it because the table name is not part of the TableMetadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ypeckstadt Thanks for improving the error message!

But, sorry, I should've noticed this when posting the previous comment, other error messages (e.g., DATA_LOADER_INVALID_COLUMN_NON_EXISTENT) have the same issue. Also, the namespace name should be added as well. I think we have 2 options as follows:

  1. Add namespace and table names to all the error messages if needed
  2. The class is a util class for keys, so it may be okay to include only key name in the error messages from KeyUtils. Instead, we'll catch exceptions from KeyUtils and wrap it with the namespace and table names like this:
  try {
    :
    // This exception doesn't contain the namespace or table name.
    Key key = KeyUtils.parseKeyValue(columnKeyValue, tableMetadata);
  } catch (KeyParsingException e) {
    // Add the namespace and table names information.
    throw new KeyParsingException(
       CoreError.DATA_LOADER_XXXX_FAILED.buildMessage(namespaceName, tableName), e);
  }

I think either is fine. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made changes based on the above feedback. I went for option 1 as it seems like the safest solution. Option 2 is good as well, but then it shifts the responsibility for proper exception handling and logging to all classes that use the Key Utils. So it creates the risk that some classes do it and some don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement!

It looks like there is room to improve the other error messages from the same perspective (i.e., missing information about namespace and table). What do you think?

Copy link
Contributor Author

@ypeckstadt ypeckstadt Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@komamitsu My apologies for the late reply. (had to focus on ScalarFlow more). I have updated the PR and ensured each exception message includes the namespace and table name.

To avoid having to pass in all 3 with all methods, I have added a wrapper class called ColumnInfo that includes all 3 fields.

@ypeckstadt ypeckstadt requested a review from komamitsu May 28, 2024 23:21
Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left several minor comments. Please take a look when you have time!

* @return ScalarDB column
* @throws Base64Exception if an error occurs while base64 decoding
*/
public static Column<?> createColumnFromValue(DataType dataType, String columnName, String value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @Nullable for nullable parameters would be helpful:

Suggested change
public static Column<?> createColumnFromValue(DataType dataType, String columnName, String value)
public static Column<?> createColumnFromValue(DataType dataType, String columnName, Nullable String value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Comment on lines 66 to 72
} catch (NumberFormatException e) {
throw new NumberFormatException(
CoreError.DATA_LOADER_INVALID_NUMBER_FORMAT_FOR_COLUMN_VALUE.buildMessage(columnName));
} catch (IllegalArgumentException e) {
throw new Base64Exception(
CoreError.DATA_LOADER_INVALID_BASE64_ENCODING_FOR_COLUMN_VALUE.buildMessage(columnName));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should set the cause of the exceptions:

Suggested change
} catch (NumberFormatException e) {
throw new NumberFormatException(
CoreError.DATA_LOADER_INVALID_NUMBER_FORMAT_FOR_COLUMN_VALUE.buildMessage(columnName));
} catch (IllegalArgumentException e) {
throw new Base64Exception(
CoreError.DATA_LOADER_INVALID_BASE64_ENCODING_FOR_COLUMN_VALUE.buildMessage(columnName));
}
} catch (NumberFormatException e) {
throw new NumberFormatException(
CoreError.DATA_LOADER_INVALID_NUMBER_FORMAT_FOR_COLUMN_VALUE.buildMessage(columnName), e);
} catch (IllegalArgumentException e) {
throw new Base64Exception(
CoreError.DATA_LOADER_INVALID_BASE64_ENCODING_FOR_COLUMN_VALUE.buildMessage(columnName), e);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added. Although to fix this I had to revise the used exceptions a bit.

Comment on lines 46 to 48
throw new KeyParsingException(
CoreError.DATA_LOADER_INVALID_VALUE_KEY_PARSING_FAILED.buildMessage(
columnKeyValue.getColumnValue(), tableName, e.getMessage()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Suggested change
throw new KeyParsingException(
CoreError.DATA_LOADER_INVALID_VALUE_KEY_PARSING_FAILED.buildMessage(
columnKeyValue.getColumnValue(), tableName, e.getMessage()));
throw new KeyParsingException(
CoreError.DATA_LOADER_INVALID_VALUE_KEY_PARSING_FAILED.buildMessage(
columnKeyValue.getColumnValue(), tableName, e.getMessage()), e);

Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@ypeckstadt ypeckstadt requested a review from brfrn169 June 10, 2024 05:51
"Invalid key: Column %s does not exist in the table %s in namespace %s.",
"",
""),
DATA_LOADER_INVALID_VALUE_KEY_PARSING_FAILED(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. It has become unused due to refactoring the error handling based on Toshi's feedback. I have removed it now.

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left several very minor comments. Other than that, LGTM! Thank you!

@@ -1,9 +1,11 @@
subprojects {
group = "scalardb.dataloader"


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary empty line?

@@ -23,5 +25,12 @@ subprojects {
testImplementation "org.mockito:mockito-core:${mockitoVersion}"
testImplementation "org.mockito:mockito-inline:${mockitoVersion}"
testImplementation "org.mockito:mockito-junit-jupiter:${mockitoVersion}"

Copy link
Collaborator

@brfrn169 brfrn169 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. Unnecessary empty line?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants