Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Support for e notation using existing parse_decimal in string to decimal conversion #6905

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

himadripal
Copy link
Contributor

@himadripal himadripal commented Dec 19, 2024

Which issue does this PR close?

Closes #. apache/datafusion#10315

Rationale for this change

What changes are included in this PR?

Completed :

  • replaced string to decimal conversion with existing parse_decimal method
  • added rounding logic to existing parse_decimal method
  • fixed the cast error message to precision and scale in the decimal cast error instead of default precision and scale.
  • add rounding logic in parse_e_notation method.Existing function does not round based on scale for eNotations.

Pending :

  • port over parse_string_to_decimal_native and remove this fn

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 19, 2024
@himadripal himadripal changed the title Fix string to decimal e notation Fix: Support for e notation using existing parse_decimal in string to decimal conversion Dec 19, 2024
@himadripal
Copy link
Contributor Author

himadripal commented Dec 19, 2024

@andygrove @viirya @tustvold please take a first look. The one failing test will be fixed once I add the rounding logic in parse-e-notation function.

.and_then(|v| T::validate_decimal_precision(v, precision).map(|_| v))
parse_decimal::<T>(v, precision, scale).map_err(|_| {
ArrowError::CastError(format!(
"Cannot cast string '{}' to decimal type of precision {} and scale {}",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T:DATA_TYPE shows default Decimal(38,10) or Decimal256(76,..) in the error message, hiding the precision and scale provided for cast.

@@ -230,6 +231,7 @@ where
)?))
}

#[allow(dead_code)]
Copy link
Contributor Author

@himadripal himadripal Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails in clippy, hence added #[allow(dead_code)], there is no use, if required we can remove it and cover existing tests with parse_decimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this and port the tests, to ensure we aren't losing test coverage / accidentally changing behaviour

Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a breaking API change, as it changes the rounding behaviour of parse_decimal?

@himadripal
Copy link
Contributor Author

I think this might be a breaking API change, as it changes the rounding behaviour of parse_decimal?

Clippy did not complain and tests are passing, except one which I'm working on - rounding for e-notation. Would any others build task catch it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants