Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Report all unsupported operations for a query in cudf.polars #16960

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Oct 1, 2024

Description

Closes #16690. The purpose of this PR is to list all of the unique operations that are unsupported by cudf.polars when running a query.

  1. Question: How to traverse the tree to report the error nodes? Should this be done upstream in Polars?
  2. Instead of traversing the query afterwards, we should probably catch each unsupported feature as we translate the IR.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 added feature request New feature or request 5 - DO NOT MERGE Hold off on merging; see PR for details non-breaking Non-breaking change labels Oct 1, 2024
@Matt711 Matt711 self-assigned this Oct 1, 2024
@github-actions github-actions bot added Python Affects Python cuDF API. cudf.polars Issues specific to cudf.polars labels Oct 1, 2024
python/cudf_polars/cudf_polars/dsl/ir.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/utils/other.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from 0310f26 to 3175a7e Compare October 9, 2024 03:46
@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from 3175a7e to 054b271 Compare October 9, 2024 15:28
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking really nice. Some smaller suggestions and a few small logic fixes

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny suggestions, debug_mode is now gone.

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the logic of assert_ir_translation_raises to integrate with the changes made in this PR. I made the changes because now that we're returning ErrorNodes and ErrorExprs instead of raising exceptions during translation, assert_ir_translation_raises(q, NotImplementedError) fails in a lot of places.

To solve this, I checked that the exception(s) being asserted in q.collect(...) are inside Translation.errors and treated any other exceptions raised during translation as cases where assert_ir_translation_raises fails. I this required me to hard-code a few cases where translation could fail.

Are there other cases I missed where translation could fail? WDYT of the changes @wence-?

python/cudf_polars/tests/test_config.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/testing/asserts.py Outdated Show resolved Hide resolved
@wence-
Copy link
Contributor

wence- commented Oct 30, 2024

@Matt711 This stalled a bit, I think it's a good one to get into 24.12, do you need some help with some bits?

@Matt711
Copy link
Contributor Author

Matt711 commented Oct 31, 2024

@Matt711 This stalled a bit, I think it's a good one to get into 24.12, do you need some help with some bits?

Hey @wence-, that should be doable. The last thing I needed to do with this PR is get test coverage to 100%. I'll address merge conflicts tomorrow, and try to do that too. I'll check in offline if I need help.

@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from ff7f2e1 to 9551c1f Compare October 31, 2024 15:02
Copy link

copy-pr-bot bot commented Oct 31, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @Matt711, I had one suggestion that may help your segfault issue

python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
schema = {k: dtypes.from_polars(v) for k, v in polars_schema.items()}
except Exception as e:
self.errors.append(e)
return ir.ErrorNode({}, str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a tricky one, we want to put schema here, but we didn't manage to make it.

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/testing/asserts.py Outdated Show resolved Hide resolved
@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None:


EXPECTED_FAILURES: Mapping[str, str] = {
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, sorry

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 7, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 7, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 7, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 8, 2024

/ok to test

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changeset looks large, but am I correct that ultimately all that happened is that free functions translate_ir and translate_expr were made methods of the new Translator class so that this class could maintain a list of errors raised in any of those calls during a single traversal? If so, then I've grokked this PR and generally everything looks fine to me and I don't need to review again.

python/cudf_polars/cudf_polars/testing/asserts.py Outdated Show resolved Hide resolved
@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None:


EXPECTED_FAILURES: Mapping[str, str] = {
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still failing? Not sure what the current state is.

@wence-
Copy link
Contributor

wence- commented Nov 8, 2024

The changeset looks large, but am I correct that ultimately all that happened is that free functions translate_ir and translate_expr were made methods of the new Translator class so that this class could maintain a list of errors raised in any of those calls during a single traversal? If so, then I've grokked this PR and generally everything looks fine to me and I don't need to review again.

Basically yes

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 8, 2024

/ok to test

@Matt711 Matt711 requested a review from a team as a code owner November 9, 2024 00:34
@Matt711
Copy link
Contributor Author

Matt711 commented Nov 9, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 11, 2024

/ok to test

@Matt711 Matt711 requested a review from wence- November 11, 2024 22:54
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Matt711, nice work!

@wence-
Copy link
Contributor

wence- commented Nov 12, 2024

/merge

@rapids-bot rapids-bot bot merged commit 043bcbd into rapidsai:branch-24.12 Nov 12, 2024
102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

[FEA] Report all unsupported operations for a query in cudf-polars
4 participants