Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cubesql) Implement format / col_description #8947

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

pauldheinrichs
Copy link
Contributor

@pauldheinrichs pauldheinrichs commented Nov 12, 2024

Semi-resolves: #8926

Hoping to unblock metabase failing field sync on it's latest version. This pull requests implements

  • format macro

It also noops

  • col_description
  • RegClass (for schema.table identifiers)

I don't have a tonne of bandwidth to fully implement the missing macro but this at least helps unblock the current request metabase is making to cube and failing when attempting to sync-fields.

It's not entirely clear if there's already an inherent way to lookup the table OID to get that result from the RegClass result to be leveraged in col_description implementation

  • Tests has been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

Issue Reference this PR resolves

[For example #12]

Description of Changes Made (if issue reference is not provided)

[Description goes here]

@pauldheinrichs pauldheinrichs requested a review from a team as a code owner November 12, 2024 18:15
Copy link

vercel bot commented Nov 12, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

8 Skipped Deployments
Name Status Preview Comments Updated (UTC)
examples-angular-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-d3 ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-dashboard ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-data-table ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-highcharts ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-material-ui ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-react-pivot-table ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm
examples-vue-query-builder ⬜️ Ignored (Inspect) Visit Preview Nov 18, 2024 4:12pm

@github-actions github-actions bot added the pr:community Contribution from Cube.js community members. label Nov 12, 2024
@pauldheinrichs
Copy link
Contributor Author

pauldheinrichs commented Nov 13, 2024

2024-11-13 13:29:54,605 ERROR [Cube SQL Error] {"error": "Error during processing PostgreSQL message: CubeError: error in reqwest: builder error", "appName": "NULL", "protocol": "postgres", "apiType": "sql"}
thread 'main' panicked at cubesql/e2e/tests/postgres.rs:316:25:
unexpected return, simple query must be cancelled, actual: Ok(())

doesn't appear related to my changes, all specs i've added have passed 🤔

@droidraja
Copy link

droidraja commented Nov 15, 2024

2024-11-13 13:29:54,605 ERROR [Cube SQL Error] {"error": "Error during processing PostgreSQL message: CubeError: error in reqwest: builder error", "appName": "NULL", "protocol": "postgres", "apiType": "sql"}
thread 'main' panicked at cubesql/e2e/tests/postgres.rs:316:25:
unexpected return, simple query must be cancelled, actual: Ok(())

doesn't appear related to my changes, all specs i've added have passed 🤔

@pauldheinrichs My PR is also failing with the same set of tests. I was trying to find a PR for which this GHA was succeeding.
#8902
Will be watching this thread for any updates.

@ovr
Copy link
Member

ovr commented Nov 18, 2024

I am not sure, but I prepared a potential fix for the CI issue: b75a37a

@pauldheinrichs Could you please rebase your PR? Thanks!

@ovr ovr self-assigned this Nov 18, 2024
@pauldheinrichs
Copy link
Contributor Author

pauldheinrichs commented Nov 18, 2024

I am not sure, but I prepared a potential fix for the CI issue: b75a37a

@pauldheinrichs Could you please rebase your PR? Thanks!

@ovr All set! Rebased!

Copy link

codecov bot commented Nov 18, 2024

Codecov Report

Attention: Patch coverage is 86.95652% with 33 lines in your changes missing coverage. Please review.

Project coverage is 79.76%. Comparing base (b75a37a) to head (99db541).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
...t/cubesql/cubesql/src/compile/engine/udf/common.rs 73.80% 33 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8947      +/-   ##
==========================================
- Coverage   82.66%   79.76%   -2.90%     
==========================================
  Files         221      221              
  Lines       78303    78545     +242     
==========================================
- Hits        64727    62654    -2073     
- Misses      13576    15891    +2315     
Flag Coverage Δ
cubesql 79.76% <86.95%> (-2.90%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@pauldheinrichs
Copy link
Contributor Author

@ovr Is there anything further you'd like to see in this PR? I'd love to merge this so we can update our metabase instance

Copy link
Member

@mcheshkov mcheshkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for a delay!
Here's what I've noticed. Also you need to update it to fresh master.
Other than that I think it's good to land.

@@ -3148,6 +3155,155 @@ pub fn create_pg_get_serial_sequence_udf() -> ScalarUDF {
)
}

// Return a NOOP for this so metabase can sync without failing
// TODO: Implement this
pub fn create_col_description_udf() -> ScalarUDF {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please attach links to respective parts of PostgreSQL documentation?

Like https://www.postgresql.org/docs/17/functions-info.html#FUNCTIONS-INFO-COMMENT here, and https://www.postgresql.org/docs/17/functions-string.html#FUNCTIONS-STRING-FORMAT for format, maybe other important references.


ScalarUDF::new(
"col_description",
&Signature::one_of(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of this Signature::one_of? It matches single type signature, why not use Signature::exact directly?

ScalarUDF::new(
"col_description",
&Signature::one_of(
vec![TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8])],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

col_description is documented as ( table oid, column integer ) → text,
so I expect that argument types here would be ints, not strings.

You can check against pg_description system table provider for this
https://github.com/cube-js/cube/blob/master/rust/cubesql/cubesql/src/compile/engine/information_schema/postgres/pg_description.rs

First argument would be used to filter objoid, and second - objsubid

as_str
)))
// If the type name contains a dot, it's a schema-qualified name
// and we should return return the approprate RegClass to be converted to OID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: return return

)))
// If the type name contains a dot, it's a schema-qualified name
// and we should return return the approprate RegClass to be converted to OID
// For now, we'll return 0 so metabase can sync without failing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add TODO here along the lines of "implement something"

"col_description",
&Signature::one_of(
vec![TypeSignature::Exact(vec![DataType::Utf8, DataType::Utf8])],
Volatility::Immutable,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

col_description is not immutable, as it can change values for same inputs in a different queries: column comment can change between queries.
But it cannot change values for a single query (I think, assuming PostgreSQL isolation), so it should be Volatility::Stable.


ScalarUDF::new(
"format",
&Signature::variadic(vec![DataType::Utf8], Volatility::Immutable),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format is documented like this
format(formatstr text [, formatarg "any" [, ...] ])

Signature::variadic([Utf8], ...) will accept any number of arguments, but each argument must be one of [Utf8]

Unfortunately, there's no proper way to represent it in TypeSignature right now. But current variant would just cast everything to Utf8 before calling format, which is ok for now, I think. More recent DataFusion versions have extended TypeSignature to support this case, so we can deal with this later.

Please comment this part: about casting and TODO about incorrect signature

// Quote identifier if necessary
let needs_quoting = !value
.chars()
.all(|c| c.is_ascii_lowercase() || c.is_ascii_digit() || c == '_')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit wrong, because identifiers 1 and 1a need quoting, but a1 - does not.

+----------------+
| quoted_keyword |
+----------------+
| select |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL actually wraps keywords with quotes, and accepts keywords unwrapped only in assigning column alias position, but does not accept those in subquery alias or in column expression.

Maybe just wrap any %I argument with quotes? I can't imagine right now what could go wrong, in a sense that format argument can already be quoted in result, so any consumer should handle it as well. That would differ from PostgreSQL behavior, of course. But current implementation also differs, and "quote everything" approach have a benefit of something like "any %I substitution would make a proper SQL".

@pheinrichs
Copy link

Sorry for a delay! Here's what I've noticed. Also you need to update it to fresh master. Other than that I think it's good to land.

@mcheshkov
Sorry - I no longer have access to this stuff so won’t be able to make the requested changes.

@mcheshkov
Copy link
Member

@pauldheinrichs, hope you are doing well!
Do you mind if I pick it up from your branch?

@pheinrichs
Copy link

@pauldheinrichs, hope you are doing well! Do you mind if I pick it up from your branch?

@mcheshkov of course! Feel free

mcheshkov added a commit that referenced this pull request Jan 16, 2025
* Adds `format` function, only `%I` placeholder implemented
* Adds `col_description` stub implementation, always return empty string
*Add stub for regclass casting of schema qualified type names, always returns 0

Picked up from #8947, addresses #8926

---------

Co-authored-by: Paul Heinrichs <paul.heinrichs@vidyard.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:community Contribution from Cube.js community members.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Metabase] Unable to sync fields on latest version on metabase v0.51.2
5 participants