Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc gen: Migrate aggregate functions doc to attribute based. #13646

Merged
merged 5 commits into from
Dec 5, 2024

Conversation

comphead
Copy link
Contributor

@comphead comphead commented Dec 4, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

  • Migrate aggregate functions doc to attribute based
  • With migration some documentation bugs fixed

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added documentation Improvements or additions to documentation core Core DataFusion crate functions labels Dec 4, 2024
@@ -281,11 +281,11 @@ first_value(expression [ORDER BY expression])
#### Example

```sql
> SELECT first_value(column_name ORDER BY other_column) FROM table_name;
> SELECT last_value(column_name ORDER BY other_column) FROM table_name;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the documentation fixed now.
There is an existing bug when documentation lock with the same name covers 2 or more doc functions in the same file. In this scenario documentation will pick the first available doc description which is unexpected. The migration to macros fixes the problem as it uses separate lock per function

//
// let file_path = format!("{}.txt", name);
// if std::path::Path::new(&file_path).exists() {
// std::fs::remove_file(&file_path).unwrap();
Copy link
Contributor Author

@comphead comphead Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of leaving it for now as this part generates text files with attribute text so its easier to do a migration for remaning parts(windows/builtin udfs), semi automatic way instead of manual

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just add a small comment noting this is useful for migration purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree a comment explaining what this is used for would be most helpful

@comphead
Copy link
Contributor Author

comphead commented Dec 4, 2024

@Omega359 I just moved aggregate functions to attribute based approach which also fixes an existing bug. Please take a look.

@alamb I'm planning to create a small epic to migrate other functions to attribute based with details

@Omega359
Copy link
Contributor

Omega359 commented Dec 4, 2024

I'll do my best to look at this tonight or tomorrow but I'm juggling a lot atm. I am on vacation next week and I hope to be able to work on my backlog a bit then which currently stands at:

  • arrow bulk delete issue - Object store failing during bulk delete in S3 arrow-rs#6483 - write reproducer, look into fix
  • Another attempt for adding SessionConfig to ScalarFunctionArgs
    • timestamp functions use DF timezone (various tickets)
  • regexp_match - fix for utf8view (arrow-rs)
  • file issue to explain how to generate code coverage
  • union_by_name - finish or redo impl - current test impl unlikely to work
  • sqllogictests - augment with sqllite (currently working on this)
  • sqllogictests - see if some of the files can be refactored to be 'cleaner' (single area of test per file)
  • sqllogictests - see if it's possible to incorporate duckdb tests (first check license)
  • sqlancer - extend for unions, see if it handles different string types (utf8, largeutf8, utf8view), push down filtering. See what it would take to impl a DQP oracle
  • look at trying stringzilla for contains in arrow-rs to see if it improves perf enough to be worth it
  • review doc changes for aggregate functions

Copy link
Contributor

@Omega359 Omega359 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me - thx for spending the time to push through this improvement!

//
// let file_path = format!("{}.txt", name);
// if std::path::Path::new(&file_path).exists() {
// std::fs::remove_file(&file_path).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just add a small comment noting this is useful for migration purposes.

.unwrap_or_default(),
);

let st_arg_token = " expression to operate on. Can be a constant, column, or function, and any combination of operators.";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be extracted to a constant and that used instead of having text copies of the string in 2 places.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @comphead and @Omega359

I also took a good look at this and think it looks quite nice 👌

//
// let file_path = format!("{}.txt", name);
// if std::path::Path::new(&file_path).exists() {
// std::fs::remove_file(&file_path).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree a comment explaining what this is used for would be most helpful

@@ -63,6 +63,88 @@ impl Documentation {
) -> DocumentationBuilder {
DocumentationBuilder::new(doc_section, description, syntax_example)
}

/// Output the `Documentation` struct in form of custom Rust documentation attributes
pub fn to_doc_attribute(&self) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would also help helpful here to note this is to help migration to doc comment attributes

@comphead
Copy link
Contributor Author

comphead commented Dec 5, 2024

Thanks @alamb and @Omega359
I will create 2 more tickets to cover other functions taking this PR as an example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate documentation Improvements or additions to documentation functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants