Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Specific builder for typed builder trait and easier to use for lists #6863

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

rluvaton
Copy link
Contributor

@rluvaton rluvaton commented Dec 10, 2024

Which issue does this PR close?

This doesn't close #6846 but it help get there

Rationale for this change

It's really hard to build nested lists, hopefully this makes it easier

What changes are included in this PR?

Created SpecificArrayBuilder and implement it for some of the builders, I was not able to implement it for all the builders

I implement for the following:

  1. boolean_builder
  2. fixed_size_binary_builder
  3. fixed_size_list_builder
  4. generic_bytes_builder
  5. generic_bytes_view_builder
  6. generic_list_builder
  7. null_builder
  8. primitive_builder

Are there any user-facing changes?

Yes, before documentation I would like to get your feedback

Possible breaking changes as now there are 2 functions with the same name so users will be prompted to select the trait to use (I think)

TODOs

  • Remove todos
  • Add documentation
  • Implement for more before merging?
  • Add more tests

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 10, 2024
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, a lot of this functionality already exists

I wonder if you've given thought to a TypedListArray abstraction much like we have for DictionaryArray, ultimately the issue boils down to it being fiddly to iterate a ListArray, it isn't really a limitation of the builders

@@ -352,6 +352,18 @@ impl ArrayAccessor for &BooleanArray {
}
}

impl ArrayAccessor for BooleanArray {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary as this is already implemented for &BooleanArray

@@ -219,6 +219,49 @@ impl ArrayBuilder for BooleanBuilder {
}
}


impl SpecificArrayBuilder for BooleanBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality of this trait seems to overlap with the existing extend functionality, I'm not sure what it is adding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing, the only thing it add is for the list builder to use those functions as it expect a SpecificArrayBuilder trait

if output.is_null(i) {
self.append_null();
} else {
let current_value = output.value(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance of this will be very suboptimal, it would be better to iterate the offsets directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I did not try to make it fast now, just to make it work as a proof of concept

@rluvaton
Copy link
Contributor Author

I'm not sure about this, a lot of this functionality already exists

I wonder if you've given thought to a TypedListArray abstraction much like we have for DictionaryArray, ultimately the issue boils down to it being fiddly to iterate a ListArray, it isn't really a limitation of the builders

I think typed list array would be a great solution, my codebase knowledge is very limited

@rluvaton
Copy link
Contributor Author

@tustvold I was unable to add FromIterator for TypedListArray or ListArray (FYI TypedDictionaryArray also does not have this function, DictionaryArray has this function but only for string keys which does not help me)

Also my solution does not work when creating the builder using make_builder as I can't implement the SpecificArrayBuilder for Box<dyn ...>

@tustvold
Copy link
Contributor

ListBuilder already implements Extend, and adding FromIterator should be a trivial extension.

The issue from your example concerns being able to extract an iterator of values from a list array, this will require adding a TypedListArray abstraction that can downcast the type-erased ListArray. I can try to find some time to write this up more properly over the next week or so

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants