Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose the null buffer of every builder that has one #5754

Merged
merged 1 commit into from
May 13, 2024
Merged

Expose the null buffer of every builder that has one #5754

merged 1 commit into from
May 13, 2024

Conversation

HadrienG2
Copy link
Contributor

Which issue does this PR close?

Closes #5749.

Rationale for this change

This makes the interface of different builder types more consistent, and enables exposing builder contents as a slice in more cases in my prototype for #5700.

What changes are included in this PR?

Overall, my general process was to look at what append_null()/append(false) pushes the null into, and expose the corresponding null buffer.

  • For each builder which has an inner null buffer builder, it is exposed as already done in BooleanBuilder and PrimitiveBuilder.
  • For builders that use a dictionary-like (key, value) layout, the null buffer of keys is treated as the overall null buffer of the builder, following the example of append_null().

This only leaves the following builders without a null buffer accessor after this PR:

  • XyzRunBuilder: For these builder types, the null buffer is per-run, not per-element. It is not yet clear to me if/how that should be exposed, so I'm leaving this for a future PR.
  • UnionBuilder: For this builder type, each variant has its own null buffer, so there is no notion of builder-wide null buffer. This would be best handled by exposing some sort of access to the storage of individual variants, but again, this requires more design work, so I'm leaving it for a future PR.

Are there any user-facing changes?

More builders expose a validity_slice(&self) -> Option<&[u8]> method. Since the semantics are identical to pre-existing methods from other builders with this name, the documentation was copy-pasted from there.

@github-actions github-actions bot added the arrow Changes to the arrow crate label May 11, 2024
@tustvold tustvold merged commit 7d465b8 into apache:master May 13, 2024
24 of 25 checks passed
@HadrienG2 HadrienG2 deleted the expose-null-buffers branch May 16, 2024 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Why are null buffers / validity slices exposed by some builders, but not others?
2 participants