Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Make AsyncArrowWriter accepts AsyncFileWriter #5753

Merged
merged 1 commit into from
May 14, 2024

Conversation

Xuanwo
Copy link
Member

@Xuanwo Xuanwo commented May 11, 2024

Which issue does this PR close?

Closes #5738

Rationale for this change

This makes AsyncArrowWriter accepts AsyncFileWriter so users can save an extra copy if they have a writer that can accept Bytes directly.

What changes are included in this PR?

A bit change from the proposal but mainly the same.

Are there any user-facing changes?

AsyncArrowWriter now accepts AsyncFileWriter instead of AsyncWrite. But AsyncFileWriter has been implemented for all AsyncWrite, so existing code should be working as before. For example, all testing code is not touched 💌

Maybe we can catch up the arrow 52's train? #5688

@github-actions github-actions bot added the parquet Changes to the parquet crate label May 11, 2024
@Xuanwo Xuanwo force-pushed the add-async-file-writer branch from a680933 to 4d0fe42 Compare May 11, 2024 11:31
///
/// The underlying writer CAN decide to buffer the data or write it immediately.
/// This design allows the writer implementer to control the buffering and I/O scheduling.
fn write(&mut self, bs: Bytes) -> BoxFuture<'_, Result<()>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document atomicity requirements for this method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find current ArrowWriter's design is also not safe to retry. I will update the behavior notes of write to make it more clear and just take the Vec<u8> here to make this PR more simple.

@Xuanwo Xuanwo force-pushed the add-async-file-writer branch from 4d0fe42 to 723515d Compare May 14, 2024 03:38
@Xuanwo
Copy link
Member Author

Xuanwo commented May 14, 2024

Hi @tustvold, PTAL. Thanks!

Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo Xuanwo force-pushed the add-async-file-writer branch from 723515d to 2076737 Compare May 14, 2024 03:40
@tustvold tustvold merged commit d17b206 into apache:master May 14, 2024
16 checks passed
@tustvold
Copy link
Contributor

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proposal: Make AsyncArrowWriter accepts AsyncFileWriter trait instead
2 participants