Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(object_store): Add put API for buffered::BufWriter #5835

Merged
merged 6 commits into from
Jun 13, 2024

Conversation

Xuanwo
Copy link
Member

@Xuanwo Xuanwo commented Jun 3, 2024

Which issue does this PR close?

Closes #5834

Rationale for this change

Add buffered::BufUploader to provide buffered support via PayLoad native API.

Add put API for buffered::BufWriter

What changes are included in this PR?

New struct: buffered::BufUploader

buffered::BufWriter now has new async API put(Bytes).

Are there any user-facing changes?

New API: buffered::BufUploader

New API: buffered::BufWriter::put

Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 6, 2024

Kindly cc @alamb and @tustvold for a look, thanks!

@alamb
Copy link
Contributor

alamb commented Jun 7, 2024

I will try and review this PR later today or this weekend. Thanks @Xuanwo

@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 7, 2024

I will try and review this PR later today or this weekend. Thanks @Xuanwo

Thanks a lot! There is no hurry for this PR to get merged since we won't release soon. I just want to know about your plan first.

@tustvold
Copy link
Contributor

tustvold commented Jun 7, 2024

Hi sorry, had a few too many things going on this week. I'll review this on Monday. This seems to have a lot in common with the existing BufWriter and so I wonder if we can combine them somehow

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Xuanwo -- I reviewed this PR and I think it looks very good to me. I have a few small comments on API design, naming and tests but I don't think they are critical

Even if some of this logic is duplicated (as @tustvold mentions in #5835 (comment)) I think as long as we have a good API and a good test, we can refactor to reduce the duplication as a follow on PR if we have time

object_store/src/buffered.rs Outdated Show resolved Hide resolved
object_store/src/buffered.rs Outdated Show resolved Hide resolved
object_store/src/buffered.rs Outdated Show resolved Hide resolved
object_store/src/buffered.rs Show resolved Hide resolved
Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 11, 2024

Hi, @alamb, I have addressed all comments. PTAL, thanks!

@tustvold
Copy link
Contributor

Hi apologies I have been ill the last few days, looking again at this I think we could simply achieve this by adding BufWriter::put with the same signature as BufUploader::put.

In my mind this would have a few key benefits:

  • Would avoid additional cognitive overhead for users wondering which abstraction to use
  • Is consistent with WriteMultipart which has both write and put methods
  • Allows mixing the write and put APIs, e.g. use write for thrift encoding and put for writing parquet pages
  • Already supports all the attributes, tags, etc... Add attributes and tags support for BufUploader #5867
  • Avoids code duplication

What do you think?

@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 12, 2024

Hi apologies I have been ill the last few days,

Wishing you a speedy recovery. Best wishes.

looking again at this I think we could simply achieve this by adding BufWriter::put with the same signature as BufUploader::put.

As long as we can avoid the extra Box::pin(fut), I'm ok with this design. I will take a look.

Xuanwo added 3 commits June 12, 2024 18:53
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 12, 2024

Hi, @tustvold, I have added put for BufWriter directly, please review again.

@Xuanwo Xuanwo changed the title feat(object_store): Add buffered::BufUploader feat(object_store): Add put API for buffered::BufWriter Jun 12, 2024
Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo Xuanwo requested a review from alamb June 12, 2024 11:30
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Xuanwo -- other than the panic this looks good to me

Ok(())
}
BufWriterState::Write(None) | BufWriterState::Flush(_) => {
panic!("Already shut down")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function returns a Result is there any reason to panic here rather than returning Err?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the same behavior in poll_write. It should not happen that put come into this state.

I'm fine if we want to return error in both cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.state = BufWriterState::Write(f.await?.into());
continue;
}
BufWriterState::Buffer(path, b) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i took a look at trying to combine this code with the very similar code in poll_write but I couldn't come up with anything that was particularly good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a bit clumsy to refactor the same code for both poll-based and async-await systems just for 10 lines.

Copy link
Member Author

@Xuanwo Xuanwo Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is I want to avoid the extra cost of Box::pin(fut) by awaiting it in place.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Xuanwo

@Xuanwo
Copy link
Member Author

Xuanwo commented Jun 12, 2024

Thanks @Xuanwo

Thanks a lot for the quick review and nice design from @tustvold

@alamb
Copy link
Contributor

alamb commented Jun 12, 2024

Thanks @Xuanwo

Thanks a lot for the quick review and nice design from @tustvold

I don't know if a week of reviews qualifies as quick -- I hope to do better ;)

@alamb alamb merged commit 601a722 into apache:master Jun 13, 2024
13 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 13, 2024

Thanks again @Xuanwo

@Xuanwo Xuanwo deleted the add-buffered-uploader branch June 13, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
object-store Object Store Interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add BufUploader to implement same feature upon WriteMultipart like BufWriter
3 participants