Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-1613: Zstd decompression supports direct buffer #1789

Closed
wants to merge 1 commit into from

Conversation

cxzl25
Copy link
Contributor

@cxzl25 cxzl25 commented Feb 7, 2024

What changes were proposed in this pull request?

ZstdCodec implements the DirectDecompressionCodec interface.

Why are the changes needed?

zstd-jni supports direct buffer decompression, which can reduce Buffer copying.

How was this patch tested?

add UT

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the JAVA label Feb 7, 2024
@@ -214,6 +214,11 @@ public boolean compress(ByteBuffer in, ByteBuffer out,

@Override
public void decompress(ByteBuffer in, ByteBuffer out) throws IOException {
if (in.isDirect() && out.isDirect()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question. When is this used in Apache ORC code base?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use zerocopy orc.use.zerocopy=true, we may use the direct buffer, which can fallback to the original way of reading if the codec doesn't implement DirectDecompressionCodec.

static HadoopShims.ZeroCopyReaderShim createZeroCopyShim(FSDataInputStream file,
CompressionCodec codec, ByteBufferAllocatorPool pool) throws IOException {
if ((codec == null || ((codec instanceof DirectDecompressionCodec) &&
((DirectDecompressionCodec) codec).isAvailable()))) {
/* codec is null or is available */
return SHIMS.getZeroCopyReader(file, pool);
}
return null;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (with one question).

@dongjoon-hyun dongjoon-hyun added this to the 2.0.0 milestone Feb 7, 2024
dongjoon-hyun pushed a commit that referenced this pull request Feb 8, 2024
### What changes were proposed in this pull request?
`ZstdCodec` implements the `DirectDecompressionCodec` interface.

### Why are the changes needed?
`zstd-jni` supports direct buffer decompression, which can reduce Buffer copying.

### How was this patch tested?
add UT

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #1789 from cxzl25/ORC-1613.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 25beaa2)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to main/2.0. Thank you, @cxzl25 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants