Skip to content

Commit

Permalink
ORC-1602: [C++] limit compression block size
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
limit compression block size on c++ side.

### Why are the changes needed?
to fix #1727

### How was this patch tested?
UT passed

### Was this patch authored or co-authored using generative AI tooling?
NO

Closes #1779 from ffacs/branch-1.8.

Authored-by: ffacs <ffacs520@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
ffacs authored and dongjoon-hyun committed Feb 3, 2024
1 parent 739476c commit 3c20534
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 1 deletion.
2 changes: 2 additions & 0 deletions c++/include/orc/Writer.hh
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ namespace orc {

/**
* Set the data compression block size.
* Should less then 1 << 23 bytes (8M) which is limited by the
* 3 bytes size of compression block header (1 bit for isOriginal and 23 bits for length)
*/
WriterOptions& setCompressionBlockSize(uint64_t size);

Expand Down
3 changes: 3 additions & 0 deletions c++/src/Writer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@ namespace orc {
}

WriterOptions& WriterOptions::setCompressionBlockSize(uint64_t size) {
if (size >= (1 << 23)) {
throw std::invalid_argument("Compression block size cannot be greater or equal than 8M");
}
privateBits->compressionBlockSize = size;
return *this;
}
Expand Down
12 changes: 11 additions & 1 deletion c++/test/TestWriter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2122,5 +2122,15 @@ namespace orc {
}
}

TEST_P(WriterTest, testValidateOptions) {
WriterOptions options;
constexpr uint64_t compressionBlockSizeThreshold = (1 << 23) - 1;
EXPECT_NO_THROW(options.setCompressionBlockSize(compressionBlockSizeThreshold));
EXPECT_THROW(options.setCompressionBlockSize(compressionBlockSizeThreshold + 1),
std::invalid_argument);
EXPECT_THROW(options.setCompressionBlockSize(compressionBlockSizeThreshold + 2),
std::invalid_argument);
}

INSTANTIATE_TEST_CASE_P(OrcTest, WriterTest, Values(FileVersion::v_0_11(), FileVersion::v_0_12(), FileVersion::UNSTABLE_PRE_2_0()));
}
} // namespace orc

0 comments on commit 3c20534

Please sign in to comment.