Skip to content

Commit

Permalink
fix: Make TPCH dbgen text buffer size consistent with Presto Java (#1…
Browse files Browse the repository at this point in the history
…2169)

Summary:
Changed text buffer size to be 300 MB for Velox's dbgen to match with Java Presto TPCH dbgen's text buffer size. The text buffer size is used in randomly generating offset and length to grab a chunk from the overall text for each row. This fixed the difference in the comment column for the tables in TPCH.

Java:
https://github.com/trinodb/tpch/blob/master/src/main/java/io/trino/tpch/TextPool.java#L35
```
private static final int DEFAULT_TEXT_POOL_SIZE = 300 * 1024 * 1024;
```

C++:
https://github.com/facebookincubator/velox/blob/main/velox/tpch/gen/DBGenIterator.cpp#L40
```
load_dists(
        10 * 1024 * 1024, &dbgenCtx); // 10 MB buffer size for text generation.
```

Resolves: prestodb/presto#24011

Pull Request resolved: #12169

Reviewed By: amitkdutta

Differential Revision: D68653706

Pulled By: xiaoxmeng

fbshipit-source-id: 635cc572bc79c33662e26124589992bcf6962830
  • Loading branch information
minhancao authored and facebook-github-bot committed Jan 25, 2025
1 parent cac8000 commit dea4758
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
10 changes: 5 additions & 5 deletions velox/connectors/tpch/tests/TpchConnectorTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,11 @@ TEST_F(TpchConnectorTest, simple) {
makeFlatVector<int64_t>({0, 1, 1, 1, 4}),
// n_comment
makeFlatVector<StringView>({
"furiously regular requests. platelets affix furious",
"instructions wake quickly. final deposits haggle. final, silent theodolites ",
"asymptotes use fluffily quickly bold instructions. slyly bold dependencies sleep carefully pending accounts",
"ss deposits wake across the pending foxes. packages after the carefully bold requests integrate caref",
"usly ironic, pending foxes. even, special instructions nag. sly, final foxes detect slyly fluffily ",
" haggle. carefully final deposits detect slyly agai",
"al foxes promise slyly according to the regular accounts. bold requests alon",
"y alongside of the pending deposits. carefully special packages are about the ironic forges. slyly special ",
"eas hang ironic, silent packages. slyly regular packages are furiously over the tithes. fluffily bold",
"y above the carefully unusual theodolites. final dugouts are quickly across the furiously regular d",
}),
});
test::assertEqualVectors(expected, output);
Expand Down
3 changes: 2 additions & 1 deletion velox/tpch/gen/DBGenIterator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ class DBGenBackend {
// structures required by dbgen are populated.
DBGenContext dbgenCtx;
load_dists(
10 * 1024 * 1024, &dbgenCtx); // 10 MB buffer size for text generation.
300 * 1024 * 1024,
&dbgenCtx); // 300 MB buffer size for text generation.
}
~DBGenBackend() {
cleanup_dists();
Expand Down

0 comments on commit dea4758

Please sign in to comment.