-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrated Boot into Proto-X #31
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only comments are minor thoughts as I read it. Nice work independently integrating everything.
experiments/protox_tpch_sf10/main.sh
Outdated
INTENDED_PGDATA_HARDWARE=ssd | ||
PGDATA_PARENT_DPATH=/mnt/nvme1n1/phw2/dbgym_tmp/ | ||
|
||
# space for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is temporary? Not sure if you wanted to remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to run specific commands within main.sh
. It's inconvenient to copy them directly because they rely on ENV_VARS, so this is my way of "copying" them. I'll add a comment better explaining this, but I imagine it'll be permanent.
scripts/pat_test.sh
Outdated
INTENDED_PGDATA_HARDWARE=ssd | ||
PGDATA_PARENT_DPATH=/mnt/nvme1n1/phw2/dbgym_tmp/ | ||
|
||
# space for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment, but doesn't matter much.
@@ -208,12 +220,59 @@ def start_with_changes( | |||
"Waiting for postgres to bootup but it is not..." | |||
) | |||
|
|||
# Move the temporary over since we know the temporary can load. | |||
# Set up Boot if we're told to do so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leaks abstractions between modules somewhat, (proto-x knowing about Boot), but from briefly thinking about it, it doesn't seem like fixing the abstraction leak is worth it.
I think in an ideal world, the connection handling logic would be independent of the tuner -- then stuff like Boot doesn't need to get pushed down into the tuner.
However, given that the tuner currently manages its own connections, this is fine. As long as there aren't more than 3-5 things that leak, it probably won't be too bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that's something I didn't realize. I think there is an argument for why Proto-X should know about Boot though, because when you use Proto-X to tune, you need to indicate whether Boot was used during tuning or not to know whether the numbers are real runtimes or estimated. We can revisit this later though.
Summary: Integrated Boot to accelerate the HPO and tuning loops of Proto-X
Demo:
On TPC-H SF10, using Boot (image 1), Proto-X finds a configuration with a "Best Seen Metric" of 69.67s after ~1hr of tuning. Without Boot (image 2), Proto-X only finds a configuration with a "Best Seen Metric" of 103.10s after ~1hr of tuning. Note that "Best Seen Metric" refers to the total runtime of the last run of all 22 queries TPC-H. "Best Metric" is set to "Best Seen Metric" whenever a run has no timed-out queries. Both runs used the same hyperparameters and the same per-query timeout.
Details:
postgres build
.--enable-boot-during-[hpo|tune]
. Boot is enabled separately for HPO and for tuning.