-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fit preprocessor just once with tune_bayes
#955
Comments
This is an excellent point. Once a candidate is created by the Gaussian process model, we pass that to We would have to make substantial changes to We’ll have to consider this to see if there is a less invasive approach than the one described above (I don't think we can do that). |
If there is nothing to tune in the preprocessor, could it be 'baked' prior to starting the tuning process altogether? Then the workflow gets modified to use the baked data and no preprocessor, the tuning is conducted, and then everything gets repackaged at the end? Maybe this is too much work for too little gain in a special case. |
Unless you are using a validation set, we would not want to fit the preprocessor on the entire training set then fit the model on a potentially different data set (i.e., one that was a resample) |
I'm thinking if we are passing in resamples we could bake the preprocessor on each resample in advance, or something like that. |
Ok, I've tried hacking this together and I see why it won't work. The workflow expects the data in each resample to look similar (same columns, etc), but if we preprocess and glue the resamples back together, each resample could have different columns, and that breaks things down the line. FWIW, I brought all this up because I have a workflow that involves |
Feature
Currently, it appears that
tune_bayes
recomputes the entire preprocessor during every iteration, even if the preprocessor has nothing to tune. This can lead to a substantial amount of unnecessary computation as the preprocessor should only need to be executed once and could be reused for all iterations.The text was updated successfully, but these errors were encountered: