Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76

kkristacia · 2024-05-21T02:54:51Z

Hi thank you for developing this package! I want to be able to load the already saved model, then use it for inference like in production. How can I let the inference dataset to go through the same preprocessing steps eg. OneHotEncoding of categorical variables, scaling?

akashsaravanan-georgian · 2024-05-22T17:51:16Z

Hi @kkristacia,
To load the model, you just need to run the same steps as creating the model. The only difference is that while calling model = AutoModelWithTabular.from_pretrained(...) make sure you set the first argument pretrained_model_name_or_path to the path that you saved your model in.

Similarly, to preprocess the inference dataset, I would recommend running load_data_from_folder function with the same parameters used in the load_data_from_folder while training. Use the same training data to reconstruct the encoders and replace the test data with your inference data. I know this isn't optimal so we'll definitely change this in a future version.

Please let me know if you run into any other issues and I can help you solve it! :)

kkristacia · 2024-05-24T07:58:35Z

Hi Akash, thanks for the clarification. Yea I was hoping for some way to not use the training data during inference. Definitely will be great if future versions can have the functionality!

dsunart · 2024-05-28T09:42:51Z

Hi Akash. Just to second this - it would be great if the preprocessing objects were saved for making inferences in production. Loading my whole dataset into my production environment would take up space unnecessarily. Love the toolkit, and looking forward to seeing an update in the future!

akashsaravanan-georgian · 2024-05-28T16:48:37Z

Thanks @dsunart! I'm reopening this issue as a feature request. It should be added in as part of our next release!

akashsaravanan-georgian · 2024-09-17T15:15:49Z

Hey @kkristacia and @dsunart, happy to note that this is now part of the toolkit. You can see this in action in this example.

akashsaravanan-georgian added the question Further information is requested label May 22, 2024

kkristacia closed this as completed May 24, 2024

akashsaravanan-georgian added enhancement New feature or request and removed question Further information is requested labels May 28, 2024

akashsaravanan-georgian reopened this May 28, 2024

akashsaravanan-georgian mentioned this issue Sep 6, 2024

Feat: Better Preprocessing #79

Merged

akashsaravanan-georgian closed this as completed in #79 Sep 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76

Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76

kkristacia commented May 21, 2024

akashsaravanan-georgian commented May 22, 2024 •

edited

Loading

kkristacia commented May 24, 2024

dsunart commented May 28, 2024

akashsaravanan-georgian commented May 28, 2024

akashsaravanan-georgian commented Sep 17, 2024

Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76

Is there a way to save the preprocessing objects for inference? (OneHotEncoder, Scaler) #76

Comments

kkristacia commented May 21, 2024

akashsaravanan-georgian commented May 22, 2024 • edited Loading

kkristacia commented May 24, 2024

dsunart commented May 28, 2024

akashsaravanan-georgian commented May 28, 2024

akashsaravanan-georgian commented Sep 17, 2024

akashsaravanan-georgian commented May 22, 2024 •

edited

Loading