Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory consumption in hyperopt scan #2150

Open
scarlehoff opened this issue Sep 9, 2024 · 3 comments
Open

Excessive memory consumption in hyperopt scan #2150

scarlehoff opened this issue Sep 9, 2024 · 3 comments
Labels
bug Something isn't working hyperoptimization

Comments

@scarlehoff
Copy link
Member

scarlehoff commented Sep 9, 2024

When running a hyperparameter scan, the memory grows proportional to the number of trials.

This smells like a memory leak, since from one trial to the next virtually all information can be safely dropped. In principle there is a call to clear_backend between hyperopts but clearly not all memory is emptied. This should be fixed.

@Cmurilochem @goord I understand you don't have more time in the project to debug this issue, so I'll do it at some point, but have you by any chance already looked into this (and maybe know where the memory leak is coming from). It will save me some time.

edit: some more info, I can get away with a parallel run of ~60-100 replicas with ~20 GB of RAM. Running a hyperparameter scan should, in the worst-case-scenario, take nfolds X ~20 GB. Even that should be reduced since there's no reason to keep around the NN weights and gradients after the fold has finished.

@scarlehoff scarlehoff added bug Something isn't working hyperoptimization labels Sep 9, 2024
@Cmurilochem
Copy link
Collaborator

I remember we discussed about memory leaks in connection with tensorflow versions a while ago. But I guess that this is another thing and honestly I have never looked at this issue.

@goord
Copy link
Collaborator

goord commented Sep 12, 2024

Are you running on GPU or CPU @scarlehoff ?

@scarlehoff
Copy link
Member Author

GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working hyperoptimization
Projects
None yet
Development

No branches or pull requests

3 participants