-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] SamTov measure memory scaling #476
base: main
Are you sure you want to change the base?
Conversation
I would have some questions:
What do you mean by that?
I did this for MLSuite with
The do have 7 GB of memory for linux / windows and I dont't think we can change that. |
CI/integration_tests/calculators/test_radial_distribution_function.py
Outdated
Show resolved
Hide resolved
memory_scaling_test: bool = False | ||
memory_fraction: float = 0.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could expand the config to be config.memory.scaling_test = True
instead of config.memory_scaling_test = True
with additional dataclasses. This way it could be more structured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's an interesting idea. Managing how configuration things are set in general is a nice thing to discuss as it can be quite involved. I think having data classes for different things like you mention here would be very nice.
So what I want to do, and what I have started in the test module, is to generate several hdf5 databases with data of different exact sizes, e.g 1MB, 10MB and so on. In this case, rather than generate a numpy array, save it to a readable file, and then read it with MDSuite, I want to make hdf5 database, make an experiment, and then add it as data to that experiment. In practice, this would be the equivalent of saving your simulation data into the H5MD database format and then using it in MDSuite.
I thought somewhere you can set a memory limit but maybe I am mistaken. In that case, we can set up local runners and just keep it as a test for say, releases that we perform locally. |
Please work
# Conflicts: # CI/integration_tests/calculators/test_einstein_helfand_ionic_conductivity.py
Memory ScalingRaw dataActivate in workflow file |
…nto SamTov_Measure_Memory_Scaling
Test this PR with 2 GB on
Fix #464 and fix the memory safety of MDSuite which is currently not robust enough for small memory machines (< 8 GB or so) on large data sets (>10 GB).
Memory scaling measurement methodology
It seems one can use the pytest-monitoring pytest plugin to measure the memory usage of specific tests. The outputs of these measurements are stored in an sqlite database that can then be queried for information about the specific test that was run. The workflow is outlined below:
This requires the person running the experiment to install
pytest-monitoring
locally. It can be added theoretically to the CI on GitHub and be tested every time but we will then have to make the SQL reading more robust.Tasks
Reviewer notes
Any assistance with this is welcomed as it is not the only important PR on MDSuite at the moment, namely, #475 and this still need to be fixed before any release.