You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
during the package testing, we had some optimization jobs killed prematurely by the scheduler (Slurm) because of RAM limit overdraft. One of the examples is attached. For the job, we requested 16 cores and 48 GB of memory. We tried to increase the amount of memory to 4GB/core on another molecule of a similar size, with the same result in the end; same problems appeared with 8 cores/24 GB submissions. In all cases we've seen, the jobs were killed within 24 hours after start.
Our configuration: Linux 5.14.0-427.37.1.el9_4.x86_64, gcc 11.3.0, Intel MKL, SLURM 22.05.9
We can provide more info if needed.
The batch log file, input file, starting xyz and cropped version of the log file are attached OOM.zip (full log is too big but available from Drive, link below)
I experience the same an my system and for different types of calculations. In fact, when monitoring my jobs on I see that memory usage increases linearly with time, which for large calculations ultimately leads to OOM kills.
Dear OpenQP team,
during the package testing, we had some optimization jobs killed prematurely by the scheduler (Slurm) because of RAM limit overdraft. One of the examples is attached. For the job, we requested 16 cores and 48 GB of memory. We tried to increase the amount of memory to 4GB/core on another molecule of a similar size, with the same result in the end; same problems appeared with 8 cores/24 GB submissions. In all cases we've seen, the jobs were killed within 24 hours after start.
Our configuration: Linux 5.14.0-427.37.1.el9_4.x86_64, gcc 11.3.0, Intel MKL, SLURM 22.05.9
We can provide more info if needed.
The batch log file, input file, starting xyz and cropped version of the log file are attached OOM.zip (full log is too big but available from Drive, link below)
https://drive.google.com/file/d/1KUrlGrdIFAUvsHWa4Mshl5WRU97EKrrA/view?usp=sharing
The text was updated successfully, but these errors were encountered: