job scripts for candide #13

lbaumo · 2023-06-30T09:53:43Z

create a sample script to submit to the batch queue and a script version of the code

lbaumo · 2023-08-29T11:16:37Z

there are still memory problems- Im getting this error for the gaussian process regressor:
Traceback (most recent call last): File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/site-packages/joblib/externals/loky/backend/resource_tracker.py", line 278, in main registry[rtype][name] -= 1 KeyError: '/dev/shm/joblib_memmapping_folder_28332_8b7b16870bc540a78fa449e97d0c2b55_4cb00576fda144e5bb3b654cf5effdde/28332-140497104158792-9f8cc414dcd74eef9ce42f5cf4cead4a.pkl'

lbaumo · 2023-08-29T11:20:26Z

the solution appears to be here, I will give it a try

lbaumo · 2023-08-29T11:25:09Z

solution above works, but there is now an issue:

Traceback (most recent call last):
  File "/home/baumont/software/shear-pipe-peaks/example/constraints_CFIS-P3.py", line 111, in <module>
    with Pool() as pool:
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/context.py", line 119, in Pool
    context=self.get_context())
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/pool.py", line 174, in __init__
    self._repopulate_pool()
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/pool.py", line 239, in _repopulate_pool
    w.start()
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/process.py", line 105, in start
    self._popen = self._Popen(self)
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/home/baumont/.conda/envs/sp-peaks/lib/python3.6/multiprocessing/popen_fork.py", line 66, in _launch
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

martinkilbinger · 2023-08-29T11:42:33Z

Are you running it on a compute node?

lbaumo · 2023-08-29T11:49:22Z

I can run on my laptop and in a notebook, so I have tried the script on the login node just to see if the chains get started. I am trying now on a compute node have increased the memory allocation
`* JOBID: 283632.cmaster

USER: baumont
GROUP: baumont
JOBNAME: peaks-mcmc
SESSIONID: 231638
RESOURCESLIST: mem=10gb,neednodes=1:ppn=8,nodes=1:ppn=8,walltime=10:00:00
RESOURCESUSED: cput=00:19:58,mem=5583580kb,vmem=91802740kb,walltime=00:00:55
QUEUE: batch
JOB EXIT STATUS: 1`
hmm- weird that python gives the error before the job script

lbaumo · 2023-08-29T11:56:52Z

increasing the memory gives the same error but the usage does not really get close to the max:

JOBID: 283633.cmaster
USER: baumont
GROUP: baumont
JOBNAME: peaks-mcmc
SESSIONID: 232911
RESOURCESLIST: mem=32gb,neednodes=1:ppn=24,nodes=1:ppn=24,walltime=10:00:00
RESOURCESUSED: cput=00:19:23,mem=4153168kb,vmem=90186128kb,walltime=00:00:53
QUEUE: batch
JOB EXIT STATUS: 1

lbaumo · 2023-08-29T12:13:29Z

ah so apparently when using a multiprocessing.Pool, the default way to start the processes is fork. The issue with fork is that the entire process is duplicated. the script was using a default number of processes equal to the number of cores on the node, even if I did not allocate the entire note in the batch queue script. I tried to make the number of processes a user input, and now the job is running. hopefully it works

lbaumo mentioned this issue Jun 30, 2023

Candide scripts #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

job scripts for candide #13

job scripts for candide #13

lbaumo commented Jun 30, 2023

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023 •

edited

Loading

martinkilbinger commented Aug 29, 2023

lbaumo commented Aug 29, 2023 •

edited

Loading

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023

job scripts for candide #13

job scripts for candide #13

Comments

lbaumo commented Jun 30, 2023

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023 • edited Loading

martinkilbinger commented Aug 29, 2023

lbaumo commented Aug 29, 2023 • edited Loading

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023

lbaumo commented Aug 29, 2023 •

edited

Loading

lbaumo commented Aug 29, 2023 •

edited

Loading