Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early requeue of jobs #57

Open
aeantipov opened this issue Dec 11, 2020 · 1 comment
Open

Early requeue of jobs #57

aeantipov opened this issue Dec 11, 2020 · 1 comment

Comments

@aeantipov
Copy link

(copy-paste from a chat)
Consider a case of 100 jobs, each for 30 min.
When adaptive scheduler submits these jobs it creates a slurm job for each one.
That requires allocating a node per job.
But in fact when the jobs are not super-long by the time the 50th node becomes available the prior nodes can be empty.
So now you end up with 98 allocated nodes and 98% of calculation finished only to wait for that last node to boot up.

I was wondering if more frequent checking and requeuing jobs would be useful for adaptive-scheduler or is it too much of a hassle?

@basnijholt
Copy link
Owner

This is certainly a good idea and not too hard to implement (I think).

Let's see if we can make this a priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants