Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cordon at large scale #89

Open
dbenque opened this issue Sep 4, 2020 · 1 comment
Open

Cordon at large scale #89

dbenque opened this issue Sep 4, 2020 · 1 comment

Comments

@dbenque
Copy link
Contributor

dbenque commented Sep 4, 2020

We met a case where a lot of nodes (>75% of the cluster) have received a condition that trigger draino. draino has the ability to delay/schedule the drain activity not to create an outage, but the cordon is always applied immediately.
This lead us to have to many nodes cordon at the same time in the cluster which made some operations (rollout of huge application) impossible.

In our case let's say that cluster capacity is limit to C nodes and/or that a given application cannot be allocated more than A nodes (quota). Let's imagine that a huge % of the nodes are cordon, some drains are scheduled by draino. At that time an application is rolled out. If all current nodes of the application are cordon, the system will try to provision (with cluster autoscaler for example) some more nodes to host new pods of the application. Sometimes this provisioning is impossible because it would lead to break either the C or A limits.

In that case the system is blocked: the application cannot be rolled out. This is critical because we may not want to wait for all the ongoing drain activities to complete (could take hours if there are hundreds of nodes) before being able to rollout.

Possible solutions (that can be combined):

A- limit the number of nodes that can be cordon on the cluster simultaneously: draino would not cordon if this limit is reached, waiting for some more cordon slot. New flag: --max-simultaneous-cordon, format of value (int | int%), default -1 meaning no limit.

example:

--max-simultaneous-cordon=150
--max-simultaneous-cordon=10%

B- limit the number of nodes that can be cordon simultaneously for a given set of label key: draino would not cordon if this limit is reached, waiting for some more cordon slot. New flag: --max-simultaneous-cordon-for-labels, , format of value (int | int%),labelKey+``

--max-simultaneous-cordon-for-labels=3,app         
--max-simultaneous-cordon-for-labels=3,app,shard
--max-simultaneous-cordon-for-labels=10%,app,shard

C- same than B but using taint key instead of label key: --max-simultaneous-cordon-for-labels

D- in a first place instead of cordon use a taint PreferNoSchedule. Then only Cordon just before the start of the Drain activity (according to current schedule). New flag: --use-preferred-no-schedule-taint

What do you think?

I can start to implement A which is simple and can already help to protect the system.

@dbenque dbenque mentioned this issue Sep 7, 2020
@tamilhce
Copy link

Can we get an update on this thread? Could you please provide a reason for not considering the cordon limiter? Please update the thread, and I'll be happy to take it up from there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants