Cordon at large scale #89

dbenque · 2020-09-04T07:21:21Z

We met a case where a lot of nodes (>75% of the cluster) have received a condition that trigger draino. draino has the ability to delay/schedule the drain activity not to create an outage, but the cordon is always applied immediately.
This lead us to have to many nodes cordon at the same time in the cluster which made some operations (rollout of huge application) impossible.

In our case let's say that cluster capacity is limit to C nodes and/or that a given application cannot be allocated more than A nodes (quota). Let's imagine that a huge % of the nodes are cordon, some drains are scheduled by draino. At that time an application is rolled out. If all current nodes of the application are cordon, the system will try to provision (with cluster autoscaler for example) some more nodes to host new pods of the application. Sometimes this provisioning is impossible because it would lead to break either the C or A limits.

In that case the system is blocked: the application cannot be rolled out. This is critical because we may not want to wait for all the ongoing drain activities to complete (could take hours if there are hundreds of nodes) before being able to rollout.

Possible solutions (that can be combined):

A- limit the number of nodes that can be cordon on the cluster simultaneously: draino would not cordon if this limit is reached, waiting for some more cordon slot. New flag: --max-simultaneous-cordon, format of value (int | int%), default -1 meaning no limit.

example:

--max-simultaneous-cordon=150

--max-simultaneous-cordon=10%

B- limit the number of nodes that can be cordon simultaneously for a given set of label key: draino would not cordon if this limit is reached, waiting for some more cordon slot. New flag: --max-simultaneous-cordon-for-labels, , format of value (int | int%),labelKey+``

--max-simultaneous-cordon-for-labels=3,app

--max-simultaneous-cordon-for-labels=3,app,shard

--max-simultaneous-cordon-for-labels=10%,app,shard

C- same than B but using taint key instead of label key: --max-simultaneous-cordon-for-labels

D- in a first place instead of cordon use a taint PreferNoSchedule. Then only Cordon just before the start of the Drain activity (according to current schedule). New flag: --use-preferred-no-schedule-taint

What do you think?

I can start to implement A which is simple and can already help to protect the system.

The text was updated successfully, but these errors were encountered:

tamilhce · 2023-05-19T13:05:02Z

Can we get an update on this thread? Could you please provide a reason for not considering the cordon limiter? Please update the thread, and I'll be happy to take it up from there

dbenque mentioned this issue Sep 7, 2020

cordon limiter #90

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cordon at large scale #89

Cordon at large scale #89

dbenque commented Sep 4, 2020

tamilhce commented May 19, 2023

Cordon at large scale #89

Cordon at large scale #89

Comments

dbenque commented Sep 4, 2020

Possible solutions (that can be combined):

tamilhce commented May 19, 2023