Control plane should scale up serially, not in parallel #2016
Labels
area/control-plane
Issues or PRs related to control-plane lifecycle management
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone
We recently implemented control plane scale up. If the desired number of control plane machines exceeds the actual number, and at least one control plane machine exists, the controller will create multiple machines in parallel. (Once created, each machine runs
kubeadm join --control-plane
).I think we should scale up control planes serially. Before creating an additional control plane machine, we should verify that every etcd member has started. We could also verify that the etcd cluster has quorum (if it does not have quorum, creating a new machine might be a waste of time and resources. On the other hand, if it does have quorum, it might lose it after we create the machine)
Today, etcd still recommends that the cluster be scaled up or down one member at a time. Moreover, there are known issues with running kubeadm join --control-plane in parallel.
In the future, we will likely be able to scale up in parallel by using etcd non-voting members (learners). Kubeadm is already exploring this idea.
/cc @detiber @randomvariable @chuckha
The text was updated successfully, but these errors were encountered: