-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install Descheduler, fix startup readywait #4363
Install Descheduler, fix startup readywait #4363
Conversation
e65e8f3
to
0d2bd18
Compare
0d2bd18
to
7748d16
Compare
75a35bd
to
7de9616
Compare
Rebased on master, addressed all review comments. |
@deitch updated PR description to add context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions and possible suggestions.
d5098b7
to
87f6a27
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot say I understood all the changes and reviewed the PR properly. But I hope @deitch did =D So, approving.
Also left several comment for butter understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I would say we can merge this PR as soon as @andrewd-zededa says he is satisfied with the comments and addressing them.
Descheduler will be used for eve-app rebalancing during cluster node reboots/upgrades in an upcoming PR. Wait for longhorn daemonsets to be ready, before upcoming PR to snapshot single-node /var/lib kube db. Resolve sometimes failure to import external-boot-image Wait for containerd before importing. Tighter error checking on import. Signed-off-by: Andrew Durbin <andrewd@zededa.com>
87f6a27
to
b81c066
Compare
@OhmSpectator I've tried to address all your review requests, thank you for reviewing. |
This is a few changes to the cluster-init.sh install/boot path of HV=kubevirt eve as a base for upcoming cluster work.
Descheduler will be used for eve-app rebalancing during cluster node reboots/upgrades in an upcoming PR. After a node has encountered an outage and recovered the descheduler is used to evict pods where the current node does not match the preferred affinity node. Next the native kubernetes scheduler is allowed to run again and place that pod back where it has requested placement.
Longhorn daemonsets take some time to come ready (~5-10 minutes on some systems) after the initial install request with 'kubectl apply'. It is important to wait at install time and block all_components_initialized until all longhorn daemonsets are ready as a foundation before an upcoming PR to snapshot single-node /var/lib sqlite k3s db. This db snapshot is used to facilitate converting a cluster node back to a single node system.
Fix: Resolve a small window which led to a failure to import external-boot-image: