Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install Descheduler, fix startup readywait #4363

Conversation

andrewd-zededa
Copy link
Contributor

@andrewd-zededa andrewd-zededa commented Oct 16, 2024

This is a few changes to the cluster-init.sh install/boot path of HV=kubevirt eve as a base for upcoming cluster work.

Descheduler will be used for eve-app rebalancing during cluster node reboots/upgrades in an upcoming PR. After a node has encountered an outage and recovered the descheduler is used to evict pods where the current node does not match the preferred affinity node. Next the native kubernetes scheduler is allowed to run again and place that pod back where it has requested placement.

Longhorn daemonsets take some time to come ready (~5-10 minutes on some systems) after the initial install request with 'kubectl apply'. It is important to wait at install time and block all_components_initialized until all longhorn daemonsets are ready as a foundation before an upcoming PR to snapshot single-node /var/lib sqlite k3s db. This db snapshot is used to facilitate converting a cluster node back to a single node system.

Fix: Resolve a small window which led to a failure to import external-boot-image:

  • Wait for containerd before importing.
  • Tighter error checking on import.

pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
@andrewd-zededa
Copy link
Contributor Author

Rebased on master, addressed all review comments.

@andrewd-zededa
Copy link
Contributor Author

@deitch updated PR description to add context.

Copy link
Contributor

@deitch deitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions and possible suggestions.

pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
pkg/kube/cluster-init.sh Outdated Show resolved Hide resolved
@andrewd-zededa andrewd-zededa force-pushed the andrewd-external-boot-image-import branch 8 times, most recently from d5098b7 to 87f6a27 Compare October 23, 2024 15:06
Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot say I understood all the changes and reviewed the PR properly. But I hope @deitch did =D So, approving.
Also left several comment for butter understanding.

pkg/kube/cluster-init.sh Show resolved Hide resolved
pkg/kube/longhorn-utils.sh Show resolved Hide resolved
pkg/kube/longhorn-utils.sh Show resolved Hide resolved
pkg/kube/descheduler-policy-configmap.yaml Outdated Show resolved Hide resolved
Copy link
Member

@OhmSpectator OhmSpectator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I would say we can merge this PR as soon as @andrewd-zededa says he is satisfied with the comments and addressing them.

Descheduler will be used for eve-app rebalancing during
cluster node reboots/upgrades in an upcoming PR.
Wait for longhorn daemonsets to be ready, before upcoming PR
to snapshot single-node /var/lib kube db.
Resolve sometimes failure to import external-boot-image
	Wait for containerd before importing.
	Tighter error checking on import.

Signed-off-by: Andrew Durbin <andrewd@zededa.com>
@andrewd-zededa andrewd-zededa force-pushed the andrewd-external-boot-image-import branch from 87f6a27 to b81c066 Compare November 1, 2024 15:07
@andrewd-zededa
Copy link
Contributor Author

@OhmSpectator I've tried to address all your review requests, thank you for reviewing.

@OhmSpectator OhmSpectator merged commit b487e9a into lf-edge:master Nov 1, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants