-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kube-up.sh containerd and Daemonset support #1366
kube-up.sh containerd and Daemonset support #1366
Conversation
* scaleout tp server enable daemonset * rp kubelet add tenant-server-kubeconfig information * remove daemonset from RP server
…a#1273) * add install-docker and install-containerd for ubuntu * change default runtime to contanerd * remove unused LOG_DUMP_SYSTEMD_SERVICES from config
ffb21a5
to
e35ca94
Compare
@@ -129,6 +129,8 @@ spec: | |||
- name: config-volume | |||
mountPath: /etc/coredns | |||
readOnly: true | |||
- name: tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what file does coredns use for here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the individual commit, i can assume this is for the logs.
this seem a bit hack especially this is for master code.
normally the pod app should be able to get its log out. we should leverage, or use the termination log if this for pod crash csaes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cherry-pick from community Kubernetes. kubernetes/kubernetes#82128
we do have this kind of error before this fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info.
This seems to be temp workaround for an issue in klog in k8s release 1.14 and 1.15 as the change only short-lived in those two releases only, in 1.16 k8s didn't take this change.
we probably should not take it instead check how eventually the fix was addressed.
is this change helping us? the root issue is the coredns cannot access the api server and hence the crash-restart, if the root issue is addressed, we won't need this fix -- i.e. we can afford even a short restart of the Dns service for NOW.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least for now, it help us. once the root cause of cordons addressed, not sure what will happened once we remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we discussed, let's add some comment on this change for future update to coredns in arktos if resource is available.
|
||
# Override to latest versions of containerd and runc | ||
systemctl stop containerd | ||
if [[ ! -z "${UBUNTU_INSTALL_CONTAINERD_VERSION:-}" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i recall we have an internal version of containerd for mizar CNI. can you double check with Hongwei on this if we need the internal version ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good. the way to get the coredns log seems not meeting the master bar. please investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm.
please annotate the coredns workaround so we can get a chance to track it and fix.
For commit: “add workaround for klog issue causing CoreDNS to crash”, did the investigation below:
|
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: yb01, zmn223 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
What this PR does / why we need it:
part of merge from poc-2022-01-30 to master, including:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Test Scenarios:
Does this PR introduce a user-facing change?: