Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Scale-out] kube-up.sh failed to start Kube-flannel-ds with CrashLoopBackOff #1317

Open
sonyafenge opened this issue Jan 27, 2022 · 0 comments

Comments

@sonyafenge
Copy link
Collaborator

What happened:
start kube-up.sh with "SCALEOUT_CLUSTER=true", kube-flannel-ds pods failed with CrashLoopBackOff

$ kubectl --kubeconfig=/home/sonyali/go/src/k8s.io/arktos/cluster/kubeconfig-proxy get pods -A -o wide | grep flannel
kube-system   kube-flannel-ds-ck9nx                                   467230830100771799    0/1     CrashLoopBackOff    5          6m19s   10.40.0.4   sonyaperf2-012722-rp-1-minion-group-819s   <none>           <none>
kube-system   kube-flannel-ds-hdv2v                                   4889372435816231445   0/1     CrashLoopBackOff    6          12m     10.40.0.2   sonyaperf2-012722-tp-1-master              <none>           <none>
kube-system   kube-flannel-ds-thf24                                   6761946374700889854   0/1     CrashLoopBackOff    4          6m19s   10.40.0.3   sonyaperf2-012722-rp-1-master              <none>           <none>
kube-system   kube-flannel-ds-wcmz9                                   6355461902304453693   0/1     CrashLoopBackOff    5          6m19s   10.40.0.5   sonyaperf2-012722-rp-1-minion-group-97v8   <none>           <none>

What you expected to happen:
kube-flannel-ds should be started successfully.
How to reproduce it (as minimally and precisely as possible):

$ unset KUBE_GCE_MASTER_PROJECT KUBE_GCE_NODE_PROJECT KUBE_GCI_VERSION  KUBE_GCE_MASTER_IMAGE  KUBE_GCE_NODE_IMAGE KUBE_CONTAINER_RUNTIME NETWORK_PROVIDER
$ export KUBEMARK_NUM_NODES=100 NUM_NODES=2 SCALEOUT_CLUSTER=true SCALEOUT_TP_COUNT=1 SCALEOUT_RP_COUNT=1 RUN_PREFIX=sonyaperf2-012722
$ export MASTER_DISK_SIZE=500GB MASTER_ROOT_DISK_SIZE=500GB KUBE_GCE_ZONE=us-west2-b MASTER_SIZE=n1-highmem-32 NODE_SIZE=n1-highmem-16 NODE_DISK_SIZE=500GB GOPATH=$HOME/go KUBE_GCE_ENABLE_IP_ALIASES=true KUBE_GCE_PRIVATE_CLUSTER=true CREATE_CUSTOM_NETWORK=true KUBE_GCE_INSTANCE_PREFIX=${RUN_PREFIX} KUBE_GCE_NETWORK=${RUN_PREFIX} ENABLE_KCM_LEADER_ELECT=false ENABLE_SCHEDULER_LEADER_ELECT=false ETCD_QUOTA_BACKEND_BYTES=8589934592 SHARE_PARTITIONSERVER=false LOGROTATE_FILES_MAX_COUNT=200 LOGROTATE_MAX_SIZE=200M KUBE_ENABLE_APISERVER_INSECURE_PORT=true KUBE_ENABLE_PROMETHEUS_DEBUG=true KUBE_ENABLE_PPROF_DEBUG=true TEST_CLUSTER_LOG_LEVEL=--v=2 HOLLOW_KUBELET_TEST_LOG_LEVEL=--v=2 GCE_REGION=us-west2-b
./cluster/kube-up.sh

Anything else we need to know?:

Environment:

  • Arktos version (use kubectl version):
$ git log --oneline
adcf6ae442f (HEAD, upstream/poc-2022-01-30) In get or create service function, get needs to happen before create. (#1312)
7dd0e3ba549 Slight refactor and essential UT (#1308)
a5e701e8ad6 Grant necessary permission to mizar service controller (#1306)
33b7a581795 Start proxy in worker
ea1908021b2 Assign pod to default mizar network only when Mizar VPC/subnet annotation is not present in pod (#1296)
770f87c479b Remove json templates in constructing Openstack request (#1299)
c30f971d433 Bug fix: add pod creation event back to queue when network is not ready (#1295)
84db6887873 start nodeipam controller at RP of kube-up scale-out cluster (#1292)
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant