Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on ubuntu 20.04, create VM pod failed while container pod created successfully, likely OS related ? #1324

Open
yb01 opened this issue Feb 1, 2022 · 3 comments
Assignees

Comments

@yb01
Copy link
Collaborator

yb01 commented Feb 1, 2022

What happened:

root@ip-172-31-27-86:~/go/src/k8s.io/arktos# cat vmdefault.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: vmdefault
  namespace: kube-system
  tenant: system
  annotations:
    VirtletCPUModel: "host-model"
    VirtletSSHKeys: |
     ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCaJEcFDXEK2ZbX0ZLS1EIYFZRbDAcRfuVjpstSc0De8+sV1aiu+dePxdkuDRwqFtCyk6dEZkssjOkBXtri00MECLkir6FcH3kKOJtbJ6vy3uaJc9w1ERo+wyl6SkAh/+JTJkp7QRXj8oylW5E20LsbnA/dIwWzAF51PPwF7A7FtNg9DnwPqMkxFo1Th/buOMKbP5ZA1mmNNtmzbMpMfJATvVyiv3ccsSJKOiyQr6UG+j7sc/7jMVz5Xk34Vd0l8GwcB0334MchHckmqDB142h/NCWTr8oLakDNvkfC1YneAfAO41hDkUbxPtVBG5M/o7P4fxoqiHEX+ZLfRxDtHB53 me@localhost
spec:
  virtualMachine:
          #publicKey: "ssh-rsa AAA"
    keyPairName: "foobar"
    name: vm
    image: download.cirros-cloud.net/0.5.1/cirros-0.5.1-x86_64-disk.img
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        cpu: "1"
        memory: "1Gi"
      requests:
        cpu: "1"
        memory: "1Gi"
root@ip-172-31-27-86:~/go/src/k8s.io/arktos# cat netpod1-1.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: netpod1-1
  namespace: kube-system
  tenant: system
  labels:
    app: netpod
spec:
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 10
  containers:
  - name: netctr
    image: mizarnet/testpod
    ports:
    - containerPort: 9001
      protocol: TCP
    - containerPort: 5001
      protocol: UDP
    - containerPort: 7000
      protocol: TCP
root@ip-172-31-27-86:~/go/src/k8s.io/arktos# 
root@ip-172-31-27-86:~/go/src/k8s.io/arktos# 
root@ip-172-31-27-86:~/go/src/k8s.io/arktos# kubectl get pods -AT
TENANT   NAMESPACE     NAME                               HASHKEY               READY   STATUS             RESTARTS   AGE
system   default       mizar-daemon-64dzf                 2073677356856065344   1/1     Running            0          174m
system   default       mizar-operator-5c97f7478d-gcndc    331524117027475215    1/1     Running            0          174m
system   default       netpod1                            1403058461880678111   1/1     Running            0          105m
system   kube-system   coredns-default-59d8b85bdf-7n6n4   8408488878527592580   0/1     Running            42         174m
system   kube-system   kube-dns-554c5866fc-hmdtv          2483681518505561867   0/3     CrashLoopBackOff   110        174m
system   kube-system   netpod1-1                          8262144681471406475   1/1     Running            0          95m
system   kube-system   virtlet-kfzfm                      231680915987572178    3/3     Running            0          174m
system   kube-system   vmdefault                          3449575720634038347   0/1     Pending            0          4m7s
root@ip-172-31-27-86:~/go/src/k8s.io/arktos# 


============= kubelet log ===========================
I0201 00:15:17.418140  197541 kuberuntime_manager.go:948] computePodActions got {KillPod:true CreateSandbox:true SandboxID: Attempt:0 NextInitContainerToStart:nil ContainersToStart:[0] ContainersToKill:map[] ContainersToUpdate:map[] ContainersToRestart:[] Hotplugs:{NICsToAttach:[] NICsToDetach:[]}} for pod "vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)"
I0201 00:15:17.418446  197541 event.go:278] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-172-31-27-86", UID:"ip-172-31-27-86", APIVersion:"", ResourceVersion:"", FieldPath:"", Tenant:""}): type: 'Warning' reason: 'GetingClusterDNS' For verification - ClusterDNS IP : "10.0.0.141"
I0201 00:15:17.418477  197541 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"vmdefault", UID:"a2faac73-9ae1-42e1-bf16-f524ee91f1a7", APIVersion:"v1", ResourceVersion:"3398", FieldPath:"", Tenant:"system"}): type: 'Warning' reason: 'GettingClusterDNS' pod: "vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)". For verification - ClusterDNS IP : "10.0.0.141"
E0201 00:15:17.436475  197541 remote_runtime.go:107] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = Error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: server returned error: error getting fd: error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: reexec caused error: exit status 127
E0201 00:15:17.436520  197541 kuberuntime_sandbox.go:86] CreatePodSandbox for pod "vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)" failed: rpc error: code = Unknown desc = Error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: server returned error: error getting fd: error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: reexec caused error: exit status 127
E0201 00:15:17.436540  197541 kuberuntime_manager.go:1024] createPodSandbox for pod "vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)" failed: rpc error: code = Unknown desc = Error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: server returned error: error getting fd: error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: reexec caused error: exit status 127
E0201 00:15:17.436611  197541 pod_workers.go:196] Error syncing pod a2faac73-9ae1-42e1-bf16-f524ee91f1a7 ("vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)"), skipping: failed to "CreatePodSandbox" for "vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)" with CreatePodSandboxError: "CreatePodSandbox for pod \"vmdefault_kube-system_system(a2faac73-9ae1-42e1-bf16-f524ee91f1a7)\" failed: rpc error: code = Unknown desc = Error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: server returned error: error getting fd: error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: reexec caused error: exit status 127"
I0201 00:15:17.436773  197541 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"vmdefault", UID:"a2faac73-9ae1-42e1-bf16-f524ee91f1a7", APIVersion:"v1", ResourceVersion:"3398", FieldPath:"", Tenant:"system"}): type: 'Warning' reason: 'FailedCreatePodSandBox' Failed create pod sandbox: rpc error: code = Unknown desc = Error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: server returned error: error getting fd: error adding pod vmdefault (a2faac73-9ae1-42e1-bf16-f524ee91f1a7) to CNI network: reexec caused error: exit status 127

What you expected to happen:
the VM pod, should be created successfully as the container pod

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Arktos version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@yb01 yb01 changed the title create VM pod failed with Mizar CNI while container pod created successfully create VM pod failed while container pod created successfully, likely OS related ? Feb 1, 2022
@yb01
Copy link
Collaborator Author

yb01 commented Feb 1, 2022

This is NOT mizar specific. and it is likely due to the setting in the new ENV i just set up for demo.
I tried a same VM pod i used in my dev machine which is all good. the very same vm yaml file failed the same way without using MIZAR as CNI as well.

@yb01 yb01 changed the title create VM pod failed while container pod created successfully, likely OS related ? on ubuntu 20.04, create VM pod failed while container pod created successfully, likely OS related ? Feb 3, 2022
@yb01
Copy link
Collaborator Author

yb01 commented Feb 7, 2022

This is most like due to the GNU c lib is not compatible per the error in the log. arktos vm runtime build with Ubuntu 18.04 and GNU 2.27. Ubuntu OS 20.04 uses GNU 2.32.

will try to rebuild arktos vm runtime with match GNU libs

@yb01
Copy link
Collaborator Author

yb01 commented Feb 16, 2022

post 130.

@yb01 yb01 added rel-130 and removed rel-130 labels Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant