pod creation stuck in ContainerCreating state

6/7/2018

I have created a k8s cluster with RHEL7 with kubernetes packages GitVersion:"v1.8.1". I'm trying to deploy wordpress on my custom cluster. But pod creation is always stuck in ContainerCreating state.

phani@k8s-master]$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                                        READY     STATUS              RESTARTS   AGE
default       wordpress-766d75457d-zlvdn                                  0/1       ContainerCreating   0          11m
kube-system   etcd-k8s-master                                             1/1       Running             0          1h
kube-system   kube-apiserver-k8s-master                                   1/1       Running             0          1h
kube-system   kube-controller-manager-k8s-master                          1/1       Running             0          1h
kube-system   kube-dns-545bc4bfd4-bb8js                                   3/3       Running             0          1h
kube-system   kube-proxy-bf4zr                                            1/1       Running             0          1h
kube-system   kube-proxy-d7zvg                                            1/1       Running             0          34m
kube-system   kube-scheduler-k8s-master                                   1/1       Running             0          1h
kube-system   weave-net-92zf9                                             2/2       Running             0          34m
kube-system   weave-net-sh7qk                                             2/2       Running             0          1h

Docker Version:1.13.1

Pod status from descibe command
      Normal   Scheduled               18m                default-scheduler                           Successfully assigned wordpress-766d75457d-zlvdn to worker1
      Normal   SuccessfulMountVolume   18m                kubelet, worker1                            MountVolume.SetUp succeeded for volume "default-token-tmpcm"
      Warning  DNSSearchForming        18m                kubelet, worker1                            Search Line limits were exceeded, some dns names have been omitted, the applied search line is: default.svc.cluster.local svc.cluster.local cluster.local 
      Warning  FailedCreatePodSandBox  14m                kubelet, worker1                            Failed create pod sandbox.
      Warning  FailedSync              25s (x8 over 14m)  kubelet, worker1                            Error syncing pod
      Normal   SandboxChanged          24s (x8 over 14m)  kubelet, worker1                            Pod sandbox changed, it will be killed and re-created.

from the kubelet log I observed below error on worker

error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

But kubelet is stable no problems seen on worker.

How do I solve this problem?

I checked the cni failure, I couldn't find anything.

~]# ls /opt/cni/bin
bridge  cnitool  dhcp  flannel  host-local  ipvlan  loopback  macvlan  noop  ptp  tuning  weave-ipam  weave-net  weave-plugin-2.3.0

In journal logs below messages are repetitively appeared . seems like scheduler is trying to create the container all the time.

Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421184   14339 remote_runtime.go:115] StopPodSandbox "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304" from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421212   14339 kuberuntime_manager.go:780] Failed to stop sandbox {"docker" "47da29873230d830f0ee21adfdd3b06ed0c653a0001c29289fe78446d27d2304"}
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421247   14339 kuberuntime_manager.go:580] killPodWithSyncResult failed: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
    Jun 08 11:25:22 worker1 kubelet[14339]: E0608 11:25:22.421262   14339 pod_workers.go:182] Error syncing pod 7f1c6bf1-6af3-11e8-856b-fa163e3d1891 ("wordpress-766d75457d-spdrb_default(7f1c6bf1-6af3-11e8-856b-fa163e3d1891)"), skipping: failed to "KillPodSandbox" for "7f1c6bf1-6af3-11e8-856b-fa163e3d1891" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"
-- phanikumar ch
docker
kubernetes

3 Answers

6/8/2018

As Matthew said it's most likely a CNI failure.

First, find the node this pod is running on:

kubectl get po wordpress-766d75457d-zlvdn -o wide 

Next in the node where the pod is located check /etc/cni/net.d if you have more than one .conf then you can delete one and restart the node.

source: https://github.com/kubernetes/kubeadm/issues/578.

note this is one of the solutions.

-- elia
Source: StackOverflow

6/8/2018

Failed create pod sandbox.

... is almost always a CNI failure; I would check on the node that all the weave containers are happy, and that /opt/cni/bin is present (or its weave equivalent)

You may have to check both the journalctl -u kubelet.service as well as the docker logs for any containers running to discover the full scope of the error on the node.

-- mdaniel
Source: StackOverflow

6/11/2018

It's seem to working by removing the$KUBELET_NETWORK_ARGS in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

I have removed $KUBELET_NETWORK_ARGS and restarted the worker node then pods got deployed successfully.

-- phanikumar ch
Source: StackOverflow