k8s DaemonSet: rejected pods are re-created too fast to get any log or event

6/6/2018

I am having the exact same issue as stated here: https://github.com/kubernetes/kubernetes/issues/51884

calico node is created but deleted directly again:

Jun 06 04:38:28 [..] I0606 08:38:28.170339       1 event.go:218] Event(v1.ObjectReference{Kind:"DaemonSet", Namespace:"kube-system", Name:"calico-node", UID:"10999117-6962-11e8-a147-0800278037c3", APIVersion:"extensions", ResourceVersion:"29749", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: calico-node-rm2gl
Jun 06 04:38:28 [..] I0606 08:38:28.732348       1 event.go:218] Event(v1.ObjectReference{Kind:"DaemonSet", Namespace:"kube-system", Name:"calico-node", UID:"10999117-6962-11e8-a147-0800278037c3", APIVersion:"extensions", ResourceVersion:"29749", FieldPath:""}): type: 'Warning' reason: 'FailedDaemonPod' Found failed daemon pod kube-system/calico-node-rm2gl on node localhost.localdomain, will try to kill it
Jun 06 04:38:28 [..] E0606 08:38:28.755021       1 daemon_controller.go:265] kube-system/calico-node failed with : deleted 1 failed pods of DaemonSet kube-system/calico-node
Jun 06 04:38:28 [..] I0606 08:38:28.757246       1 event.go:218] Event(v1.ObjectReference{Kind:"DaemonSet", Namespace:"kube-system", Name:"calico-node", UID:"10999117-6962-11e8-a147-0800278037c3", APIVersion:"extensions", ResourceVersion:"29749", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: calico-node-rm2gl

Can you tell me how I can troubleshoot that? journalctl does not print out why the pod fails. And the pod is not even created or logs can be visible.

-- user1945826
kubernetes
project-calico

1 Answer

6/7/2018

One solution I found: get the yaml of the pod modify it and deploy with this. You need to be fast here in order to retrieve it cause the pods are deleted so fast. That is why we use journalctl to get the pods real name:

kubectl -n kube-system get pod $(sudo journalctl -e |grep calico-node- |awk '{print $NF}') -o yaml > calico_pod.yaml

You could do as well:

`kubectl -n kube-system get pod calico-node-<unique-hash> -o yaml > calico_pod.yaml`

But you need to be very fast!

Then you can make small changes to calico_pod.yaml and figure out what is the issue:

kubectl create -f calico_pod.yaml

My issue was that cpu resources was defined:

resources:
  requests:
    cpu: 250m

And there were no resources left.

-- user1945826
Source: StackOverflow