kubelet logs flooding even after pods deleted

1/15/2019
Kubernetes version : v1.6.7
Network plugin : weave

I recently noticed that my entire cluster of 3 nodes went down. Doing my initial level of troubleshooting revealed that /var on all nodes was 100%.

Doing further into the logs revealed the logs to be flooded by kubelet stating

Jan 15 19:09:43 test-master kubelet[1220]: E0115 19:09:43.636001    1220 kuberuntime_gc.go:138] Failed to stop sandbox "fea8c54ca834a339e8fd476e1cfba44ae47188bbbbb7140e550d055a63487211" before removing: rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod "<TROUBLING_POD>-1545236220-ds0v1_kube-system" network: CNI failed to retrieve network namespace path: Error: No such container: fea8c54ca834a339e8fd476e1cfba44ae47188bbbbb7140e550d055a63487211
Jan 15 19:09:43 test-master kubelet[1220]: E0115 19:09:43.637690    1220 docker_sandbox.go:205] Failed to stop sandbox "fea94c9f46923806c177e4a158ffe3494fe17638198f30498a024c3e8237f648": Error response from daemon: {"message":"No such container: fea94c9f46923806c177e4a158ffe3494fe17638198f30498a024c3e8237f648"}

The <TROUBLING_POD>-1545236220-ds0v1 was being initiated due to a cronjob and due to some misconfigurations, there were errors occurring during the running of those pods and more pods were being spun up.

So I deleted all the jobs and their related pods. So I had a cluster that had no jobs/pods running related to my cronjob and still see the same ERROR messages flooding the logs.

I did :

1) Restart docker and kubelet on all nodes.

2) Restart the entire control plane

and also 3) Reboot all nodes.

But still the logs are being flooded with the same error messages even though no such pods are even being spun up.

So I dont know how can I stop kubelet from throwing out the errors.

Is there a way for me to reset the network plugin I am using ? Or do something else ?

--
docker
kubernetes
weave

1 Answer

1/16/2019

Check if the pod directory exists under /var/lib/kubelet

You're on a very old version of Kubernetes, upgrading will fix this issue.

-- jaxxstorm
Source: StackOverflow