Kubernetes pod / container deployment failing unexpectedly starts to error out

4/17/2018

One of the pods in my deployment started unexpectedly giving errors after a very minor change. On running "kubctl describe" on the failed pod I get the following error:

Warning Failed 14s kubelet, ip-10-166-30-232.ec2.internal Error: failed to start container "": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:295: setting oom score for ready process caused \"write /proc/11890/oom_score_adj: invalid argument\"" Warning BackOff 9s (x2 over 13s) kubelet, ip-10-166-30-232.ec2.internal Back-off restarting failed container

-- Bryji
containers
docker
kubernetes
kubernetes-helm

1 Answer

4/17/2018

Googling a bit threw up the following result: https://bugzilla.redhat.com/show_bug.cgi?id=1460097 - basically binary data in the Environment can cause Docker to fail with this error.

My problem was I had added a secret to the Kubernetes namespace but forgetting that secrets need to be base64 encoded. Hence when the secret was decoded in the pod environment, it decoded into a binary form that Docker didn't like.

A second prong to this was that while I tried to undo the reference to the secret by removing references from my Helm Charts describing the deployment, these were not actually removed in the target Deployment. It seems there is a 'merge' strategy where items may be added from your Helm source, but they are never removed. Hence I had to manually delete the reference to the secret using kubectl (https://github.com/kubernetes/helm/issues/1966).

-- Bryji
Source: StackOverflow