pod deployment fails with no clear message in logs

8/9/2019

I have a Kubernetes cluster deployed locally to a node prepped by kubeadm. I am experimenting with one of the pods. This pod fails to deploy, however I can't locate the cause of it. I have guesses as to what the problem is but I'd like to see something related in the Kubernetes logs

Here's what i have tried:

$kubectl logs nmnode-0-0 -c hadoop -n test
Error from server (NotFound): pods "nmnode-0-0" not found
$ kubectl get event -n test | grep nmnode
(empty results here)
$ journalctl -m |grep nmnode

and I get a bunch of repeated entries like the following. It talks about killing the pod but it gives no reason whatsoever for it

Aug 08 23:10:15 jeff-u16-3 kubelet[146562]: E0808 23:10:15.901051  146562 event.go:240] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"nmnode-0-0.15b92c3ff860aed6", GenerateName:"", Namespace:"test", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"test", Name:"nmnode-0-0", UID:"743d2876-69cf-43bc-9227-aca603590147", APIVersion:"v1", ResourceVersion:"38152", FieldPath:"spec.containers{hadoop}"}, Reason:"Killing", Message:"Stopping container hadoop", Source:v1.EventSource{Component:"kubelet", Host:"jeff-u16-3"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "nmnode-0-0.15b92c3ff860aed6" is forbidden: unable to create new content in namespace test because it is being terminated' (will not retry!)

The shorted version of the above message is this:

Reason:"Killing", Message:"Stopping container hadoop",

The cluster is still running. Do you know how I can get to the bottom of this?

-- Jeff Saremi
kubernetes

2 Answers

8/9/2019

try this command to get some hints

kubectl describe pod nmnode-0-0 -n test

share the output from

kubectl get po -n test
-- P Ekambaram
Source: StackOverflow

8/9/2019

Try to execute command below:

$ kubectl get pods --all-namespaces

Take a look if your pod was not created in a different namespace.

Most common reason of pod failures:

1. The container was never created because it failed to pull image.

2. The container never existed in the runtime, and the error reason is not in the "special error list", so the containerStatus was never set and kept as "no state".

3. Then the container was treated as "Unkown" and the pod was reported as Pending without any reason. The containerStatus was always "no state" after each syncPod(), the status manager could never delete the pod even though the Deletiontimestamp was set.

Useful article: pod-failure.

-- MaggieO
Source: StackOverflow