I have a Kubernetes cluster deployed locally to a node prepped by kubeadm. I am experimenting with one of the pods. This pod fails to deploy, however I can't locate the cause of it. I have guesses as to what the problem is but I'd like to see something related in the Kubernetes logs
Here's what i have tried:
$kubectl logs nmnode-0-0 -c hadoop -n test
Error from server (NotFound): pods "nmnode-0-0" not found
$ kubectl get event -n test | grep nmnode
(empty results here)
$ journalctl -m |grep nmnode
and I get a bunch of repeated entries like the following. It talks about killing the pod but it gives no reason whatsoever for it
Aug 08 23:10:15 jeff-u16-3 kubelet[146562]: E0808 23:10:15.901051 146562 event.go:240] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"nmnode-0-0.15b92c3ff860aed6", GenerateName:"", Namespace:"test", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"test", Name:"nmnode-0-0", UID:"743d2876-69cf-43bc-9227-aca603590147", APIVersion:"v1", ResourceVersion:"38152", FieldPath:"spec.containers{hadoop}"}, Reason:"Killing", Message:"Stopping container hadoop", Source:v1.EventSource{Component:"kubelet", Host:"jeff-u16-3"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf4b616dacae12d6, ext:2812562895486, loc:(*time.Location)(0x781e740)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "nmnode-0-0.15b92c3ff860aed6" is forbidden: unable to create new content in namespace test because it is being terminated' (will not retry!)
The shorted version of the above message is this:
Reason:"Killing", Message:"Stopping container hadoop",
The cluster is still running. Do you know how I can get to the bottom of this?
try this command to get some hints
kubectl describe pod nmnode-0-0 -n test
share the output from
kubectl get po -n test
Try to execute command below:
$ kubectl get pods --all-namespaces
Take a look if your pod was not created in a different namespace.
Most common reason of pod failures:
1. The container was never created because it failed to pull image.
2. The container never existed in the runtime, and the error reason is not in the "special error list", so the containerStatus was never set and kept as "no state".
3. Then the container was treated as "Unkown" and the pod was reported as Pending without any reason. The containerStatus was always "no state" after each syncPod(), the status manager could never delete the pod even though the Deletiontimestamp was set.
Useful article: pod-failure.