When scheduling a kubernetes Job and Pod, if the Pod can't be placed the explanation available from kubectl describe pods PODNAME
looks like:
Warning FailedScheduling <unknown> default-scheduler 0/172 nodes are available:
1 Insufficient pods, 1 node(s) were unschedulable, 11 Insufficient memory,
30 Insufficient cpu, 32 node(s) didn't match node selector, 97 Insufficient nvidia.com/gpu.
That's useful but a little too vague. I'd like more detail than that.
Specifically can I list all nodes with the reason the pod wasn't scheduled to each particular node?
I was recently changing labels and the node selector and want to determine if I made a mistake somewhere in that process or if the nodes I need really are just busy.
You can find more details related to problems with scheduling particular Pod
in kube-scheduler
logs. If you set up your cluster with kubeadm tool, kube-scheduler
as well as other key components of the cluster is deployed as a system Pod
. You can list such Pods
with the following command:
kubectl get pods -n kube-system
which will show you among others your kube-scheduler
Pod
:
NAME READY STATUS RESTARTS AGE
kube-scheduler-master-ubuntu-18-04 1/1 Running 0 2m37s
Then you can check its logs. In my example the command will look as follows:
kubectl logs kube-scheduler-master-ubuntu-18-04 -n kube-system
You should find there the information you need.
One more thing...
If you've already verified it, just ignore this tip
Let's start from the beginning...
I've just created a simple job from the example you can find here:
kubectl apply -f https://k8s.io/examples/controllers/job.yaml
job.batch/pi created
If I run:
kubectl get jobs
it shows me:
NAME COMPLETIONS DURATION AGE
pi 0/1 17m 17m
Hmm... completions 0/1 ? Something definitely went wrong. Let's check it.
kubectl describe job pi
tells me basically nothing. In it's events I can see only:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 18m job-controller Created pod: pi-zxp4p
as if everything went well... but we already know it didn't. So let's investigate further. As you probably know, job-controller
creates Pods
that run to completion to perform certain task. From the perspective of the job-controller
everything went well (we've just seen it in it's events):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 23m job-controller Created pod: pi-zxp4p
It did it's part of the task and reported that everything went fine. But it's just part of the whole task. It passed actual Pod
creation task further to the kube-scheduler
controller as being just a job-controller
it isn't responsible (and doesn't even have enough privileges) to schedule the actual Pod
on particular node. If we run:
kubectl get pods
we can see one Pod
in a Pending
state:
NAME READY STATUS RESTARTS AGE
pi-zxp4p 0/1 Pending 0 30m
Let's describe it:
kubectl describe pod pi-zxp4p
In events we can see some very important and specific info:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 20s (x24 over 33m) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
so now we know the actual reason why our Pod couldn't be scheduled.
Pay attention to different fields of the event:
From
: default-scheduler
- it means that the message was originated from our kube-scheduler
.
Type
: Warning
, which isn't as important as Critical
or Error
so chances are that it may not appear in kube-scheduler
logs if the last one was started with the default level of log verbosity.
You can read here that:
As per the comments, the practical default level is V(2). Developers and QE environments may wish to run at V(3) or V(4). If you wish to change the log level, you can pass in -v=X where X is the desired maximum level to log.