I have a Kubernetes cluster on GKE that's managed by Helm, and recently deploys started failing because pods would not leave the Pending
state. Inspecting the pending pods I see:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 50m default-scheduler Successfully assigned default/my-pod-6cbfb94cb-4pzl9 to gke-staging-pool-43f2e11c-nzjz
Warning FailedValidation 50m (x6 over 50m) kubelet, gke-staging-pool-43f2e11c-nzjz Error validating pod my-pod-6cbfb94cb-4pzl9_default(8e4dab93-75a7-11e9-80e1-42010a960181) from api, ignoring: spec.priority: Forbidden: Pod priority is disabled by feature-gate
And specifically, this warning seems relevant: Error validating pod my-pod-6cbfb94cb-4pzl9_default(8e4dab93-75a7-11e9-80e1-42010a960181) from api, ignoring: spec.priority: Forbidden: Pod priority is disabled by feature-gate
Comparing the new, Pending
, pods with currently running pods, the only difference I see (apart from timestamps, etc) is:
$ kubectl get pod my-pod-6cbfb94cb-4pzl9 -o yaml > /tmp/pending-pod.yaml
$ kubectl get pod my-pod-7958cc964-64wsd -o yaml > /tmp/running-pod.yaml
$ diff /tmp/running-pod.yaml /tmp/pending-pod.yaml
…
@@ -58,7 +58,8 @@
name: default-token-wnhwl
readOnly: true
dnsPolicy: ClusterFirst
- nodeName: gke-staging-pool-43f2e11c-r4f9
+ nodeName: gke-staging-pool-43f2e11c-nzjz
+ priority: 0 // <-- notice that the `priority: 0` field is added
restartPolicy: Always
schedulerName: default-scheduler
…
It appears that this started happening some time between May 1st, 2019, and May 6th 2019.
The cluster I've used as an example here is a staging cluster, but I've noticed the same behavior on two similarly configured production clusters, which leads me to believe there was a change on the Google Kube side.
Pods are deployed by Helm through cloudbuild.yaml
, and nothing has changed in that setup (either version of Helm, or cloudbuild file) between May 1st and May 6th when the regression seems to have been introduced.
Helm version:
Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
If you see the docs for Pod priority and Preemption the feature is alpha
in <= 1.10
(disabled by default, enabled by a feature gate which GKE doesn't do on the control plane, afaik) then it became beta
in >= 1.11
(enabled by default).
Could be either or combination:
You have a GKE control plane >= 1.11
and the node where your Pod is trying to start (being scheduled by the kube-scheduler) is running a kubelet <= 1.10
. Somebody could have upgraded the control plane without upgrading the nodes (or instance group)
Someone upgraded the control plane to 1.11 or later and the priority admission controller is enabled by default and it's preventing Pods with the spec.priority
field set from starting (or restarting). If you see the API docs (priority field) it says that when the priority admission controller is enabled you can't set that field and that field can only be set by a PriorityClass/PriorityClassName.