How to diagnose a stuck Kubernetes rollout / deployment?

6/1/2018

It seems a deployment has gotten stuck. How can I diagnose this further?

kubectl rollout status deployment/wordpress
Waiting for rollout to finish: 2 out of 3 new replicas have been updated...

It's stuck on that for ages already. It is not terminating the two older pods:

kubectl get pods                   
NAME                         READY     STATUS    RESTARTS   AGE
nfs-server-r6g6w             1/1       Running   0          2h
redis-679c597dd-67rgw        1/1       Running   0          2h
wordpress-64c944d9bd-dvnwh   4/4       Running   3          3h
wordpress-64c944d9bd-vmrdd   4/4       Running   3          3h
wordpress-f59c459fd-qkfrt    0/4       Pending   0          22m
wordpress-f59c459fd-w8c65    0/4       Pending   0          22m

And the events:

kubectl get events --all-namespaces
NAMESPACE   LAST SEEN   FIRST SEEN   COUNT     NAME                                          KIND         SUBOBJECT   TYPE      REASON              SOURCE                    MESSAGE
default     25m         2h           333       wordpress-686ccd47b4-4pbfk.153408cdba627f50   Pod                      Warning   FailedScheduling    default-scheduler         No nodes are available that match all of the predicates: Insufficient cpu (1), Insufficient memory (2), MatchInterPodAffinity (1).
default     25m         2h           337       wordpress-686ccd47b4-vv9dk.153408cc8661c49d   Pod                      Warning   FailedScheduling    default-scheduler         No nodes are available that match all of the predicates: Insufficient cpu (1), Insufficient memory (2), MatchInterPodAffinity (1).
default     22m         22m          1         wordpress-686ccd47b4.15340e5036ef7d1c         ReplicaSet               Normal    SuccessfulDelete    replicaset-controller     Deleted pod: wordpress-686ccd47b4-4pbfk
default     22m         22m          1         wordpress-686ccd47b4.15340e5036f2fec1         ReplicaSet               Normal    SuccessfulDelete    replicaset-controller     Deleted pod: wordpress-686ccd47b4-vv9dk
default     2m          22m          72        wordpress-f59c459fd-qkfrt.15340e503bd4988c    Pod                      Warning   FailedScheduling    default-scheduler         No nodes are available that match all of the predicates: Insufficient cpu (1), Insufficient memory (2), MatchInterPodAffinity (1).
default     2m          22m          72        wordpress-f59c459fd-w8c65.15340e50399a8a5a    Pod                      Warning   FailedScheduling    default-scheduler         No nodes are available that match all of the predicates: Insufficient cpu (1), Insufficient memory (2), MatchInterPodAffinity (1).
default     22m         22m          1         wordpress-f59c459fd.15340e5039d6c622          ReplicaSet               Normal    SuccessfulCreate    replicaset-controller     Created pod: wordpress-f59c459fd-w8c65
default     22m         22m          1         wordpress-f59c459fd.15340e503bf844db          ReplicaSet               Normal    SuccessfulCreate    replicaset-controller     Created pod: wordpress-f59c459fd-qkfrt
default     3m          23h          177       wordpress.1533c22c7bf657bd                    Ingress                  Normal    Service             loadbalancer-controller   no user specified default backend, using system default
default     22m         22m          1         wordpress.15340e50356eaa6a                    Deployment               Normal    ScalingReplicaSet   deployment-controller     Scaled down replica set wordpress-686ccd47b4 to 0
default     22m         22m          1         wordpress.15340e5037c04da6                    Deployment               Normal    ScalingReplicaSet   deployment-controller     Scaled up replica set wordpress-f59c459fd to 2
-- Chris Stryczynski
kubernetes

2 Answers

6/4/2018

The new deployment had a replica count of 3 while the previous had 2. I assumed I could set a high value for replica count and it would try to deploy as many replicas as it could before it reaches it's resource capacity. However this does not seem to be the case...

-- Chris Stryczynski
Source: StackOverflow

6/1/2018

You can use describe kubectl describe po wordpress-f59c459fd-qkfrt but from the message the pods cannot be scheduled in any of the nodes.

Provide more capacity, like try to add a node, to allow the pods to be scheduled.

-- iomv
Source: StackOverflow