Kubernetes Rolling Update not obeying 'maxUnavailable' replicas when redeployed in autoscaled conditions

12/19/2018

In a nutshell, most of our apps are configured with the following strategy in the Deployment -

  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate

The Horizonatal Pod Autoscaler is configured as so

spec:
  maxReplicas: 10
  minReplicas: 2

Now when our application was redeployed, instead of running a rolling update, it instantly terminated 8 of our pods and dropped the number of pods to 2 which is the min number of replicas available. This happened in a fraction of a second as you can see here.

enter image description here

Here is the output of kubectl get hpa -

enter image description here

As maxUnavailable is 25%, shouldn't only about 2-3 pods go down at max ? Why did so many pods crash at once ? It seems as though rolling update is useless if it works this way.

What am I missing ?

-- bholagabbar
autoscaling
deployment
kubernetes

2 Answers

3/7/2020

I want to point out the minReadySeconds property.

The minReadySeconds property that specifies how long a newly created pod should be ready before the pod is treated as available. Actually the redeploying that without minReadySeconds's property has been done successfully in a very short time. But after short time readiness probe started to failing for any reason and the pods start scale down.

maxUnavailable property is only taken care about while RollingUpdate. After RollingUpdate event this property ignored.

Note from Kubernetes In Action's book : If you only define the readiness probe without setting minReadySeconds properly, new pods are considered available immediately when the first invocation of the readiness probe succeeds. If the readiness probe starts failing shortly after, the bad version is rolled out across all pods. Therefore, you should set minReadySeconds appropriately.

-- fgul
Source: StackOverflow

12/26/2018

After looking at this question, I decided to try this with test Environment where I wanted to check If it doesn't work.

I have setup the metrics-server to fetch the metrics server and set a HPA. I have followed the following steps to setup the HPA and deployment:

How to Enable KubeAPI server for HPA Autoscaling Metrics

Once, I have working HPA and max 10 pods running on system, I have updated the images using:

[root@ip-10-0-1-176 ~]# kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   49%/50%   1         10        10         87m

[root@ip-10-0-1-176 ~]# kubectl get pods
NAME                              READY   STATUS    RESTARTS   AGE
load-generator-557649ddcd-6jlnl   1/1     Running   0          61m
php-apache-75bf8f859d-22xvv       1/1     Running   0          91s
php-apache-75bf8f859d-dv5xg       1/1     Running   0          106s
php-apache-75bf8f859d-g4zgb       1/1     Running   0          106s
php-apache-75bf8f859d-hv2xk       1/1     Running   0          2m16s
php-apache-75bf8f859d-jkctt       1/1     Running   0          2m46s
php-apache-75bf8f859d-nlrzs       1/1     Running   0          2m46s
php-apache-75bf8f859d-ptg5k       1/1     Running   0          106s
php-apache-75bf8f859d-sbctw       1/1     Running   0          91s
php-apache-75bf8f859d-tkjhb       1/1     Running   0          55m
php-apache-75bf8f859d-wv5nc       1/1     Running   0          106s
[root@ip-10-0-1-176 ~]# kubectl set image deployment php-apache php-apache=hpa-example:v1 --record
deployment.extensions/php-apache image updated

[root@ip-10-0-1-176 ~]# kubectl get pods
NAME                              READY   STATUS              RESTARTS   AGE
load-generator-557649ddcd-6jlnl   1/1     Running             0          62m
php-apache-75bf8f859d-dv5xg       1/1     Terminating         0          2m40s
php-apache-75bf8f859d-g4zgb       1/1     Terminating         0          2m40s
php-apache-75bf8f859d-hv2xk       1/1     Terminating         0          3m10s
php-apache-75bf8f859d-jkctt       1/1     Running             0          3m40s
php-apache-75bf8f859d-nlrzs       1/1     Running             0          3m40s
php-apache-75bf8f859d-ptg5k       1/1     Terminating         0          2m40s
php-apache-75bf8f859d-sbctw       0/1     Terminating         0          2m25s
php-apache-75bf8f859d-tkjhb       1/1     Running             0          56m
php-apache-75bf8f859d-wv5nc       1/1     Terminating         0          2m40s
php-apache-847c8ff9f4-7cbds       1/1     Running             0          6s
php-apache-847c8ff9f4-7vh69       1/1     Running             0          6s
php-apache-847c8ff9f4-9hdz4       1/1     Running             0          6s
php-apache-847c8ff9f4-dlltb       0/1     ContainerCreating   0          3s
php-apache-847c8ff9f4-nwcn6       1/1     Running             0          6s
php-apache-847c8ff9f4-p8c54       1/1     Running             0          6s
php-apache-847c8ff9f4-pg8h8       0/1     Pending             0          3s
php-apache-847c8ff9f4-pqzjw       0/1     Pending             0          2s
php-apache-847c8ff9f4-q8j4d       0/1     ContainerCreating   0          4s
php-apache-847c8ff9f4-xpbzl       0/1     Pending             0          1s

Also, I have kept job in background which pushed the kubectl get pods output every second in a file. At no time till all images are upgraded, number of pods never went below 8.

I believe you need to check how you're setting up your rolling upgrade. Are you using deployment or replicaset? I have kept the rolling update strategy same as you maxUnavailable: 25% and maxSurge: 25% with deployment and it is working well for me.

-- Prafull Ladha
Source: StackOverflow