I own a GKE Cluster on GCP, I have 1 node pool with 1 node (4 CPU/16Gb RAM).
Today I tried to scale one of my application to 10 replicas (We want to run lots of concurrent requests on it).
I first edited my
horizontalPodAutoscaler.yaml and changed
maxReplicas from 5 to 50 and
minReplicas from 1 to 10.
Then I edited
deployment.yaml and modified
spec.replicas from 3 to 10.
Now my deployment is stuck in a loop: It tries to deploy the 10 pods, and as soon as the 10 are ready, it kills 5 of them to go back to 5, in an infinite loop.
Here a the screenshots of the state of the Autoscaler during the loop, it's like it tries to apply 1 configuration and immeditalety the configuration get overwritten by the other.
Here are the config files I am using:
apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: labels: app: my-app env: production name: my-app-hpa namespace: production spec: maxReplicas: 50 metrics: - resource: name: cpu targetAverageUtilization: 80 type: Resource minReplicas: 10 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: my-app
apiVersion: apps/v1 kind: Deployment metadata: labels: app: my-app env: production name: my-app namespace: production spec: replicas: 10 selector: matchLabels: app: my-app env: production strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: labels: app: my-app env: production spec: nodeSelector: cloud.google.com/gke-nodepool: my-pool containers: - image: gcr.io/my_project_id/github.com/my_org/my-app imagePullPolicy: IfNotPresent name: my-app-1 resources: requests: cpu: "50m"