I own a GKE Cluster on GCP, I have 1 node pool with 1 node (4 CPU/16Gb RAM).
Today I tried to scale one of my application to 10 replicas (We want to run lots of concurrent requests on it).
I first edited my horizontalPodAutoscaler.yaml
and changed maxReplicas
from 5 to 50 and minReplicas
from 1 to 10.
Then I edited deployment.yaml
and modified spec.replicas
from 3 to 10.
Now my deployment is stuck in a loop: It tries to deploy the 10 pods, and as soon as the 10 are ready, it kills 5 of them to go back to 5, in an infinite loop.
Here a the screenshots of the state of the Autoscaler during the loop, it's like it tries to apply 1 configuration and immeditalety the configuration get overwritten by the other.
Here are the config files I am using:
horizontalPodScheduler.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: my-app
env: production
name: my-app-hpa
namespace: production
spec:
maxReplicas: 50
metrics:
- resource:
name: cpu
targetAverageUtilization: 80
type: Resource
minReplicas: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-app
env: production
name: my-app
namespace: production
spec:
replicas: 10
selector:
matchLabels:
app: my-app
env: production
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-app
env: production
spec:
nodeSelector:
cloud.google.com/gke-nodepool: my-pool
containers:
- image: gcr.io/my_project_id/github.com/my_org/my-app
imagePullPolicy: IfNotPresent
name: my-app-1
resources:
requests:
cpu: "50m"