I have a k8s cluster deployed on GKE, there is one "main" nodepool containing 1 node, for all the deployments and one nodepool containing 1 node for kube-ip.
On the main nodepool, I would like to deploy 10 replicas of one of my application (flask API). However GKE is contantly killing my pods when the number exceed 5 to get the pod number back to 5.
I tried to modify the values in my differents yaml files (deployment.yaml
and horizontalPodScheduler.yaml
)
horizontalPodScheduler.yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
labels:
app: my-app
env: production
name: my-app-hpa
namespace: production
spec:
maxReplicas: 20
metrics:
- resource:
name: cpu
targetAverageUtilization: 80
type: Resource
minReplicas: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-app
env: production
name: my-app
namespace: production
spec:
replicas: 10
selector:
matchLabels:
app: my-app
env: production
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-app
env: production
spec:
nodeSelector:
cloud.google.com/gke-nodepool: my-pool
containers:
- image: gcr.io/my_project_id/github.com/my_org/my-app
imagePullPolicy: IfNotPresent
name: my-app-1
Even when I set those values, GKE is always overwritting them to 5 replicas:
Here is the resources summary for my main node, you can see there is plenty of resources to deploy the replicas (it's a pretty simple API)
I also tried to use the "scale" button on GKE UI, but results are the same...