I've got a preemptible node pool which is clearly under-utilized:
The node pool hosts a deployment with HPA with the following setup:
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
spec:
replicas: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
initContainers:
- name: wait-for-database
image: ### IMAGE ###
command: ['bash', 'init.sh']
containers:
- name: backend
image: ### IMAGE ###
command: ["bash", "entrypoint.sh"]
imagePullPolicy: Always
resources:
requests:
memory: "200M"
cpu: "50m"
ports:
- name: probe-port
containerPort: 8080
hostPort: 8080
volumeMounts:
- name: static-shared-data
mountPath: /static
readinessProbe:
httpGet:
path: /readiness/
port: probe-port
failureThreshold: 5
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
- name: nginx
image: nginx:alpine
resources:
requests:
memory: "400M"
cpu: "20m"
ports:
- containerPort: 80
volumeMounts:
- name: nginx-proxy-config
mountPath: /etc/nginx/conf.d/default.conf
subPath: app.conf
- name: static-shared-data
mountPath: /static
volumes:
- name: nginx-proxy-config
configMap:
name: backend-nginx
- name: static-shared-data
emptyDir: {}
nodeSelector:
cloud.google.com/gke-nodepool: app-dev
tolerations:
- effect: NoSchedule
key: workload
operator: Equal
value: dev
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: backend
namespace: default
spec:
maxReplicas: 12
minReplicas: 8
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: backend
metrics:
- resource:
name: cpu
targetAverageUtilization: 50
type: Resource
---
The node pool also has the toleration label.
The HPA utilization shows this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
backend-develop Deployment/backend-develop 10%/50% 8 12 8 38d
But the node pool does not scale down for about a day. No heavy load on this deployment:
NAME STATUS ROLES AGE VERSION
gke-dev-app-dev-fee1a901-fvw9 Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-gls7 Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lf3f Ready <none> 24h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lgw9 Ready <none> 3d10h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-qxkz Ready <none> 3h35m v1.14.10-gke.36
gke-dev-app-dev-fee1a901-s10l Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-sj4d Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-vdnw Ready <none> 27h v1.14.10-gke.36
There's no affinity
settings for this deployment and node pool. Some of the nodes easily pack several same pods, but others just hold one pod for hours, no scale down happens.
What could be wrong?
The issue was:
hostPort: 8080
This lead to FailedScheduling didn't have free ports
.
That's why the nodes were kept online.