Why is GKE HPA not scaling down?

2/15/2021

I have a Kubernetes deployment with a Go App in Kubernetes 1.17 on GKE. It has cpu and memory requests and limits. It has 1 replica specified in the deployment.

Furthermore I have this HPA (I have a autoscaling/v2beta2 defined in my Helm chart, but GKE converts it to a v2beta1 apparently):

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    meta.helm.sh/release-name: servicename
    meta.helm.sh/release-namespace: namespace
  creationTimestamp: "2021-02-15T11:30:18Z"
  labels:
    app.kubernetes.io/managed-by: Helm
  name: servicename-service
  namespace: namespace
  resourceVersion: "123"
  selfLink: link
  uid: uid
spec:
  maxReplicas: 10
  metrics:
  - resource:
      name: memory
      targetAverageUtilization: 80
    type: Resource
  - resource:
      name: cpu
      targetAverageUtilization: 80
    type: Resource
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: servicename-service
status:
  conditions:
  - lastTransitionTime: "2021-02-15T11:30:33Z"
    message: recommended size matches current size
    reason: ReadyForNewScale
    status: "True"
    type: AbleToScale
  - lastTransitionTime: "2021-02-15T13:17:20Z"
    message: the HPA was able to successfully calculate a replica count from cpu resource
      utilization (percentage of request)
    reason: ValidMetricFound
    status: "True"
    type: ScalingActive
  - lastTransitionTime: "2021-02-15T13:17:36Z"
    message: the desired count is within the acceptable range
    reason: DesiredWithinRange
    status: "False"
    type: ScalingLimited
  currentMetrics:
  - resource:
      currentAverageUtilization: 14
      currentAverageValue: "9396224"
      name: memory
    type: Resource
  - resource:
      currentAverageUtilization: 33
      currentAverageValue: 84m
      name: cpu
    type: Resource
  currentReplicas: 3
  desiredReplicas: 3
  lastScaleTime: "2021-02-15T13:40:11Z"

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "456"
    meta.helm.sh/release-name: servicename-service
    meta.helm.sh/release-namespace: services
  creationTimestamp: "2021-02-11T10:00:45Z"
  generation: 129
  labels:
    app: servicename
    app.kubernetes.io/managed-by: Helm
    chart: servicename
    heritage: Helm
    release: servicename-service
  name: servicename-service-servicename
  namespace: namespace
  resourceVersion: "123"
  selfLink: /apis/apps/v1/namespaces/namespace/deployments/servicename-service-servicename
  uid: b1fcc8c6-f3e6-4bbf-92a1-d7ae1e2bb188
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: servicename
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: servicename
        release: servicename-service
    spec:
      containers:
        envFrom:
        - configMapRef:
            name: servicename-service-servicename
        image: image
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /health/liveness
            port: 8888
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: servicename
        ports:
        - containerPort: 8888
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /health/readiness
            port: 8888
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 500m
            memory: 256Mi
          requests:
            cpu: 150m
            memory: 64Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 3
  conditions:
  - lastTransitionTime: "2021-02-11T10:00:45Z"
    lastUpdateTime: "2021-02-16T14:10:29Z"
    message: ReplicaSet "servicename-service-servicename-5b6445fcb" has
      successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2021-02-20T16:19:51Z"
    lastUpdateTime: "2021-02-20T16:19:51Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 129
  readyReplicas: 3
  replicas: 3
  updatedReplicas: 3

Output of kubectl get hpa --all-namespaces

NAMESPACE   NAME                                  REFERENCE                                         TARGETS                        MINPODS   MAXPODS   REPLICAS   AGE
namespace   servicename-service                   Deployment/servicename-service                    9%/80%, 1%/80%                 1         10        2          6d
namespace   xyz-service                           Deployment/xyz-service                            18%/80%, 1%/80%                1         10        1          6d

I haven't changed any Kubernetes Controller default settings like --horizontal-pod-autoscaler-downscale-stabilization.

Question: Why is it not scaling down to 1 replica when the currentAverageUtilization of the cpu is 33 and the target one 80? I waited for more than 1 hour.

Any ideas?

-- nikos
google-kubernetes-engine
hpa
kubernetes

0 Answers