GCP GKE Kubernetes HPA: horizontal-pod-autoscaler failed to get memory utilization: missing request for memory. How to fix it?

2/21/2022

I have a cluster in the Google Kubernetes Engine and want to make one of the deployments auto scalable by memory.

After doing a deployment, I check the horizontal scalation with the following command

kubectl describe hpa -n my-namespace

With this result:

Name:                                                     myapi-api-deployment
Namespace:                                                my-namespace
Labels:                                                   <none>
Annotations:                                              <none>
CreationTimestamp:                                        Tue, 15 Feb 2022 12:21:44 +0100
Reference:                                                Deployment/myapi-api-deployment
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  <unknown> / 50%
Min replicas:                                             1
Max replicas:                                             5
Deployment pods:                                          1 current / 1 desired
Conditions:
  Type            Status  Reason                   Message
  ----            ------  ------                   -------
  AbleToScale     True    ReadyForNewScale         recommended size matches current size
  ScalingActive   False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get memory utilization: missing request for memory
  ScalingLimited  False   DesiredWithinRange       the desired count is within the acceptable range
Events:
  Type     Reason                   Age                    From                       Message
  ----     ------                   ----                   ----                       -------
  Warning  FailedGetResourceMetric  2m22s (x314 over 88m)  horizontal-pod-autoscaler  failed to get memory utilization: missing request for memory

When I use the kubectl top command I can see the memory and cpu usage. Here is my deployment including the autoscale:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api-deployment
  namespace: my-namespace
  annotations:
    reloader.stakater.com/auto: "true"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-api
      version: v1
  template:
    metadata:
      labels:
        app: my-api
        version: v1
      annotations:
        sidecar.istio.io/rewriteAppHTTPProbers: "true"
    spec:
      serviceAccountName: my-api-sa
      containers:
        - name: esp
          image: gcr.io/endpoints-release/endpoints-runtime:2
          imagePullPolicy: Always
          args: [
              "--listener_port=9000",
              "--backend=127.0.0.1:8080",              
              "--service=myproject.company.ai"            
          ]
          ports:
            - containerPort: 9000
        - name: my-api
          image: gcr.io/myproject/my-api:24
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: "/healthcheck"
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: "/healthcheck"
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10         
          resources:
            limits:
              cpu: 500m
              memory: 2048Mi
            requests:
              cpu: 300m
              memory: 1024Mi
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-deployment
  namespace: my-namespace
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api-deployment
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: memory
        target:
          type: "Utilization"
          averageUtilization: 50
---

Using the autoscaling/v2beta2 recommended by the GKE documentation

-- seldon851
google-cloud-platform
google-kubernetes-engine
hpa
kubernetes
memory

1 Answer

2/22/2022

When using the HPA with memory or CPU, you need to set resource requests for whichever metric(s) your HPA is using. See How does a HorizontalPodAutoscaler work, specifically

For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each Pod. If a target raw value is set, the raw metric values are used directly.

Your HPA is set to match the my-api-deployment which has two containers. You have resource requests set for my-api but not for esp. So you just need to add a memory resource request to esp.

-- Gari Singh
Source: StackOverflow