Using Horizontal Pod Autoscaling along with resource requests and limits

5/18/2019

Say we have the following deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  ...
spec:
  replicas: 2
  template:
    spec:
      containers:
        - image: ...
          ...
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
            limits:
              cpu: 500m
              memory: 300Mi

And we also create a HorizontalPodAutoscaler object which automatically scales up/down the number of pods based on CPU average utilization. I know that the HPA will compute the number of pods based on the resource requests, but what if I want the containers to be able to request more resources before scaling horizontally?

I have two questions:

1) Are resource limits even used by K8s when a HPA is defined?

2) Can I tell the HPA to scale based on resource limits rather than requests? Or as a means of implementing such a control, can I set the targetUtilization value to be more than 100%?

-- mittelmania
horizontal-scaling
kubernetes

2 Answers

5/18/2019

No, HPA is not looking at limits at all. You can specify target utilization to any value even higher than 100%.

-- Vasily Angapov
Source: StackOverflow

5/27/2019

Hi in deployment we have resources requests and limits. As per documentation here those parameters acts before HPA gets main role as autoscaler:

  1. When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods.
  2. Then the kubelet starts a Container of a Pod, it passes the CPU and memory limits to the container runtime.
  3. If a Container exceeds its memory limit, it might be terminated. If it is restartable, the kubelet will restart it, as with any other type of runtime failure.

If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory.

On the other hand:

The Horizontal Pod Autoscaler is implemented as a control loop, with a period controlled by the controller manager’s (with default value of 15 seconds). The controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition.

Note: Please note that if some of the pod’s containers do not have the relevant resource request set, CPU utilization for the pod will not be defined and the autoscaler will not take any action for that metric.

Hope this help

-- Hanx
Source: StackOverflow