Say we have the following deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
replicas: 2
template:
spec:
containers:
- image: ...
...
resources:
requests:
cpu: 100m
memory: 50Mi
limits:
cpu: 500m
memory: 300Mi
And we also create a HorizontalPodAutoscaler
object which automatically scales up/down the number of pods based on CPU average utilization. I know that the HPA will compute the number of pods based on the resource requests, but what if I want the containers to be able to request more resources before scaling horizontally?
I have two questions:
1) Are resource limits even used by K8s when a HPA is defined?
2) Can I tell the HPA to scale based on resource limits rather than requests? Or as a means of implementing such a control, can I set the targetUtilization
value to be more than 100%?
No, HPA is not looking at limits at all. You can specify target utilization to any value even higher than 100%.
Hi in deployment we have resources requests and limits. As per documentation here those parameters acts before HPA gets main role as autoscaler:
- When you create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each of the resource types: the amount of CPU and memory it can provide for Pods.
- Then the kubelet starts a Container of a Pod, it passes the CPU and memory limits to the container runtime.
- If a Container exceeds its memory limit, it might be terminated. If it is restartable, the kubelet will restart it, as with any other type of runtime failure.
If a Container exceeds its memory request, it is likely that its Pod will be evicted whenever the node runs out of memory.
On the other hand:
The Horizontal Pod Autoscaler is implemented as a control loop, with a period controlled by the controller manager’s (with default value of 15 seconds). The controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition.
Note: Please note that if some of the pod’s containers do not have the relevant resource request set, CPU utilization for the pod will not be defined and the autoscaler will not take any action for that metric.
Hope this help