So this is happening on EKS K8s v1.15. You can see the api version in the describe output. The millicpu hovers between 80 and 120... which does not at ALL match the replica counts coming out of the HPA....
Here is the YAML:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: sequencer
namespace: djin-content
spec:
minReplicas: 1
maxReplicas: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sequencer
metrics:
- type: Pods
pods:
metricName: cpu_usage
targetAverageValue: 500
Here is the kubectl describe:
[root@ip-10-150-53-173 ~]# kubectl describe hpa -n djin-content
Name: sequencer
Namespace: djin-content
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"sequencer","namespace":"djin-con...
CreationTimestamp: Wed, 05 Aug 2020 20:40:37 +0000
Reference: Deployment/sequencer
Metrics: ( current / target )
"cpu_usage" on pods: 122m / 500
Min replicas: 1
Max replicas: 10
Deployment pods: 7 current / 7 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 4
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from pods metric cpu_usage
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 34m horizontal-pod-autoscaler New size: 10; reason: pods metric cpu_usage above target
Normal SuccessfulRescale 15m (x2 over 34m) horizontal-pod-autoscaler New size: 6; reason: pods metric cpu_usage above target
Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 5; reason: All metrics below target
Normal SuccessfulRescale 9m51s (x2 over 23m) horizontal-pod-autoscaler New size: 3; reason: All metrics below target
Normal SuccessfulRescale 5m (x2 over 16m) horizontal-pod-autoscaler New size: 4; reason: pods metric cpu_usage above target
Normal SuccessfulRescale 4m45s (x2 over 15m) horizontal-pod-autoscaler New size: 5; reason: pods metric cpu_usage above target
Normal SuccessfulRescale 4m30s horizontal-pod-autoscaler New size: 7; reason: pods metric cpu_usage above target
The custom metric API is populated correctly/frequently and running well. The deployment targeting is working perfectly ... I have went through the entire k8s code base for this API and replica calculation and this makes NO sense ...
I found the answer to this some time ago and forgot to update. It was debated at length in a well-known k8s project issue about this topic. It's essentially a design bug in k8s HPA targeting (feature?): https://github.com/kubernetes/kubernetes/issues/78761#issuecomment-670815813
It seems like the metrics don't match, you have 122m (milicores) vs / 500 raw something.
"cpu_usage" on pods: 122m / 500
You didn't specify what's calculating your custom metrics, it could be that an extra 0
is being added to 122m
making it 1220 / 500
(I assume cpu_usage
is the custom metric since the regular metrics server metric is just cpu
) but you could try:
targetAverageValue: 500m
The more common way to do HPA on CPU usage is to use CPU Utilization Percentage from the metrics server.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
...
The scaling activities are managed by the kube-controlller-manager
in your K8s control plane, if you have the EKS control plane logs enabled you could also take a look there to find more information. 👀
✌️