I have Kuberenetes cluster hosted in Google Cloud.
I deployed my deployment and added an hpa
rule for scaling.
kubectl autoscale deployment MY_DEP --max 10 --min 6 --cpu-percent 60
waiting a minute and run kubectl get hpa
command to verify my scale rule - As expected, I have 6 pods running (according to min parameter).
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
MY_DEP Deployment/MY_DEP <unknown>/60% 6 10 6 1m
Now, I want to change the min parameter:
kubectl patch hpa MY_DEP -p '{"spec":{"minReplicas": 1}}'
Wait for 30 minutes and run the command:
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
MY_DEP Deployment/MY_DEP <unknown>/60% 1 10 6 30m
expected replicas: 1, actual replicas: 6
More information:
replicas
parameter has not changed.If I changed the MINPODS
parameter to "1" - why I still have 6 pods? How to make Kubernetes to actually change the min
pods in my deployment?
If I changed the MINPODS parameter to "1" - why I still have 6 pods?
I believe the answer is because of the <unknown>/60%
present in the output. The fine manual states:
Please note that if some of the pod's containers do not have the relevant resource request set, CPU utilization for the pod will not be defined and the autoscaler will not take any action for that metric
and one can see an example of 0% / 50%
in the walkthrough page. Thus, I would believe that since kubernetes cannot prove what percentage of CPU is being consumed -- neither above nor below the target -- it takes no action for fear of making whatever the situation is worse.
As for why there is a <unknown>
, I would hazard a guess it's the dreaded heapster-to-metrics-server cutover that might be obfuscating that information from the kubernetes API. Regrettably, I don't have first-hand experience testing that theory, in order to offer you concrete steps beyond "see if your cluster is collecting metrics in a place that kubernetes can see them."