Fairlry new, getting in terms with kubernetes. Apologize if this is pretty basic.
I have exposed a service with replication set to 3. I have autoscaled the deployment with minPods=2
and maxPods=6
when the CPU runs over 50%.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-api Deployment/ewealth-stocks-api <unknown> / 50% **2 6 3** 2m
What does the above configuration do? How many pods will be available when the CPU is within the threshold and when it crosses it? Thanks.
As I Understand, You want to use Horizontal Pod Autoscaler based on usage of CPU.
So basically HPA will replicate the pods based on CPU load. It means when the CPU load is greater than 50 % than increase the number of replicas of pods till 6. When the CPU load is decreased then keep the number of replicas of pods 2.
As the amount of load is not controlled in any way it may happen that the final number of replicas will be different from 2 to 6 .
HPA will increase and decrease the number of replicas (via the deployment) to maintain an average CPU utilization across all Pods of 50% If you want to check state of CPU usage and numbers of pods then you use following commands
kubectl get hpa
kubectl describe hpa HPA_NAME
I have attached a link for further understanding which provide an examples.