I have built an application that uses celery to run a task asynchronously. I have the celery working on 2 Kubernetes pods that is being hosted on an Amazon AWS m4.large instance.
I also have set up horizontal pod autoscaling so that the pods would scale when CPU utilization goes above 80%. I have an autoscaling group in AWS that also scales when instance CPU utilization reaches over 80%.
For about an hour, the cluster scales to 5 instances all running at ~100% CPU utilization but then all other celery pods run at ~2% and only one is running at ~100%. There are still many tasks in the queue, but it seems that only one of my pods is running at full potential.
What is happening here and how can I evenly distribute tasks such that all 5 pods run at similar CPU utilization until all tasks are completed?
When running kubectl top pods. One of my pods runs 1900 milliCPUs and others run at 1 milliCPUs.