Kubernetes autoscaling not fast enough

5/7/2018

I am working on a project which is deployed on Kubernetes. The system consists of multiple microservices where one of them does intense CPU work for 4-5 seconds when requested through HTTP.

We are in a situation where users might invoke this endpoint many times concurrently over a short period of time (possibly within milliseconds). My concern is that the auto-scaler is not able to boot new pods fast enough to meet the needs and that multiple requests will land on the same pod resulting in poor performance for each of those HTTP request.

The application running in the container does actually boot a new process on every request. This means that the application is able to take advantage of more than one core when processing more than one request.

So the question is

Is there any way to make the auto-scaler lightening-fast ie. responding within milliseconds? How is this problem solved in other projects?

Thanks

-- Jakob Kristensen
architecture
autoscaling
kubernetes

1 Answer

5/8/2018

POD autoscaling is based on metrics that get scraped from the running pods by a tool called heapster. This tool, by default, scrapes data every 60 seconds. Furthermore, data scraping takes a significant amount of time (seconds) to complete (the more the PODs, the longer the required time).

That said, it should be clear that it's not feasible to make the autoscaling work at the speed you are requiring (milliseconds).

The "solution" you have is allocating a number of PODs that will be able to sustain your traffic during a peak in a reasonable way. This is of course a waste of system resources when you are off-peak.

-- whites11
Source: StackOverflow