What to put for Kubernetes resource requests/limits?

5/15/2018

I've seen articles recommending that resource requests/limit should be implemented. However, none I've found that discuss on what numbers to fill in.

For example, consider a container use zero CPU while idle, 80% under normal user requests and 200% CPU when hit by some rare requests:

If I put the maximum, 2000m as CPU request then a core would sit idle most of the time
On the other hand, if I request 800m and several pods are hitting their CPU limit at the same time the context switch overhead will kicks in

There are also cases like

Internal tools that sit idle most of the time, then jump to 200% on active use
Apps that have different peak time. For example, a SaaS that people use during working hours and a chatbot that start getting load after people leave work. It'd be nice if they could share the unused capacity.

Ideally vertical pod autoscaler would probably solve these problems automatically, but it is still in alpha today.

-- willwill

kubernetes

1 Answer

5/15/2018

What I've been doing is to use telegraf to collect resource usage, and use the 95th percentile while the limit is set to 1 CPU and twice the memory request.

Screenshot

The problem with this method is

App that utilize multicores during startup, then under a core throughout their life will take longer to starts. I've observed a 2 minutes Spring startup become 5 minutes
Apps that are rarely used will have less resource reserved, and so have to rely on bursting capacity when it get invoked. This could be a problem if it has a surge in popularity.

-- willwill

Source: StackOverflow

K
Q

What to put for Kubernetes resource requests/limits?

Similar Questions

1 Answer

KQ

What to put for Kubernetes resource requests/limits?

Similar Questions

1 Answer

K
Q