What to put for Kubernetes resource requests/limits?

5/15/2018

I've seen articles recommending that resource requests/limit should be implemented. However, none I've found that discuss on what numbers to fill in.

For example, consider a container use zero CPU while idle, 80% under normal user requests and 200% CPU when hit by some rare requests:

  • If I put the maximum, 2000m as CPU request then a core would sit idle most of the time
  • On the other hand, if I request 800m and several pods are hitting their CPU limit at the same time the context switch overhead will kicks in

There are also cases like

  • Internal tools that sit idle most of the time, then jump to 200% on active use
  • Apps that have different peak time. For example, a SaaS that people use during working hours and a chatbot that start getting load after people leave work. It'd be nice if they could share the unused capacity.

Ideally vertical pod autoscaler would probably solve these problems automatically, but it is still in alpha today.

-- willwill
kubernetes

1 Answer

5/15/2018

What I've been doing is to use telegraf to collect resource usage, and use the 95th percentile while the limit is set to 1 CPU and twice the memory request.

Screenshot

The problem with this method is

  • App that utilize multicores during startup, then under a core throughout their life will take longer to starts. I've observed a 2 minutes Spring startup become 5 minutes
  • Apps that are rarely used will have less resource reserved, and so have to rely on bursting capacity when it get invoked. This could be a problem if it has a surge in popularity.
-- willwill
Source: StackOverflow