I've seen articles recommending that resource requests/limit should be implemented. However, none I've found that discuss on what numbers to fill in.
For example, consider a container use zero CPU while idle, 80% under normal user requests and 200% CPU when hit by some rare requests:
There are also cases like
Ideally vertical pod autoscaler would probably solve these problems automatically, but it is still in alpha today.
What I've been doing is to use telegraf to collect resource usage, and use the 95th percentile while the limit is set to 1 CPU and twice the memory request.
The problem with this method is