What is the use case of setting memory request less than limit in . K8s

1/14/2019

I understand the use case of setting CPU request less than limit - it allows for CPU bursts in each container if the instances has free CPU hence resulting in max CPU utilization. However, I cannot really find the use case for doing the same with memory. Most application don't release memory after allocating it, so effectively applications will request up to 'limit' memory (which has the same effect as setting request = limit). The only exception is containers running on an instance that already has all its memory allocated. I don't really see any pros in this, and the cons are more nondeterministic behaviour that is hard to monitor (one container having higher latencies than the other due to heavy GC). Only use case I can think of is a shaded in memory cache, where you want to allow for a spike in memory usage. But even in this case one would be running the risk of of one of the nodes underperforming.

-- EugeneMi
kubernetes

2 Answers

1/15/2019

Actually the topic you mention is interesting and in the meantime complex, just as Linux memory management is. As we know when the process is using more memory than the limit it will quickly move up on the potential "to-kill" process "ladder". Going further, the purpose of limit is to tell the kernel when it should consider the process to be potentially killed. Requests on the other hand are a direct statement "my container will need this much memory", but other than that they provide valuable information to the Scheduler about where can the Pod be scheduled (based on available Node resources).

If there is no memory request and high limit, Kubernetes will default the request to the limit (this might result in scheduling fail, even if the pods real requirements are met).

If you set a request, but not limit - container will use the default limit for namespace (if there is none, it will be able to use the whole available Node memory)

Setting memory request which is lower than limit you will give your pods room to have activity bursts. Also you make sure that a memory which is available for Pod to consume during boost is actually a reasonable amount.

Setting memory limit == memory request is not desirable simply because activity spikes will put it on a highway to be OOM killed by Kernel. The memory limits in Kubernetes cannot be throttled, if there is a memory pressure that is the most probable scenario (lets also remember that there is no swap partition).

Quoting Will Tomlin and his interesting article on Requests vs Limits which I highly recommend:

You might be asking if there’s reason to set limits higher than requests. If your component has a stable memory footprint, you probably shouldn’t since when a container exceeds its requests, it’s more likely to be evicted if the worker node encounters a low memory condition.

To summarize - there is no straight and easy answer. You have to determine your memory requirements and use monitoring and alerting tools to have control and be ready to change/adjust the configuration accordingly to needs.

-- aurelius
Source: StackOverflow

1/15/2019

Maybe not a real answer, but a point of view on the subject.

The difference with the limit on CPU and Memory is what happens when the limit is reached. In case of the CPU, the container keeps running but the CPU usage is limited. If memory limit is reached, container gets killed and restarted.

In my use case, I often set the memory request to the amount of memory my application uses on average, and the limit to +25%. This allows me to avoid container killing most of the time (which is good) but of course it exposes me to memory overallocation (and this could be a problem as you mentioned).

-- whites11
Source: StackOverflow