Process inside Pod is OOMKilled even though Pod limits not reached

11/7/2019

We're sort of new to the whole Kubernetes world, but by now have a number of services running in GKE. Today we saw some strange behaviour though, where one of the processes running inside one of our pods was killed, even though the Pod itself had plenty of resources available, and wasn't anywhere near its limits.

The limits are defined as such:

resources:
  requests:
    cpu: 100m
    memory: 500Mi
  limits:
    cpu: 1000m
    memory: 1500Mi

Inside the pod, a Celery (Python) is running, and this particular one is consuming some fairly long running tasks.

During operation of one of the tasks, the celery process was suddenly killed, seemingly caused by OOM. The GKE Cluster Operations logs show the following:

Memory cgroup out of memory: Kill process 613560 (celery) score 1959 or sacrifice child
Killed process 613560 (celery) total-vm:1764532kB, anon-rss:1481176kB, file-rss:13436kB, shmem-rss:0kB

The resource graph for the time period looks like the following:

CPU and Memory usage for Pod

As can be clearly seen, neither the CPU or the memory usage was anywhere near the limits defined by the Pod, so we're baffled as to why any OOMKilling occurred? Also a bit baffled by the fact that the process itself was killed, and not the actual Pod?

Is this particular OOM actually happening inside the OS and not in Kubernetes? And if so - is there a solution to getting around this particular problem?

-- Tingyou
google-kubernetes-engine
kubernetes

1 Answer

11/7/2019

About your statement:

Also a bit baffled by the fact that the process itself was killed, and not the actual Pod?

Compute Resources (CPU/Memory) are configured for Containers, not for Pods.

If a Pod container is OOM killed, the Pod is not evicted. The underlying container is restarted by the kubelet based on its RestartPolicy. The Pod will still exist on the same node, and the Restart Count will be incremented (unless you are using RestartPolicy: Never, which is not your case).

If you do a kubectl describe on your pod, the newly spawned container will be in Running state, but you can find the last restart cause in Last State. Also, you can check how many times it was restarted:

State:          Running
  Started:      Wed, 27 Feb 2019 10:29:09 +0000
Last State:     Terminated
  Reason:       OOMKilled
  Exit Code:    137
  Started:      Wed, 27 Feb 2019 06:27:39 +0000
  Finished:     Wed, 27 Feb 2019 10:29:08 +0000
Restart Count:  5

The Resource Graph visualization may deviate from the actual use of Memory. As it uses a 1 min interval (mean) sampling, if your memory suddenly increases over the top limit, the container can be restarted before your average memory usage gets plotted on the graph as a high peak. If your Python container makes short/intermittent high memory usage, it's prone to be restarted even though the values are not in the graph.

With kubectl top you can view the last memory usage registered for the Pod. Although it will be more precisely to see the memory usage on a specific point in time, keep in mind that it fetches the values from metrics-server, which have a --metric-resolution:

The interval at which metrics will be scraped from Kubelets (defaults to 60s).

If your container makes a "spiky" use of memory, you may still see it being restarting without even seeing the memory usage on kubectl top.

-- Eduardo Baitello
Source: StackOverflow