I am serving jupyter notebook through a Kubernetes cluster. And I've set resources.limits
to prevent someone from draining all of the host servers memory.
While one problem is that the jupyter notebook kernels after crash and automatic restart they do not throw any OOM errors after the container exceeds the memory, which will make the user very confused.
So how can I make the jupyter notebook raise the OOM error when running with Kubernetes?
Please look at solution from this post "Jupyter notebook: memory usage for each notebook" on killing a notebook when it exceeds certain amount of memory.
There is another nice interface (like top) from nbtop.
If you have only one specific pod, you can monitor the events/logs, as in here:
kubectl get events --watch
kubectl logs -f podname
That being said, not all events in a pod lifecycle are properly reported, as shown in kubernetes/kubernetes
issue 38532 and the (abandoned) PR 45682.
But you should still see OOMKilled:true
when docker inspect
'ing the pod.