Create JVM heapdump when K8s healthcheck restarts the pod - no OOM occur

8/2/2020

I have a situation when all of a sudden a really long GC pause occurs and I need to find out what is the source of the sudden memory allocation. The long GC pause (around 30 seconds) causes the pod to fail several K8s health checks in a row and the pod gets restarted, without OOM actually happening. I want to create a heap dump before the K8s actually restarts the pod. I realise that the dump should be done to some external persistent mount.

The only idea I have of how to cause the heap dump to occur is to use the preStop hook. The question is, whether the preStop hook is fired when the pod is restarted because of health check failure?

Maybe there is a more elegant solution to this?

-- Kikosha
heap-dump
java
jvm
kubernetes
kubernetes-health-check

1 Answer

8/2/2020

The question is, whether the preStop hook is fired when the pod is restarted because of health check failure?

Yes. As per the definition, perStop hook runs immediately before a container is terminated due to an API request or management event such as liveness probe failure, preemption, resource contention and others.


Should I use preStop hook to capture Java Heap Dump before pod termination?

Yes. But you need to be careful, a call to the preStop hook fails if the container is already in terminated or completed state. When the pod is terminated, it waits for default 30 second grace period (with additional 2 seconds if PerStop hook is not completed) before sending KILL signal. If the preStop hook needs longer to complete than the default grace period allows, you must modify terminationGracePeriodSeconds to suit this.


Any more elegant solution to this?

Not I am aware of. I guess by adding an empty dir volume to the pod, and configuring the JVM to do the heap dumps to that directory command: ["java", "-XX:+HeapDumpOnOutOfMemoryError", "-XX:HeapDumpPath=/dumps/oom.bin", "-jar", "yourapp.jar"] should work.

Why the above solution will work?

When kubernetes kills your container because it is not responding to the health check, the kubernetes will just restart the container, but it will not reschedule the pod, so it will not move it to another node. Hence the empty dir volume is not deleted until the pod is moved to another node. Hence when the container is restarted, the new container will mount the same empty dir, which will contain the heap dump from the previous run. So you can kubectl cp those files at any time after the event. There might be other challenges to copy the heap dump files but they are solvable. Check this for more info.

-- mchawre
Source: StackOverflow