Persist heap dump in case of OOM in kubernetes pod?

2/16/2022

I need to persist the heap dump when the java process gets OOM and the pod is restarted.

I have following added in the jvm args

-XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/dumps

...and emptydir is mounted on the same path.

But the issue is if the pod gets restarted and if it gets scheduled on a different node, then we are losing the heap dump. How do I persist the heap dump even if the pod is scheduled to a different node?

We are using AWS EKS and we are having more than 1 replica for the pod.

Could anyone help with this, please?

-- Baitanik
amazon-eks
amazon-web-services
heap-dump
kubernetes
persistent-volumes

2 Answers

2/18/2022

As writing to EFS is too slow in your case, there is another option for AWS EKS - awsElasticBlockStore.

The contents of an EBS volume are persisted and the volume is unmounted when a pod is removed. This means that an EBS volume can be pre-populated with data, and that data can be shared between pods.

Note: You must create an EBS volume by using aws ec2 create-volume or the AWS API before you can use it.

There are some restrictions when using an awsElasticBlockStore volume:

  • the nodes on which pods are running must be AWS EC2 instances
  • those instances need to be in the same region and availability zone as the EBS volume
  • EBS only supports a single EC2 instance mounting a volume

Check the official k8s documentation page on this topic, please. And How to use persistent storage in EKS.

-- mozello
Source: StackOverflow

2/16/2022

You will have to persists the heap dumps on a shared network location between the pods. In order to achieve this, you will need to provide persistent volume claims and in EKS, this could be achieved using an Elastic File System mounted on different availability zones. You can start learning about it by reading this guide about EFS-based PVCs.

-- Allan Chua
Source: StackOverflow