How to do scala heap dump in Kubernetes in Azure

2/8/2019

I've got a problem doing automatic heap dump to a mounted persistent volume in Microsoft Azure AKS (Kubernetes).

So the situation looks like this:

  • Running program with parameters -Xmx200m causes out of memory exception
  • After building, pushing and deploying docker image in AKS after few seconds pod is killed and restarted
  • I got message in hello.txt in mounted volume but no dump file is created

What could be the reason of such a behaviour?

My test program looks like this:

import java.io._

object Main {

  def main(args: Array[String]): Unit = {

    println("Before printing test info to file")
    val pw = new PrintWriter(new File("/borsuk_data/hello.txt"))
    pw.write("Hello, world")
    pw.close
    println("Before allocating to big Array for current memory settings")
    val vectorOfDouble = Range(0, 50 * 1000 * 1000).map(x => 666.0).toArray
    println("After creating to big Array")
  }

}

My entrypoint.sh:

#!/bin/sh
java -jar /root/scala-heap-dump.jar -Xmx200m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/data/scala-heap-dump.bin

My Dockerfile:

FROM openjdk:jdk-alpine

WORKDIR /root
ADD target/scala-2.12/scala-heap-dump.jar  /root/scala-heap-dump.jar
ADD etc/entrypoint.sh /root/entrypoint.sh
ENTRYPOINT ["/bin/sh","/root/entrypoint.sh"]

My deployment yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: scala-heap-dump
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: scala-heap-dump
    spec:
      containers:
        - name: scala-heap-dump-container
          image: PRIVATE_REPO_ADDRESS/scala-heap-dump:latest
          imagePullPolicy: Always
          resources:
            requests:
              cpu: 500m
              memory: "1Gi"
            limits:
              cpu: 500m
              memory: "1Gi"
          volumeMounts:
            - name: data
              mountPath: /data
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: dynamic-persistence-volume-claim
      dnsPolicy: ClusterFirst
      hostNetwork: false
      imagePullSecrets:
        - name: regsecret

UPDATE: As lawrencegripper pointed out the first issue was that pod was OOM killed due to memory limits in yaml. After changing memory to 2560Mi or higher (I've tried even such ridiculous values in yaml as CPU: 1000m and memory 5Gi) I don't get reason OOM killed. However, no dump file is created and different kind of message occurs under lastState terminated. The reason is: Error. Unfortunately this isn't very helpful. If anybody knows how to narrow it down, please help.

UPDATE 2: I've added some println in code to have better understanding of what's going on. The logs for killed pod are:

Before printing test info to file
Before allocating to big Array for current memory settings
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at scala.reflect.ManifestFactory$DoubleManifest.newArray(Manifest.scala:153)
        at scala.reflect.ManifestFactory$DoubleManifest.newArray(Manifest.scala:151)
        at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:285)
        at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:283)
        at scala.collection.AbstractTraversable.toArray(Traversable.scala:104)
        at Main$.main(Main.scala:12)
        at Main.main(Main.scala)

So as you can see program never reaches: println("After creating to big Array").

-- CodeDog
azure
azure-aks
docker
kubernetes
scala

2 Answers

2/13/2019

I think the problem is the entrypoint.sh command.

> java --help
Usage: java [options] <mainclass> [args...]
       (to execute a class)
   or  java [options] -jar <jarfile> [args...]
       (to execute a jar file)

Note that anything after the -jar are arguments passed to your application, not to the JVM.

Try:

java -Xmx200m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/data/scala-heap-dump.bin -jar /root/scala-heap-dump.jar 
-- DanLebrero
Source: StackOverflow

2/8/2019

It's a long-shot but one possibility is that Kubernetes is killing the pod due to it breaching the memory limit set in the YAML while it's building the dump but before it writes it to disk.

Use kubectl get pod <yourPodNameHere> --output=yaml to get the pod information back and look under lastState for Reason: OOMKilled

https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/

-- lawrencegripper
Source: StackOverflow