Difference between requested and allocated memory - Spark on Kubernetes

3/16/2020

I am running a Spark Job in Kubernetes cluster using spark-submit command as below,

bin/spark-submit \
    --master k8s://https://api-server-host:443 \
    --deploy-mode cluster \
    --name spark-job-name \
    --conf spark.kubernetes.namespace=spark \
    --conf spark.kubernetes.container.image=docker-repo/pyspark:55 \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-submit \
    --conf spark.kubernetes.pyspark.pythonVersion=3 \
    --conf spark.executor.memory=4G \
    --files local:///mnt/conf.json \
    local:///mnt/ingest.py

and when I check the request and limit for the executor pod, it shows below. There is almost 1700 MB extra got allocated for the pod.

Limits:
  memory:  5734Mi
Requests:
  cpu:     4
  memory:  5734Mi

Why is that?

-- karthikeayan
apache-spark
kubernetes

2 Answers

3/16/2020

In addition to @CptDolphin 's answer, be aware that Spark always allocates spark.executor.memoryOverhead extra memory (max of 10% of spark.executor.memory or 384MB, unless explicitly configured), and may allocate additional spark.executor.pyspark.memory if you defined that in your configuration.

-- mazaneicha
Source: StackOverflow

3/16/2020

What you define the pod (as individual system) can use is one thing, what you define spark or java or any other app runing inside that system (pod) can use, is another thing; think of it as normal computer with limits, and then your application with its own limits.

-- CptDolphin
Source: StackOverflow