I am running a Spark Job in Kubernetes cluster using spark-submit command as below,
bin/spark-submit \
--master k8s://https://api-server-host:443 \
--deploy-mode cluster \
--name spark-job-name \
--conf spark.kubernetes.namespace=spark \
--conf spark.kubernetes.container.image=docker-repo/pyspark:55 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-submit \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.executor.memory=4G \
--files local:///mnt/conf.json \
local:///mnt/ingest.pyand when I check the request and limit for the executor pod, it shows below. There is almost 1700 MB extra got allocated for the pod.
Limits:
memory: 5734Mi
Requests:
cpu: 4
memory: 5734MiWhy is that?
In addition to @CptDolphin 's answer, be aware that Spark always allocates spark.executor.memoryOverhead extra memory (max of 10% of spark.executor.memory or 384MB, unless explicitly configured), and may allocate additional spark.executor.pyspark.memory if you defined that in your configuration.
What you define the pod (as individual system) can use is one thing, what you define spark or java or any other app runing inside that system (pod) can use, is another thing; think of it as normal computer with limits, and then your application with its own limits.