I am running a Spark Job in Kubernetes cluster using spark-submit command as below,
bin/spark-submit \
--master k8s://https://api-server-host:443 \
--deploy-mode cluster \
--name spark-job-name \
--conf spark.kubernetes.namespace=spark \
--conf spark.kubernetes.container.image=docker-repo/pyspark:55 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-submit \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.executor.memory=4G \
--files local:///mnt/conf.json \
local:///mnt/ingest.py
and when I check the request and limit for the executor pod, it shows below. There is almost 1700 MB extra got allocated for the pod.
Limits:
memory: 5734Mi
Requests:
cpu: 4
memory: 5734Mi
Why is that?
In addition to @CptDolphin 's answer, be aware that Spark always allocates spark.executor.memoryOverhead
extra memory (max of 10% of spark.executor.memory
or 384MB, unless explicitly configured), and may allocate additional spark.executor.pyspark.memory
if you defined that in your configuration.
What you define the pod (as individual system) can use is one thing, what you define spark or java or any other app runing inside that system (pod) can use, is another thing; think of it as normal computer with limits, and then your application with its own limits.