Build Kubernetes cluster with spark master and spark workers

9/6/2021

I've built a custom-spark docker image with the following dependencies:

  • Python 3.6.9
  • Pip 1.18
  • Java OpenJDK 64-Bit Server VM, 1.8.0_212
  • Hadoop 3.2
  • Scala 2.13.0
  • Spark 3.0.3

where I pushed to ducker hub: https://hub.docker.com/r/redaer7/custom-spark

Dockerfile,spark-master and spark-worker files are stored under: https://github.com/redaER7/Custom-Spark

I verify /spark-master and /spark-worker works well when creating a container linked to the previous image:

docker run -it -d --name spark_1 redaer7/custom-spark:1.0 bash
docker exec -it $CONTAINER_ID /bin/bash

enter image description here

enter image description here

My issue is when I try to build a K8s cluster from previous image with following yaml file for the spark master pod:

kubectl create namespace sparkspace
kubectl -n sparkspace create -f ./spark-master-deployment.yaml
#yaml file
kind: Deployment
apiVersion: apps/v1
metadata:
  name: spark-master
spec:
  replicas: 1 # should always be one
  selector:
    matchLabels:
      component: spark-master
  template:
    metadata:
      labels:
        component: spark-master
    spec:
      containers:
        - name: spark-master
          image: redaer7/custom-spark:1.0
          imagePullPolicy: IfNotPresent
          command: ["/spark-master"]
          ports:
          - containerPort: 7077
          - containerPort: 8080
          resources:
          #  limits:
          #    cpu: 1
          #    memory: 1G
            requests:
              cpu: 1 #100m
              memory: 1G

I get CrashLoopBackOff when viewing pod with kubectl -n sparkspace get pods

When inspecting with kubectl -n sparkspace describe pod $Pod_Name

enter image description here

Any clue about that First warning ? thank you

-- Reda E.
apache-spark
docker
kubernetes

1 Answer

9/9/2021

I simply solved it by re-pulling the image :

imagePullPolicy: Always

Because I edited the Docker Image locally and I haven't changed the following in the config file:

imagePullPolicy: IfNotPresent

Then, I pushed it into Dockerhub for later deployment

-- Reda E.
Source: StackOverflow