CrashLoopBackOff in spark cluster in kubernetes: nohup: can't execute '--': No such file or directory

6/20/2017

Dockerfile:

FROM openjdk:8-alpine

RUN apk update && \
        apk add curl bash procps

ENV SPARK_VER 2.1.1
ENV HADOOP_VER 2.7
ENV SPARK_HOME /opt/spark

# Get Spark from US Apache mirror.
RUN mkdir -p /opt && \
    cd /opt && \
    curl http://www.us.apache.org/dist/spark/spark-${SPARK_VER}/spark-${SPARK_VER}-bin-hadoop${HADOOP_VER}.tgz | \
        tar -zx && \
    ln -s spark-${SPARK_VER}-bin-hadoop${HADOOP_VER} spark && \
    echo Spark ${SPARK_VER} installed in /opt

ADD start-common.sh start-worker.sh start-master.sh /
RUN chmod +x /start-common.sh /start-master.sh /start-worker.sh
ENV PATH $PATH:/opt/spark/bin

WORKDIR $SPARK_HOME
EXPOSE 4040 6066 7077 8080

CMD ["spark-shell", "--master", "local[2]"]

spark-master-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: spark-master
  labels:
    name: spark-master
spec:
  type: NodePort
  ports:
    # the port that this service should serve on
  - name: webui
    port: 8080
    targetPort: 8080
  - name: spark
    port: 7077
    targetPort: 7077
  - name: rest
    port: 6066
    targetPort: 6066
  selector:
    name: spark-master

spark-master.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    name: spark-master
  name: spark-master
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: spark-master
    spec:
      containers:
      - name : spark-master
        imagePullPolicy: "IfNotPresent"
        image: spark-2.1.1-bin-hadoop2.7
        name: spark-master
        ports:
        - containerPort: 8080
        - containerPort: 7077
        - containerPort: 6066
        command: ["/start-master.sh"]

Error: Back-off restarting failed docker container Error syncing pod, skipping: failed to "StartContainer" for "spark-master" with CrashLoopBackOff: "Back-off 10s restarting failed container=spark-master pod=spark-master-286530801-7qv4l_default(34fecb5e-55eb-11e7-994e-525400f3f8c2)"

Any idea? Thanks

UPDATE

 2017-06-20T19:43:56.300935235Z starting org.apache.spark.deploy.master.Master, logging to /opt/spark/logs/spark--org.apache.spark.deploy.master.Master-1-spark-master-1682838347-9927h.out
2017-06-20T19:44:03.414011228Z failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.master.Master --host spark-master-1682838347-9927h --port 7077 --webui-port 8080 --ip spark-master --port 7077
2017-06-20T19:44:03.418640516Z   nohup: can't execute '--': No such file or directory
2017-06-20T19:44:03.419814788Z full log in /opt/spark/logs/spark--org.apache.spark.deploy.maste


2017-06-20T19:43:50.343251857Z starting org.apache.spark.deploy.worker.Worker, logging to /opt/spark/logs/spark--org.apache.spark.deploy.worker.Worker-1-spark-worker-243125562-0lh9k.out
2017-06-20T19:43:57.450929613Z failed to launch: nice -n 0 /opt/spark/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://spark-master:7077
2017-06-20T19:43:57.465409083Z   nohup: can't execute '--': No such file or directory
2017-06-20T19:43:57.466372593Z full log in /opt/spark/logs/spark--org.apache.spark.deploy.worker.Worker-1-spark-worker-243125562-0lh9k.out
r.Master-1-spark-master-1682838347-9927h.out 
-- BAE
apache-spark
docker
dockerfile
kubernetes

2 Answers

6/20/2017

This is just an idea, I haven't looked into it very much.

I imagine start-master.sh might be looking for start-common.sh as normally they would be both be in the PATH but in this Dockerfile they are added into /. Perhaps you could try

ENV PATH $PATH:/:/opt/spark/bin

or just add these scripts into /opt/spark/bin instead.

-- Janos Lenart
Source: StackOverflow

6/21/2017

The version of nohup that ships with alpine does not support '--'. You need to install a gnu version of nohup through the coreutils alpine package in your docker file like this:

RUN apk --update add coreutils

Alternatively create your own start script that runs the class directly and run that instead

/usr/spark/bin/spark-submit --class org.apache.spark.deploy.master.Master $SPARK_MASTER_INSTANCE --port $SPARK_MASTER_PORT --webui-port $SPARK_WEBUI_PORT

-- Jonathan Wickens
Source: StackOverflow