Could not find file in CRD definition for Spark application

1/14/2019

I am using Spark operator with Kubernetes. I need a file fot the application execution. However, when I define it near my jar file in files section of the custom object definition I get an error which means the the file is not found:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
    at example.SparkExample$.main(SparkExample.scala:37)
    at example.SparkExample.main(SparkExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:846)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:194)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
    at org.apache.spark.deploy.SparkSubmit$anon$2.doSubmit(SparkSubmit.scala:921)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:932)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

This exception point to the SparkExample.scala:37 where I read class arguments:

  val dataFromFile = readFile(spark.sparkContext, args(0))

What is wrong with the custom object definition or arguments setup?

Here is how custom object definition looks like:

apiVersion: sparkoperator.k8s.io/v1alpha1
kind: SparkApplication
metadata:
  name: spark-example
  namespace: default
spec:
  type: Scala
  image: gcr.io/ynli-k8s/spark:v2.4.0-SNAPSHOT
  mainClass: example.SparkExample
  mainApplicationFile: http://ip:8089/spark_k8s_airflow.jar
  mode: cluster
  deps:
    files:
      - http://ip:8089/jar_test_data.txt
  driver:
    coreLimit: 1000m
    cores: 0.1
    labels:
      version: 2.4.0
    memory: 1024m
    serviceAccount: default
  executor:
    cores: 1
    instances: 1
    labels:
      version: 2.4.0
    memory: 1024m
  imagePullPolicy: Always
-- Cassie
apache-spark
kubernetes
scala

1 Answer

1/17/2019

The answer is in this issued I have just added arguments to the application and change the way I load file to Spark application

-- Cassie
Source: StackOverflow