I am using Spark operator with Kubernetes. I need a file fot the application execution. However, when I define it near my jar file in files section of the custom object definition I get an error which means the the file is not found:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at example.SparkExample$.main(SparkExample.scala:37)
at example.SparkExample.main(SparkExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:846)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:194)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$anon$2.doSubmit(SparkSubmit.scala:921)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:932)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
This exception point to the SparkExample.scala:37 where I read class arguments:
val dataFromFile = readFile(spark.sparkContext, args(0))
What is wrong with the custom object definition or arguments setup?
Here is how custom object definition looks like:
apiVersion: sparkoperator.k8s.io/v1alpha1
kind: SparkApplication
metadata:
name: spark-example
namespace: default
spec:
type: Scala
image: gcr.io/ynli-k8s/spark:v2.4.0-SNAPSHOT
mainClass: example.SparkExample
mainApplicationFile: http://ip:8089/spark_k8s_airflow.jar
mode: cluster
deps:
files:
- http://ip:8089/jar_test_data.txt
driver:
coreLimit: 1000m
cores: 0.1
labels:
version: 2.4.0
memory: 1024m
serviceAccount: default
executor:
cores: 1
instances: 1
labels:
version: 2.4.0
memory: 1024m
imagePullPolicy: Always
The answer is in this issued I have just added arguments to the application and change the way I load file to Spark application