Unable to see output or error messages for Spark on Kubernetes

5/21/2019

Trying to run a simple Spark application using Kubernetes master. But I don't get the intended output/processing, neither do I see any error messages. The final pod phase is 'Failed' and the error code is 101. The pod logs show the usual log4j warnings, but nothing else.

Running minikube v1.0.1 on windows (amd64) on my office laptop using hyperv. Have already increased the #cpus and memory on minikube VM to 3 and 4 GB as recommended.

Made sure that the applications run fine with Spark Standalone. The first application 'Hello' is supposed to print a 'Hello' message. The second application 'Calculate Monthly Revenue' is supposed to read data from Teradata over JDBC, aggregate it and write the result back to Teradata table over JDBC.

Also made sure that 'hello minikube' works fine.

In all the code snippets below, ... indicates portions omitted for brevity, >>> indicates command prompt.

>>> spark-submit --master k8s://https://153.65.225.219:8443 --deploy-mode cluster --name Hello --class Hello --conf spark.executor.instances=1 --conf spark.kubernetes.container.image=rahulvkulkarni/default:spark-td-run --conf spark.kubernetes.container.image.pullSecrets=regcred local://hello_2.12-0.1.0-SNAPSHOT.jar
log4j:WARN No appenders could be found for logger (io.fabric8.kubernetes.client.Config).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/20 16:59:09 INFO LoggingPodStatusWatcherImpl: State changed, new state:
         pod name: hello-1558351748442-driver
...
         phase: Pending
         status: []
...
19/05/20 16:59:13 INFO LoggingPodStatusWatcherImpl: State changed, new state:
         pod name: hello-1558351748442-driver
...
         phase: Failed
         status: [ContainerStatus(containerID=docker://464c9c0e23d543f20954d373218c9cefefc31107711cbd2ada4d93bb31ce4d80, image=rahulvkulkarni/default:spark-td-run, imageID=docker-pullable://rahulvkulkarni/default@sha256:1de9951c4ac9f0b5f26efa3949e1effa779b0605066f2043738402ce20e8179b, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=false, restartCount=0, state=ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://464c9c0e23d543f20954d373218c9cefefc31107711cbd2ada4d93bb31ce4d80, exitCode=101, finishedAt=2019-05-17T18:26:41Z, message=null, reason=Error, signal=null, startedAt=2019-05-17T18:26:40Z, additionalProperties={}), waiting=null, additionalProperties={}), additionalProperties={})]
19/05/20 16:59:13 INFO LoggingPodStatusWatcherImpl: Container final statuses:


         Container name: spark-kubernetes-driver
         Container image: rahulvkulkarni/default:spark-td-run
         Container state: Terminated
         Exit code: 101
19/05/20 16:59:13 INFO Client: Application Hello finished.
...


>>> kubectl logs hello-1558351748442-driver
++ id -u
...
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=172.17.0.5 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class Hello spark-internal
19/05/17 18:26:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
log4j:WARN No appenders could be found for logger (org.apache.spark.deploy.SparkSubmit$anon$2).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

What does exit code 101 mean? How to find the actual error?

Then I tried to configure log4j for detailed logging as described in How to stop INFO messages displaying on spark console?. Renamed and used the log4j.properties template provided in the conf directory. But spark-submit is not able to find the log4j.properties file that I have already included in the docker build.

>>> spark-submit --master k8s://https://153.65.225.219:8443 --deploy-mode cluster --files /opt/spark/conf/log4j.properties --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/opt/spark/conf/log4j.properties" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/opt/spark/conf/log4j.properties" --name "Calculate Monthly Revenue" --class mthRev --conf spark.executor.instances=1 --conf spark.kubernetes.container.image=rahulvkulkarni/default:spark-td-run --conf spark.kubernetes.container.image.pullSecrets=regcred local://mthrev_2.10-0.1-SNAPSHOT.jar <username> <password> <server name>
19/05/20 20:02:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/05/20 20:02:52 INFO LoggingPodStatusWatcherImpl: State changed, new state:
         pod name: calculate-monthly-revenue-1558362771110-driver
...
         Container name: spark-kubernetes-driver
         Container image: rahulvkulkarni/default:spark-td-run
         Container state: Terminated
         Exit code: 1


>>> kubectl logs -c spark-kubernetes-driver calculate-monthly-revenue-1558362771110-driver
++ id -u
...
log4j:ERROR Could not read configuration file from URL [file:/opt/spark/conf/log4j.properties].
java.io.FileNotFoundException: /opt/spark/conf/log4j.properties (No such file or directory)
...
log4j:ERROR Ignoring configuration file [file:/opt/spark/conf/log4j.properties].
19/05/17 21:30:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected scheme-specific part at index 2: C:
        at org.apache.hadoop.fs.Path.initialize(Path.java:205)
        at org.apache.hadoop.fs.Path.<init>(Path.java:171)
        at org.apache.hadoop.fs.Path.<init>(Path.java:93)
        at org.apache.hadoop.fs.Globber.glob(Globber.java:211)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1657)
        at org.apache.spark.deploy.DependencyUtils$.org$apache$spark$deploy$DependencyUtils$resolveGlobPath(DependencyUtils.scala:192)
        at org.apache.spark.deploy.DependencyUtils$anonfun$resolveGlobPaths$2.apply(DependencyUtils.scala:147)
        at org.apache.spark.deploy.DependencyUtils$anonfun$resolveGlobPaths$2.apply(DependencyUtils.scala:145)
        at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241)
        at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241)
        at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
        at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
        at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
        at org.apache.spark.deploy.DependencyUtils$.resolveGlobPaths(DependencyUtils.scala:145)
        at org.apache.spark.deploy.SparkSubmit$anonfun$prepareSubmitEnvironment$4.apply(SparkSubmit.scala:355)
        at org.apache.spark.deploy.SparkSubmit$anonfun$prepareSubmitEnvironment$4.apply(SparkSubmit.scala:355)
        at scala.Option.map(Option.scala:146)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:355)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.URISyntaxException: Expected scheme-specific part at index 2: C:
        at java.net.URI$Parser.fail(URI.java:2848)
        at java.net.URI$Parser.failExpecting(URI.java:2854)
        at java.net.URI$Parser.parse(URI.java:3057)
        at java.net.URI.<init>(URI.java:746)
        at org.apache.hadoop.fs.Path.initialize(Path.java:202)
        ... 23 more
[INFO  tini (1)] Main child exited normally (with status '1')

I tried several variations of specifying the log4j.properties file: local file on my Windows laptop (file:///C$/Users//spark-2.4.3-bin-hadoop2.7/conf/log4j.properties and file:///C:/Users//spark-2.4.3-bin-hadoop2.7/conf/log4j.properties), local file in the Linux container (file:///opt/spark/conf/log4j.properties). But I keep getting the message:

log4j:ERROR Could not read configuration file from URL [file:/C$/Users/<my-username>/spark-2.4.3-bin-hadoop2.7/conf/log4j.properties].

The IllegalArgumentException exception went away when I tried the path without the colon (C:), i.e. either the Linux path or the Windows path with C$.

But I still don't get the desired output of my program and don't know if/what is the error!

-- RahulK
apache-spark
kubernetes

2 Answers

7/21/2019

I solve this by deploying config file to blobs https://$(container_name).blob.core.windows.net/jars/log4jconfig1

and give him config to spark-submit

--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=https://<container_name>.blob.core.windows.net/jars/log4jconfig1" \

--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=https://<container_name>.blob.core.windows.net/jars/log4jconfig1" \
-- Тимофей Смирнов
Source: StackOverflow

5/22/2019

There was a typo in the spark-submit command in the specification of the application jar. I was using only two forward slashes: local://hello_2.12-0.1.0-SNAPSHOT.jar. Hence, Spark was not able to locate it and (I think) was ignoring it silently and then had no work to do. Hence, there was no message. I'd expect it to give a warning at least.

Changed it to three slashes and it moved ahead: local:///hello_2.12-0.1.0-SNAPSHOT.jar

I now have another issue related to Kubernetes RBAC, which I will solve separately. The log4j issue still remains, but is not a concern for me now.

-- RahulK
Source: StackOverflow