pyspark does not work when istio sidecar is injected

2/20/2019

We are running pyspark in the pod where spark is launched as standalone mode. The driver is not able to connect to the executor if the istio sidecar is injected. When we disable sidecar injection it works fine.

Does istio block inter pod communication as well? if so, How to workaround that? Any way to whitelist inter pod communication?

Error, when sidecar is injected in executing pyspark task:

    at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:1070)
      at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
      at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
      at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
      at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
      at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
      at java.lang.Thread.run(Thread.java:748)
 Caused by: java.nio.channels.ClosedChannelException
      at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)

 Driver stacktrace:
      at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$failJobAndIndependentStages(DAGScheduler.scala:1887)
      at org.apache.spark.scheduler.DAGScheduler$anonfun$abortStage$1.apply(DAGScheduler.scala:1875)
      at org.apache.spark.scheduler.DAGScheduler$anonfun$abortStage$1.apply(DAGScheduler.scala:1874)
      at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
      at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1874)
      at org.apache.spark.scheduler.DAGScheduler$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
      at org.apache.spark.scheduler.DAGScheduler$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
      at scala.Option.foreach(Option.scala:257)
      at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2108)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2057)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2046)
      at org.apache.spark.util.EventLoop$anon$1.run(EventLoop.scala:49)
      at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
      at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:166)
      ... 32 more
 Caused by: java.io.IOException: Failed to send RPC /jars/com.fasterxml.jackson.core_jackson-databind-2.6.6.jar to default-scheduler-59cb49d65f-ht7j6/192.168.1.197:35688: java.nio.channels.ClosedChannelException
      at org.apache.spark.network.client.TransportClient$2.handleFailure(TransportClient.java:163)
      at org.apache.spark.network.client.TransportClient$StdChannelListener.operationComplete(TransportClient.java:334)
      at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
      at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
      at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
      at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:122)

default-scheduler-59cb49d65f-ht7j6 is the pod name.

-- enator
istio
kubernetes
pyspark

0 Answers