Is running spark on kubernetes still experimental?

10/5/2018

We would like to test some Spark submission on a Kubernetes cluster;

However, the official documentation is kind of ambiguous.

Spark can run on clusters managed by Kubernetes. This feature makes use of native Kubernetes scheduler that has been added to Spark.

The Kubernetes scheduler is currently experimental. In future versions, there may be behavioral changes around configuration, container images and entrypoints.

Does this mean that the kubernetes scheduler itself is experimental or some kind of its implementation related to spark?

Does it make sense to run spark on Kubernetes in production-grade environments?

-- pkaramol
apache-spark
kubernetes

1 Answer

10/5/2018
  1. Yes, it's experimental if you are using the Spark Kubernetes scheduler like you mentioned here. Use it at your own risk.

  2. Not really, if you are running a standalone cluster in Kubernetes without the Kubernetes scheduler. This means create a master in a Kubernetes pod and then allocate a number of slave pods that talk to that master. Then submitting your jobs with the good old spark-summit without --master k8s:// command and with the usual --master spark:// command. The downside of this basically that your Spark cluster in Kubernetes is static.

-- Rico
Source: StackOverflow