How to create a Spark or TensorFlow cluster based on containers with mesos or kubernetes?

6/3/2017

After reading the discussions about the differences between mesos and kubernetes and kubernetes-vs.-mesos-vs.-swarm, I am still confused about how to create a Spark and TensorFlow cluster with docker containers via some bear metal hosts and AWS like private cloud (OpenNebular).

Currently, I am able to build a static TensorFlow cluster with docker containers manually distributed to different hosts. I only run a stand alone spark on a bear metal host. The way of manually setup a mesos cluster for containers can be found here.

Since my resources are limited, I would like to find a way to deploy docker containers to the current mixed infrastructure to build either a tensorflow or spark cluster, so that I can do data analysis either with tensorflow or spark on the same resources.

Is it possible to create/run/undeploy a spark or tensorflow cluster quickly with docker containers on a mixed infrastructure with mesos or kubernetes? How can I do that?

Any comments and hints are welcome.

-- Yingding Wang
apache-spark
containers
kubernetes
mesos
tensorflow

1 Answer

6/4/2017

Given you have limited resources, I suggest you have a look at using the Spark helm, which gives you:

  • 1 x Spark Master with port 8080 exposed on an external LoadBalancer
  • 3 x Spark Workers with HorizontalPodAutoscaler to scale to max 10 pods when CPU hits 50% of 100m
  • 1 x Zeppelin with port 8080 exposed on an external LoadBalancer

If this configuration doesn't work then you can build your own docker images and deploy those, take a look at this blog series. There is work underway to make Spark more Kubernetes friendly. This issue also gives some insight.

Not looked into Tensorflow, I suggest you look at this blog

-- Dan Murphy
Source: StackOverflow