BLAS library support for Apache Spark on Kubernetes

6/21/2018

While BLAS/LAPACK libraries can cause immense speed-ups in local execution, and even provide speed-up (to a lesser extent due to various reasons) for Apache Spark in cluster mode, I was wondering whether the usage of BLAS libraries for Spark on top of a Kubernetes cluster does make sense at all.

One of the reasons is that, for BLAS to provide a significant speed boost, it has to be compiled against the specific local CPU, which might not be done quite so easily.

I am aware of a similar question that considers a more general inclusion of BLAS/LAPACK libraries in docker images. It would be interesting to hear whether someone already has experiences with this. Since Kubernetes is also officially supported only since around March 2018, there seems to be little to no documentation on this out there, and popular docker images like alpine seem to have only little support for the necessary dependencies such as glibc etc.

To formulate a precise question: Does the usage of BLAS/LAPACK on Spark on Kubernetes even make sense? And what would be the smartest way to include this?

-- dennlinger
apache-spark
docker
kubernetes
lapack

0 Answers