I'm trying to run Spark on K8 and struggling a bit with data locality. I'm using the native spark support but just watched https://databricks.com/session/hdfs-on-kubernetes-lessons-learned. I've followed the steps there in setting up my HDFS cluster (namenode on first k8 node, using host networking). I was wondering if anyone knows if the fix to the spark driver presented has been merged into the mainline spark code?
I ask as I still see ANY locality in places I'd expect NODE_LOCAL.
The code has been a part of version v2.2.0-kubernetes-0.4.0