Why is running distrubuted tensorflow on kubernetes much slower than running locally?

8/27/2018

I want to run a cifar10 model on kubernetes.And here is my cifar10_cluster_train.py https://github.com/icloud-ecnu/k8s_primary Then we build a docker image.Here is my definition of service and replicaset.

enter image description here

4service for 4 pod ,It can run normally,but it's very slow,costs about half one hour.I create four containers locally to run it ,it cost less than 5 minutes. Why is it so solw? I am a newcomer to k8s,hope for your advice sincerely. Thank you~

-- Chris Qi
distributed
kubernetes
resources
tensorflow

0 Answers