I am trying to understand the K8s gpu practices better, and implementing a small K8s GPU cluster which is suppose to work like below.
This going to be little long explanation, but I hope it will help to have many questions at once place to understand GPU practices better in Kubernetes.
If no message is present in queue and none of the GPU based pods are executing program( i mean not using gpu), then gpu node pool should scale down to 0.
Create a gpu node pool. Each node contains N GPU, where N >= 1. Assign model trainer pod to each gpu. That is 1:1 mapping of pods and GPU. When I tried assigning 2 pods to 2 GPU machine where each pod is suppose to run a mnist program.
What I noticed is
1 pod got allocated and executes the program and later it went into crash loop. May be I am doing some mistake as my docker image is suppose to run program once only as I was just doing feasibility test of running 2 pods simultaneously on 2 gpu of same node.Below is the error
Message Reason First Seen Last Seen Count Back-off restarting failed container BackOff Jun 21, 2018, 3:18:15 PM Jun 21, 2018, 4:16:42 PM 143 pulling image "nkumar15/mnist" Pulling Jun 21, 2018, 3:11:33 PM Jun 21, 2018, 3:24:52 PM 5 Successfully pulled image "nkumar15/mnist" Pulled Jun 21, 2018, 3:12:46 PM Jun 21, 2018, 3:24:52 PM 5 Created container Created Jun 21, 2018, 3:12:46 PM Jun 21, 2018, 3:24:52 PM 5 Started container Started Jun 21, 2018, 3:12:46 PM Jun 21, 2018, 3:24:52 PM 5
The other pod didn't get assigned at all to GPU. Below is the message from pod events
0/3 nodes are available: 3 Insufficient nvidia.com/gpu.
Have multiple GPU machines in gpu node pool with each node having only 1 GPU.
K8s, will assign each pod to each available GPU in node and hopefully there won't be any issue. I am yet to try this.
Again apologies for such a long question, but I hope it will help other also.
For question 2 I created a new cluster to reproduce same issue which I faced multiple times before, I am not sure what changed this time but 2nd pod is successfully allocated a GPU. I think with this result I can confirm that 1gpu-1pod mapping is allowed in a multi gpu single node However restricting memory per gpu process is not feasible as of 1.10.
Both designs are supported in 1.10. I view design 2 as a special case of 1. You don't necessarily need to have 1 GPU per node. In case your pod needs more GPUs and memory, you have to have multiple GPUs per node, as you mentioned in question (4). I'd go with 1 unless there's a reason not to.
I think the best practice would be create a new cluster with no GPUs (a cluster has a default node pool), and then create a GPU node pool and attach it to the cluster. Your non-GPU workload can run in the default pool, and the GPU workload can run in the GPU pool. To support scaling-down to 0 GPUs, you need to set --num-nodes
and --min-nodes
to be 0 when creating the GPU node pool.
Docs:
Create a cluster with no GPUs: https://cloud.google.com/kubernetes-engine/docs/how-to/creating-a-cluster#creating_a_cluster
Create a GPU node pool for an existing cluster: https://cloud.google.com/kubernetes-engine/docs/concepts/gpus#gpu_pool