There is this kubernetes cluster with n number of nodes where some of the nodes are fitted with multiple NVIDIA 1080Ti GPU cards on it.
I have two kind of pods 1. GPU enabled, these need to be scheduled on GPU fitted nodes where pod will only use one of the GPU cards present on that node. 2. CPU only, now these can be scheduled anywhere, preferably on CPU only nodes.
Scheduling problem is addressed clearly in this answer.
Issue: When scheduling a GPU-enabled pod on a GPU fitted node I want to be able decide on which GPU card among those multiple GPU cards my pod is going to use. Further, I was thinking of a loadbalancer sitting transparently b/w GPU hardware and pods that will decide the mapping.
Any help around this architecture would be deeply appreciated. Thank you!
You have to use Official NVIDIA GPU device plugin rather than suggested by GCE. There's possibility to schedule GPUs by attributes
Pods can specify device selectors based on the attributes that are advertised on the node. These can be specified at the container level. For example:
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
containers:
- name: cuda-container
image: nvidia/cuda:9.0-base
command: ["sleep"]
args: ["100000"]
computeResourceRequests: ["nvidia-gpu"]
computeResources:
- name: "nvidia-gpu"
resources:
limits:
nvidia.com/gpu: 1
affinity:
required:
- key: "nvidia.com/gpu-memory"
operator: "Gt"
values: ["8000"] # change value to appropriate mem for GPU
Check Kubernetes on NVIDIA GPUs Installation Guide
Hope this will help