We have a Kubernetes cluster.
Now we want to expand that with GPU nodes (so that would be the only nodes in the Kubernetes cluster that have GPUs).
We'd like to avoid Kubernetes to schedule pods on those nodes unless they require GPUs.
Not all of our pipelines can use GPUs. The absolute majority are still CPU-heavy only.
The servers with GPUs could be very expensive (for example, Nvidia DGX could be as much as $150/k per server).
If we just add DGX nodes to Kubernetes cluster, then Kubernetes would schedule non-GPU workloads there too, which would be a waste of resources (e.g. other jobs that are getting scheduled later and do need GPUs, may have other non-GPU resources there exhausted there like CPU and memory, so they would have to wait for non-GPU jobs/containers to finish).
Is there is a way to customize GPU resource scheduling in Kubernetes so that it would only schedule pods on those expensive nodes if they require GPUs? If they don't, they may have to wait for availability of other non-GPU resources like CPU and memory on non-GPU servers...
Thanks.
Using labels and label selectors for your nodes is right. But you need to use NodeAffinity
on your pods.
Something like this:
apiVersion: v1
kind: Pod
metadata:
name: run-with-gpu
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/node-type
operator: In
values:
- gpu
containers:
- name: your-gpu-workload
image: mygpuimage
Also, attach the label to your GPU nodes:
$ kubectl label nodes <node-name> kubernetes.io/node-type=gpu
You can use labels and label selectors for this. kubernates docs
Update: example
apiVersion: v1
kind: Pod
metadata:
name: with-gpu-antiAffinity
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: resources
operator: In
values:
- cpu-only