I have an in house K8s cluster running on bare metal. On one of my worker nodes I have 4 GPUs and I want to configure K8s to recognise and use these GPUs. Based on the official documentation I installed all the required stuff and now when I run:
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Tue Nov 12 09:20:20 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... On | 00000000:02:00.0 Off | N/A |
| 29% 25C P8 2W / 250W | 0MiB / 10989MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... On | 00000000:03:00.0 Off | N/A |
| 29% 25C P8 1W / 250W | 0MiB / 10989MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce RTX 208... On | 00000000:82:00.0 Off | N/A |
| 29% 26C P8 2W / 250W | 0MiB / 10989MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce RTX 208... On | 00000000:83:00.0 Off | N/A |
| 29% 26C P8 12W / 250W | 0MiB / 10989MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I know that I have to label the node so K8s recognise these GPUs but I can't find the correct labels on the official documentations. On the docs I just see this:
# Label your nodes with the accelerator type they have.
kubectl label nodes <node-with-k80> accelerator=nvidia-tesla-k80
While on another tutorial (just for google cloude) I found this:
aliyun.accelerator/nvidia_count=1 #This field is important.
aliyun.accelerator/nvidia_mem=12209MiB
aliyun.accelerator/nvidia_name=Tesla-M40
So what is the proper way to label my node? Do I need to also label it with the number and memory size of GPUs?
I see you are trying to make sure that your pod gets scheduled on a node with GPUs
The easiest way to do it would be to label a node with GPU like this:
kubectl label node <node_name> has_gpu=true
and then creating your pod add nodeSelector
fied with has_gpu: true
. In this way pod will be scheduled only on nodes with GPUs. Read more here in k8s docs
The only problem with it is that in this case scheduler is not aware of how many GPUs are on the node and can schedule more than 4 pods on the node with only 4 GPUs.
Better option would be to use node extended resource
It would look like follows:
kubectl proxy
patch node resource configuration:
curl --header "Content-Type: application/json-patch+json" \
--request PATCH \
--data '[{"op": "add", "path": "/status/capacity/example.com~1gpu", "value": "4"}]' \
http://localhost:8001/api/v1/nodes/<your-node-name>/status
assign an extender resource to a pod
apiVersion: v1
kind: Pod
metadata:
name: extended-resource-demo
spec:
containers:
- name: extended-resource-demo-ctr
image: my_pod_name
resources:
requests:
example.com/gpu: 1
limits:
example.com/gpu: 1
In this case scheduler is aware how many GPUs are available on the node and won't schedule more pods if cannot satisfy requests.