GPU resource limit

10/9/2019

I’m having some trouble limiting my Pods access to the GPUs available on my cluster.

Here is my .yaml:

kind: Pod
metadata:
  name: train-gpu
  spec:
  containers:
  - name: train-gpu
    image: index.docker.io/myprivaterepository/train:latest
    command: ["sleep"]
    args: ["100000"]
    resources:
      limits:
        nvidia.com/gpu: 1 # requesting 1 GPU

When I run the nvidia-smi command inside of this pod all of the GPUs show up, rather than just 1.

Any advice would be much appreciated.

Some information that may be useful:

Kubernetes version:
Client Version: version.Info{Major:“1”, Minor:“16”, GitVersion:“v1.16.1”, GitCommit:“d647ddbd755faf07169599a625faf302ffc34458”, GitTreeState:“clean”, BuildDate:“2019-10-07T14:30:40Z”, GoVersion:“go1.12.10”, Compiler:“gc”, Platform:“linux/amd64”}
Server Version: version.Info{Major:“1”, Minor:“15”, GitVersion:“v1.15.3”, GitCommit:“2d3c76f9091b6bec110a5e63777c332469e0cba2”, GitTreeState:“clean”, BuildDate:“2019-08-19T11:05:50Z”, GoVersion:“go1.12.9”, Compiler:“gc”, Platform:“linux/amd64”}

Docker base image:
FROM nvidia/cuda:10.1-base-ubuntu18.04
-- Kareem Kudus
kubernetes

0 Answers