Kubernetes nvidia-gpu UnexpectedAdmissionError

3/26/2019

I have a k8s cluster with one master and one worker. Worker has GPUs, and I am able to schedule pods on it using GPUs.

I add the second worker, also with GPUs. Everything is properly installed on that second worker. When I try to schedule GPU busybox on that second worker I get status UnexpectedAdmissionError.

With kubectl describe pod busybox I see Warning UnexpectedAdmissionError 7m kubelet, wikiserver Update plugin resources failed due to requested number of devices unavailable for nvidia.com/gpu. Requested: 1, Available: 0, which is unexpected. which is odd since with kubectl describe nodes second-worker I see that both Capacity and Allocatable are nvidia.com/gpu: 1

I was able to successfully schedule GPU busybox on first worker. Also, I was able to schedule CPU busybox on both workers.

This is yaml conf for GPU busybox:

apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: default
spec:
  containers:
  - image: busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
    name: busybox
    resources:
        limits:
          nvidia.com/gpu: 1 
  restartPolicy: Always
  nodeName: secondworker

The second worker (and first worker and the master) is configured as described by nvidia-device-plugin.

-- Antonio Jurić
gpu
kubernetes
nvidia

0 Answers