My cluster has 2 nodes with GPUs and a small master
cluster nodes are running 
OS Image:                   Container-Optimized OS from Google
I've installed the GPU daemon per:
 kubectl get pods -A
kube-system   nvidia-driver-installer-7rpff                                    1/1     Running   1          3d3h
kube-system   nvidia-driver-installer-97jg5                                    1/1     Running   1          3d3hI have added taints to run the application on the GPU nodes and this is working correctly.
Each node runs a single pod of the application which should be able to access the GPU
I don't know why but only one of the pods is actually able to access its GPU.
The other instance (which is running on its dedicated node with a GPU) spits out this message:
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.LD_LIBRARY_PATHS is set correctly (identical in both pods and valid paths). I see that the GPU appears when performing in both pods:
H/W path    Device  Class          Description
==============================================
                    system         Computer
/0                  bus            Motherboard
...
/0/100/4            display        TU104GL [Tesla T4] <<<<<<<<<<<<<<<Helm is:
replicaCount: 2
image:
  repository: ""
  pullPolicy: Always
  tag: ""
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
podAnnotations: {}
podSecurityContext: {}
  # fsGroup: 2000
securityContext: {}
  # capabilities:
  #   drop:
  #   - ALL
  # readOnlyRootFilesystem: true
  # runAsNonRoot: true
  # runAsUser: 1000
service:
  type: NodePort
  port: 3030
ingress:
  enabled: true
  className: ""
  annotations:
    kubernetes.io/ingress.global-static-ip-name: xxx
  hosts:
    - host: staging-xxxx.com
      paths:
        - path: /
          pathType: Prefix
resources:
  # We usually recommend not to specify default resources and to leave this as a conscious
  # choice for the user. This also increases chances charts run on environments with little
  # resources, such as Minikube. If you do want to specify resources, uncomment the following
  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
   limits:
     cpu: 2
     memory: 16Gi
     nvidia.com/gpu: 1
   requests:
     cpu: 2
     memory: 8Gi
autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80
  # targetMemoryUtilizationPercentage: 80
nodeSelector:
  cloud.google.com/gke-nodepool: xxxx-staging-0-autoscale-0
tolerations: {}
affinity: {}