our GKE container have a problem with start python app. we use CUDA/TensorFlow

11/14/2019

We have node pool without GPU/CUDA support, and our application (based on NVIDIA/CUDA/TenserFlow image nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04) worked before but in some time after deploy cant respond cuda inside container. FYI - we disable CUDA with CUDA_VISIBLE_DEVICES=-1. And at the local host on linux this works. What can happens ? We having this message tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)

Can anybody help me - what can happened ?

Thanks!

-- Andrey Shishka
google-kubernetes-engine
python

0 Answers