We have node pool without GPU/CUDA support, and our application (based on NVIDIA/CUDA/TenserFlow image nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04) worked before but in some time after deploy cant respond cuda inside container. FYI - we disable CUDA with CUDA_VISIBLE_DEVICES=-1. And at the local host on linux this works. What can happens ? We having this message tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
Can anybody help me - what can happened ?
Thanks!