Google cloud out of memory handling large tensors, docker limit?

3/21/2017

Training and evaluating big neural network models (tensorflow) on google cloud.

Got the following error when evaluating my model:

W  Resource exhausted: OOM when allocating tensor with shape[38633472,17] 
W  Ran out of memory trying to allocate 2.45GiB.  See logs for memory state. 
  undefined

I think it has to do with the container memory limit.

Any help on that?

-- guyov
google-cloud-ml
google-kubernetes-engine
tensorflow

1 Answer

3/22/2017

Which scale tier and machine type were you using? In case of OOM, you can try to use larger machine in CUSTOM tier.

If you still have the issue, please send us your project number and job id to cloudml-feedback@google.com, so that we can take a close look.

-- Guoqing Xu
Source: StackOverflow