Numpy mmap and kubernetes RAM limits

11/14/2018

I'm trying to load a large (~40GB) file with np.load(path, mmap_mode='r') on a kubernetes cluster which allows very high short-term memory use but much lower indefinitely. I have 24GB "requested" memory with a much higher limit. np.load() detects that memory resources are available, so it automatically loads the entire file into RAM (my understanding is numpy automatically uses the memory mapping to the extent it needs to given available memory). Because the pod is dramatically exceeding requested memory, it is automatically killed when others request memory. If I set the memory limit to something low, the pod is 100M killed because mmap again doesn't detect the limitation and plows right past the ceiling. Is there any way around this without a lot of memory micromanaging?

-- Hugh Runyan
kubernetes
machine-learning
numpy
python

0 Answers