I'm testing Kubernetes with the intention of being able to run batch jobs in a queue. I've created a resourcequota with
$ kubectl create quota memoryquota --hard=memory=450Mi
,
limiting the total memory usage of all containers in the used namespace to 450M. I also have a script run-memhog.sh
that creates a memhog-job with a memory limit of X and using Y megs of memory:
kubectl run memhog-$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 8 | head -n 1)
--replicas=1 --restart=OnFailure --limits=memory=$1Mi,cpu=100m --record
--image=derekwaynecarr/memhog --command -- memhog -r100 $2m
Running $ for i in {1..4}; do ./run-memhog.sh 200 100; done
correctly causes four jobs to be created, two of which complete in around 20 seconds, and the other two, as expected, get a FailedCreate
warning with a message
Error creating: pods "memhog-plgxke9m-" is forbidden: exceeded quota: memoryquota, requested: memory=200Mi, used: memory=400Mi, limited: memory=450Mi
Running $ kubectl get jobs
shows an expected outcome:
NAME DESIRED SUCCESSFUL AGE
memhog-2covdiww 1 0 35s
memhog-6bg0b6g6 1 1 35s
memhog-plgxke9m 1 0 35s
memhog-w2ujbg1b 1 1 35s
Everything's OK so far, and I'm expecting the two still uncompleted jobs to start running as soon as the resources become available (= after the previous pods/containers are cleared). However, the jobs stay in a pending state for who knows how long - I checked after two hours and they still didn't start running, after which I left the server running overnight and the jobs got completed somewhere during that time.
My question is: what is causing the jobs to be pending for such a long time? Is there anyway I can poll for resource availability more frequently? I tried to search through both the kubectl reference and kubernetes docs, but didn't find any mention of a fix/setting for this.