Scheduling more pods than there are resources available

4/6/2017

Say I have 1000 identical pods I'd like to be run eventually, but node resources only allow for 10 pods to be run in parallel.

Each pod eventually removes their RC if they exit cleanly, so given enough time, all pods should be run.

If I schedule all 1000 pods at the same time, though, 990 of them will be pending initially. The scheduler will keep all 990 pods on a busy loop trying to be scheduled, and the operation will only succeed (for a certain pod) after one of the 10 running pods is removed.

This busy loop is far from ideal in my situation, as it'll likely take all of the scheduler's available resources. Is there an alternative solution to this provided natively by kubernetes? It seems clear that this particular behaviour of scheduling way more pods than you're able to deal with isn't something that kubernetes optimises for.

-- Gabriel
kubernetes

1 Answer

4/9/2017

This type of workload is better suited for the Job resource.

Since you have a fixed number of pods to run, the easiest way to do this would be create a Job with .spec.completions set to 1000.

You could then control the number of pods running concurrently through .spec.parallelism. By default this is set to 1, which means only 1 pod will run at a time, but you can set it to a higher value to have the Job finish faster (e.g. 10, since that is the limit that your nodes can handle).

-- Pixel Elephant
Source: StackOverflow