distribute docker containers evenly with kubectl

10/8/2015

If I create 3 nodes in a cluster, how do I distribute the docker containers evenly across the containers? For example, if I create a cluster of 3 nodes with 8 cpus on each node, I've determined through performance profiling that I get the best performance when I run one container per cpu.

gcloud container clusters create mycluster --num-nodes 3 --machine-type n1-standard-8

kubectl run myapp --image=gcr.io/myproject/myapp -r 24

When I ran kubectl above, it put 11 containers on the first node, 10 on the second, and 3 on the third. How to I make it so that it is 8 each?

-- Cameron Taggart
docker
kubernetes

3 Answers

10/8/2015

In my case, I want to put a fixed number of containers in each node. I am able to do this by labeling each node and then using a nodeSelector with a config. Ignore that fact that I mislabeled the 3rd node, here is my setup:

kubectl label nodes gke-n3c8-7d9f8163-node-dol5 node=1
kubectl label nodes gke-n3c8-7d9f8163-node-hmbh node=2
kubectl label nodes gke-n3c8-7d9f8163-node-kdc4 node=3

That can be automated doing:

kubectl get nodes --no-headers | awk '{print NR " " $1}' | xargs -l bash -c 'kubectl label nodes $1 node=$0'

enter image description here

apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 8
  selector:
    app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      nodeSelector:
        node: "1"
      containers:
        - name: nginx
          image: nginx
-- Cameron Taggart
Source: StackOverflow

10/8/2015

If I'm not mistaken, what you see is the expectation. If you want finer grained control over pod placement you probably want a customer scheduler.

-- jpapejr
Source: StackOverflow

10/8/2015

Both your and jpapejr's solutions seem like they'd work, but using a nodeSelector to force scheduling to a single node has the downside of requiring multiple RCs for a single application and making that application less resilient to a node failure. The idea of a custom scheduler is nice but has the downside of the amount of work to write and maintain that code.

I think another possible solution would be to set runtime constraints in your pod spec that might get you near to what you want. Based on this newly merged doc with examples of runtime contraints, I think you could set resources.requests.cpu in the pod spec part of the RC and get close to a CPU-per-pod:

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
    - name: myapp
      image: myregistry/myapp:v1
      resources:
        requests:
          cpu: "1000m"

That docs has other good examples of how requests and limits differ and interact. There may be a combination that gives you what you want and also keeps your application at proper capacity when an individual node fails.

-- rwehner
Source: StackOverflow