If I create 3 nodes in a cluster, how do I distribute the docker containers evenly across the containers? For example, if I create a cluster of 3 nodes with 8 cpus on each node, I've determined through performance profiling that I get the best performance when I run one container per cpu.
gcloud container clusters create mycluster --num-nodes 3 --machine-type n1-standard-8
kubectl run myapp --image=gcr.io/myproject/myapp -r 24
When I ran kubectl
above, it put 11 containers on the first node, 10 on the second, and 3 on the third. How to I make it so that it is 8 each?
In my case, I want to put a fixed number of containers in each node. I am able to do this by labeling each node and then using a nodeSelector
with a config. Ignore that fact that I mislabeled the 3rd node, here is my setup:
kubectl label nodes gke-n3c8-7d9f8163-node-dol5 node=1
kubectl label nodes gke-n3c8-7d9f8163-node-hmbh node=2
kubectl label nodes gke-n3c8-7d9f8163-node-kdc4 node=3
That can be automated doing:
kubectl get nodes --no-headers | awk '{print NR " " $1}' | xargs -l bash -c 'kubectl label nodes $1 node=$0'
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx
spec:
replicas: 8
selector:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
nodeSelector:
node: "1"
containers:
- name: nginx
image: nginx
If I'm not mistaken, what you see is the expectation. If you want finer grained control over pod placement you probably want a customer scheduler.
Both your and jpapejr's solutions seem like they'd work, but using a nodeSelector
to force scheduling to a single node has the downside of requiring multiple RCs for a single application and making that application less resilient to a node failure. The idea of a custom scheduler is nice but has the downside of the amount of work to write and maintain that code.
I think another possible solution would be to set runtime constraints in your pod spec that might get you near to what you want. Based on this newly merged doc with examples of runtime contraints, I think you could set resources.requests.cpu
in the pod spec part of the RC and get close to a CPU-per-pod:
apiVersion: v1
kind: Pod
metadata:
name: myapp
spec:
containers:
- name: myapp
image: myregistry/myapp:v1
resources:
requests:
cpu: "1000m"
That docs has other good examples of how requests
and limits
differ and interact. There may be a combination that gives you what you want and also keeps your application at proper capacity when an individual node fails.