Question about concept on Kubernetes pod assignment to nodes

4/15/2020

I am quite a beginner in Kuberenetes and would like to ask about some concepts related to kuberenetes pod assignment.

Suppose there is a deployment to be made with a requirement of 3 replica sets.

(1)

Assume that there are 4 nodes, where each of it being a different physical server with different CPU and memory.

When the deployment is made, how would kubernetes assgin the pods to the nodes? Will there be scenario where it will put multiple pods on the same server, while a server does not have pod assignment (due to resource considereation)?

(2)

Assume there are 4 nodes (on 4 indentical physical servers), and 1 pod is created on each of the 4 nodes.

Suppose that now one of the nodes goes down. How would kuberenetes handle this? Will it recreate the pod on one of the other 3 nodes, based on which one having more available resources?

Thank you for any advice in advance.

-- Patrick C.
kubernetes
kubernetes-pod

2 Answers

4/15/2020
  1. Kubernetes will try to deploy pods to multiple nodes for better availability and resiliency. This will be based on the resource availability of the nodes. So if any node is not having enough capacity to host a pod it's possible that more than one replica of a pod is scheduled into same node.

  2. Kubernetes will reschedule pods from the failed node to other available node which has enough capacity to host the pod. In this process again if there is no enough node which can host the replicas then there is a possibility that more than one replica is scheduled on same node.

You can read more on the scheduling algorithm here.

You can influence the scheduler by node and pod affinity and antiaffinity

-- Arghya Sadhu
Source: StackOverflow

4/15/2020

There's a brief discussion of the Kubernetes Scheduler in the Kubernetes documentation. Generally scheduling is fairly opaque, but you also tend to aim for fairly well-loaded nodes; the important thing from your application point of view is to set appropriate resource requests: in your pod specifications. Just so long as there's enough room on each node to meet the resource requests, it usually doesn't matter to you which node gets picked.

In the scenario you describe, (1) it is possible that two replicas will be placed on the same node and so two nodes will go unused. That's especially true if the nodes aren't identical and they have resource constraints: if your pods require 4 GB of RAM, but you have some nodes that have less than that (after accounting for system pods and daemon set pods), the pods can't get scheduled there.

If a node fails (2) Kubernetes will automatically reschedule the pods running on that node if possible. "Fail" is a broad case, and can include a node being intentionally stopped to be upgraded or replaced. In this latter case you have some control over the cluster's behavior; see Disruptions in the documentation.

Many environments will run a cluster autoscaler. This can cause nodes to come and go automatically: if you try to schedule a pod and it won't fit, the autoscaler will allocate a new node, and if a node is under 50% utilization, it will be removed (and its pods rescheduled). In your first scenario you might start with only one node, but when the pod replicas don't all fit, the autoscaler would create a new node and once it's available the excess pods could be scheduled there.

-- David Maze
Source: StackOverflow