Kubernetes not load balancing across nodes in the cluster

7/27/2021

I setup a 4 node Kubernetes cluster by following the guide found here: https://www.tecmint.com/install-a-kubernetes-cluster-on-centos-8/

It has one master and 3 worker nodes.

I'm running a deployment called "hello-world" based on the bashofmann/rancher-demo image with 20 replicas. I've also created a nodeport service called hello-world that maps nodeport 30213 to port 8080 on each respective pod.

See below for the basic details:

# kubectl get all
NAME                               READY   STATUS    RESTARTS   AGE
pod/hello-world-655b948488-22dq4   1/1     Running   0          112m
pod/hello-world-655b948488-2fd7f   1/1     Running   0          112m
pod/hello-world-655b948488-2hrtw   1/1     Running   0          112m
pod/hello-world-655b948488-5h4ns   1/1     Running   0          112m
pod/hello-world-655b948488-5zg9w   1/1     Running   0          112m
pod/hello-world-655b948488-7kcsp   1/1     Running   0          112m
pod/hello-world-655b948488-c5m67   1/1     Running   0          112m
pod/hello-world-655b948488-dswcv   1/1     Running   0          112m
pod/hello-world-655b948488-fbtx6   1/1     Running   0          112m
pod/hello-world-655b948488-g7bxp   1/1     Running   0          112m
pod/hello-world-655b948488-gfb4v   1/1     Running   0          112m
pod/hello-world-655b948488-j6lz9   1/1     Running   0          112m
pod/hello-world-655b948488-jthnq   1/1     Running   0          112m
pod/hello-world-655b948488-pm5b8   1/1     Running   0          112m
pod/hello-world-655b948488-qt7gs   1/1     Running   0          112m
pod/hello-world-655b948488-s2hjv   1/1     Running   0          112m
pod/hello-world-655b948488-vcjzz   1/1     Running   0          112m
pod/hello-world-655b948488-vprgn   1/1     Running   0          112m
pod/hello-world-655b948488-x4b9n   1/1     Running   0          112m
pod/hello-world-655b948488-ztfh7   1/1     Running   0          112m

NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
service/hello-world   NodePort    10.110.212.243   <none>        8080:30213/TCP   114m
service/kubernetes    ClusterIP   10.96.0.1        <none>        443/TCP          2d2h

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hello-world   20/20   20           20          112m

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/hello-world-655b948488   20        20        20      112m

# kubectl get nodes -o wide
NAME          STATUS   ROLES                  AGE    VERSION   INTERNAL-IP       EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION                 CONTAINER-RUNTIME
k8s-master    Ready    control-plane,master   2d2h   v1.21.3   192.168.188.190   <none>        CentOS Linux 8   4.18.0-305.10.2.el8_4.x86_64   docker://20.10.7
k8s-worker1   Ready    <none>                 2d2h   v1.21.3   192.168.188.191   <none>        CentOS Linux 8   4.18.0-305.10.2.el8_4.x86_64   docker://20.10.7
k8s-worker2   Ready    <none>                 2d2h   v1.21.3   192.168.188.192   <none>        CentOS Linux 8   4.18.0-305.10.2.el8_4.x86_64   docker://20.10.7
k8s-worker3   Ready    <none>                 2d2h   v1.21.3   192.168.188.193   <none>        CentOS Linux 8   4.18.0-305.10.2.el8_4.x86_64   docker://20.10.7

I've discovered that the cluster is not load balancing across the three worker nodes. If I open my web browser and go to http://192.168.188.191:30213 then it'll load the website but only when served up by pods on k8s-worker1. Likewise, if I go to http://192.168.188.192:30213 then it'll load the website but only when served up by pods on k8s-worker2.

The neat thing about this particular container/pod image is that it'll display which pod is serving up my website request at any given time. The page refreshes and cycles through all of the available pods in the cluster. I can see that any time the page successfully refreshes it's only being served by a pod that's present on k8s-worker1. It'll also display how many replicas are present. I should be seeing 20, but I only see at most 8 replicas.

It'll never load the website served up by any pods on any of the other worker nodes. From the k8s-master I can issue the "curl --insecure http://192.168.188.191:30213" command and get a response about 33% of the time. The rest of the time it fails. I believe this is because it's trying to load-balance the request to the other worker nodes but those requests fail.

Since I'm still pretty new to this stuff, I'm not sure what to look at. Is it possible there's something wrong with the replicaset?

Each worker node has the following firewall rules opened up:

# firewall-cmd --list-ports
6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 10255/tcp 6783/tcp 6783/udp 6784/udp 443/tcp

Are there more ports I need to open up? Am I supposed to open up all ports from 3000-32767? This seems like a possible security vulnerability.

-- dutsnekcirf
kubernetes
kubernetes-pod

1 Answer

7/27/2021

The behaviour that you are facing is probably given by the the type of service that you created for load balancing your pod: You used type NodePort.

Mentioning the official kubernetes documentation the type NodePort is exposing the Service on each Node's IP at a static port (in your case the 30213). In this scenario, by making a request to a node, you are coherently seeing always the same node and the number of pods scheduled on that node, in your mentioned example: 8 (Note that this number may vary with respect to the pods distribution).

If you want to load balance all your pod you should either use a service type: LoadBalancer or ClusterIp + an Ingress.

Please, note that both of the options that I mentioned require that the cluster is capable of having an external ip which is externally load balanced "outside" kubernetes scope. If you are using a managed kubernetes installation (e.g.GKE, EKS, AKS) you will have it "for free". Since you are using a custom cluster installation, for the load balanced, you can have a look at MetalLB project.

-- Giovanni Patruno
Source: StackOverflow