I have been building a distributed load testing application using Kubernetes and Locust (similar to this).
I currently have a multi-node cluster running on bare-metal (running on an Ubuntu 18.04 server, set up using Kubeadm, and with Flannel as my pod networking addon).
The architecture of my cluster is as follows:
As of now, I don't believe that that is happening. My cluster shows that all of my deployments are healthy and running, however I am unable to access the logs of any of my slave instances which are running on nodes other than my master node. This leads me to believe that my pods are unable to communicate with each other across different nodes.
Is this an issue with my current networking or deployment setups (I followed the linked guides pretty-much verbatim)? Where should I start in debugging this issue?
Based on your description of the problem I can guess that you have a connection problem caused by firewall or network misconfiguration.
From the network perspective, there are requirements mentioned in Kubernetes documentation:
- all containers can communicate with all other containers without NAT
- all nodes can communicate with all containers (and vice-versa) without NAT
- the IP that a container sees itself as is the same IP that others see it as
From the firewall perspective, you need to ensure the cluster traffic can pass the firewall on the nodes.
Here is the list of ports you should have opened on the nodes provided by CoreOS website:
Master node inbound: TCP: 443 from Worker Nodes, API Requests, and End-Users
UDP: 8285,8472 from Master & Worker Nodes
Worker node inbound: TCP: 10250 from Master Nodes
TCP: 10255 from Heapster
TCP: 30000-32767 from External Application Consumers
TCP: 1-32767 from Master & Worker Nodes
TCP: 179 from Worker Nodes
UDP: 8472 from Master & Worker Nodes
UPD: 179 from Worker Nodes
Etcd node inbound: TCP: 2379-2380 from Master & Worker Nodes
How slaves instances try to join the master instance. You have to create master service (with labels) to access master pod. Also, make sure your SDN is up and master is reachable to slave instances. You can test using telnet to master pod IP from slave instances.
see ip forwarding is enabled on all the nodes.
# sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
if not enable it like this and test it.
echo 1 > /proc/sys/net/ipv4/ip_forward