Here is my configuration:
Hosts/Nodes IP addresses: Primary=10.0.0.12, Worker1=10.0.0.16, Worker2=10.0.0.20
serviceAndDeployment.yaml file:
apiVersion: v1
kind: Service
metadata:
name: my-webserver-service
labels:
app: nginx
spec:
type: NodePort
selector:
app: nginx
ports:
- protocol: TCP
port: 8080
targetPort: 80
nodePort: 30080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: nginx
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
No problem with applying/deploying it:
kubectl apply -f serviceAndDeployment.yaml
kubectl get pods
Everything looks good so far; there is a pod on each node.
Now, I log onto one of the nodes (10.0.0.20):
ss -tln
LISTEN 0 128 *:30080 *:*
...just the way we like it.
curl localhost:30080
Perfect, we get "Welcome to nginx!"
BUT now do the same thing again:
curl localhost:30080
And it's "curl: (7) Failed connect to localhost:30080; Network is unreachable"
Doing the same thing in rapid succession randomly (yes, randomly) gives one of three things: the response we want, Network is unreachable, or completely hangs until ctrl-c.
Doing the same thing from any other host on the network, to either ...
curl 10.0.0.20:30080 OR curl 10.0.0.16 FROM 10.0.0.XXX (any host whether the primary or other)
gives the exact same result.
sudo tcpdump -nm --number -i ens160 | grep 10.0.0.14 (done on 10.0.0.20)
On the hang, I see this:
09:47:04.844525 IP 10.0.0.14.52500 > 10.0.0.20.30080: Flags [S], seq 77004116, win 29200, options [mss 1460,sackOK,TS val 664329183 ecr 0,nop,wscale 7], length 0
On "Network unreachable", I see this:
09:56:00.608617 IP 10.0.0.14.52504 > 10.0.0.20.30080: Flags [S], seq 2968934283, win 29200, options [mss 1460,sackOK,TS val 664864943 ecr 0,nop,wscale 7], length 0
09:56:00.611607 IP 10.0.0.20 > 10.0.0.14: ICMP net 10.0.0.20 unreachable, length 36
But I can ping 10.0.0.14 from 10.0.0.20 just fine.
I don't see any other traffic between those two without making a request, except the occasional ARP request.
I have disabled firewalld with:
sudo systemctl disable --now firewalld
but the resulting behavior is not changed.
Changing the externalTrafficPolicy to Local does "work", in that I get the response I want every time, but now there is no load balancing. I already have HAProxy running as a reverse proxy & load balancer on another host, but, for some reason, it doesn't work with this NodePort as the backend. That is, I can curl 10.0.0.20:30080 from the host that is running HAProxy successfully, but HAProxy doesn't see it as a valid backend.
Why does the Service have port 30080 open and listening on my control node (which does not have an app pod running on it), if that primary node doesn't then act as a load balancer?
Why doesn't HAProxy work with the NodePort as the backend?
What is going on here?
Thank you.