K8s NodePort service is “unreachable by IP” only on 2/4 slaves in the cluster

8/9/2017

I created a K8s cluster of 5 VMs (1 master and 4 slaves running Ubuntu 16.04.3 LTS) using kubeadm. I used flannel to set up networking in the cluster. I was able to successfully deploy an application. I, then, exposed it via NodePort service. From here things got complicated for me.

Before I started, I disabled the default firewalld service on master and the nodes.

As I understand from the K8s Services doc, the type NodePort exposes the service on all nodes in the cluster. However, when I created it, the service was exposed only on 2 nodes out of 4 in the cluster. I am guessing that's not the expected behavior (right?)

For troubleshooting, here are some resource specs:

root@vm-vivekse-003:~# kubectl get nodes
NAME              STATUS    AGE       VERSION
vm-deepejai-00b   Ready     5m        v1.7.3
vm-plashkar-006   Ready     4d        v1.7.3
vm-rosnthom-00f   Ready     4d        v1.7.3
vm-vivekse-003    Ready     4d        v1.7.3   //the master
vm-vivekse-004    Ready     16h       v1.7.3

root@vm-vivekse-003:~# kubectl get pods -o wide -n playground
NAME                                     READY     STATUS    RESTARTS   AGE       IP           NODE
kubernetes-bootcamp-2457653786-9qk80     1/1       Running   0          2d        10.244.3.6   vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc   1/1       Running   0          1d        10.244.3.7   vm-rosnthom-00f

root@vm-vivekse-003:~# kubectl get svc -o wide -n playground
NAME        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE       SELECTOR
sb-hw-svc   10.101.180.19   <nodes>       9000:30847/TCP   5h        run=springboot-helloworld

root@vm-vivekse-003:~# kubectl describe svc sb-hw-svc -n playground
Name:               sb-hw-svc
Namespace:          playground
Labels:             <none>
Annotations:        <none>
Selector:           run=springboot-helloworld
Type:               NodePort
IP:                 10.101.180.19
Port:               <unset>   9000/TCP
NodePort:           <unset>   30847/TCP
Endpoints:          10.244.3.7:9000
Session Affinity:   None
Events:             <none>

root@vm-vivekse-003:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2017-08-09T06:28:06Z
  name: sb-hw-svc
  namespace: playground
  resourceVersion: "588958"
  selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
  uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
  - ip: 10.244.3.7
    nodeName: vm-rosnthom-00f
    targetRef:
      kind: Pod
      name: springboot-helloworld-2842952983-rw0gc
      namespace: playground
      resourceVersion: "473859"
      uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
  ports:
  - port: 9000
    protocol: TCP

After some tinkering I realized that on those 2 "faulty" nodes, those services were not available from within those hosts itself.

Node01 (working):

root@vm-vivekse-004:~# curl 127.0.0.1:30847      //<localhost>:<nodeport>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.101.180.19:9000   //<cluster-ip>:<port>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.244.3.7:9000      //<pod-ip>:<port>
Hello Docker World!!

Node02 (working):

root@vm-rosnthom-00f:~# curl 127.0.0.1:30847
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.101.180.19:9000
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.244.3.7:9000
Hello Docker World!!

Node03 (not working):

root@vm-plashkar-006:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-plashkar-006:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-plashkar-006:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Node04 (not working):

root@vm-deepejai-00b:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-deepejai-00b:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-deepejai-00b:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Tried netstat and telnet on all 4 slaves. Here's the output:

Node01 (the working host):

root@vm-vivekse-004:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      27808/kube-proxy
root@vm-vivekse-004:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node02 (the working host):

root@vm-rosnthom-00f:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      11842/kube-proxy
root@vm-rosnthom-00f:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node03 (the not-working host):

root@vm-plashkar-006:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      7791/kube-proxy
root@vm-plashkar-006:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

Node04 (the not-working host):

root@vm-deepejai-00b:/# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      689/kube-proxy
root@vm-deepejai-00b:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

Addition info:

From the kubectl get pods output, I can see that the pod is actually deployed on slave vm-rosnthom-00f. I am able to ping this host from all the 5 VMs and curl vm-rosnthom-00f:30847 also works from all the VMs.

I can clearly see that the internal cluster networking is messed up, but I am unsure how to resolve it! iptables -L for all the slaves are identical, and even the Local Loopback (ifconfig lo) is up and running for all the slaves. I'm completely clueless as to how to fix it!

-- Vivek Sethi
flannel
kubernetes

1 Answer

8/10/2017

If you want to reach the service from any node in the cluster you need fine service type as ClusterIP. Since you defined service type as NodePort, you can connect from the node where service is running.


my above answer was not correct, based on documentation we should be able to connect from any NodeIP:Nodeport. but its not working in my cluster also.

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

NodePort: Exposes the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created. You’ll be able to contact the NodePort service, from outside the cluster, by requesting :.

One of my node ip forward not set. I was able to connect my service using NodeIP:nodePort

sysctl -w net.ipv4.ip_forward=1
-- sfgroups
Source: StackOverflow