k3s - can't access from one pod to another if pods on different master nodes (HighAvailability setup)

1/21/2020

k3s - can't access from one pod to another if pods on different nodes


Update:

I've narrowed the issue down - it's pods that are on other master nodes that can't communicate with those on the original master

pods on rpi4-server1 - the original cluster - can communicate with pods on rpi-worker01 and rpi3-worker02

pods on rpi4-server2 are unable to communicate with the others

I'm trying to run a HighAvailability cluster with embedded DB and using flannel / vxlan


I'm trying to setup a project with 5 services in k3s

When all of the pods are contained on a single node, they work together fine.

As soon as I add other nodes into the system and pods are deployed to them, the links seem to break.

In troubleshooting I've exec'd into one of the pods and tried to curl another. When they are on the same node this works, if the second service is on another node it doesn't.

I'm sure this is something simple that I'm missing, but I can't work it out! Help appreciated.

Key details:

  • Using k3s and native traefik
  • Two rpi4s as servers (High Availability) and two rpi3s as worker nodes

  • metallb as loadbalancer

  • Two services - blah-interface and blah-svc are configured as LoadBalancer to allow external access. The others blah-server, n34 and test-apisas NodePort to support debugging, but only really need internal access

Info on nodes, pods and services....

pi@rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get nodes --all-namespaces -o wide
NAME           STATUS                     ROLES    AGE   VERSION         INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
rpi4-server1       Ready                      master   11h   v1.17.0+k3s.1   192.168.0.140   <none>        Raspbian GNU/Linux 10 (buster)   4.19.75-v7l+     docker://19.3.5
rpi-worker01   Ready,SchedulingDisabled   <none>   10h   v1.17.0+k3s.1   192.168.0.41    <none>        Raspbian GNU/Linux 10 (buster)   4.19.66-v7+      containerd://1.3.0-k3s.5
rpi3-worker02    Ready,SchedulingDisabled   <none>   10h   v1.17.0+k3s.1   192.168.0.142   <none>        Raspbian GNU/Linux 10 (buster)   4.19.75-v7+      containerd://1.3.0-k3s.5
rpi4-server2         Ready                      master   10h   v1.17.0+k3s.1   192.168.0.143   <none>        Raspbian GNU/Linux 10 (buster)   4.19.75-v7l+     docker://19.3.5

pi@rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE        NAME                                        READY   STATUS      RESTARTS   AGE     IP              NODE           NOMINATED NODE   READINESS GATES
kube-system      helm-install-traefik-l2z6l                  0/1     Completed   2          11h     10.42.0.2       rpi4-server1       <none>           <none>
test-demo       n34-5c7b9475cb-zjlgl                      1/1     Running     1          4h30m   10.42.0.32      rpi4-server1       <none>           <none>
kube-system      metrics-server-6d684c7b5-5wgf9              1/1     Running     3          11h     10.42.0.26      rpi4-server1       <none>           <none>
metallb-system   speaker-62rkm                               0/1     Pending     0          99m     <none>          rpi-worker01   <none>           <none>
metallb-system   speaker-2shzq                               0/1     Pending     0          99m     <none>          rpi3-worker02    <none>           <none>
metallb-system   speaker-2mcnt                               1/1     Running     0          99m     192.168.0.143   rpi4-server2         <none>           <none>
metallb-system   speaker-v8j9g                               1/1     Running     0          99m     192.168.0.140   rpi4-server1       <none>           <none>
metallb-system   controller-65895b47d4-pgcs6                 1/1     Running     0          90m     10.42.0.49      rpi4-server1       <none>           <none>
test-demo       blah-server-858ccd7788-mnf67         1/1     Running     0          64m     10.42.0.50      rpi4-server1       <none>           <none>
default          nginx2-6f4f6f76fc-n2kbq                     1/1     Running     0          22m     10.42.0.52      rpi4-server1       <none>           <none>
test-demo       blah-interface-587fc66bf9-qftv6               1/1     Running     0          22m     10.42.0.53      rpi4-server1       <none>           <none>
test-demo       blah-svc-6f8f68f46-gqcbw                    1/1     Running     0          21m     10.42.0.54      rpi4-server1       <none>           <none>
kube-system      coredns-d798c9dd-hdwn5                      1/1     Running     1          11h     10.42.0.27      rpi4-server1       <none>           <none>
kube-system      local-path-provisioner-58fb86bdfd-tjh7r     1/1     Running     31         11h     10.42.0.28      rpi4-server1       <none>           <none>
kube-system      traefik-6787cddb4b-tgq6j                    1/1     Running     0          4h50m   10.42.1.23      rpi4-server2         <none>           <none>
default          testdemo2020-testchart-6f8d44b496-2hcfc   1/1     Running     1          6h31m   10.42.0.29      rpi4-server1       <none>           <none>
test-demo       test-apis-75bb68dcd7-d8rrp                   1/1     Running     0          7m13s   10.42.1.29      rpi4-server2         <none>           <none>

pi@rpi4-server1:~/Projects/test_demo_2020/test_kube_config/testchart/templates $ sudo kubectl get svc --all-namespaces -o wide
NAMESPACE     NAME                       TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                                        AGE     SELECTOR
default       kubernetes                 ClusterIP      10.43.0.1       <none>          443/TCP                                        11h     <none>
kube-system   kube-dns                   ClusterIP      10.43.0.10      <none>          53/UDP,53/TCP,9153/TCP                         11h     k8s-app=kube-dns
kube-system   metrics-server             ClusterIP      10.43.74.118    <none>          443/TCP                                        11h     k8s-app=metrics-server
kube-system   traefik-prometheus         ClusterIP      10.43.78.135    <none>          9100/TCP                                       11h     app=traefik,release=traefik
test-demo    blah-server         NodePort       10.43.224.128   <none>          5055:31211/TCP                                 10h     io.kompose.service=blah-server
default       testdemo2020-testchart   ClusterIP      10.43.91.7      <none>          80/TCP                                         10h     app.kubernetes.io/instance=testdemo2020,app.kubernetes.io/name=testchart
test-demo    traf-dashboard             NodePort       10.43.60.155    <none>          8080:30808/TCP                                 10h     io.kompose.service=traf-dashboard
test-demo    test-apis                   NodePort       10.43.248.59    <none>          8075:31423/TCP                                 7h11m   io.kompose.service=test-apis
kube-system   traefik                    LoadBalancer   10.43.168.18    192.168.0.240   80:30688/TCP,443:31263/TCP                     11h     app=traefik,release=traefik
default       nginx2                     LoadBalancer   10.43.249.123   192.168.0.241   80:30497/TCP                                   92m     app=nginx2
test-demo    n34                      NodePort       10.43.171.206   <none>          7474:30474/TCP,7687:32051/TCP                  72m     io.kompose.service=n34
test-demo    blah-interface               LoadBalancer   10.43.149.158   192.168.0.242   80:30634/TCP                                   66m     io.kompose.service=blah-interface
test-demo    blah-svc                   LoadBalancer   10.43.19.242    192.168.0.243   5005:30005/TCP,5006:31904/TCP,5002:30685/TCP   51m     io.kompose.service=blah-svc
-- ceharep
k3s
kubernetes

0 Answers