Created a bare metal multiple master cluster 1.18 release:
NAME STATUS ROLES AGE VERSION
master1 Ready master 9d v1.18.0
master2 Ready master 9d v1.18.0
master3 Ready master 9d v1.18.0
worker1 Ready <none> 9d v1.18.0
worker2 Ready <none> 9d v1.18.0
worker3 Ready <none> 9d v1.18.0
kubectl create deploy la --image=linuxacademycontent/kubeserve:v1
kubectl scale deploy la --replica=6
kubectl get pod
curlpod 1/1 Running 0 55m
la-65776bc88f-4kp95 1/1 Running 0 42m
la-65776bc88f-7lb9f 1/1 Running 0 42m
la-65776bc88f-dgzsz 1/1 Running 0 42m
la-65776bc88f-l2hs2 1/1 Running 0 43m
la-65776bc88f-vkcts 1/1 Running 0 42m
la-65776bc88f-wkm2k 1/1 Running 0 42m
kubectl expose deploy la --port=80 --target-port=80
kubectl get svc
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 75m
la ClusterIP 10.103.199.187 <none> 80/TCP 46m
kubectl get ep
kubernetes 172.31.16.67:6443 78m
la 10.244.1.6:80,10.244.1.7:80,10.244.2.5:80 + 3 more... 49m
I can verify all is working by curling from a testpod - curlpod
kubectl exec -it curlpod bash
This is v1 pod la-65776bc88f-dgzsz
Problem is if i now delete deploy (or delete all the pods), and recreate them, the service won't allow traffic to new pods.
Here is how to reproduce the problem. kubectl delete deploy la
curlpod 1/1 Running 0 71m
la-65776bc88f-4kp95 1/1 Terminating 0 58m
la-65776bc88f-7lb9f 1/1 Terminating 0 58m
la-65776bc88f-dgzsz 1/1 Terminating 0 58m
la-65776bc88f-l2hs2 1/1 Terminating 0 59m
la-65776bc88f-vkcts 1/1 Terminating 0 58m
la-65776bc88f-wkm2k 1/1 Terminating 0 58m
kubectl create deploy la --image=linuxacademycontent/kubeserve:v1
kubectl scale deploy la --replicas=6
kubectl get pod
NAME READY STATUS RESTARTS AGE
curlpod 1/1 Running 0 73m
la-65776bc88f-5q7d8 1/1 Running 0 6s
la-65776bc88f-b69z2 1/1 Running 0 10s
la-65776bc88f-j2t4t 1/1 Running 0 6s
la-65776bc88f-ndxs5 1/1 Running 0 6s
la-65776bc88f-ns2gt 1/1 Running 0 6s
la-65776bc88f-vjp8x 1/1 Running 0 6s
All looks fine. However, service won't allow any traffic to the new pods
kubectl exec -it curlpod bash
root@curlpod:/app# curl la
curl: (7) Failed to connect to la port 80: No route to host
root@curlpod:/app# curl 10.103.199.187 #name resolution seems fine
curl: (7) Failed to connect to la port 80: No route to host
I can still reach pods individually. eg la-65776bc88f-5q7d8 has IP: 10.244.5.13
root@test-pod:/app# curl 10.244.5.13
This is v1 pod la-65776bc88f-5q7d8
So I thought it was service that won't route the traffic, but I can see service has no change and all new pods are listed as endpoints for the service. kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 93m
la ClusterIP 10.103.199.187 <none> 80/TCP 64m
kubectl get ep
NAME ENDPOINTS AGE
kubernetes 172.31.16.67:6443 93m
la 10.244.5.13,10.244.1.9:80,10.244.2.7:80 + 3 more... 64m
Somewhere something is blocking the traffic from service to new pods.
Now, if we restart the svc, the problem will go away!!
kubectl delete svc la
service "la" deleted
kubectl expose deploy la --port=80 --target-port=80
service/la exposed
kubectl get svc
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 108m
la ClusterIP 10.96.124.200 <none> 80/TCP 71s
kubectl exec -it curlpod bash
This is v1 pod la-65776bc88f-z24qn
Tried a few alternative, such as use yaml to create deployment, service, same problem. Also tried using kubectl scale to scale down and up. However I create new pod in the deployment, the traffic from service to pod would always be blocked.
Any help or suggestion welcome!
Add Flannel log
kubectl logs --namespace kube-system kube-flannel-ds-amd64-tbglf
I0423 15:15:52.311421 1 main.go:518] Determining IP address of default interface
I0423 15:15:52.312644 1 main.go:531] Using interface with name eth0 and address 10.30.193.13
I0423 15:15:52.312669 1 main.go:548] Defaulting external address to interface address (10.30.193.13)
W0423 15:15:52.312689 1 client_config.go:517] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0423 15:15:52.323517 1 kube.go:119] Waiting 10m0s for node controller to sync
I0423 15:15:52.324491 1 kube.go:306] Starting kube subnet manager
I0423 15:15:53.323919 1 kube.go:126] Node controller sync successful
I0423 15:15:53.323976 1 main.go:246] Created subnet manager: Kubernetes Subnet Manager - uk02sybk8smp3
I0423 15:15:53.323983 1 main.go:249] Installing signal handlers
I0423 15:15:53.324084 1 main.go:390] Found network config - Backend type: vxlan
I0423 15:15:53.324176 1 vxlan.go:121] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
I0423 15:15:53.365832 1 main.go:355] Current network or subnet (10.244.0.0/16, 10.244.2.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rules
I0423 15:15:53.367815 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0423 15:15:53.409089 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0423 15:15:53.410118 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0423 15:15:53.411017 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j MASQUERADE --random-fully
I0423 15:15:53.412076 1 main.go:305] Setting up masking rules
I0423 15:15:53.412916 1 main.go:313] Changing default FORWARD chain policy to ACCEPT
I0423 15:15:53.412999 1 main.go:321] Wrote subnet file to /run/flannel/subnet.env
I0423 15:15:53.413009 1 main.go:325] Running backend.
I0423 15:15:53.413017 1 main.go:343] Waiting for all goroutines to exit
I0423 15:15:53.413038 1 vxlan_network.go:60] watching for new subnet leases
I0423 15:15:53.510136 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0423 15:15:53.510158 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0423 15:15:53.510293 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0423 15:15:53.510301 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0423 15:15:53.511096 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0423 15:15:53.512091 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -j RETURN
I0423 15:15:53.512206 1 iptables.go:167] Deleting iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0423 15:15:53.512998 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
I0423 15:15:53.513970 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0423 15:15:53.609595 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0423 15:15:53.612312 1 iptables.go:155] Adding iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0423 15:15:53.613132 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0423 15:15:53.710527 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.2.0/24 -j RETURN
I0423 15:15:53.712938 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
kube-proxy is flooded with
E0504 08:25:36.251542 1 proxier.go:1533] Failed to sync endpoint for service: 10.30.218.46:30080/TCP, err: parseIP Error ip=[10 244 5 3 0 0 0 0 0 0 0 0 0 0 0 0]
E0504 08:25:36.251632 1 proxier.go:1950] Failed to list IPVS destinations, error: parseIP Error ip=[10 244 5 3 0 0 0 0 0 0 0 0 0 0 0 0]