This is my cluster environment:
service cluster ip cidr:10.254.0.0/16
pod ip cidr: 172.30.0.0/16
kube-proxy: ipvs mode
flannel : vxlan
kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:13:03Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.4", GitCommit:"5ca598b4ba5abb89bb773071ce452e33fb66339d", GitTreeState:"clean", BuildDate:"2018-06-06T08:00:59Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
The results I expected: I can access the SVC IP normally from the kubernetes master. On the master host, I have deployed kube-proxy and flannel.
All components of my kubernetes cluster are deployed via the release binaries.
I have deployed my cluster normally, and in most cases, it can provide services externally. I have deployed an nginx server,My screenshot:
Service IP: 10.254.126.228
Pod IP: 172.30.21.3
When I access the pod IP (172.30.21.3) directly on the kubernetes master host, it can be accessed normally.
My screenshot: http request
But when I access the service IP, the access timeout occurs.
My screenshot: Access SVC IP time out
This will only happen when the SVC IP is requested on the host outside the container. If I am in the container, I can access the SVC IP normally.
My screenshot: in the container http request
This is my list of ipvs rules: ipvs rule list
I used tcpdump to listen to the flannel.1 NIC and docker0 NIC on the nginx pod node host and found the following problems:
When I access the pod IP directly at the kubernetes master host, the flannel NIC on my master host will send the packet to the node host flannel NIC , then forward it to the docker0 NIC, and finally the pod IP. At the same time, pod IP will respond to my request and everything is normal. But when I access the SVC IP on the kubernetes master host, I found that the NIC that sent the packet changed to eth0 instead of using the flannel NIC, so the flannel NIC on the node host directly dropped the request and did not continue forwarding to docker0.
I realize that this may be a problem with policy-based routing. If I forward the request with the destination 10.254 to the correct gateway, can I solve this problem?
But this should have a more standard solution. Can anyone help me?
(I am sorry that my English is not very good...)
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.125.232.1 0.0.0.0 UG 0 0 0 eth0
10.125.232.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 1002 0 0 eth0
172.30.1.0 172.30.1.0 255.255.255.0 UG 0 0 0 flannel.1
172.30.3.0 172.30.3.0 255.255.255.0 UG 0 0 0 flannel.1
172.30.21.0 172.30.21.0 255.255.255.0 UG 0 0 0 flannel.1
172.30.81.0 172.30.81.0 255.255.255.0 UG 0 0 0 flannel.1
172.30.92.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
172.30.94.0 172.30.94.0 255.255.255.0 UG 0 0 0 flannel.1
Here is my routing table.
@白栋天
It seems your svc CIDR has intersection with host CIDR