I am currently following this tutorial (except that I am on AWS, and I can do nothing about that).
I am currently at the 10th step and seem to be having problems while trying to reach pods from one worker to another.
Here is a log from two workers which will help underline the problem:
worker-0:
root@worker-0:/home/admin# ip addr show eth0 | grep 'inet '
inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
root@worker-0:/home/admin# traceroute 10.200.1.10 -n -i cnio0 -I -m 5
traceroute to 10.200.1.10 (10.200.1.10), 5 hops max, 60 byte packets
1 10.200.1.10 0.135 ms 0.079 ms 0.073 ms
root@worker-0:/home/admin# ping 10.240.1.232
PING 10.240.1.232 (10.240.1.232) 56(84) bytes of data.
64 bytes from 10.240.1.232: icmp_seq=1 ttl=64 time=0.151 ms
^C
--- 10.240.1.232 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
root@worker-0:/home/admin# traceroute 10.200.3.5 -g 10.240.1.232 -n -i eth0 -I -m 5
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 72 byte packets
1 * * *
2 * * *
3 * * *
4 * * *
5 * * *
root@worker-0:/home/admin#
worker-2:
root@worker-2:/home/admin# ip addr show eth0 | grep 'inet '
inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
root@worker-2:/home/admin# traceroute 10.200.3.5 -n -i cnio0 -I -m 5
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 60 byte packets
1 10.200.3.5 0.140 ms 0.077 ms 0.072 ms
root@worker-2:/home/admin# ping 10.200.3.5
PING 10.200.3.5 (10.200.3.5) 56(84) bytes of data.
64 bytes from 10.200.3.5: icmp_seq=1 ttl=64 time=0.059 ms
64 bytes from 10.200.3.5: icmp_seq=2 ttl=64 time=0.047 ms
^C
--- 10.200.3.5 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.047/0.053/0.059/0.006 ms
root@worker-2:/home/admin#
The pods deploy correctly (I have tried spawning 11 instances of busybox and here is the result:
admin@ip-10-240-1-250:~$ kubectl get pods
busybox-68654f944b-vjs2s 1/1 Running 69 2d
busybox0-7665ddff5d-2856g 1/1 Running 69 2d
busybox1-f9585ffdb-tg2lj 1/1 Running 68 2d
busybox2-78c5d7bdb6-fhfdc 1/1 Running 68 2d
busybox3-74fd4b4f98-pp4kz 1/1 Running 69 2d
busybox4-55d568f8c4-q9hk9 1/1 Running 68 2d
busybox5-69f77b4fdb-d7jf2 1/1 Running 68 2d
busybox6-b5b869f4-2vnkz 1/1 Running 69 2d
busybox7-7df7958c4b-4bxzx 0/1 Completed 68 2d
busybox8-6d78f4f5d6-cvfx7 1/1 Running 69 2d
busybox9-86d49fdf4-75ddn 1/1 Running 68 2d
Thank you for your insights
EDIT: Adding infos for workers
worker-0:
root@worker-0:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:2b:ed:df:b7:58 brd ff:ff:ff:ff:ff:ff
inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::2b:edff:fedf:b758/64 scope link
valid_lft forever preferred_lft forever
root@worker-0:/home/admin# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.240.1.1 0.0.0.0 UG 0 0 0 eth0
10.200.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cnio0
10.240.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
worker-2:
root@worker-2:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:b0:2b:67:73:9e brd ff:ff:ff:ff:ff:ff
inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::b0:2bff:fe67:739e/64 scope link
valid_lft forever preferred_lft forever
root@worker-2:/home/admin# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.240.1.1 0.0.0.0 UG 0 0 0 eth0
10.200.3.0 0.0.0.0 255.255.255.0 U 0 0 0 cnio0
10.240.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
Thanks @VAS, it was helpful,
# edit /etc/hosts
192.168.2.150 master master.localdomain
192.168.2.151 node1 node1.localdomain
192.168.2.152 node2 node2.localdomain
...
# then add routes
$ route add -net 10.244.1.0/24 gw node1
$ route add -net 10.244.2.0/24 gw node2
...
that's because
"..flannel gives each host an IP subnet (/24 by default).."
Your nodes are missing routes to other node pods' subnet.
To get it working you need to either add static routes on the worker nodes or add routes to all pods' subnets on the default gateway 10.240.1.1
The first case:
Run on the worker1 node:
route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232
Run on the worker2 node:
route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230
In this case, traffic will go directly from one worker node to another, but if your cluster grows, you have to change the route table on all workers accordingly. However, these subnets will not be reachable from other VPC hosts without adding IP routes to the cloud router.
The second case:
On the default router (10.240.1.1
):
route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232
route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230
In this case, traffic will be routed by default router, and if you add new nodes to your cluster, you will need to update only one route table on the default router.
This solution is used in the Routes part of the “Kubernetes the hard way”.
This article would be helpful for creating routes using AWS CLI.