Kubernetes: Can't ping pods across nodes

7/30/2018

I am currently following this tutorial (except that I am on AWS, and I can do nothing about that).
I am currently at the 10th step and seem to be having problems while trying to reach pods from one worker to another.

Here is a log from two workers which will help underline the problem:

worker-0:

root@worker-0:/home/admin# ip addr show eth0 | grep 'inet '                                                                                                                                                        
inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
root@worker-0:/home/admin# traceroute 10.200.1.10 -n -i cnio0 -I -m 5                                                                                                                                              
traceroute to 10.200.1.10 (10.200.1.10), 5 hops max, 60 byte packets
 1  10.200.1.10  0.135 ms  0.079 ms  0.073 ms
root@worker-0:/home/admin# ping 10.240.1.232
PING 10.240.1.232 (10.240.1.232) 56(84) bytes of data.
64 bytes from 10.240.1.232: icmp_seq=1 ttl=64 time=0.151 ms
^C
--- 10.240.1.232 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
root@worker-0:/home/admin# traceroute 10.200.3.5 -g 10.240.1.232 -n -i eth0 -I -m 5                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 72 byte packets
 1  * * *
 2  * * *
 3  * * *
 4  * * *
 5  * * *
root@worker-0:/home/admin#

worker-2:

root@worker-2:/home/admin# ip addr show eth0 | grep 'inet '
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
root@worker-2:/home/admin# traceroute 10.200.3.5 -n -i cnio0 -I -m 5                                                                                                                                                
traceroute to 10.200.3.5 (10.200.3.5), 5 hops max, 60 byte packets
 1  10.200.3.5  0.140 ms  0.077 ms  0.072 ms
root@worker-2:/home/admin# ping 10.200.3.5
PING 10.200.3.5 (10.200.3.5) 56(84) bytes of data.
64 bytes from 10.200.3.5: icmp_seq=1 ttl=64 time=0.059 ms
64 bytes from 10.200.3.5: icmp_seq=2 ttl=64 time=0.047 ms
^C
--- 10.200.3.5 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.047/0.053/0.059/0.006 ms
root@worker-2:/home/admin#

The pods deploy correctly (I have tried spawning 11 instances of busybox and here is the result:

admin@ip-10-240-1-250:~$ kubectl get pods
busybox-68654f944b-vjs2s    1/1       Running     69         2d
busybox0-7665ddff5d-2856g   1/1       Running     69         2d
busybox1-f9585ffdb-tg2lj    1/1       Running     68         2d
busybox2-78c5d7bdb6-fhfdc   1/1       Running     68         2d
busybox3-74fd4b4f98-pp4kz   1/1       Running     69         2d
busybox4-55d568f8c4-q9hk9   1/1       Running     68         2d
busybox5-69f77b4fdb-d7jf2   1/1       Running     68         2d
busybox6-b5b869f4-2vnkz     1/1       Running     69         2d
busybox7-7df7958c4b-4bxzx   0/1       Completed   68         2d
busybox8-6d78f4f5d6-cvfx7   1/1       Running     69         2d
busybox9-86d49fdf4-75ddn    1/1       Running     68         2d

Thank you for your insights

EDIT: Adding infos for workers

worker-0:

root@worker-0:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:2b:ed:df:b7:58 brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.230/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::2b:edff:fedf:b758/64 scope link
       valid_lft forever preferred_lft forever
root@worker-0:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.1.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0

worker-2:

root@worker-2:/home/admin# ip addr show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    link/ether 02:b0:2b:67:73:9e brd ff:ff:ff:ff:ff:ff
    inet 10.240.1.232/24 brd 10.240.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b0:2bff:fe67:739e/64 scope link
       valid_lft forever preferred_lft forever
root@worker-2:/home/admin# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.240.1.1      0.0.0.0         UG    0      0        0 eth0
10.200.3.0      0.0.0.0         255.255.255.0   U     0      0        0 cnio0
10.240.1.0      0.0.0.0         255.255.255.0   U     0      0        0 eth0
-- Talanor
amazon-ec2
kubernetes
networking
routing

2 Answers

1/26/2020

Thanks @VAS, it was helpful,

kubernet master

# edit /etc/hosts

192.168.2.150 master master.localdomain
192.168.2.151 node1 node1.localdomain
192.168.2.152 node2 node2.localdomain
...

# then add routes
$ route add -net 10.244.1.0/24 gw node1
$ route add -net 10.244.2.0/24 gw node2
...

that's because

"..flannel gives each host an IP subnet (/24 by default).."

Flannel: A Network Fabric for Containers

-- melvinrmc
Source: StackOverflow

8/2/2018

Your nodes are missing routes to other node pods' subnet.

To get it working you need to either add static routes on the worker nodes or add routes to all pods' subnets on the default gateway 10.240.1.1

The first case:

Run on the worker1 node:

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232

Run on the worker2 node:

route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

In this case, traffic will go directly from one worker node to another, but if your cluster grows, you have to change the route table on all workers accordingly. However, these subnets will not be reachable from other VPC hosts without adding IP routes to the cloud router.

The second case:

On the default router (10.240.1.1):

route add -net 10.200.3.0/24 netmask 255.255.255.0 gw 10.240.1.232
route add -net 10.200.1.0/24 netmask 255.255.255.0 gw 10.240.1.230

In this case, traffic will be routed by default router, and if you add new nodes to your cluster, you will need to update only one route table on the default router.
This solution is used in the Routes part of the “Kubernetes the hard way”.

This article would be helpful for creating routes using AWS CLI.

-- VAS
Source: StackOverflow