The symptom
Requests to applications sporadicly give a http 504 or long waiting time (a multiple of 12s).
We have the problem on pod to pod communication where the 2 pods are on the same node in kubernetes.
Eg from nginx ingress to an application pod on the same node from application pod to an application pod on the same node from application pod to a rabbitmq eventbus pod on the same node
Our infrastructure
EKS with classic ELBs (both internal and external) (not network lb) on nginx ingress service. The load balancer services have externalnetworkpolicy local. EKS 1.13 with node version 1.13.8 (eks optimized ami)
TCPDUMP
Folowing is a useful tcpdump output from an application pod trying to connect to an eventbus, which fails. It succeeds after a couple of retries, most of the times (usually after 12s):
13:44:46.744764 IP customer-reports-service-5b4d8c48b-vj4db.35196 > eventbus-rabbitmq.kube-system.svc.cluster.local.5672: Flags [S], seq 1434468571, win 26883, options [mss 8961,sackOK,TS val 4064032250 ecr 0,nop,wscale 7], length 0
13:44:46.751000 IP ip-10-0-161-173.eu-west-1.compute.internal > customer-reports-service-5b4d8c48b-vj4db: ICMP time exceeded in-transit, length 68
info on this tcpdump: 1. applicaton pod make a request to eventbus pod on same node 2. the node sends a icmp time exceeded to the application pod. Probably the request never gets to the eventbus.
Possible workaround
use pod anti affinity to make sure that each eventbus pod, each nginx ingress pod, each api gateway runs on different nodes then our application services
But I'm looking to an actual solution of the problem.
Other related reference
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#a-pod-cannot-reach-itself-via-service-ip Hairpin mode in my EKS setup is hairpin-veth. There is the following instruction: ensure the Kubelet has the permission to operate in /sys on node. But I'm not sure on how to do so as on EKS the cbr0 interfaces is not there, it uses eni interfaces
Ok, right after posting the question, AWS provided me a solution to the problem:
ISSUE : https://github.com/aws/amazon-vpc-cni-k8s/issues/641
Downgrade the VPC CNI PlugIn to v1.5.3 until 1.5.5 is released: update the daemonset and restart all nodes