getaddrinfo: Temporary failure in name resolution kubernetes + coredns

5/17/2019

We have a service that sends tons of events in bulks. It basically opens multiple http POST connections.

Since we moved the service to kubernetes, we're getaddrinfo: Temporary failure in name resolution errors from time to time. (most calls work but some fail and it's weird.

Can anyone explain why and how to fix?

Thanks!

-- refaelos
coredns
kubernetes

1 Answer

8/12/2019

Check the tinder post, they had a similar problem:

https://medium.com/tinder-engineering/tinders-move-to-kubernetes-cda2a6372f44

and the source for their dns info:

https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts

TLDR: check your arp tables cache gc_* host parameters, try to disable AAAA query in the containers /etc/gai.conf, move the DNS to a daemonset and inject the host IP as dns servers to the pods

-- higuita
Source: StackOverflow