Kubenet response time degraded, solved using hostNetwork: true, with unicorn app

9/25/2018

I’m trying to debug an issue that is solved by using hostNetwork: true. The k8s installation is using kubenet, and the k8s version is 1.9.8.

The installation is done with kops on AWS, using m4.xlarge and c4.xlarge instances.

The problem is the following:

When we migrated this application to kubernetes, the response time (percentile 95) for a certain endpoint increased about 20-30%.

This issue is solved, though, when using hostNetwork: true in the yaml. The performance is the same than it was on VMs for this endpoint, i.e the percentile 95 for the response time is the same for this endpoint.

I’ve asked this in the kubernetes office hours on July 18th (yeah, a while ago!) and the hostNetwork: true workaround come up there.

Please note that all kube-proxy stuff can be discarded as this increased response time is seen when measuring in the app itself. I mean, the ruby app measures the time it takes and send it to the log collector. This time, that is since the request is started to be processed by the app till it finished, shows the degraded performance already. So kube-proxy and that stuff is out of the equation.

The pod has 3 containers:

  • Nginx
  • A log collector
  • The app (a ruby app running with unicorn)

These apps are also in the VMs mode

What I tried:

  • Found a way to reproduce using ab(apache benchmark)
    • ab -c 1 -n 1000 'https://...
    • The same happens with http, instead of https
  • I tried removing the nginx container, but it doesn’t change anything. The log collector is used over localhost, and the very same thing is done on the VMs that do not exhibit the problem
  • I tried using unix sockets between nginx and the app, instead of localhost, but it didn’t change anything either.
  • Tried using same instances (m4.xlarge) with EKS: the same happens. Although the performance cost of not using hostNetwork: true is less, about 10%. Please note that EKS does not use kubenet and uses their own network overlay based on some open source.
  • Tried using another endpoint that just returns a string (puts “Ok”) and the issue does not happen
  • Tried using an endpoint that returns a few MBs (like "Die" * 10 * 1024 * 1024), and the issue does not happen either
  • Tried the same endpoint that has the issue with different query string params, so the response is big (9MB) or short (130kb) and both reliably reproduce the issue
  • Tried a nodejs application, that returns similar jsons from similar sources, and the issue is not present (nor with short/long responses) What might do next:

So, I’m trying to debug this issue to understand what it is and, hopefully, stop using hostNetwork: true. There seems to path to dig further:

  • Try other CNIs (EKS showed less performance degradation) to see if the performance changes

  • See what this endpoint does or how it interacts with unicorn and the whole stack. One big difference is that unicorn is one process per request (synchronous) and nodejs is not.

  • Try to use more newer machines (m5/c5) to see if they mitigate the performance hit. But, as this issue is not present with the current instances using them as VMs, seems that if it helps, will only hide the problem

This endpoint that has the perf problem, is an endpoint in ruby that reads a database and gets returns a json. The database, the host, the network, all seem fine (monitoring CPU, disk IO, swap, etc. with vmstat, our regular tools, AWS console, checking kern.log, sysloca and that stuff also)

By any chance, did you have a similar experience? Or do you have any other ideas on how to continue to debug this issue?

Any ideas or any kind of help is more than welcome!

Rodrigo

-- rata
amazon-eks
kubernetes

2 Answers

9/25/2018

Sounds like the overhead you're experiencing is due to Docker's NAT.
hostNetwork: true exposes the host's network to the pod/container(s), as opposed to using a NAT, providing better performance... But reducing security.

Hope this helps!

-- samhain1138
Source: StackOverflow

9/26/2018

The problem seems to be https://github.com/kubernetes/kubernetes/issues/56903

The workarounds mentioned there (like dnsPolicy: Default) solve the issue for me.

These two post explain the problem in detail: https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts and https://blog.quentin-machu.fr/2018/06/24/5-15s-dns-lookups-on-kubernetes/

And also provide some workarounds.

long story short: there is a race condition in nf that affects connection-less protocols (like UDP) when doing DNAT/SNAT. The weave guys have sent a patch to fix most of the races. To work-around you can either use an external dns (i.e. not kube-dns as it is exposed via a service and, so, uses DNAT), set flags for glibc (but don't work for musl), use a minimal delay with tc, etc.

Note: Using dnsPolicy: Default does the trick because it is using an external DNS server (i.e. not hosted in kubernetes and accessed via a service, that does DNAT).

I'll test the glibc flags for my cluster, although the dnsPolicy: Default thing does solve the issue for me, as we are using k8s DNS service resolution on some apps.

-- rata
Source: StackOverflow