How to handle ephemeral port exhaustion in envoy

5/29/2018

One of the problems with reverse proxies handling multiple requests on behalf of clients is, after a while under heavy load, the number of outgoing connections from envoy node to backend nodes will run out of ephemeral ports

Assuming that we have assigned multiple ip addreses/hostnames to envoy node, is there a way to inform envoy use these ip addresses/hostnames in a round robin fashion when making connections to backends?

References:

  1. https://blog.box.com/blog/ephemeral-port-exhaustion-and-web-services-at-scale/
  2. https://making.pusher.com/ephemeral-port-exhaustion-and-how-to-avoid-it/
  3. https://www.nginx.com/blog/overcoming-ephemeral-port-exhaustion-nginx-plus/
  4. https://github.com/kubernetes/kubernetes/issues/27398
-- Ashok Koyi
envoyproxy
kubernetes
port
reverse-proxy

1 Answer

5/29/2018

The most promising option is to find a way to enable TCP multiplexing between your proxy/LB and backend servers.

What is TCP Multiplexing?

TCP multiplexing is a technique used primarily by load balancers and application delivery controllers (but also by some stand-alone web application acceleration solutions) that enables the device to "reuse" existing TCP connections. This is similar to the way in which persistent HTTP 1.1 connections work in that a single HTTP connection can be used to retrieve multiple objects, thus reducing the impact of TCP overhead on application performance.

TCP multiplexing allows the same thing to happen for TCP-based applications (usually HTTP / web) except that instead of the reuse being limited to only 1 client, the connections can be reused over many clients, resulting in much greater efficiency of web servers and faster performing applications.

Another good explanation about TCP multiplexing can be found here.

Another option is adding more proxy instances to the pool behind the L4 network Load Balancer and set connection limit for one instance to reasonable value. Each proxy would carry a certain amount of load without a problem. If you need to handle periodic bursts in load, you may want to set auto scaling strategy to the proxy pool.

-- VAS
Source: StackOverflow