istio-proxy closing long running TCP connection after 1 hour

9/11/2020

TL;DR: How can we configure istio sidecar injection/istio-proxy/envoy-proxy/istio egressgateway to allow long living (>3 hours), possibly idle, TCP connections?

Some details:

We're trying to perform a database migration to PostgreSQL which is being triggered by one application which has Spring Boot + Flyway configured, this migration is expected to last ~3 hours.

Our application is deployed inside our kubernetes cluster, which has configured istio sidecar injection. After exactly one hour of running the migration, the connection is always getting closed.

We're sure it's istio-proxy closing the connection as we attempted the migration from a pod without istio sidecar injection and it was running for longer than one hour, however this is not an option going forward as this may imply some downtime in production which we can't consider.

We suspect this should be configurable in istio proxy setting the parameter idle_timeout - which was implemented here. However this isn't working, or we are not configuring it properly, we're trying to configure this during istio installation by adding --set gateways.istio-ingressgateway.env.ISTIO_META_IDLE_TIMEOUT=5s to our helm template.

-- Yayotrón
envoyproxy
istio
kubernetes
tcp

2 Answers

12/14/2020

If you use istio version higher than 1.7 you might try use envoy filter to make it work. There is answer and example on github provided by @ryant1986.

We ran into the same problem on 1.7, but we noticed that the ISTIO_META_IDLE_TIMEOUT setting was only getting picked up on the OUTBOUND side of things, not the INBOUND. By adding an additional filter that applied to the INBOUND side of the request, we were able to successfully increase the timeout (we used 24 hours)

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: listener-timeout-tcp
  namespace: istio-system
spec:
  configPatches:
  - applyTo: NETWORK_FILTER
    match:
      context: SIDECAR_INBOUND
      listener:
        filterChain:
          filter:
            name: envoy.filters.network.tcp_proxy
    patch:
      operation: MERGE
      value:
        name: envoy.filters.network.tcp_proxy
        typed_config:
          '@type': type.googleapis.com/envoy.config.filter.network.tcp_proxy.v2.TcpProxy
          idle_timeout: 24h

We also created a similar filter to apply to the passthrough cluster (so that timeouts still apply to external traffic that we don't have service entries for), since the config wasn't being picked up there either.

-- Jakub
Source: StackOverflow

5/18/2021

for ingress gateway, we use env.ISTIO_META_IDLE_TIMEOUT to set the idle-timeout for TCP or HTTP protocol. for sidecar, you can use the similar envoyfilter (listener-timeout-tcp) to configure INBOUND direction or OUTBOUND direction.

-- HobbyTan
Source: StackOverflow