Istio sidecar upstream pod reconnection issue

3/19/2019

I'm running istio 1.0.6 (no mTLS) on our EKS cluster and am experiencing connection issues between two pods that I can't seem to figure out.

I have a gRPC client (abc-client) and server (abc-server) running both written in go. The cluster has automatic sidecar injection enabled, so both have it running.
Everything is working fine if I just start the two services, but whenever the server restarts, the client is not able to reconnect to it. From the restart onwards I continuously get the following error message on the client:

{"level":"error","error":"rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers","time":"2019-03-19T12:07:52Z","message":"abc-server service unhealthy"}

Without the istio sidecar the reconnection is working immediately. When I look at the sidecar logs of the abc-client I see a lot of the following messages for the health checks the client is trying to do:

2019-03-19T12:09:41.527Z] "POST /grpc.health.v1.Health/CheckHTTP/2" 200 UF 5 0 1002 - "-" "grpc-go/1.15.0" "6a874d67-9c43-4b67-ad24-7337a26cac8d" "abc-server.abc-namespace:8086" "172.31.18.194:8086" outbound|8086||abc-server.abc-namespace.svc.cluster.local - 172.31.18.194:8086 172.31.22.131:44994

My takeaway from these logs is that the proxy is experiencing problems connecting to the upstream server that is not there anymore, hence the UF flag (upstream failure) but is still returning a 200 status code to the client. To me it seems that this behaviour doesn't trigger a reconnection.

I have not any VirtualServices or DestinationRules set up, it's just a simple istio cluster with the sidecars enabled.

What am I missing? Thanks in advance for your help.

-- Memorex42
eks
grpc
istio
kubernetes

1 Answer

3/29/2019

The problem is connecting to headless services. My service.yaml looked like this:

apiVersion: v1
kind: Service
metadata:
  name: grpc-test-server
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: grpc-test-server
    role: grpc-test-server
  ports:
    - name: grpc
      port: 9999
      targetPort: grpc

As per the issues #10659 and #7495 istio doesn't work yet with headless services.

So, if you don't need your service to be headless, you can just remove clusterIP: None from the yaml and everything should work as expected.

I still don't know though how to make statefulsets work which require headless services.

-- Memorex42
Source: StackOverflow