gRPC socket closed on kubernetes with ingress

4/25/2019

I have a gRPC server that works fine on my local machine. I can send grpc requests from a python app and get back the right responses.

I put the server into a GKE cluster (with only one node). I had a normal TCP load balancer in front of the cluster. In this setup my local client was able to get the correct response from some requests, but not others. I think it was the gRPC streaming that didn't work.

I assumed that this is because the streaming requires an HTTP/2 connection which requires SSL.

The standard load balancer I got in GKE didn't seem to support SSL, so I followed the docs to set up an ingress load balancer which does. I'm using a Lets-Encrypt certificate with it.

Now all gRPC requests return

status = StatusCode.UNAVAILABLE

details = "Socket closed"

debug_error_string = "{"created":"@1556172211.931158414","description":"Error received from peer ipv4:ip.of.ingress.service:443", "file":"src/core/lib/surface/call.cc", "file_line":1041,"grpc_message":"Socket closed","grpc_status":14}"

The IP address is the external IP address of my ingress service. The ingress yaml looks like this:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: rev79-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "rev79-ip"
    ingress.gcp.kubernetes.io/pre-shared-cert: "lets-encrypt-rev79"
    kubernetes.io/ingress.allow-http: "false" # disable HTTP
spec:
  rules:
  - host: sub-domain.domain.app
    http:
      paths:
      - path: /*
        backend:
          serviceName: sandbox-nodes
          servicePort: 60000

The subdomain and domain of the request from my python app match the host in the ingress rule.

It connects to a node-port that looks like this:

apiVersion: v1
kind: Service
metadata:
  name: sandbox-nodes
spec:
  type: NodePort
  selector:
    app: rev79
    environment: sandbox
  ports:
  - protocol: TCP
    port: 60000
    targetPort: 9000

The node itself has two containers and looks like this:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rev79-sandbox
  labels:
    app: rev79
    environment: sandbox
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: rev79
        environment: sandbox
    spec:
      containers:
      - name: esp
        image: gcr.io/endpoints-release/endpoints-runtime:1.31
        args: [
          "--http2_port=9000",
          "--service=rev79.endpoints.rev79-232812.cloud.goog",
          "--rollout_strategy=managed",
          "--backend=grpc://0.0.0.0:3011"
        ]
        ports:
        - containerPort: 9000
      - name: rev79-uac-sandbox
        image: gcr.io/rev79-232812/uac:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 3011
        env:
        - name: RAILS_MASTER_KEY
          valueFrom:
            secretKeyRef:
              name: rev79-secrets
              key: rails-master-key

The target of the node port is the ESP container which connects to the gRPC service deployed in the cloud, and the backend which is a Rails app that implements the backend of the API. This rails app isn't running the rails server, but a specialised gRPC server that comes with the grpc_for_rails gem

The grpc_server in the Rails app doesn't record any action in the logs, so I don't think the request gets that far.

kubectl get ingress reports this:

NAME            HOSTS                   ADDRESS            PORTS   AGE
rev79-ingress   sub-domain.domain.app   my.static.ip.addr   80      7h

showing port 80, even though it's set up with SSL. That seems to be a bug. When I check with curl -kv https://sub-domain.domain.app the ingress server handles the request fine, and uses HTTP/2. It reurns an HTML formatted server error, but I'm not sure what generates that.

The API requires an API key, which the python client inserts into the metadata of each request.

When I go to the endpoints page of my GCP console I see that the API is not registering any requests since putting in the ingress loadbalancer, so it looks like the requests are not reaching the EPS container.

So why am I getting "socket closed" errors with gRPC?

-- Toby 1 Kenobi
grpc
kubernetes
kubernetes-ingress
sockets

0 Answers