GKE basic-ingress intermittently returns 502 when backend returns 404/422

11/7/2019

I have an ingress providing routing for two microservices running on GKE, and intermittently when the microservice returns a 404/422, the ingress returns a 502.

Here is my ingress definition:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: basic-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: develop-static-ip
    ingress.gcp.kubernetes.io/pre-shared-cert: dev-ssl-cert
spec:
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: srv
          servicePort: 80
      - path: /c/*
        backend:
          serviceName: collection
          servicePort: 80
      - path: /w/*
        backend:
          serviceName: collection
          servicePort: 80

I run tests that hit the srv back-end where I expect a 404 or 422 response. I have verified when I hit the srv back-end directly (bypassing the ingress) that the service responds correctly with the 404/422.

When I issue the same requests through the ingress, the ingress will intermittently respond with a 502 instead of the 404/422 coming from the back-end.

How can I have the ingress just return the 404/422 response from the back-end?

Here's some example code to demonstrate the behavior I'm seeing (the expected status is 404):

>>> for i in range(10):
        resp = requests.get('https://<server>/a/v0.11/accounts/junk', cookies=<token>)
        print(resp.status_code)

502
502
404
502
502
404
404
502
404
404

And here's the same requests issued from a python prompt within the pod, i.e. bypassing the ingress:

>>> for i in range(10):
...     resp = requests.get('http://0.0.0.0/a/v0.11/accounts/junk', cookies=<token>)
...     print(resp.status_code)
...
404
404
404
404
404
404
404
404
404
404

Here's the output of the kubectl commands to demonstrate that the loadbalancer is set up correctly (I never get a 502 for a 2xx/3xx response from the microservice):

$ kubectl get pods -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP          NODE                                     NOMINATED NODE   READINESS GATES
srv-799976fbcb-4dxs7          2/2     Running   0          19m   10.24.3.8   gke-develop-default-pool-ea507abc-43h7   <none>           <none>
srv-799976fbcb-5lh9m          2/2     Running   0          19m   10.24.1.7   gke-develop-default-pool-ea507abc-q0j3   <none>           <none>
srv-799976fbcb-5zvmv          2/2     Running   0          19m   10.24.2.9   gke-develop-default-pool-ea507abc-jjzg   <none>           <none>
collection-5d9f8586d8-4zngz   2/2     Running   0          19m   10.24.1.6   gke-develop-default-pool-ea507abc-q0j3   <none>           <none>
collection-5d9f8586d8-cxvgb   2/2     Running   0          19m   10.24.2.7   gke-develop-default-pool-ea507abc-jjzg   <none>           <none>
collection-5d9f8586d8-tzwjc   2/2     Running   0          19m   10.24.2.8   gke-develop-default-pool-ea507abc-jjzg   <none>           <none>
parser-7df86f57bb-9qzpn       1/1     Running   0          19m   10.24.0.8   gke-develop-parser-pool-5931b06f-6mcq    <none>           <none>
parser-7df86f57bb-g6d4q       1/1     Running   0          19m   10.24.5.5   gke-develop-parser-pool-5931b06f-9xd5    <none>           <none>
parser-7df86f57bb-jchjv       1/1     Running   0          19m   10.24.0.9   gke-develop-parser-pool-5931b06f-6mcq    <none>           <none>

$ kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)        AGE
srv          NodePort    10.0.2.110   <none>        80:30141/TCP   129d
collection   NodePort    10.0.4.237   <none>        80:30270/TCP   129d
kubernetes   ClusterIP   10.0.0.1     <none>        443/TCP        130d

$ kubectl get endpoints
NAME         ENDPOINTS                                AGE
srv          10.24.1.7:80,10.24.2.9:80,10.24.3.8:80   129d
collection   10.24.1.6:80,10.24.2.7:80,10.24.2.8:80   129d
kubernetes   35.237.239.186:443                       130d
-- RAY
google-cloud-platform
google-kubernetes-engine
http-response-codes
kubernetes
kubernetes-ingress

3 Answers

11/10/2019

502 is a tricky status code, it can mean a context cancelled by the client or simply a bad gateway from the server you are trying to reach. In kubernetes a 502 usually means you cannot reach the service. Thus, I would go for debugging your services and deployments doc.

Use kubectl get pods -o wide to get your srv pod; check its clusterIP IP. Then make sure the service is load balancing the srv deployment. To accomplish this, run kubectl get svc and look for the srv service. Finally run kubectl get endpoints, get the IP assigned to the srv endpoint and match it against the IP you obtained from the pod. If this is all ok, then you are correctly load-balancing to your backend.

-- Rodrigo Loza
Source: StackOverflow

11/22/2019

tl;dr: GCP LoadBalancer/GKE Ingress will 502 if 404/422s from the back-ends don't have response bodies.

Looking at the LoadBalancer logs, I would see the following errors:

502: backend_connection_closed_before_data_sent_to_client
404: backend_connection_closed_after_partial_response_sent

Since everything was configured correctly (even the LoadBalancer said the backends were healthy)--backend was working as expected and no failed health checks--I experimented with a few things and noticed that all of my 404 responses had empty bodies.

Sooo, I added a body to my 404 and 422 responses and lo and behold no more 502s!

-- RAY
Source: StackOverflow

11/13/2019

502 errors are expected when your backend service is returning 4xx errors. If the backend is returning 4xx, the health checks will fail. If all backends are failing, the Load Balancer will not have an available backend to send the traffic to and will return 502.

For any 502 error returned from the Load Balancer, I strongly recommend checking the stackdriver logs for the HTTP Load Balancer. Any 502 error will include a message output along with the 502 response. The message should clarify why 502 was reutned (there are a number of reasons).

In your current case, the 502 error log should mention "failed_to_pick_backend" or "failed_to_connect_to_backend", something to that tune. If you are using nginx ingress, similar behavior can be seen, but the 502 error message may say something different.

-- Patrick W
Source: StackOverflow