nginx ingress controller 0.26.1 returns 504 (timeout while connecting to upstream) on GKE v1.14

11/7/2019

I recently upgraded my gke cluster to 1.14.x and nginx ingress to the latest version 0.26.1. At some point my ingresses stopped working.

For instance, when trying to access Nexus with curl INGRESS_IP -H "host:nexus.myorg.com", these are the ingress controller logs:

2019/11/07 08:35:49 [error] 350#350: *2664 upstream timed out (110: Connection timed out) while connecting to upstream, client: 82.81.2.76, server: nexus.myorg.com, request: "GET / HTTP/1.1", upstream: "http://10.8.25.3:8081/", host: "nexus.myorg.com"
2019/11/07 08:35:54 [error] 350#350: *2664 upstream timed out (110: Connection timed out) while connecting to upstream, client: 82.81.2.76, server: nexus.myorg.com, request: "GET / HTTP/1.1", upstream: "http://10.8.25.3:8081/", host: "nexus.myorg.com"
2019/11/07 08:35:59 [error] 350#350: *2664 upstream timed out (110: Connection timed out) while connecting to upstream, client: 82.81.2.76, server: nexus.myorg.com, request: "GET / HTTP/1.1", upstream: "http://10.8.25.3:8081/", host: "nexus.myorg.com"
82.81.2.76 - - [07/Nov/2019:08:35:59 +0000] "GET / HTTP/1.1" 504 173 "-" "curl/7.64.1" 79 15.003 [some-namespace-nexus-service-8081] [] 10.8.25.3:8081, 10.8.25.3:8081, 10.8.25.3:8081 0, 0, 0 5.001, 5.001, 5.001 504, 504, 504 a03f13a3bfc943e44f2df3d82a6ecaa4

As you can see it tries to connect three times to 10.8.25.3:8081 which is the pod IP, timing out in all of them.

I've sh'ed into a pod and accessed the pod using that same IP with no problem: curl 10.8.25.3:8081. So the service is set up correctly.

This is my Ingress config:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
  namespace: some-namespace
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/add-base-url: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: 30M
spec:
  rules:
  - host: nexus.myorg.com
    http:
      paths:
      - backend:
          serviceName: nexus-service
          servicePort: 8081

Any idea how to troubleshoot of fix this?

-- codependent
google-kubernetes-engine
kubernetes
kubernetes-ingress
nginx-ingress

1 Answer

11/7/2019

The problem had to do with network policies. We have some policies to forbid the access to pods from other namespaces and allow only from the ingress namespace

  apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    name: allow-from-ingress-namespace
    namespace: some-namespace
  spec:
    ingress:
    - from:
      - namespaceSelector:
          matchLabels:
            type: ingress
    podSelector: {}
    policyTypes:
    - Ingress

  apiVersion: extensions/v1beta1
  kind: NetworkPolicy
  metadata:
    name: deny-from-other-namespaces
    namespace: some-namespace
  spec:
    ingress:
    - from:
      - podSelector: {}
    podSelector: {}
    policyTypes:
    - Ingress

With the upgrade we lost the label that is matched in the policy (type=ingress). Simply adding it fixed the problem: kubectl label namespaces ingress-nginx type=ingress

-- codependent
Source: StackOverflow