Application on GKE not reachable when configured with internal load balancer

11/11/2021

Title may sound a little cryptic but here is the thing.

I've a Kubernetes Cluster on GKE (v1.20.10-gke.1600) which host a bunch of small applications.

Some of them need to be available only through internal IP. We use a VPN to act as inside the network and it works for everything flawlessly.

These applications, I'm fairly sure, were reachable when I first deployed them but it's been two months already and they are rarely used so when today i've got a call telling me that they were getting an error I was baffled.

So, when I've set it up I've followed this guide: https://cloud.google.com/kubernetes-engine/docs/how-to/internal-load-balance-ingress

These are the yaml configurations, hopefully I've censored all the sensible data:

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo-app
  labels:
    app: "foo"
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: "foo"
  template:
    metadata:
      labels:
        app: "foo"
    spec:
      containers:
      - name: foo-container
        image: eu.gcr.io/project-12345/foo:--DEPLOYMENT-NUMBER--
        imagePullPolicy: Always
        resources:
          limits:
            memory: 300Mi
            cpu: 300m
          requests:
            memory: 100Mi
            cpu: 100m
        ports:
        - containerPort: 8080
          name: tcp
      nodeSelector:
        bha.preemptible: "false"
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: foo-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: foo-app
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 95
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Service and Ingress

apiVersion: v1
kind: Service
metadata:
  name: foo-nodeport
  annotations:
    cloud.google.com/neg: '{"ingress": true}'
  labels:
    app: "foo"
spec:
  selector:
    app: "foo"
  type: NodePort
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: foo-ingress
  annotations:
    kubernetes.io/ingress.class: gce-internal
    kubernetes.io/ingress.regional-static-ip-name: foo-ingress
spec:
  rules:
  - host: foo.domain.net
    http:
      paths:
      - path: /*
        pathType: ImplementationSpecific
        backend:
          service:
            name: foo-nodeport
            port: 
              number: 80

Running a kubectl apply on those two yaml seems to work without issues.
I've taken note of the warning about the firewall rules and run the proper update and followed through the troubleshooting but everything is healthy.

I've created the internal static IP prior to the deployment. Being 10.11.0.3 and I can see it being used in the load balancer frontend. The load balancer backend is correctly pointing at the right node IP 10.101.1.2.

However if I try and open 10.11.0.3 or foo.domain.net from the browser I get an upstream request timeout. If I try to open 10.101.1.2 it loads up the application.

I tried to check if the container had some problems but everything seems fine. The nginx process is running and there are no errors in the logs. It seems to me that the issue is in the load balancer configuration somehow but I can't figure out what it's wrong.

Do anyone know at least where to look at, to better understand where the problem is?

Edit:

Under suggestion from @cesar I've checked with log explorer for any occurrence of the domain name being mentioned and the only thing that comes out are warning with payload like these:

{
  "insertId": "ut0lskdjtr8",
  "jsonPayload": {
    "@type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
  },
  "httpRequest": {
    "requestMethod": "GET",
    "requestUrl": "http://foo.domain.net/",
    "requestSize": "517",
    "status": 504,
    "responseSize": "118",
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36",
    "remoteIp": "10.13.209.144:52409",
    "serverIp": "10.101.1.2:8080",
    "latency": "30.001717s",
    "protocol": "HTTP/1.1"
  },
  "resource": {
    "type": "internal_http_lb_rule",
    "labels": {
      "region": "europe-west2",
      "project_id": "project-12345",
      "backend_type": "NETWORK_ENDPOINT_GROUP",
      "backend_target_type": "BACKEND_SERVICE",
      "network_name": "network-shared-vpc",
      "backend_scope": "europe-west2-c",
      "matched_url_path_rule": "/",
      "backend_scope_type": "ZONE",
      "url_map_name": "k8s2-um-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc",
      "backend_name": "k8s1-009b06c9-ns-frontend-apps-foo-nodeport-80-8d5eff42",
      "backend_target_name": "k8s1-009b06c9-ns-frontend-apps-foo-nodeport-80-8d5eff42",
      "forwarding_rule_name": "k8s2-fr-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc",
      "target_proxy_name": "k8s2-tp-zeslpm63-ns-frontend-apps-foo-ingress-usro40mc"
    }
  },
  "timestamp": "2021-11-12T09:14:29.481099Z",
  "severity": "WARNING",
  "logName": "projects/project-12345/logs/loadbalancing.googleapis.com%2Frequests",
  "receiveTimestamp": "2021-11-12T09:15:07.271078019Z"
}
-- Claudio
google-cloud-platform
google-kubernetes-engine
kubernetes
networking

0 Answers