GCP load balancer backend status unknown

8/23/2018

I'm flabbergasted.

I have a staging and production environment. Both environments have the same deployments, services, ingress, firewall rules, and both serve a 200 on /.

However, after turning on the staging environment and provisioning the same ingress, the staging service fails with Some backend services are in UNKNOWN state. Production is still live.

Both the frontend and backend pods are ready on GKE. I've manually tested the health checks and they pass when I visit /.

I see nothing in the logs or gcp docs pointing in the right direction. What could I have possibly broken?

ingress.yaml:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: fanout-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: "STATIC-IP"
spec:
  backend:
    serviceName: frontend
    servicePort: 8080
  tls:
  - hosts:
    - <DOMAIN>
    secretName: staging-tls
  rules:
  - host: <DOMAIN>
    http:
      paths:
      - path: /*
        backend:
          serviceName: frontend
          servicePort: 8080
      - path: /backend/*
        backend:
          serviceName: backend
          servicePort: 8080

frontend.yaml:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: frontend
  name: frontend
  namespace: default
spec:
  ports:
  - nodePort: 30664
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: frontend
  type: NodePort
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  generation: 15
  labels:
    app: frontend
  name: frontend
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  selector:
    matchLabels:
      app: frontend
  minReadySeconds: 5
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - image: <our-image>
        name: frontend
        ports:
        - containerPort: 8080
          protocol: TCP
        readinessProbe:
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 3
        livenessProbe:
          httpGet:
            path: /
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
          timeoutSeconds: 3
-- Mike
google-cloud-platform
google-kubernetes-engine
kubernetes

3 Answers

8/27/2018

Are you still experiencing this issue?

I tried to reproduce following the Google public documentation on: Setting up HTTP Load Balancing with Ingressto deploy:

A web App using the sample web application container image that listens on a HTTP server on port 8080:

However, it seems to be working now. So if you are still having this issue, please consider filing a public issue against the kubernetes/ingress-gce using the Google issue-tracking tool. Include as much details as possible, including steps to reproduce for so that this issue can get a better visibility as well as more sampling.

Please note:

The Issue Tracker User Content and Conduct Policy details the types of information that are inappropriate for submitting to Issue Tracker which includes things like sensitive personal information and spam. Please do not submit inappropriate content in Issue Tracker.

Repo Output $ kubectl describe ing

sunny@test-dev:~$ kubectl describe ing basic-ingress
Name:             basic-ingress
Namespace:        default
Address:          xx.xxx.xxx.228
Default backend:  web:8080 (10.8.2.6:8080)
Rules:
Host  Path  Backends
----  ----  --------
*     *     web:8080 (10.8.2.6:8080)
Annotations:
target-proxy:     k8s-tp-default-basic-ingress--f5636f071d87exxx
url-map:          k8s-um-default-basic-ingress--f5636f071d87exxx
backends:         {"k8s-be-31544--f5636f071d87exxx":"HEALTHY"}
forwarding-rule:  k8s-fw-default-basic-ingress--f5636f071d87exxx
Events:
Type    Reason   Age                From                     Message
----    ------   ----               ----                     -------
Normal  Service  7m (x376 over 2d)  loadbalancer-controller  default backend set to web:31544
-- arp-sunny.
Source: StackOverflow

8/24/2018

Yesterday even this guide https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer

didn't work. Don't know what happened but even waiting 30minutes + the ingress was reporting UNKNOWN state for backends .

After 24 hours, things seem to be much better. L7 http ingress works but with big delay on reporting healthy backends.

-- Yannis
Source: StackOverflow

8/24/2018

I think this is a bug. I created a new cluster and couldn't reproduce. If anyone hits this again, I would suggest trying a new cluster.

-- Mike
Source: StackOverflow