I'm flabbergasted.
I have a staging and production environment. Both environments have the same deployments, services, ingress, firewall rules, and both serve a 200
on /
.
However, after turning on the staging environment and provisioning the same ingress, the staging service fails with Some backend services are in UNKNOWN state
. Production is still live.
Both the frontend and backend pods are ready on GKE. I've manually tested the health checks and they pass when I visit /
.
I see nothing in the logs or gcp docs pointing in the right direction. What could I have possibly broken?
ingress.yaml
:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: fanout-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: "STATIC-IP"
spec:
backend:
serviceName: frontend
servicePort: 8080
tls:
- hosts:
- <DOMAIN>
secretName: staging-tls
rules:
- host: <DOMAIN>
http:
paths:
- path: /*
backend:
serviceName: frontend
servicePort: 8080
- path: /backend/*
backend:
serviceName: backend
servicePort: 8080
frontend.yaml
:
apiVersion: v1
kind: Service
metadata:
labels:
app: frontend
name: frontend
namespace: default
spec:
ports:
- nodePort: 30664
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: frontend
type: NodePort
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
generation: 15
labels:
app: frontend
name: frontend
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
selector:
matchLabels:
app: frontend
minReadySeconds: 5
template:
metadata:
labels:
app: frontend
spec:
containers:
- image: <our-image>
name: frontend
ports:
- containerPort: 8080
protocol: TCP
readinessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 3
livenessProbe:
httpGet:
path: /
port: 8080
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 3
Are you still experiencing this issue?
I tried to reproduce following the Google public documentation on: Setting up HTTP Load Balancing with Ingressto deploy:
A web App using the sample web application container image that listens on a HTTP server on port 8080:
However, it seems to be working now. So if you are still having this issue, please consider filing a public issue against the kubernetes/ingress-gce using the Google issue-tracking tool. Include as much details as possible, including steps to reproduce for so that this issue can get a better visibility as well as more sampling.
Please note:
The Issue Tracker User Content and Conduct Policy details the types of information that are inappropriate for submitting to Issue Tracker which includes things like sensitive personal information and spam. Please do not submit inappropriate content in Issue Tracker.
Repo Output $ kubectl describe ing
sunny@test-dev:~$ kubectl describe ing basic-ingress
Name: basic-ingress
Namespace: default
Address: xx.xxx.xxx.228
Default backend: web:8080 (10.8.2.6:8080)
Rules:
Host Path Backends
---- ---- --------
* * web:8080 (10.8.2.6:8080)
Annotations:
target-proxy: k8s-tp-default-basic-ingress--f5636f071d87exxx
url-map: k8s-um-default-basic-ingress--f5636f071d87exxx
backends: {"k8s-be-31544--f5636f071d87exxx":"HEALTHY"}
forwarding-rule: k8s-fw-default-basic-ingress--f5636f071d87exxx
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Service 7m (x376 over 2d) loadbalancer-controller default backend set to web:31544
Yesterday even this guide https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer
didn't work. Don't know what happened but even waiting 30minutes + the ingress was reporting UNKNOWN state for backends .
After 24 hours, things seem to be much better. L7 http ingress works but with big delay on reporting healthy backends.
I think this is a bug. I created a new cluster and couldn't reproduce. If anyone hits this again, I would suggest trying a new cluster.