In our GKE we have one service called php-services
. It is defined like so:
apiVersion: v1
kind: Service
metadata:
name: php-services
labels:
name: php-services
spec:
type: NodePort
ports:
- port: 80
selector:
name: php-services
I can access this service from inside the cluster. If I run these commands on one of our pods (in Default
namespace), I get expected results:
bash-4.4$ nslookup 'php-services'
Name: php-services
Address 1: 10.15.250.136 php-services.default.svc.cluster.local
and
bash-4.4$ wget -q -O- 'php-services/health'
{"status":"ok"}
So the service is ready and responding correctly. I need to expose this service to foreign traffic. I'm trying to do it with Ingress with following config:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress-tls
annotations:
kubernetes.io/ingress.class: "gce"
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.global-static-ip-name: "kubernetes-ingress"
kubernetes.io/ingress.allow-http: "false"
external-dns.alpha.kubernetes.io/hostname: "gke-ingress.goout.net"
namespace: default
spec:
tls:
- hosts:
- php.service.goout.net
secretName: router-tls
rules:
- host: php.service.goout.net
http:
paths:
- backend:
serviceName: php-services
servicePort: 80
path: /*
But then accessing http://php.service.goout.net/health gives an 502 error:
Error: Server Error The server encountered a temporary error and could
not complete your request.
Please try again in 30 seconds.
We also have other services with the same config which run ok and are accessible form outside.
I've found a similar question but that doesn't bring any sufficient answer either.
I've been also following the Debug Service article but that also didn't help as the service itself is OK.
Any help with this issue highly appreciated.
GKE Loadbalancer only accepts HTTP status 200 while Kubernetes health checks accept any code greater than or equal to 200 and less than 400.
Ok, so we've figured out what was wrong.
Take a look at yaml
definition of the deployment for the php-services
service: (shortened)
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: php-services
namespace: default
spec:
replicas: 1
selector:
matchLabels:
name: php-services
template:
metadata:
labels:
name: php-services
spec:
containers:
- name: php-services
image: IMAGE_TAG
livenessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 80
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 60
successThreshold: 1
timeoutSeconds: 10
readinessProbe:
failureThreshold: 3
httpGet:
path: /health
port: 80
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 60
successThreshold: 1
timeoutSeconds: 10
ports:
- containerPort: 80
The Apache server inside the image was configured in a way that it redirected from paths without trailing slash to paths with it. So when you requested /health
you actually got HTTP status 301 pointing to /health/
which then responded with 200.
In the scope of Kubernetes health checks this is OK as "Any code greater than or equal to 200 and less than 400 indicates success."
However, the problem lied in the GKE Loadbalancer. It also has it's own GKE healthchecks derived from the checks in Deployment definition. The important difference is that it only accepts HTTP status 200. And if the loadbalancer doesn't find a backend service healthy it won't pass any foreign traffic to it.
Therefore we had two options to fix this:
/health
and /health/
(or more precisely just to /health
)readinessProbe
and livenessProbe
path definition to /health/
.We choose the later and it fixed the problem.