I am trying to setup an Ingress in GCE Kubernetes. But when I visit the IP address and path combination defined in the Ingress, I keep getting the following 502 error:
Here is what I get when I run: kubectl describe ing --namespace dpl-staging
Name: dpl-identity
Namespace: dpl-staging
Address: 35.186.221.153
Default backend: default-http-backend:80 (10.0.8.5:8080)
TLS:
dpl-identity terminates
Rules:
Host Path Backends
---- ---- --------
*
/api/identity/* dpl-identity:4000 (<none>)
Annotations:
https-forwarding-rule: k8s-fws-dpl-staging-dpl-identity--5fc40252fadea594
https-target-proxy: k8s-tps-dpl-staging-dpl-identity--5fc40252fadea594
url-map: k8s-um-dpl-staging-dpl-identity--5fc40252fadea594
backends: {"k8s-be-31962--5fc40252fadea594":"HEALTHY","k8s-be-32396--5fc40252fadea594":"UNHEALTHY"}
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
15m 15m 1 {loadbalancer-controller } Normal ADD dpl-staging/dpl-identity
15m 15m 1 {loadbalancer-controller } Normal CREATE ip: 35.186.221.153
15m 6m 4 {loadbalancer-controller } Normal Service no user specified default backend, using system default
I think the problem is dpl-identity:4000 (<none>)
. Shouldn't I see the IP address of the dpl-identity
service instead of <none>
?
Here is my service description: kubectl describe svc --namespace dpl-staging
Name: dpl-identity
Namespace: dpl-staging
Labels: app=dpl-identity
Selector: app=dpl-identity
Type: NodePort
IP: 10.3.254.194
Port: http 4000/TCP
NodePort: http 32396/TCP
Endpoints: 10.0.2.29:8000,10.0.2.30:8000
Session Affinity: None
No events.
Also, here is the result of executing: kubectl describe ep -n dpl-staging dpl-identity
Name: dpl-identity
Namespace: dpl-staging
Labels: app=dpl-identity
Subsets:
Addresses: 10.0.2.29,10.0.2.30
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
http 8000 TCP
No events.
Here is my deployment.yaml:
apiVersion: v1
kind: Secret
metadata:
namespace: dpl-staging
name: dpl-identity
type: Opaque
data:
tls.key: <base64 key>
tls.crt: <base64 crt>
---
apiVersion: v1
kind: Service
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
spec:
type: NodePort
ports:
- port: 4000
targetPort: 8000
protocol: TCP
name: http
selector:
app: dpl-identity
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
annotations:
kubernetes.io/ingress.allow-http: "false"
spec:
tls:
- secretName: dpl-identity
rules:
- http:
paths:
- path: /api/identity/*
backend:
serviceName: dpl-identity
servicePort: 4000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dpl-staging
name: dpl-identity
kind: Ingress
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
annotations:
kubernetes.io/ingress.allow-http: "false"
spec:
tls:
- secretName: dpl-identity
rules:
- http:
paths:
- path: /api/identity/*
backend:
serviceName: dpl-identity
servicePort: 4000
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: dpl-staging
name: dpl-identity
labels:
app: dpl-identity
spec:
replicas: 2
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: dpl-identity
spec:
containers:
- image: gcr.io/munpat-container-engine/dpl/identity:0.4.9
name: dpl-identity
ports:
- containerPort: 8000
name: http
volumeMounts:
- name: dpl-identity
mountPath: /data
volumes:
- name: dpl-identity
secret:
secretName: dpl-identity
The "Limitations" section of the kubernetes
documentation states that:
All Kubernetes services must serve a 200 page on '/', or whatever custom value you've specified through GLBC's
--health-check-path argument
.
Issue is indeed a health check and seemed "random" for my apps where I used name-based virtual hosts to reverse proxy requests from ingress via domains to two separate backend services. Both were secured using Lets Encrypt and kube-lego
. My solution was to standardize the path for health checks for all services sharing an ingress, and declare the readinessProbe
and livenessProbe
configs in my deployment.yml
file.
I faced this issue with Google cloud cluster node version 1.7.8
and found this issue that closely-resembled what I experienced: * https://github.com/jetstack/kube-lego/issues/27
I'm using gce
and kube-lego
and my backend service health checks were on /
and kube-lego
is on /healthz
. It appears differing paths for health checks with gce ingress
might be the cause so it may be worth updating backend services to match the /healthz
pattern so all use same (or as one commenter in Github issue stated they updated kube-lego to pass on /
).
Log can read from Stackdriver Logging, in my case, it is backend_timeout error. After increase the default timeout (30s) via BackendConfig, it stop returning 502 even under load.
I solved the problem by
kubectl apply -f ingress.yaml
Essentially, I followed Roy's advice and tried to turn it off and on again.
Your backend k8s-be-32396--5fc40252fadea594
is showing as "UNHEALTHY"
.
Ingress will not forward traffic if the backend is UNHEALTHY, this will result in the 502 error you are seeing.
It will be being marked as UNHEALTHY becuase it is not passing it's health check, you can check the health check setting for k8s-be-32396--5fc40252fadea594 to see if they are appropriate for your pod, it may be polling an URI or port that is not returning a 200 response. You can find these setting under Compute Engine > Health Checks.
If they are correct then there are many steps between your browser and the container that could be passing traffic incorrectly, you could try kubectl exec -it PODID -- bash
(or ash if you are using Alpine) and then try curl-ing localhost to see if the container is responding as expected, if it is and the health checks are also configured correctly then this would narrow down the issue to likely be with your service, you could then try changing the service from a NodePort type to a LoadBalancer and see if hitting the service IP directly from your browser works.
I was having the same issue. It turns out I had to wait a few minutes before ingress to validate the service health. If someone is going to the same and done all the steps like readinessProbe
and linvenessProbe
, just ensure your ingress is pointing to a service that is either a NodePort
, and wait a few minutes until the yellow warning icon turns into a green one. Also, check the log on StackDriver to get a better idea of what's going on. My readinessProbe
and livenessProbe
is on /login
, for the gce
class. So I don't think it has to be on /healthz
.
I had the same problem, and it persisted after I enabled livenessProbe
as well readinessPorbe
. It turned this was to do with basic auth. I've added basic auth to livenessProbe
and the readinessPorbe
, but turns out the GCE HTTP(S) load balancer doesn't have a configuration option for that.
There seem to be a few another kind of issue with too, e.g. setting container port to 8080 and service port to 80 didn't work with GKE ingress controller (yet I wouldn't clearly indicate what the problem was). And broadly, it looks to me like there is very little visibility and running your own ingress container is a better option with respect to visibility.
I picked Traefik for my project, it worked out of the box, and I'd like to enable Let's Encrypt integration. The only change I had to make to Traefik manifests was about tweaking the service object to disabling access to the UI from outside of the cluster and expose my app with through external load balancer (GCE TCP LB). Also, Traefik is more native to Kubernetes. I tried Heptio Contour, but something didn't work out of the box (will give it a go next time when the new version comes out).