I want to deploy a gRPC + HTTP servers on GKE with HTTP/2 and mutual TLS. My deployment have both a readiness probe and liveness probe with custom path. I expose both the gRPC and HTTP servers via an Ingress.
deployment's probes and exposed ports:
livenessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
httpGet:
path: /_ah/health
port: 8443
scheme: HTTPS
name: grpc-gke
ports:
- containerPort: 8443
protocol: TCP
- containerPort: 50052
protocol: TCP
NodePort service:
apiVersion: v1
kind: Service
metadata:
name: grpc-gke-nodeport
labels:
app: grpc-gke
annotations:
cloud.google.com/app-protocols: '{"grpc":"HTTP2","http":"HTTP2"}'
service.alpha.kubernetes.io/app-protocols: '{"grpc":"HTTP2", "http": "HTTP2"}'
spec:
type: NodePort
ports:
- name: grpc
port: 50052
protocol: TCP
targetPort: 50052
- name: http
port: 443
protocol: TCP
targetPort: 8443
selector:
app: grpc-gke
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: grpc-gke-ingress
annotations:
kubernetes.io/ingress.allow-http: "false"
#kubernetes.io/ingress.global-static-ip-name: "grpc-gke-ip"
labels:
app: grpc-gke
spec:
rules:
- http:
paths:
- path: /_ah/*
backend:
serviceName: grpc-gke-nodeport
servicePort: 443
backend:
serviceName: grpc-gke-nodeport
servicePort: 50052
The pod does exist, and has a "green" status, before creating the liveness and readiness probes. I see regular logs on my server that both the /_ah/live
and /_ah/ready
are called by the kube-probe and the server responds with the 200
response.
I use a Google managed TLS certificate on the load balancer (LB). My HTTP server creates a self-signed certificate -- inspired by this blog.
I create the Ingress after I start seeing the probes' logs. After that it creates an LB with two backends, one for the HTTP and one for the gRPC. The HTTP backend's health checks are OK and the HTTP server is accessible from the Internet. The gRPC backend's health check fails thus the LB does not route the gRPC protocol and I receive the 502
error response.
This is with GKE master 1.12.7-gke.10. I also tried newer 1.13 and older 1.11 masters. The cluster has HTTP load balancing enabled and VPC-native enabled. There are firewall rules to allow access from LB to my pods (I even tried to allow all ports from all IP addresses). Delaying the probes does not help either.
Funny thing is that I deployed nearly the same setup, just the server's Docker image is different, couple of months ago and it is running without any issues. I can even deploy new Docker images of the server and everything is great. I cannot find any difference between these two.
There is one another issue, the Ingress is stuck on the "Creating Ingress" state for days. It never finishes and never sees the LB. The Ingress' LB never has a front-end and I always have to manually add an HTTP/2 front-end with a static IP and Google managed TLS certificate. This should be happening only for cluster which were created without "HTTP load balancing", but it happens in my case every time for all my "HTTP load balancing enabled" clusters. The working deployment is in this state for months already.
Any ideas why the gRPC backend's health check could be failing even though I see logs that the readiness and liveness endpoints are called by kube-probe?
EDIT:
describe svc grpc-gke-nodeport
Name: grpc-gke-nodeport
Namespace: default
Labels: app=grpc-gke
Annotations: cloud.google.com/app-protocols: {"grpc":"HTTP2","http":"HTTP2"}
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"cloud.google.com/app-protocols":"{\"grpc\":\"HTTP2\",\"http\":\"HTTP2\"}",...
service.alpha.kubernetes.io/app-protocols: {"grpc":"HTTP2", "http": "HTTP2"}
Selector: app=grpc-gke
Type: NodePort
IP: 10.4.8.188
Port: grpc 50052/TCP
TargetPort: 50052/TCP
NodePort: grpc 32148/TCP
Endpoints: 10.0.0.25:50052
Port: http 443/TCP
TargetPort: 8443/TCP
NodePort: http 30863/TCP
Endpoints: 10.0.0.25:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
and the health check for the gRPC backend is an HTTP/2 GET using path /
on port 32148
. Its description is "Default kubernetes L7 Loadbalancing health check." where as the description of the HTTP's back-end health check is "Kubernetes L7 health check generated with readiness probe settings.". Thus the health check for the gRPC back-end is not created from the readiness probe.
Editing the health check to point to port 30863
an changing the path to readiness probe fixes the issue.
GKE ingress just recently started supporting full gRPC support in beta (whereas HTTP2 ro HTTP1.1 conversion was used in the past). To use gRCP though, you need to add an annotation to the ingress "cloud.google.com/app-protocols: '{"http2-service":"HTTP2"}'". Refer to this how-to doc for more detais.
Editing the health check to point to the readiness probe's path and changed the port to the one of the HTTP back-end fixed this issue (look for the port in the HTTP back-end's health check. it is the NodePort's.). It runs know without any issues.
Using the same health check for the gRPC back-end as for the HTTP back-end did not work, it was reset back to its own health check. Even deleting the gRPC back-end's health check did not help, it was recreated. Only editing it to use a different port and path has helped.