Does the backend's readinessProbe path need to be declared in the Endpoint openapi.yaml file?

8/6/2019

I'm deploying a project that is a GKE container backend, with a Cloud Endpoints API on the the front end. It goes through an Ingress and NodePort service. I've declared a readiness probe on the ESP container:

        readinessProbe:
          httpGet:
            path: /ping
            port: 8080

But I observed that the /ping path must be declared in the openapi.yaml without any security, or else it returns 404, or 401 if declared with some security definition.

Is that expected? I don't see anything about that in the Endpoints Samples repo:

https://github.com/GoogleCloudPlatform/endpoints-samples/tree/master/k8s

-- jacob
google-cloud-endpoints
google-cloud-networking
google-cloud-platform
google-kubernetes-engine

4 Answers

8/8/2019

I'm assuming the term "backend" is referred to the nodes of the cluster or the Pods running by GKE. For both cases, GKE is self managed and the healthy of the nodes and the pods is monitored by the Master Node so that they should be always available. More information on this regard can be found in Kubernetes Scheduler and GKE Cluster Architecture.

In this context, the healthchecks used for the GKE ingress doesn't check the backend but the services running in the cluster. To check that these services are OK, readiness and liveness probes are used as explained in this article.

-- Alfredo GH
Source: StackOverflow

3/13/2020

I was having the same problem, but it seems possible to have a healthcheck on the GCE sidecar AND the eventual backend (e.g. the api you have actually built).

The trick is, the healthcheck for your api has to be configured as a readinessprobe on the cloud endpoints container (not your api container). The endpoint that you check also has to have no security against it, so it is publicly accessible (although it seems like you could also configure the health check with an api key).

I followed previous suggestions and also added the "healthz" argument to the esp container. My API is running on port 80 with a unprotected status endpoint at /api/status

spec:
  containers:
  - name: esp
    image: gcr.io/endpoints-release/endpoints-runtime:1
    args: [
      "--http_port=8081",
      "--backend=127.0.0.1:80",
      "--service=MY SERVICE",
      "--rollout_strategy=managed",
      "-z", "healthz",
    ]
    ports:
    - containerPort: 8081
    readinessProbe:
      httpGet:
        path: /api/status/
        port: 8081
  - name: api
    image: MY IMAGE
    ports:
      - containerPort: 80

This worked, so that check is now verifying both containers are running with one readinessProbe. Interestingly, when I checked the LoadBalancer, it had also configured one against /healthz, when it hadn't been before. That would nbe useful in diagnosing cases where the endpoints container was working, but that api is not

-- millsy
Source: StackOverflow

8/9/2019

If you want to health check your backend, you need to list the health check path in openapi config for ESP to forward it.

-- Wayne Zhang
Source: StackOverflow

8/7/2019

After more closely examining the documentation and code samples, I discovered that the ESP has a healthz argument for that:

    spec:
      containers:
      - name: esp
        image: gcr.io/endpoints-release/endpoints-runtime:1
        args: [
          "-p", "8080",
          "-a", "127.0.0.1:8081",
          "-s", "SERVICE_NAME",
          "--rollout_strategy", "managed",
          "-z", "healthz",
        ]
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
        ports:
          - containerPort: 8080

(https://github.com/GoogleCloudPlatform/endpoints-samples/blob/master/k8s/esp_echo_gke_ingress.yaml)

When specifying the healthz argument for the ESP, it configures nginx to just return 200 for that path:

% if healthz:
    location = /${healthz} {
      return 200;
      access_log off;
    }
% endif

(https://github.com/cloudendpoints/esp/blob/5c9f586b14db58607d1780966cceac923809c150/start_esp/nginx-auto.conf.template#L123-L128)

So it looks like Google wants you to just do a health check on the ESP container, not your backend.

This answers my original question, although it still does lead to a second question, which is: Why not health check the backend?

-- jacob
Source: StackOverflow