What is the correct way to identify an application is deployed successfully with all pods up on Kubernetes?

8/16/2021

After deploying your pods, how one can identify that all the pods are up and running? I have listed down few options which I think could be correct but wanted to understand what is the standard way to identify the successful deployment. 1. Connect to application via its interface and use it to identify if all the pods (cluster) are up (maybe good for stateful applications). For stateless applications pod is up should be enough. 2. Expose a Restful API service which monitors the deployment and responds accordingly. 3. Use Kubectl to connect to pods and get the status of pods and containers running.

I think number 1 is the right way but wanted to understand community view on it.

-- Manish Khandelwal
kubernetes
kubernetes-pod

1 Answer

8/17/2021

All your approaches sounds reasonable and will do the job, but why not just use the tools that Kubernetes is giving us exactly for this purpose ? ;)

There are two main health check used by Kubernetes:

  • Liveness probe- to know if container is running and working without issues (not hanged, not in deadlock state)
  • Readiness probe - to know if container is able to accept more requests

Worth to note there is also "Startup probe" which is responsible for protecting slow starting containers with difficult to estimate start time.

Liveness:

As mentioned earlier, main goal of the liveness probe is to ensure that container is not dead. If it is dead, Kubernetes removes the Pod and start a new one.

Readiness:

The main goal of the readiness probe is to check if container is able to handle additional traffic. In some case, the container may be working but it can't accept a traffic. You are defining readiness probes the same as the liveness probes, but the goal of this probe it to check if application is able to answer several queries in a row within a reasonable time. If not, Kubernetes stop sending traffic to the pod until it passes readiness probe.

Implementation:

You have a few ways to implement probes:

  • run a command every specified period of time and check if it was done correctly - the return code is 0 (in this example, the command cat /tmp/healthy is running every few seconds).
  • send a HTTP GET request to the container every specified period of time and check if it returns a success code (in this example, Kubernetes is sending a HTTP request to the endpoint /healthz defined in container).
  • attempt to open a TCP socket in the container every specified period of time and make sure that connection is established (in this example, Kubernetes is connecting to container on port 8080).

For both probes you can define few arguments:

  • initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
  • periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
  • timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
  • successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup Probes. Minimum value is 1.
  • failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the container. In case of readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.

Combining these two health checks will make sure that the application has been deployed and is working correctly - liveness probe for ensuring that pod is restarted when it container in it stopped working and readiness probe for ensuring that traffic does not reach pod with not-ready or overloaded container. The proper functioning of the probes requires an appropriate selection of the implementation method and definition of arguments - most often by trial and error. Check out these documentation:

-- Mikolaj S.
Source: StackOverflow