Liveliness probe test in google cloud clustered kubernetes environment

11/23/2018

I want to test liveliness probe in google cloud clustered kubernetes environment. How can I bring a pod or container down to test liveliness probes ?

The problem is that replica sets will automatically bring the pods up, if I delete any on those.

-- Subit Das
cluster-computing
gcloud
kubernetes

2 Answers

11/26/2018

The question is (quote) "...How can I bring a pod or container down to test liveliness probes ?". The type of probe isn't specified but I'll assume it is HTTP GET or TCP Socket.

Assuming you have proper access to the node/host on which the pod is running:

  • Start a single pod.
  • Verify that the liveness probe checks out - that's it, it is working.
  • Find out on which node the pod is running. This, for example, will return the IP address:

    kubectl -n <namespace> get pod <pod-name> -o jsonpath={.status.hostIP}
  • Log onto the node.

  • Find the PID of the application process. For example, list all processes (ps aux) and look for the specific process or grep by (part of the) name: ps aux | grep -i <name>. Take the number in the second column. For example, the PID in this ps aux partial output is 13314:

    nobody   13314  0.0  0.6 145856 38644 ?        Ssl  13:24   0:00 /bin/prometheus --storage....
  • While on the node, suspend (pause/stop) the process by executing kill -STOP <PID>. For example, for the PID from above:

    kill -STOP 13314

At this point:

  • If there is no liveness probe defined, the pod should still be in Running status and not restarted even though it won't be responding to attempts for connections. To resume the stopped process, execute kill -CONT <PID>.

  • A properly configured HTTP GET or TCP Socket liveness probe should fail because connection with the application can't be established.

Notice that this method may also work for "exec.command" probes depending what those commands do.

It is to note, also, that most applications run as PID 1 in a (Docker) container. As the Docker docs explain "...A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so". That is probably the reason why the approach won't work from inside the container.

-- apisim
Source: StackOverflow

11/25/2018

On Kubernetes, pods are mortal, and the number of live pods at any given time is guaranteed by the replicasets (which are wrapped by the deployments). So, to take your pods down, you can scale down your deployment to the number you need, or even to zero, like this:

kubectl scale deployment your-deployment-name --replicas=0

However, if you are trying to test and verify that the kubernetes service resource not sending packets to the non live or non ready pod, here's what you can do: You can create another pod with same labels as your real application pods, such that label selectors in the service would match this new pod as well. Configure the pod to have an invalid liveness/readiness probes, so it will not be considered live/ready. Then, hit your service with requests etc. to verify that it never hits the new pod you created.

-- Utku Ă–zdemir
Source: StackOverflow