I want to test liveliness probe in google cloud clustered kubernetes
environment. How can I bring a pod or container down to test liveliness probes ?
The problem is that replica sets will automatically bring the pods up, if I delete any on those.
The question is (quote) "...How can I bring a pod or container down to test liveliness probes ?". The type of probe isn't specified but I'll assume it is HTTP GET or TCP Socket.
Assuming you have proper access to the node/host on which the pod is running:
Find out on which node the pod is running. This, for example, will return the IP address:
kubectl -n <namespace> get pod <pod-name> -o jsonpath={.status.hostIP}
Log onto the node.
Find the PID of the application process. For example, list all processes (ps aux
) and look for the specific process or grep
by (part of the) name: ps aux | grep -i <name>
. Take the number in the second column. For example, the PID in this ps aux
partial output is 13314:
nobody 13314 0.0 0.6 145856 38644 ? Ssl 13:24 0:00 /bin/prometheus --storage....
While on the node, suspend (pause/stop) the process by executing kill -STOP <PID>
. For example, for the PID from above:
kill -STOP 13314
At this point:
If there is no liveness probe defined, the pod should still be in Running
status and not restarted even though it won't be responding to attempts for connections. To resume the stopped process, execute kill -CONT <PID>
.
A properly configured HTTP GET or TCP Socket liveness probe should fail because connection with the application can't be established.
Notice that this method may also work for "exec.command" probes depending what those commands do.
It is to note, also, that most applications run as PID 1 in a (Docker) container. As the Docker docs explain "...A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so". That is probably the reason why the approach won't work from inside the container.
On Kubernetes, pods are mortal, and the number of live pods at any given time is guaranteed by the replicasets (which are wrapped by the deployments). So, to take your pods down, you can scale down your deployment to the number you need, or even to zero, like this:
kubectl scale deployment your-deployment-name --replicas=0
However, if you are trying to test and verify that the kubernetes service resource not sending packets to the non live or non ready pod, here's what you can do: You can create another pod with same labels as your real application pods, such that label selectors in the service would match this new pod as well. Configure the pod to have an invalid liveness/readiness probes, so it will not be considered live/ready. Then, hit your service with requests etc. to verify that it never hits the new pod you created.