I have my rook-ceph
cluster running on AWS
. Its loaded up with data. Is there's any way to stimulate POWER FAILURE so that I can test the behaviour of my cluster?.
It depends what is the purpose of your crash test. I see two options:
You want to test if you correctly deployed Kubernetes on AWS - then, I'd terminate the related AWS EC2 Instance (or set of Instances)
You want to test if your end application is resilient to Kubernetes Node failures - then I'd just check what PODs are running on the given Node and kill them all suddenly with:
kubectl delete pods <pod> --grace-period=0 --force
From Docker you can send KILL signal "SIGPWR" that Power failure (System V)
docker kill --signal="SIGPWR"
and from Kubernet
kubectl exec <pod> -- /killme.sh
and so scriplt killme.sh
beginning of script-----
#!/bin/bash
# Define process to find
kiperf=$(pidof iperf)
# Kills all iperf or command line
kill -30 $kiperf
script end -------------
signal 30 you can find here
Cluster Pods do not disappear till someone (a person or a controller) destroys them, or there is an unavoidable hardware or system software error.
Developers call these unavoidable cases involuntary disruptions to an application. Examples are:
Developers call other cases voluntary disruptions. These include both actions initiated by the application owner and those initiated by a Cluster Administrator.
Typical application owner actions include:
More information you can find here: kubernetes-discruption, application-discruption.
You can setup Prometheus on your cluster and mesure metrics during failure.