K
Q

Question

How to simulate Power Failure In Kubernetes

7/1/2019

I have my rook-cephcluster running on AWS. Its loaded up with data. Is there's any way to stimulate POWER FAILURE so that I can test the behaviour of my cluster?.

-- Rajat Singh

amazon-web-services

google-kubernetes-engine

kubernetes

openshift

3 Answers

7/2/2019

It depends what is the purpose of your crash test. I see two options:

You want to test if you correctly deployed Kubernetes on AWS - then, I'd terminate the related AWS EC2 Instance (or set of Instances)
You want to test if your end application is resilient to Kubernetes Node failures - then I'd just check what PODs are running on the given Node and kill them all suddenly with:

kubectl delete pods <pod> --grace-period=0 --force

-- Rafał Leszko

Source: StackOverflow

7/1/2019

From Docker you can send KILL signal "SIGPWR" that Power failure (System V)

docker kill --signal="SIGPWR"

and from Kubernet

kubectl exec <pod> -- /killme.sh

and so scriplt killme.sh

beginning of script-----
#!/bin/bash
# Define process to find
kiperf=$(pidof iperf)
# Kills all iperf or command line
kill -30 $kiperf
script end -------------

signal 30 you can find here

-- Soleil

Source: StackOverflow

7/3/2019

Cluster Pods do not disappear till someone (a person or a controller) destroys them, or there is an unavoidable hardware or system software error.

Developers call these unavoidable cases involuntary disruptions to an application. Examples are:

a hardware failure of the physical machine backing the node
cluster administrator deletes VM (instance) by mistake
cloud provider or hypervisor failure makes VM disappear a kernel panic
the node disappears from the cluster due to cluster network partition
eviction of a pod due to the node being out-of-resources. Except for the out-of-resources condition, all these conditions should be familiar to most users; they are not specific to Kubernetes.

Developers call other cases voluntary disruptions. These include both actions initiated by the application owner and those initiated by a Cluster Administrator.

Typical application owner actions include:

deleting the deployment or other controller that manages the pod
updating a deployment’s pod template causing a restart
directly deleting a pod (e.g. by accident)

More information you can find here: kubernetes-discruption, application-discruption.

You can setup Prometheus on your cluster and mesure metrics during failure.

-- MaggieO

Source: StackOverflow

KQ

How to simulate Power Failure In Kubernetes

Similar Questions

3 Answers

K
Q