Kubernetes container crash

11/24/2021

I have a deployment YAML file and I have 4 different containers in a single pod. I want to do this when one of the containers is crashes then I want to restart the pod.

Right now when the container is crashed, only that container restart but I want to restart all containers

-- Kufu
kubernetes
kubernetes-pod

2 Answers

11/24/2021

A Deployment can't do this but you could abuse a Job with a restartPolicy of Never to achieve it. But it's a hack and definitely violates best practices. It would require each container to artificially fail if any other container failed:

  1. Use a Job with a restartPolicy of Never instead of a Deployment
  2. Make each container regularly write some "I'm alive" message to some shared location, e.g. an emptyDir volume that's shared between all the containers in the Pod
  3. Make each container monitor the "I'm alive" messages of all other containers, and when one is missing (which means that this container crashed), then crash this container on purpose (e.g. exit 1)

The effect of this is that when one container crashes, then all containers crash. When all containers crashed, the Pod is declared as Failed and the Job Controller restarts the entire Pod.

Note however that each restart counts against the backoffLimit of the Job, so when this limit is reached, the Job is declared as failed and Pods won't be restarted anymore. Also note that this works only if the Pod template in the Job has a restartPolicy of Never, because with OnFailure (see docs), failed containers are restarted immediately and each container restart counts against the Job's backoffLimit (see docs).

As mentioned, it's an abuse of what a Job is supposed to do and therefore I would not recommend to use this in production or for any serious workloads. But it would probably allow you to do what you want for playing around.

-- weibeld
Source: StackOverflow

11/24/2021

you can use preStop hook on each container to send a message to other ones to shutdown which will restart those containers. it is not the same as restarting pod, but might help with your use case

-- Reza Nasiri
Source: StackOverflow