In my deployment a pod can be in a situation where it needs to be recreated. In this case it can still process traffic but should be recreated asap.
So I think about having a livenessProbe that reports failure if the pod needs to be restarted. The readiness probe will still report ok.
I know that eventually kubernetes will recreate all pods and the system will be fine again.
My question now is can this be done without outage? So lets assume all pods of a replicaset report are not alive at the same time. Will kubernetes kill them all and then replace them or will it act in a rolling update fashion where it starts a new pod, waits for it to be ready, then kills one not alive pod and continue this way until all are replaced?
Is this the default behaviour of kubernetes? If not can it be configured to behave like this?
K8 will not use rolling to start pod if they are failed due to probe or any other reason.
Also about probes,when to start liveness probe first time and how frequently to do that, is specified in liveness probe itself. As you have multiple replicas of same pod, these values will be same for all replicas of pods managed by single ReplicaSet. So yes this is default behavior.
But at all you want to do this without outage, you can create two ReplicaSet who manages two different set of same pods but with different values for below liveness probe params:
initialDelaySeconds: Number of seconds after the container has started before
liveness or readiness probes are initiated.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds.
Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed.
failureThreshold: When a Pod starts and the probe fails, Kubernetes will try failureThreshold times before giving up.
When a pod should be restarted return failure from livenessProbe not immediately but some time later (i.e. with delay). If you use different delays for each pod, you will have the rolling restart. Even a random delay would minimize the probability of outage.