Running an Akka cluster on k8s and it is using a downing strategy (let’s say Autodowning), so in the case where a node goes unreachable the container which went unreachable exits. The problem is that this node went unreachable because of a network issue/ issue with the platform provided by k8s and as such the entire pod should be restarted and scheduled onto a new healthy k8s node. Because scheduling can take some time we only want to reschedule the container onto a new pod on a new node if unreachability is the cause of the failure. Is there any way to propagate failure messages to the parent in k8s like use an exit code to make the decision of when to restart the container and when to delete the pod.
Because scheduling can take some time we only want to reschedule the container onto a new pod on a new node if unreachability is the cause of the failure.
Kubernetes manages all scheduling and health checks for you.
Is there any way to propagate failure messages to the parent in Kubernetes like use an exit code
Kubernetes creates events for some events, or you can watch the API for changes on Pods.
to make the decision of when to restart the container and when to delete the pod.
Kubernetes manages restart, scheduling and eviction of Pods.