My service dispatches to multiple replicas of a deployment. Usually the load will of course be balanced round-robin style (K8s default).
But what happens if one of the backend instances is temporarily offline, i.e. it closes its port (80
in that case) for some time but the pod still running? Will the service automatically skip it for new requests and include it again when it is listening on its port again? Or will requests still continue to go to this pod and fail?
I was unable to find the answer in the docs.
I think you're looking for readiness probes. By configuring a readiness probe for your pod, Kubernetes will periodically check your pod to figure out if it's able to serve traffic. You can configure how it determines if a pod is ready (ping a port, run a command, make an http request, etc) as well as how often it checks. If the readiness probe fails, the pod will be marked as Not Ready and traffic won't be routed to it until it's Ready again.