I've got application with 10 pods and traffic is load balanced between all pods. There was an issue that caused transactions queued up and few pods could not recover properly or took a long time to process the queue once the issue was fixed. The new traffic was still too much for some of the pods.
I'm wondering if I can block new traffic to particular pod(s) in a replicaset and let them process the queue and once the queue is processed then let the new traffic come in again?
For that you can use the probe to handle this scenario
A Readiness probe is one way to do it.
What probes to do is, continuously check inside the container or POD for the process is up or not on configured time interval.
Example
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
you can create the endpoint into the application which will be checked by the K8s automatically and if K8s will 200 it will mark the POD as Ready to handle the traffic. Or else mark as Unready not to handle traffic.
Note :
Readiness and liveness probes can be used in parallel for the same container. Using both can ensure that traffic does not reach a container that is not ready for it, and that containers are restarted when they fail.
The Readiness probe won't restart your POD if it's failing, while the liveness probe will restart your POD or container if it's failing and sending 400.
In your scenario, it's better to use the Readiness probe, so the process keeps running and never gets restarted. Once application ready to handle traffic K8s will get the 200 responses on endpoint.