On k8s, when I requests to pods and all pods are in use (not ready), the requests will be timeout immediately. I want to hold the request until a pod is ready and then the request is sent to the pod.
Do you know a sort of timeout duration settings for load balancing? Also, I couldn't find any relevant documentation on this matter, am I fundamentally misunderstanding something?
PS: I use Readiness probe. The case I say is that Readiness probes of all pod return false, so all pods are in use.
Try to execute kubectl wait
command:
$ kubectl wait ([-f FILENAME] | resource.group/resource.name | resource.group [(-l label | --all)]) [--for=delete|--for condition=available]
$ kubectl wait pod-running <pod-name>
$ kubectl wait pod-running @pod-id
$ kubectl create -c example.json | kubectl wait pod-running - # accept from pod name/ID from stdin
$ kubectl wait pod-template <pod-name> --format-template='{{ if .Status.Condition == "Running" }}1{{ else }}0{{ end }}'
You could also define custom in plugins or config files.
Take a look: kubernetes-conditions-api.
Please try to use Readiness/Liveliness probe, This is required for kubernetes.
Based on both status(After success) kubernetes will notify and accordingly kubernetes will redirect the request to respective pod, as pods are ready and live, now they will be able to process the request.