For one of my projects, I am deploying the application on Kubernetes.
Shortly explained, the app (in my dev machine) is deployed on 8 pods (1 / cpu core), inside a Node. Each pod can process one request at a time (I force the limite with a threadpool - CPU is the bottleneck here).
I would like to force Kubernetes load balancer to send incoming request to next pod when a pod is "busy".
Does any of you know how to achieve this ?
You can run Kube-proxy in IPVS mode which allows routing of services based on a few different algorithms. You can find more in the documentation here, and this blog. From the blog:
--proxy-mode=ipvs
And add a flag --ipvs-scheduler
for choosing one of the options
rr: round-robin
lc: least connection
dh: destination hashing
sh: source hashing
sed: shortest expected delay
nq: never queue
As a pre-requisite the nodes will need the following modules installed:
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
If you want to do more sophisticated routing than this - then I would look into service mesh as an option