I have one master service and multiple slave services. The master service continuously polls a topic using subscriber from Google PubSub. The Slave services are REST APIs. Once the master service receives a message, it delegates the message to a slave service. Currently I'm using ClusterIP service in Kubernetes. Some of my requests are long running and some are pretty short.
I happen to observe that sometimes if there's a short running request while a long running request is in process, it has to wait until the long running request to finish even though many pods are available without serving any traffic. I think it's due to the round robin load balancing. I have been trying to find a solution and looked into approaches like setting up external HTTP load balancer with ingress and internal HTTP load balancer. But I'm really confused about the difference between these two and which one applies for my use case. Can you suggest which of the approaches would solve my use case?
assuming you want 20% of the traffic to go to x
service and the rest 80% to y
service. create 2 ingress files for each of the 2 targets, with same host name, the only difference is that one of them will carry the following ingress annotations:
nginx.ingress.kubernetes.io/canary: "true" #--> tell the controller to not create a new vhost
nginx.ingress.kubernetes.io/canary-weight: "20" #--> route here 20% of the traffic from the existing vhost
weighted routing is a bit beyond the ClusterIP
. as you said yourself, its time for a new player to enter the game - an ingress controller.
this is a k8s abstraction for a load balancer
- a powerful server sitting in front of your app and routing the traffic between the ClusterIP
s.
install ingress controller on gcp cluster
once you have it installed and running, use its canary feature to perform a weighted routing. this is done using the following annotations:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: http-svc
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "20"
spec:
rules:
- host: echo.com
http:
paths:
- backend:
serviceName: http-svc
servicePort: 80
here is the full guide.
(this is the relevant definition from google cloud docs but the concept is similar among other cloud providers)
GCP's load balancers can be divided into external and internal load balancers. External load balancers distribute traffic coming from the internet to your GCP network. Internal load balancers distribute traffic within your GCP network.
https://cloud.google.com/load-balancing/docs/load-balancing-overview