Currently I am running a solr cluster on Kubernetes as a statefulset. My solr cluster has 39 pods running in it. I am running a single pod on a single physical node. My solr cluster has just 1 collection divived into 3 shards, each shard has 13 nodes (or pods) running in it and out of those 13 nodes (or pods), 3 are TLOG replicas and 10 are PULL replicas.
The problem that I want to disucss is - I want to autoscale my solr cluster. On the basis of some condition I want to downscale my PULL replica nodes (or pods) to minimum, so that unnecessary consumption can be reduced. Now I know I can use HPA in Kuberntes to autoscale, but while downscaling I don't want to stop my TLOG nodes (or pods). Similarly, while scaling up I want to just add PULL replicas to my cluster.
Can anyone please help me with this problem.
You can have different deployments for each one of the pod types, e.g one Deployment for TLOG pods and another one for PULL pods. Then you can define a fixed number of replicas for the TLOG pods and an HPA for the PULL pods. This will allow for adding / removing PULL pods only, without any impact on the TLOG pods.