My Requirement is Scale up PODS on Custom metrics like pending messages from queue increases pods has to increase to process jobs. In kubernetes Scale up is working fine with prometheus adapter & prometheus operator.
I have long running process in pods, but HPA checks the custom metrics and try to scale down, Due to this process killing mid of operations and loosing that message. How i can control the HPA kill only free pods where no process is running.
At present the HPA cannot be configured to accommodate workloads of this nature. The HPA simply sets the replica count on the deployment to a desired value according to the scaling algorithm, and the deployment chooses one or more pods to terminate.
There is a lot of discussion on this topic in this Kubernetes issue that may be of interest to you. It is not solved by the HPA, and may never be. There may need to be a different kind of autoscaler for this type of workload. Some suggestions are given in the link that may help you in defining one of these.
If I was to take this on myself, I would create a new controller, with corresponding CRD containing a job definition and the scaling requirements. Instead of scaling deployments, I would have it launch jobs. I would have the jobs do their work (process the queue) until they became idle (no items in the queue) then exit. The controller would only scale up, by adding jobs, never down. The jobs themselves would scale down by exiting when the queue is empty.
This would require that your jobs be able to detect when they become idle, by checking the queue and exiting if there is nothing there. If your queue read blocks forever, this would not work and you would need a different solution.
The kubebuilder project has an excellent example of a job controller. I would start with that and extend it with the ability to check your published metrics and start the jobs accordingly.
Also see Fine Parallel Processing Using a Work Queue in the Kubernetes documentation.