I am trying to horizontally autoscale a workload not only by custom metrics but also by algorithm that differs from algorithm described here
1/ is that possible?
2/ if it is not, and assuming i don't mind creating a container that does the autoscaling for me instead of HPA, what API should i call to do the equivalent of kubectl scale deployments/<name> --replicas=<newDesired>
?
here's the use-case:
1/ the workload consumes a single request from a queue, handles them, when done removes the item it handled, and consumes the next message.
2/ when there's more than 0 messages ready - I'd like to scale up to the number of messages ready (or max scale if it is larger). when there's 0 messages being processed - i'd like to scale down to 0.
getting the messages ready/ messages being processed to metrics server is not an issue.
getting HPA to scale by "messages ready" is not an issue either.
but...
HPA algorithm scales gradually... when i place 10 items in queue - it first to 4, then to 8 then to 10.
it also scales down gradually, and when it scales down it can terminate a pod that was processing - thus increasing the "ready" and causing a scale-up.
a node.js code that i would have run had i known the api to call (intead of HPA):
let desiredToSet = 0;
if (!readyMessages && !processingMessages) {
//if we have nothing in queue and all workers completed their work - we can scale down to minimum
//we like it better than reducing slowly as this way we are not risking killing a worker that's working
desiredToSet = config.minDesired;
}
else {
//messages ready in the queue, increase number of workers up to max allowed
desiredToSet = Math.max(Math.min(readyMessages + processingMessages, config.maxDesired), currentDeploymentReplicas);
}
//no point in sending a request to change, if nothing changed
if (desiredToSet !== currentDeploymentReplicas) {
<api to set desiredToSet of deployment to come here>;
}
1) I don't think it's possible. The HPA controller is built-into Kubernetes and I don't think its algorithm can be extended/replaced.
2) Yes, you can create a custom controller that does the job of the HPA with your own algorithm. To scale the Deployment up and down through the Kubernetes API, you manipulate the Scale sub-resource of the Deployment.
Concretely, to scale the Deployment to a new number of replicas, you make the following request:
PUT /apis/apps/v1/namespaces/{namespace}/deployments/{name}/scale
With a Scale resource (containing the desired replica count) as a body argument, as described in the API reference.