Is it possible to have the HPA scale based on the number of available running pods?
I have set up a readiness probe that cuts out a pod based it's internal state (idle, working, busy). When a pod is 'busy', it no longer receives new requests. But the cpu, and memory demands are low.
I don't want to scale based on cpu, mem, or other metrics.
Seeing as the readiness probe removes it from active service, can I scale based on the average number of active (not busy) pods? When that number drops below a certain point more pods are scaled.
TIA for any suggestions.
You can create custom metrics, a number of busy-pods
for HPA. That is, the application should emit a metric value when it is busy. And use that metric to create HorizontalPodAutoscaler.
Something like this:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: custom-metric-sd
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: custom-metric-sd
minReplicas: 1
maxReplicas: 20
metrics:
- type: Pods
pods:
metricName: busy-pods
targetAverageValue: 4
Here is another reference for HPA with custom metrics.