I am using a Horizontal Pod Autoscaler in Kubernetes as shown below. I want to use it for a service to scale between 4 and 40 replicas. Unfortunately due to the upscale delay it would take roughly an hour to scale from 4 to 40 replicas. Is there any chance I could provide something like a min/max Surge to upscale replicas? So that it would at least upscale by 2 or 4 replicas?
My API Object (helm):
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: {{ required "A valid service.name entry required!" .Values.service.name }}
labels:
app: {{ .Values.service.name }}
version: {{ .Values.image.tag | quote }}
chart: {{ template "nodejs.chart" . }}
release: "{{ .Release.Name }}-{{ .Values.image.tag }}"
heritage: {{ .Release.Service }}
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: {{ required "A valid service.name entry required!" .Values.service.name }}
minReplicas: {{ .Values.autoscaling.minReplicas }}
maxReplicas: {{ .Values.autoscaling.maxReplicas }}
metrics:
- type: Resource
resource:
name: cpu
targetAverageValue: {{ required "A valid autoscaling.cpuTargetValue entry is required" .Values.autoscaling.cpuTargetValue }}
- type: Resource
resource:
name: memory
targetAverageValue: {{ required "A valid autoscaling.memoryTargetValue entry is required" .Values.autoscaling.memoryTargetValue }}
Not really. It seems that you are concerned about thrashing. There's not really a way to define an upscale step number combined with a cool off period.
Up until Kubernetes 1.11 you can specify the --horizontal-pod-autoscaler-upscale-delay
on the kube-controller-manager which defaults to 3 minutes. This may not be enough, so I created this issue
So starting with Kubernetes 1.12 that option has been removed in favor of a better scaling algorithm.