I have several Java projects running in Docker containers managed with Kubernetes. I want to enable the Horizontal Pod Autoscaling(HPA) based on CPU provided by Kubernetes, but I find it hard to deal with the initial CPU spikes caused by the JVM when initialising the container.
I currently have not set a cpu limit in the Kubernetes yaml files for any of the projects which basically means that I let the pods take as much CPU from the environment as they can (I know its a bad practice, but it lets me boot JVM pods in less than 30 seconds).
The problem this creates is that during the pod creation in the first 3-4 minutes the CPU usage will spike so much that If I have an autoscale rule set it will trigger it. Autoscaled pod will spin up and cause the same spike and re-trigger the autoscale until the maximum amount of pods are reached and things settle down.
I tried setting a cpu limit in the kubernetes yaml file but the amount if cpu that my projects need is not that big so by setting this to an non-overkill amount makes my pods spin up in more than 5min which is unacceptable.
I could also increase the autoscale delay to more than 10 minutes but its a global rule that will also affect deployments which I need to scale very fast, so that is also not a viable option for me.
This is an example cpu and memory configuration for one of my pods
env:
resources:
requests:
memory: "1300Mi"
cpu: "250m"
limits:
memory: "1536Mi"
I also migrated to Java 10 recently which is supposed to be optimised for containerisation. Any advice or comment will be much appreciated. Thanks in advance.
Edit:
I could also set up hpa based on custom prometheus metrics like http_requests, but that option will be harder to maintain since there lots of variables that can affect the amount of requests the pod can handle.
Depends on your K8 version.
< 1.12
:
In this version you have, as you are explaining, only the --horizontal-pod-autoscaler-upscale-delay
flag for the Kube-Controller or the custom metrics in HPAv2. https://v1-11.docs.kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
=>1.12
:
Here we have gotten a new HPA algorithm, which discards unReady
pods in its calculation leading to fewer auto correcting.
https://github.com/kubernetes/kubernetes/pull/68068
Change CPU sample sanitization in HPA. Ignore samples if:
- Pod is beeing initalized - 5 minutes from start defined by flag
- pod is unready
- pod is ready but full window of metric hasn't been colected since transition
- Pod is initialized - 5 minutes from start defined by flag:
- Pod has never been ready after initial readiness period.
This should help you here.