Issue:
When your pods consume all the resources on available EKS nodes, a new worker/managed node is created. However, this process takes up to 2 minutes, during which time your pods are pending. During a scaling event, I need pods available as quickly as possible.
Main Idea I have: (but not confident on if this would work)
Trying to find a way manually set Quality of Service. Then create a replicaset or deployment with the same resource request as my scaling application, but set the QoS to Best Effort to this "dummy pod", and set QoS to Guaranteed in my scaling application. So that once resources are gone, kubernetes in theory should evict the dummy pod, allowing my application to scale, but then also putting the dummy pod in a pending state, triggering a cluster scale event, so that by the time my application scales again, a new node has already been added to the cluster and is available.
So far, k8s isn't liking the QoS being added to a replicaset, it will allow it on a single pod, but I would like to have 3 of these "dummy pods" and want them to respawn on new nodes, to continually stay ahead of the scaling.
Is this the best way to do this? Any other ideas? Would a deployment or statefulset allow a QoS to be added manually?
Example of Yaml I tried that kubernetes didn't like:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx
labels:
tier: frontend
spec:
replicas: 3
selector:
matchLabels:
tier: frontend
template:
metadata:
labels:
tier: frontend
spec:
containers:
image: nginx
resources:
limits:
memory: "3500Mi"
cpu: "1500m"
requests:
memory: "3500Mi"
cpu: "1500m"
status:
qosClass: BestEffort