We plan to use AWS EKS to run a stateless application.
There is a goal to achieve optimal budget by using spot instances and prefer them to on-demand ones.
Per AWS recommendations, we plan to have two Managed Node Groups: one with on-demand instances, and one with spot instances, plus Cluster Autoscaler to adjust groups size.
Now, the problem to solve is achieving two somewhat conflicting requirements:
After some research I found following possible approaches to solving it:
Approach A: Using preferredDuringSchedulingIgnoredDuringExecution
with weights based on Node Group capacity type label. E.g. one preferredDuringSchedulingIgnoredDuringExecution
rule with weight 90 would prefer nodes with capacity type spot
, and other rule with weight 1 would prefer on-demand ones, e.g.:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 90
preference:
matchExpressions:
- key: eks.amazonaws.com/capacityType
operator: In
values:
- spot
- weight: 1
preference:
matchExpressions:
- key: eks.amazonaws.com/capacityType
operator: NotIn
values:
- spot
The downside is that — as I understand — you are not guaranteed to have pods running on least preferred group, as those are just (added) weights, not some sort of exact distribution.
Other approach, which in theory could be combined with one above (?) is also using topologySpreadConstraints
, e.g.:
spec:
topologySpreadConstraints:
- maxSkew: 20
topologyKey: eks.amazonaws.com/capacityType
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
foo: bar
Which would distribute pods across nodes with different capacity types, while allowing a skew of, say, 20 pods between them, and probably should (?) be combined with preferredDuringSchedulingIgnoredDuringExecution
to achieve the desired effect.
How feasible is the approach above? Are those the right tools to achieve the goals? I would very much appreciate any advice on the case!
This is not something the Kubernetes scheduler supports. Weights in affinities are more like score multiplies, and maxSkew is a very general cap on how out of balance things can get, but not the direction of that imbalance.
You would have to write something custom AFAIK, or at least I've not seen anything for this when I went looking last. Check out the scheduler extender webhook system for a somewhat easy way to implement it.