I have a bunch of pods running in the same cluster. Sometimes there are not enough resources and some pods need to terminate.
That's OK, but how do I set the priority of which pods are killed first?
It usually kills my most important service first :\
Thanks!
I suggest you take a look at resource QoS.
Have you important stuff (including monitoring) specify limit=request which in turn will land them in the guaranteed QoS class.
Specifically,
The system computes pod level requests and limits by summing up per-resource requests and limits across all containers. When request == limit, the resources are guaranteed (...)
Also, overstepping CPU limits only results in throttling, so it's more important to get memory limits (per container) right.