In the past I observed nodes unresponsiveness, master didn't seem to communicate with those nodes under heavy loads, when my pods are overloaded. So I set CPU limits for those pods as equal to CPU requests. Since then nodes are healthy even when pods are overloaded and wait to scale out.
However, CPU usages of pods are not even. It will be useful if I set CPU limits more than CPU requests. Say, my pod have 4 worker processes with 1 CPU request and limit (I know 4 CPU requests and limits are desirable for 4 processes, but it means x4 cost). If I give 2 or more CPU limits, they can be burstable, but I fear unstable nodes again.
If I can set node CPU limits shared with pods in a node I will be happy. I know node CPU allocatable and capacity, but I think even capacity doesn't limit the total of pod CPU usage. or I'm wrong with this?
If you can run the relevant pods in a separate namespace altogether you could use ResourceQuotas
ReesourceQuota provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.
The ResourceQuota parameter limits.cpu might be just what you are looking for. Across all pods in a non-terminal state in the given namespace, the sum of CPU limits cannot exceed this value.
You can read more about ResourceQuota here: https://kubernetes.io/docs/concepts/policy/resource-quotas/