How exactly k8s reserves resources for a namespace?

3/27/2019

I have the following questions regarding request/limit quota for ns:

Considering the following namespace resource setup: - request: 1 core/1GiB - limit: 2 core/2GiB

  1. Does it mean a namespace is guaranteed to have 1/1GiB? How is it achieved physically on cluster nodes? Does it mean k8s somehow strictly reserve these values for a ns (at a time it's created)? At which point of time reservation takes place?

  2. Limit 2 core/2GiB - does it mean it's not guaranteed for a ns and depends on current cluster's state? Like if currently cluster has only 100MiB of free ram available, but in runtime pod needs 200Mib more above a resource request - pod will be restarted? Where does k8s take this resource if pod needs to go above it's request?

  3. Regarding namespace granularity and k8s horizontal auto scaling: consider we have 2 applications and 2 namespaces - 1 ns per each app. We set both ns quota as such that there's some free buffer for 2 extra pods and horizontal auto scaling up to 2 pods with certain CPU threshold. So, is there really a point in doing such a set up? My concern is that if NS reserves it's resources and no other ns can utilize them - we can just create 2 extra pods in each ns replica set with no auto scaling, using these pods constantly. I can see a point in using auto scaling if we have more than 1 application in 1 ns, so that these apps could share same resource buffer for scaling. Is this assumption correct?

  4. How do you think is this a good practice to have 1 ns per app? Why?

p.s. i know what resource request/limit are and difference between them. In most info sources there's just very high level explanation of the concept.

Thanks in advance.

-- Jan Lobau
kubernetes

1 Answer

3/27/2019

The docs clearly states the following:

In the case where the total capacity of the cluster is less than the sum of the quotas of the namespaces, there may be contention for resources. This is handled on a first-come-first-served basis.

and

ResourceQuotas are independent of the cluster capacity. They are expressed in absolute units. So, if you add nodes to your cluster, this does not automatically give each namespace the ability to consume more resources.

and

resource quota divides up aggregate cluster resources, but it creates no restrictions around nodes: pods from several namespaces may run on the same node

ResourceQuotas is a constraint set in the namespace and does not reserve capacity, it just set a limit of resources that can be consumed by each namespace.

To effectively "reserve" the capacity, you have to set the restrictions to all namespaces, so that other namespaces does not use more resources than you cluster can provide. This way you can have more guarantees that a namespace will have available capacity to run their load.

The docs suggests:

  • Proportionally divide total cluster resources among several teams(namespaces).
  • Allow each team to grow resource usage as needed, but have a generous limit to prevent accidental resource exhaustion.
  • Detect demand from one namespace, add nodes, and increase quota.

Given that, the answer for your questions are:

  1. it is not a reserved capacity, the reservation happens on resource(pod) creation.

  2. Running resources are not affected after reservation. New resources are rejected if the resource creation will over commit the quotas(limits)

  3. As stated in the docs, if the limit are higher than the capacity, the reservation will happen in a first-come-first-served basis.

  4. This question can make to it's own question in SO, in simple terms, for resource isolation and management.

-- Diego Mendes
Source: StackOverflow