How does Kubernetes Admission Controller handle multiple simultaneous admission requests?

9/30/2021

Assuming that there is only one admission controller Pod running, and the admission controller has a webhook that will be triggered by Pod deletion events.

Example Scenario

There are 2 Pods (Pod A and Pod B) within a namespace. 2 different users (Alice and Bob) perform Pod deletion at the exact same time, in which: 1. Alice deletes the Pod A 2. Bob deletes the Pod B

In this specific scenario, will the admission controller handle both the admission requests serially or in parallel? In other words, will the admission controller handle the admission request for Pod A before that of Pod B (or vice versa), or will it handle both the admission requests are the same time?

General Scenario

The admission requests are sent from the API Server to the admission controller. Generally speaking, will it be possible that multiple admission requests are sent to the admission controller at the exact same time?

And if so, will the admission controller handle them in parallel via some built-in parallelism mechanism, or will the admission controller queue them and process them serially?

-- jtee
kubernetes
multithreading

1 Answer

10/7/2021

Since in kube-api options we can see --max-mutating-requests-inflight and --max-requests-inflight flags, which are used to determine the server's total concurrency limit, I think that admission controller also support multi-threading. Because otherwise it will be a bottleneck for proceeding with API requests.

This is true and accurate advice, as long as we are using the base environment.

But on the other hand, we can customize our environment and specify how requests should be processed. For that purpose can be used API Priority and Fairness (APF). APF classifies and isolates requests in a more fine-grained way in comparison with --max-mutating-requests-inflight and --max-requests-inflight.

Without APF enabled, overall concurrency in the API server is limited by the kube-apiserver flags --max-requests-inflight and --max-mutating-requests-inflight. With APF enabled, the concurrency limits defined by these flags are summed and then the sum is divided up among a configurable set of priority levels. Each incoming request is assigned to a single priority level, and each priority level will only dispatch as many concurrent requests as its configuration allows.

The default configuration, for example, includes separate priority levels for leader-election requests, requests from built-in controllers, and requests from Pods.

So, it's a quite wide question. It all depends on which admission controller we using: original or custom, what controls are used (--max-mutating-requests-inflight and --max-requests-inflight command-line flags), APF configuration.

-- Andrew Skorkin
Source: StackOverflow