Timeouts in metrics-server right after installing ingress in AKS

6/25/2020

Prerequisites:

  • New kubernetes cluster (Azure, v. 1.14.8) is set up
  • Metrics-server is set up automatically by AKS (v. 0.3.5)

Steps:

  • Install ingress into cluster via helm install ingress stable/nginx-ingress --namespace ingress --create-namespace --set controller.replicaCount=1
  • Wait few minutes

After some minutes (3-8) there are errors in metrics-server and it fall into loop with FailedDiscoveryCheck error: Failed to make webhook authorized request: Post https://...azmk8s.io:443/apis/authorization.k8s.io/v1beta1/subjectaccessreviews: read tcp %IP%: read: connection timed out.

Error in NGINX Ingress controller pod:

E0625 12:18:49.622522       6 leaderelection.go:320] error retrieving resource lock ingress/ingress-controller-leader-nginx: Get "https://10.0.0.1:443/api/v1/namespaces/ingress/configmaps/ingress-controller-leader-nginx": context deadline exceeded
I0625 12:18:49.622561       6 leaderelection.go:277] failed to renew lease ingress/ingress-controller-leader-nginx: timed out waiting for the condition
I0625 12:18:49.626143       6 leaderelection.go:242] attempting to acquire leader lease  ingress/ingress-controller-leader-nginx...
E0625 12:34:13.890642       6 leaderelection.go:320] error retrieving resource lock ingress/ingress-controller-leader-nginx: Get "https://10.0.0.1:443/api/v1/namespaces/ingress/configmaps/ingress-controller-leader-nginx": read tcp 10.244.0.53:55144->10.0.0.1:443: read: connection timed out

The metrics-server does not work until its restart. After the restart no issues are observed. The adding of liveness/readiness probes to metrics-server deployment fixes the issue with late restart of metrics-server, but does not fix the root cause.

Why the metrics-server stop working only after few minutes of installing ingress? How the installing of ingress affects cluster? It is reproduced stably. You can delete ingress, then install it again and the issue will be repeated.

Sometimes, metrics-server fails with error:

Message: endpoints for service/metrics-server in "kube-system" have no addresses
Reason: MissingEndpoints

The same behavior is also observed for another pod: If you install kubernetes-dashboard, then it stops working after installation of ingress. There is error 500 context deadline exceeded.

It is wanted to understand and fix the root cause.

-- Efim Konstantinov
azure-aks
kubernetes
kubernetes-ingress

0 Answers