Installing nginx ingress controller into AKS cluster - can't pull image from Azure Container Registry - 401 Unauthorized

8/27/2021

I'm trying to install an nginx ingress controller into an Azure Kubernetes Service cluster using helm. I'm following this Microsoft guide. It's failing when I use helm to try to install the ingress controller, because it needs to pull a "kube-webhook-certgen" image from a local Azure Container Registry (which I created and linked to the cluster), but the kubernetes pod that's initially scheduled in the cluster fails to pull the image and shows the following error when I use kubectl describe pod [pod_name]:

failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

This section describes using helm to create an ingress controller.

The guide describes creating an Azure Container Registry, and link it to a kubernetes cluster, which I've done successfully using:

az aks update -n myAKSCluster -g myResourceGroup --attach-acr <acr-name>

I then import the required 3rd party repositories successfully into my 'local' Azure Container Registry as detailed in the guide. I checked that the cluster has access to the Azure Container Registry using:

az aks check-acr --name MyAKSCluster --resource-group myResourceGroup --acr letsencryptdemoacr.azurecr.io

I also used the Azure Portal to check permissions on the Azure Container Registry and the specific repository that has the issue. It shows that both the cluster and repository have the ACR_PULL permission)

When I run the helm script to create the ingress controller, it fails at the point where it's trying to create a kubernetes pod named nginx-ingress-ingress-nginx-admission-create in the ingress-basic namespace that I created. When I use kubectl describe pod [pod_name_here], it shows the following error, which prevents creation of the ingress controller from continuing:

Failed to pull image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen:v1.5.1@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": [rpc error: code = NotFound desc = failed to pull and unpack image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068: not found, rpc error: code = Unknown desc = failed to pull and unpack image "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to resolve reference "letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

This is the helm script that I run in a linux terminal:

helm install nginx-ingress ingress-nginx/ingress-nginx --namespace ingress-basic --set controller.replicaCount=1 --set controller.nodeSelector."kubernetes\.io/os"=linux --set controller.image.registry=$ACR_URL --set controller.image.image=$CONTROLLER_IMAGE --set controller.image.tag=$CONTROLLER_TAG --set controller.image.digest="" --set controller.admissionWebhooks.patch.nodeSelector."kubernetes\.io/os"=linux --set controller.admissionWebhooks.patch.image.registry=$ACR_URL --set controller.admissionWebhooks.patch.image.image=$PATCH_IMAGE --set controller.admissionWebhooks.patch.image.tag=$PATCH_TAG --set defaultBackend.nodeSelector."kubernetes\.io/os"=linux --set defaultBackend.image.registry=$ACR_URL --set defaultBackend.image.image=$DEFAULTBACKEND_IMAGE --set defaultBackend.image.tag=$DEFAULTBACKEND_TAG --set controller.service.loadBalancerIP=$STATIC_IP --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-dns-label-name"=$DNS_LABEL

I'm using the following relevant environment variables:

$ACR_URL=letsencryptdemoacr.azurecr.io
$PATCH_IMAGE=jettech/kube-webhook-certgen
$PATCH_TAG=v1.5.1

How do I fix the authorization?

-- Chris Halcrow
azure
azure-container-registry
kubernetes

3 Answers

8/30/2021

Make sure you guys set all the digests to empty

--set controller.image.digest=""
--set controller.admissionWebhooks.patch.image.digest=""
--set defaultBackend.image.digest=""

Basically, this will pull the image <your-registry>.azurecr.io/ingress-nginx/controller:<version> without the @digest:<digest>

The other problem, if you use the latest chart version, the deployment will crash into CRASHLOOPBACKOFF status. Checking the live log of the pod, you will see the problem with flags, eg Unknown flag --controller-class. To resolve this problem, you could specify the -version flag in the helm install command to use the version 3.36.0. All deployment problems should be resolved.

-- Le Silencieux
Source: StackOverflow

8/29/2021

It seems like the issue is caused by the new ingress-nginx/ingress-nginx helm chart release. I have fixed it by using version 3.36.0 instead of the latest (4.0.1).

helm upgrade -i nginx-ingress ingress-nginx/ingress-nginx \
--version 3.36.0 \
...
-- David Truong
Source: StackOverflow

8/30/2021

Azure support identified and provided a solution to this and essentially confirmed that the documentation in the Microsoft tutorial is at best now outdated against the current Helm release for the ingress controller.

The full error message I was getting was similar to the following, which indicates that the first error encountered is actually that the image is NotFound. The message about Unauthorized is actually misleading. The issue appears to be that the install references 'digests' for a couple of the images required by the install (basically the digest is a unique identifier for the image). The install appears to have been using digests of the docker images from the original location, and not the digest of my copy of the images that I imported into the Azure Container Registry. This obviously then doesn't work, as the digests of the images the install is trying to pull don't match the digests of the images that are imported to my Azure Container Registry.

Failed to pull image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen:v1.5.1@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': [rpc error: code = NotFound desc = failed to pull and unpack image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to resolve reference 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068: not found, rpc error: code = Unknown desc = failed to pull and unpack image 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to resolve reference 'letsencryptdemoacr.azurecr.io/jettech/kube-webhook-certgen@sha256:f3b6b39a6062328c095337b4cadcefd1612348fdd5190b1dcbcb9b9e90bd8068': failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

The generated digest for the images that I'd imported into my local Azure Container Registry needed to be specified as additional arguments to the helm install:

--set controller.image.digest="sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899" \
--set controller.admissionWebhooks.patch.image.digest="sha256:950833e19ade18cd389d647efb88992a7cc077abedef343fa59e012d376d79b7" \

I then had a 2nd issue where I was getting CrashLoopBackoff for the ingress controller pod. I fixed this by re-importing a different version of the ingress controller image than the one referenced in the tutorial, as follows:

set environment variable used to identify the tag to pull for the ingress controller image CONTROLLER_TAG=v1.0.0

delete the ingress repository from the Azure Container Registry (I did this manually via the portal), then re-import it using the following (the values of other variables should be as specified in the Microsoft tutorial):

az acr import --name $REGISTRY_NAME --source $CONTROLLER_REGISTRY/$CONTROLLER_IMAGE:$CONTROLLER_TAG --image $CONTROLLER_IMAGE:$CONTROLLER_TAG
-- Chris Halcrow
Source: StackOverflow