I'm suddenly having issues pulling the latest image from Azure container registry with AKS (which previously worked fine.
If I run
kubectl describe pod <podid> I get:
Failed to pull image <image>: rpc error: code = Unknown desc = Error response from daemon: Get <image>: unauthorized: authentication required
I've tried logging into the ACR manually and it's all working correctly - the new images have pushed correctly and I can pull them manually.
Moreover I've tried:
az aks update -g MyResourceGroup -n MyManagedCluster --attach-acr acrName
Which succeeds (no errors, there is an output propagation being successful) but it still doesn't work.
I've tried updating the credentials with:
az aks update-credentials --resource-group <group>--name <aks name>--reset-service-principal --service-principal <sp id> --client-secret <client-secret>
Which spits out a rather weird message:
Deployment failed. Correlation ID: 6e84754a-821d-4a39-a0df-7ab9ba21973f.
Unable to get log analytics workspace info. Resource ID:
/subscriptions/<subscription id>/resourcegroups/defaultresourcegroup-
weu/providers/microsoft.operationalinsights/workspaces/defaultworkspace-
d259e6ea-8230-4cb0-a7a8-7f0df6c7ef18-weu. Details: autorest/azure: Service
returned an error. Status=404 Code="ResourceGroupNotFound"
Message="Resource group 'defaultresourcegroup-weu' could not be found.". For
more details about how to create and use log analytics workspace, please
refer to: https://aka.ms/new-log-analytics
I tried creating a new log analytics workspace and the error above persisted.
I've also tried steps from:
Besides the the posts above, I've gone through many tutorials and Microsoft pages to try fix the problem.
I've tried creating a new service principal and assigning it the appropriate roles but the error still persists. I've also dabbled with creating new secrets and had no success.
My pods that don't need new images are all running as expected. If I look at my app registrations (under azure active directory) they were all created a year ago - so I'm concerned something expired and I don't know how to fix it.
Got this working by disabling the Log Analytics addon using:
az aks disable-addons -a monitoring -n <AKSName> -g <ResourceGroupName>
As per one of the error messages I posted, it seems my log analytics was just causing things to fall apart (not sure why though) so disabled it for the time being and was able to update creds with
az aks update-credentials --resource-group <group>--name <aks name>--reset-service-principal --service-principal <sp id> --client-secret <client-secret>
there are two ways to get this sorted
CLIENT_ID=$(az aks show --resource-group $AKS_RESOURCE_GROUP --name
$AKS_CLUSTER_NAME --subscription $SUBSCRIPTION_ID --query "servicePrincipalProfile.clientId" --output tsv)
ACR_ID=$(az acr show --name $ACR_NAME --resource-group $ACR_RESOURCE_GROUP --subscription $SUBSCRIPTION_ID --query "id" --output tsv)
az role assignment create --assignee $CLIENT_ID --role Reader --scope $ACR_ID
other way is to add the Image pull secret, which will be of type kubernetes.io/dockerconfigjson
this can be done by -
kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
Then you can refer this secret as imagePullSecret in the deployment files and the authentication error shall not be there