Suddenly an entire Kubernetes cluster (Azure's AKS-solution) became unresponsive.
When running kubectl
commands, the result is kubectl x509 certificate has expired or is not yet valid
.
Nothing in Azure Portal indicates an unhealthy state.
The quick solution:
az aks rotate-certs -g $RESOURCE_GROUP_NAME -n $CLUSTER_NAME
When certificates have been rotated, you can use kubectl
again.
Be ready to wait 30 minutes before the cluster fully recovers.
Full explanation can be found in this article:
https://docs.microsoft.com/en-us/azure/aks/certificate-rotation