I've gone through the Azure Cats&Dogs tutorial described here and I am getting an error in the final step where the apps are launched in AKS. Kubernetes is reporting that I have insufficent pods but I'm not sure why this would be. I've run through this same tutorial a few weeks ago without problems.
$ kubectl apply -f azure-vote-all-in-one-redis.yaml
deployment.apps/azure-vote-back created
service/azure-vote-back created
deployment.apps/azure-vote-front created
service/azure-vote-front created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
azure-vote-back-655476c7f7-mntrt 0/1 Pending 0 6s
azure-vote-front-7c7d7f6778-mvflj 0/1 Pending 0 6s
$ kubectl get events
LAST SEEN TYPE REASON KIND MESSAGE
3m36s Warning FailedScheduling Pod 0/1 nodes are available: 1 Insufficient pods.
84s Warning FailedScheduling Pod 0/1 nodes are available: 1 Insufficient pods.
70s Warning FailedScheduling Pod skip schedule deleting pod: default/azure-vote-back-655476c7f7-l5j28
9s Warning FailedScheduling Pod 0/1 nodes are available: 1 Insufficient pods.
53m Normal SuccessfulCreate ReplicaSet Created pod: azure-vote-back-655476c7f7-kjld6
99s Normal SuccessfulCreate ReplicaSet Created pod: azure-vote-back-655476c7f7-l5j28
24s Normal SuccessfulCreate ReplicaSet Created pod: azure-vote-back-655476c7f7-mntrt
53m Normal ScalingReplicaSet Deployment Scaled up replica set azure-vote-back-655476c7f7 to 1
99s Normal ScalingReplicaSet Deployment Scaled up replica set azure-vote-back-655476c7f7 to 1
24s Normal ScalingReplicaSet Deployment Scaled up replica set azure-vote-back-655476c7f7 to 1
9s Warning FailedScheduling Pod 0/1 nodes are available: 1 Insufficient pods.
3m36s Warning FailedScheduling Pod 0/1 nodes are available: 1 Insufficient pods.
53m Normal SuccessfulCreate ReplicaSet Created pod: azure-vote-front-7c7d7f6778-rmbqb
24s Normal SuccessfulCreate ReplicaSet Created pod: azure-vote-front-7c7d7f6778-mvflj
53m Normal ScalingReplicaSet Deployment Scaled up replica set azure-vote-front-7c7d7f6778 to 1
53m Normal EnsuringLoadBalancer Service Ensuring load balancer
52m Normal EnsuredLoadBalancer Service Ensured load balancer
46s Normal DeletingLoadBalancer Service Deleting load balancer
24s Normal ScalingReplicaSet Deployment Scaled up replica set azure-vote-front-7c7d7f6778 to 1
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-27217108-0 Ready agent 7d4h v1.9.9
The only thing I can think of that has changed is that I have other (larger) clusters running now as well, and the main reason I went through this Cats&Dogs tutorial again was because I hit this same problem today with my other clusters. Is this a resources limit issue with my Azure account?
Update 10-20/3:15 PST: Notice how these three clusters all show that they use the same nodepool, even though they were created in different resource groups. Also note how the "get-credentials" call for gem2-cluster reports an error. I did have a cluster earlier called gem2-cluster which I deleted and recreated using the same name (in fact I deleted the wole resource group). What's the correct process for doing this?
$ az aks get-credentials --name gem1-cluster --resource-group gem1-rg
Merged "gem1-cluster" as current context in /home/psteele/.kube/config
$ kubectl get nodes -n gem1
NAME STATUS ROLES AGE VERSION
aks-nodepool1-27217108-0 Ready agent 3h26m v1.9.11
$ az aks get-credentials --name gem2-cluster --resource-group gem2-rg
A different object named gem2-cluster already exists in clusters
$ az aks get-credentials --name gem3-cluster --resource-group gem3-rg
Merged "gem3-cluster" as current context in /home/psteele/.kube/config
$ kubectl get nodes -n gem1
NAME STATUS ROLES AGE VERSION
aks-nodepool1-14202150-0 Ready agent 26m v1.9.11
$ kubectl get nodes -n gem2
NAME STATUS ROLES AGE VERSION
aks-nodepool1-14202150-0 Ready agent 26m v1.9.11
$ kubectl get nodes -n gem3
NAME STATUS ROLES AGE VERSION
aks-nodepool1-14202150-0 Ready agent 26m v1.9.11
Check to make sure you are not hitting core limits for your subscription.
az vm list-usage --location "<location>" -o table
If you are you can request more quota, https://docs.microsoft.com/en-us/azure/azure-supportability/resource-manager-core-quotas-request
What is your max-pods set to? This is a normal error when you've reached the limit of pods per node.
You can check your current maximum number of pods per node with:
$ kubectl get nodes -o yaml | grep pods
pods: "30"
pods: "30"
And your current with:
$ kubectl get pods --all-namespaces | grep Running | wc -l
18
I hit this because I exceed the max pods, I found out how much I could handle by doing:
$ kubectl get nodes -o json | jq -r .items[].status.allocatable.pods | paste -sd+ - | bc