I created an aks cluster using az aks create command with kubenet network and 2 nodes. Due to permissions issue in the AD account, the NSG had to be switched off before running the aks create command. After the AKS cluster created successfully, the NSG was reapplied.
In order to check the health of the newly created cluster, when I run:
kubectl get nodes --all-namespaces;
there are no nodes returned. However, when I look into the azure portal and the corresponding vNet, there are 2vmss created using the ips within the subnet range. When I run:
kubectl get pods --all-namespaces;
all pods are in pending state:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-xxxxdxxxxx-xxxxx 0/1 Pending 0 5h
kube-system coredns-autoscaler-xxdxxxxxxxx-xxxx 0/1 Pending 0 5h
kube-system kubernetes-dashboard-xxdxxxxxx-xxxxx 0/1 Pending 0 5h
kube-system metrics-server-xxxxxxxdxx-xxxx 0/1 Pending 0 5h
kube-system omsagent-rs-xxxxxxxxdx-xxxxx 0/1 Pending 0 5h
kube-system tiller-deploy-xxxxxxxdxxx-xxxx 0/1 Pending 0 34m
kube-system tunnelfront-xxxxxxxdx-xxxxx 0/1 Pending 0 5h
I then did a describe on the coredns pod:
kubectl describe pod coredns-xxxxxxxxxx-xxxx -n kube-system
Warning FailedScheduling 2m40s (x2242 over 2d5h) default-scheduler no nodes available to schedule pods
I need to deploy some containers using helm/tiller and when I run the installation commands I get the error
Error: could not find a ready tiller pod
I know this is not directly to do with helm/tiller installation, the issue may be a bit more deeper.
I am new to Kubernetes, any thoughts on how to diagnose the issue will be much appreciated.
You need to manually deploy
kubectl logs --namespace kube-system tiller-deploy-xxxxxxxdxxx-xxxx
as stated in the below comments, there are no nodes and all the pods are in the pending state according to your logs, as recommended here
you need to delete the cluster and recreate the cluster.
if no nodes are returned from kubectl get nodes
I'd suggest recreating the cluster, since if there are no nodes - no pods can ever run on this cluster. you might try and upgrade the cluster to a newer version of kubernetes (this would effectively redeploy the nodes), that might help.