kube-system: Pod Warning FailedScheduling default-scheduler no nodes available to schedule pods

11/19/2018

Why am I getting:

kube-system 1m 1h 245 kube-dns-fcd468cb-8fhg2.156899dbda62d287 Pod Warning FailedScheduling default-scheduler no nodes available to schedule pods

UPDATE - I've now migrated the entire cluster to us-west-2 rather than eu-west-1 so I can run the code out of the box to prevent introducing any errors. The tfstate file showed the correct EKS AMI is being referred to.

E.g.

720: "image_id": "ami-00c3b2d35bddd4f5c",

FWIW, I'm following along to https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html and using the code it links to in Github - i.e. https://github.com/terraform-providers/terraform-provider-aws/tree/master/examples/eks-getting-started

Note: looking in EC2 Instances, I can see 2 EKS nodes running with the correct AMI IDs.

\==== UPDATES

Checking nodes:

kubectl get nodes
No resources found.

ssh into one of the nodes and running journalctl shows:

Nov 21 12:28:25 ip-10-0-0-247.us-west-2.compute.internal kubelet[4417]: E1121 12:28:25.419465    4417 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:461: Failed to list *v1.Node: Unauthorized
Nov 21 12:28:25 ip-10-0-0-247.us-west-2.compute.internal kubelet[4417]: E1121 12:28:25.735882    4417 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:452: Failed to list *v1.Service: Unauthorized
Nov 21 12:28:26 ip-10-0-0-247.us-west-2.compute.internal kubelet[4417]: E1121 12:28:26.237953    4417 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Unauthorized
Nov 21 12:28:26 ip-10-0-0-247.us-west-2.compute.internal kubelet[4417]: W1121 12:28:26.418327    4417 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d
Nov 21 12:28:26 ip-10-0-0-247.us-west-2.compute.internal kubelet[4417]: E1121 12:28:26.418477    4417 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: n

Given Auth may be an issue I checked the Terraform code which seems to be correct. E.g.:

https://github.com/terraform-providers/terraform-provider-aws/blob/master/examples/eks-getting-started/outputs.tf#L9-L20

Any way I can test this in a bit more detail? Or any further suggestions?

-- Snowcrash
amazon-eks
kubernetes

1 Answer

11/19/2018

I'm guessing you don't have any nodes registered on your cluster. Just because the EC2 nodes are not up, it doesn't mean that your cluster is able to use them. You can check with:

$ kubectl get nodes

Another possibility is that your nodes are available but they don't have enough resources (which is unlikely).

Another possibility is that your nodes are tainted with something like this:

$ kubectl taint node node1 key=value:NoSchedule

You can check and remove it:

$ kubectl describe node node1
$ kubectl taint node node1 key:NoSchedule-

Another possibility is that you have nodeSelector in your pod spec and you don't have the nodes labeled with that node selector. Check with:

$ kubectl get nodes --show-labels
-- Rico
Source: StackOverflow