Calico-kube-controller, coredns pods are in pending state

7/15/2019

I am trying to deploy a Kubernetes cluster, my master node is UP and running but some pods are stuck in pending state. Below is the output of get pods.

NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE   IP       NODE                NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-65b4876956-29tj9    0/1     Pending   0          9h    <none>   <none>              <none>           <none>
kube-system   calico-node-bf25l                           2/2     Running   2          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   coredns-7d6cf57b54-b55zw                    0/1     Pending   0          9h    <none>   <none>              <none>           <none>
kube-system   coredns-7d6cf57b54-bk6j5                    0/1     Pending   0          12m   <none>   <none>              <none>           <none>
kube-system   kube-apiserver-master-0-eccdtest            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-controller-manager-master-0-eccdtest   1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-proxy-jhfjj                            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   kube-scheduler-master-0-eccdtest            1/1     Running   1          9h    <none>   master-0-eccdtest   <none>           <none>
kube-system   openstack-cloud-controller-manager-tlp4m    1/1     CrashLoopBackOff   114        9h    <none>   master-0-eccdtest   <none>           <none>

When I am trying to check the pod logs I am getting the below error.

Error from server: no preferred addresses found; known addresses: []

Kubectl get events has lot of warnings.

NAMESPACE     LAST SEEN   TYPE      REASON                    KIND   MESSAGE
default       23m         Normal    Starting                  Node   Starting kubelet.
default       23m         Normal    NodeHasSufficientMemory   Node   Node master-0-eccdtest status is now: NodeHasSufficientMemory
default       23m         Normal    NodeHasNoDiskPressure     Node   Node master-0-eccdtest status is now: NodeHasNoDiskPressure
default       23m         Normal    NodeHasSufficientPID      Node   Node master-0-eccdtest status is now: NodeHasSufficientPID
default       23m         Normal    NodeAllocatableEnforced   Node   Updated Node Allocatable limit across pods
default       23m         Normal    Starting                  Node   Starting kube-proxy.
default       23m         Normal    RegisteredNode            Node   Node master-0-eccdtest event: Registered Node master-0-eccdtest in Controller
kube-system   26m         Warning   FailedScheduling          Pod    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   3m15s       Warning   FailedScheduling          Pod    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   25m         Warning   DNSConfigForming          Pod    Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m         Normal    SandboxChanged            Pod    Pod sandbox changed, it will be killed and re-created.
kube-system   23m         Normal    Pulled                    Pod    Container image "registry.eccd.local:5000/node:v3.6.1-26684321" already present on machine
kube-system   23m         Normal    Created                   Pod    Created container
kube-system   23m         Normal    Started                   Pod    Started container
kube-system   23m         Normal    Pulled                    Pod    Container image "registry.eccd.local:5000/cni:v3.6.1-26684321" already present on machine
kube-system   23m         Normal    Created                   Pod    Created container
kube-system   23m         Normal    Started                   Pod    Started container
kube-system   23m         Warning   Unhealthy                 Pod    Readiness probe failed: Threshold time for bird readiness check:  30s
calico/node is not ready: felix is not ready: Get http://localhost:9099/readiness: dial tcp [::1]:9099: connect: connection refused
kube-system   23m     Warning   Unhealthy          Pod          Liveness probe failed: Get http://localhost:9099/liveness: dial tcp [::1]:9099: connect: connection refused
kube-system   26m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   3m15s   Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   105s    Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   26m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   22m     Warning   FailedScheduling   Pod          0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
kube-system   21m     Warning   FailedScheduling   Pod          skip schedule deleting pod: kube-system/coredns-7d6cf57b54-w95g4
kube-system   21m     Normal    SuccessfulCreate   ReplicaSet   Created pod: coredns-7d6cf57b54-bk6j5
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-apiserver:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-controller-manager:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   23m     Normal    LeaderElection     Endpoints    master-0-eccdtest_ed8f0ece-a6cd-11e9-9dd7-fa163e182aab became leader
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-proxy:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   23m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/kube-scheduler:v1.13.5-1-80cc0db3" already present on machine
kube-system   23m     Normal    Created            Pod          Created container
kube-system   23m     Normal    Started            Pod          Started container
kube-system   23m     Normal    LeaderElection     Endpoints    master-0-eccdtest_ee2520c1-a6cd-11e9-96a3-fa163e182aab became leader
kube-system   26m     Warning   DNSConfigForming   Pod          Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 10.96.0.10 10.51.40.100 10.51.40.103
kube-system   36m     Warning   BackOff            Pod          Back-off restarting failed container
kube-system   23m     Normal    SandboxChanged     Pod          Pod sandbox changed, it will be killed and re-created.
kube-system   20m     Normal    Pulled             Pod          Container image "registry.eccd.local:5000/openstack-cloud-controller-manager:v1.14.0-1-11023d82" already present on machine
kube-system   20m     Normal    Created            Pod          Created container
kube-system   20m     Normal    Started            Pod          Started container
kube-system   3m20s   Warning   BackOff            Pod          Back-off restarting failed container

The only nameserver in reslov.conf is

nameserver 10.96.0.10

I have extensively used google for these problems but did not get any working solution. Any suggestions would be appreciated.

TIA

-- vidyadhar reddy
kubernetes

2 Answers

9/16/2019

I have figured out the issue. I didnot have access to my cloud controller FQDN from my master nodes. I added DNS entry on my master /etc/resolv.conf and it worked.

-- vidyadhar reddy
Source: StackOverflow

7/15/2019

Your main issue here is 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate warning message. You are getting this due to the node-role.kubernetes.io/master:NoSchedule and node.kubernetes.io/not-ready:NoSchedule taints

This taint prevents scheduling pods on current node.

If you want to be able to schedule pods on the control-plane node, e.g. for a single-machine Kubernetes cluster for development, run:

kubectl taint nodes instance-1 node-role.kubernetes.io/master-
kubectl taint nodes instance-1 node.kubernetes.io/not-ready:NoSchedule-

But from my POW it is better to:

-initiate cluster using kubeadm

-apply CNI

-add new worker node

-and let all your new pods be scheduled on worker node.

sudo kubeadm init --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.15.0
...

Your Kubernetes control-plane has initialized successfully!

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config


$ kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created


-ADD worker node using kubeadm join string on slave node

$ kubectl get nodes
NAME         STATUS   ROLES    AGE   VERSION
instance-1   Ready    master   21m   v1.15.0
instance-2   Ready    <none>   34s   v1.15.0

    $ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE    IP               NODE         NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-658558ddf8-v2rqx   1/1     Running   0          11m    192.168.23.129   instance-1   <none>           <none>
kube-system   calico-node-c2tkt                          1/1     Running   0          11m    10.132.0.36      instance-1   <none>           <none>
kube-system   calico-node-dhc66                          1/1     Running   0          107s   10.132.0.38      instance-2   <none>           <none>
kube-system   coredns-5c98db65d4-dqjm7                   1/1     Running   0          22m    192.168.23.130   instance-1   <none>           <none>
kube-system   coredns-5c98db65d4-hh7vd                   1/1     Running   0          22m    192.168.23.131   instance-1   <none>           <none>
kube-system   etcd-instance-1                            1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-apiserver-instance-1                  1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-controller-manager-instance-1         1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-proxy-qwvkq                           1/1     Running   0          107s   10.132.0.38      instance-2   <none>           <none>
kube-system   kube-proxy-s9gng                           1/1     Running   0          22m    10.132.0.36      instance-1   <none>           <none>
kube-system   kube-scheduler-instance-1                  1/1     Running   0          21m    10.132.0.36      instance-1   <none>           <n
-- VKR
Source: StackOverflow