kubernetes cluster master node not ready

9/12/2017

i do not know why ,my master node in not ready status,all pods on cluster run normally, and i use cabernets v1.7.5 ,and network plugin use calico,and os version is "centos7.2.1511"

# kubectl get nodes
NAME        STATUS     AGE       VERSION
k8s-node1   Ready      1h        v1.7.5
k8s-node2   NotReady   1h        v1.7.5




# kubectl get all --all-namespaces
NAMESPACE     NAME                                           READY     STATUS    RESTARTS   AGE
kube-system   po/calico-node-11kvm                           2/2       Running   0          33m
kube-system   po/calico-policy-controller-1906845835-1nqjj   1/1       Running   0          33m
kube-system   po/calicoctl                                   1/1       Running   0          33m
kube-system   po/etcd-k8s-node2                              1/1       Running   1          15m
kube-system   po/kube-apiserver-k8s-node2                    1/1       Running   1          15m
kube-system   po/kube-controller-manager-k8s-node2           1/1       Running   2          15m
kube-system   po/kube-dns-2425271678-2mh46                   3/3       Running   0          1h
kube-system   po/kube-proxy-qlmbx                            1/1       Running   1          1h
kube-system   po/kube-proxy-vwh6l                            1/1       Running   0          1h
kube-system   po/kube-scheduler-k8s-node2                    1/1       Running   2          15m

NAMESPACE     NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       svc/kubernetes   10.96.0.1    <none>        443/TCP         1h
kube-system   svc/kube-dns     10.96.0.10   <none>        53/UDP,53/TCP   1h

NAMESPACE     NAME                              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system   deploy/calico-policy-controller   1         1         1            1           33m
kube-system   deploy/kube-dns                   1         1         1            1           1h

NAMESPACE     NAME                                     DESIRED   CURRENT   READY     AGE
kube-system   rs/calico-policy-controller-1906845835   1         1         1         33m
kube-system   rs/kube-dns-2425271678                   1         1         1         1h

update

it seems master node can not recognize the calico network plugin, i use kubeadm to install k8s cluster ,due to kubeadm start etcd on 127.0.0.1:2379 on master node,and calico on other nodes can not talk with etcd,so i modify etcd.yaml as following ,and all calico pods run fine, i do not very familiar with calico ,how to fix it ?

apiVersion: v1
kind: Pod
metadata:
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --listen-client-urls=http://127.0.0.1:2379,http://10.161.233.80:2379
    - --advertise-client-urls=http://10.161.233.80:2379
    - --data-dir=/var/lib/etcd
    image: gcr.io/google_containers/etcd-amd64:3.0.17
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health
        port: 2379
        scheme: HTTP
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: etcd
    resources: {}
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: certs
    - mountPath: /var/lib/etcd
      name: etcd
    - mountPath: /etc/kubernetes
      name: k8s
      readOnly: true
  hostNetwork: true
  volumes:
  - hostPath:
      path: /etc/ssl/certs
    name: certs
  - hostPath:
      path: /var/lib/etcd
    name: etcd
  - hostPath:
      path: /etc/kubernetes
    name: k8s
status: {}

[root@k8s-node2 calico]# kubectl describe node k8s-node2
Name:                   k8s-node2
Role:
Labels:                 beta.kubernetes.io/arch=amd64
                        beta.kubernetes.io/os=linux
                        kubernetes.io/hostname=k8s-node2
                        node-role.kubernetes.io/master=
Annotations:            node.alpha.kubernetes.io/ttl=0
                        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:                 node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:      Tue, 12 Sep 2017 15:20:57 +0800
Conditions:
  Type                  Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----                  ------  -----------------                       ------------------                      ------                          -------
  OutOfDisk             False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  MemoryPressure        False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure          False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletHasNoDiskPressure        kubelet has no disk pressure
  Ready                 False   Wed, 13 Sep 2017 10:25:58 +0800         Tue, 12 Sep 2017 15:20:57 +0800         KubeletNotReady                 runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
  InternalIP:   10.161.233.80
  Hostname:     k8s-node2
Capacity:
 cpu:           2
 memory:        3618520Ki
 pods:          110
Allocatable:
 cpu:           2
 memory:        3516120Ki
 pods:          110
System Info:
 Machine ID:                    3c6ff97c6fbe4598b53fd04e08937468
 System UUID:                   C6238BF8-8E60-4331-AEEA-6D0BA9106344
 Boot ID:                       84397607-908f-4ff8-8bdc-ff86c364dd32
 Kernel Version:                3.10.0-514.6.2.el7.x86_64
 OS Image:                      CentOS Linux 7 (Core)
 Operating System:              linux
 Architecture:                  amd64
 Container Runtime Version:     docker://1.12.6
 Kubelet Version:               v1.7.5
 Kube-Proxy Version:            v1.7.5
PodCIDR:                        10.68.0.0/24
ExternalID:                     k8s-node2
Non-terminated Pods:            (5 in total)
  Namespace                     Name                                            CPU Requests    CPU Limits      Memory Requests Memory Limits
  ---------                     ----                                            ------------    ----------      --------------- -------------
  kube-system                   etcd-k8s-node2                                  0 (0%)          0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-apiserver-k8s-node2                        250m (12%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-controller-manager-k8s-node2               200m (10%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-proxy-qlmbx                                0 (0%)          0 (0%)          0 (0%)          0 (0%)
  kube-system                   kube-scheduler-k8s-node2                        100m (5%)       0 (0%)          0 (0%)          0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  550m (27%)    0 (0%)          0 (0%)          0 (0%)
Events:         <none>
-- user1208081
kubernetes

3 Answers

9/13/2017

I think you may need to add tolerations and update the annotations for calico-node in the manifest you are using so that it can run on a master created by kubeadm. Kubeadm taints the master so that pods cannot run on it unless they have a toleration for that taint.

I believe you are using the https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/calico.yaml manifest which has the annotations (that include tolerations) for K8s v1.5, you should check https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml, it has the toleration syntax for K8s v1.6+.

Here is a snippet from the above with annotations and tolerations

metadata:
  labels:
    k8s-app: calico-node
  annotations:
    # Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
    # reserves resources for critical add-on pods so that they can be rescheduled after
    # a failure.  This annotation works in tandem with the toleration below.
    scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
  hostNetwork: true
  tolerations:
  - key: node-role.kubernetes.io/master
    effect: NoSchedule
  # Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
  # This, along with the annotation above marks this pod as a critical add-on.
  - key: CriticalAddonsOnly
    operator: Exists
-- Erik Stidham
Source: StackOverflow

9/12/2017

It's good practice to run a describe command in order to see what's wrong with your node:

kubectl describe nodes <NODE_NAME>

e.g.: kubectl describe nodes k8s-node2 You should be able to start your investigations from there and add more info to this question if needed.

-- AR1
Source: StackOverflow

9/14/2018

You need install a Network Policy Provider, this is one of supported provider: Weave Net for NetworkPolicy. command line to install:

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

After a few seconds, a Weave Net pod should be running on each Node and any further pods you create will be automatically attached to the Weave network.

-- Alex
Source: StackOverflow