Kubeadm Failed to create SubnetManager: error retrieving pod spec for kube-system

10/2/2019

No matter what I do it seems I cannot get rid of this problem. I have installed Kubernetes using kubeadm many times quite successfully however adding a v1.16.0 node is giving me a heck of a headache.

O/S: Ubuntu 18.04.3 LTS Kubernetes version: v1.16.0 Kubeadm version: Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-18T14:34:01Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"

A query of the cluster shows:

    NAME                  STATUS                     ROLES    AGE     VERSION
kube-apiserver-1      Ready                      master   110d    v1.15.0
kube-apiserver-2      Ready                      master   110d    v1.15.0
kube-apiserver-3      Ready                      master   110d    v1.15.0
kube-node-1           Ready                      <none>   110d    v1.15.0
kube-node-2           Ready                      <none>   110d    v1.15.0
kube-node-3           Ready                      <none>   110d    v1.15.0
kube-node-4           Ready                      <none>   110d    v1.16.0
kube-node-5           Ready,SchedulingDisabled   <none>   3m28s   v1.16.0
kube-node-databases   Ready                      <none>   110d    v1.15.0

I have temporarily disabled scheduling to the node until I can fix this problem. A query of the pod status in the kube-system namespace shows the problem:

$ kubectl -n kube-system get pods

NAME                                       READY   STATUS             RESTARTS   AGE
coredns-fb8b8dccf-55zjs                    1/1     Running            128        21d
coredns-fb8b8dccf-kzrpc                    1/1     Running            144        21d
kube-flannel-ds-amd64-29xp2                1/1     Running            11         110d
kube-flannel-ds-amd64-hp7nq                1/1     Running            14         110d
kube-flannel-ds-amd64-hvdpf                0/1     CrashLoopBackOff   5          8m28s
kube-flannel-ds-amd64-jhhlk                1/1     Running            11         110d
kube-flannel-ds-amd64-k6dzc                1/1     Running            2          110d
kube-flannel-ds-amd64-lccxl                1/1     Running            21         110d
kube-flannel-ds-amd64-nnn7g                1/1     Running            14         110d
kube-flannel-ds-amd64-shss5                1/1     Running            7          110d

kubectl -n kube-system logs -f kube-flannel-ds-amd64-hvdpf

I1002 01:13:22.136379       1 main.go:514] Determining IP address of default interface
I1002 01:13:22.136823       1 main.go:527] Using interface with name ens3 and address 192.168.5.46
I1002 01:13:22.136849       1 main.go:544] Defaulting external address to interface address (192.168.5.46)
E1002 01:13:52.231471       1 main.go:241] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-hvdpf': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-hvdpf: dial tcp 10.96.0.1:443: i/o timeout

Although I had a few hits on iptables issues and kernel routing I don't understand why previous versions have installed without a hitch but this version is giving me such a problem.

I have installed this node and destroyed it quite a few times yet the result is always the same.

Anyone else having this issue or has a solution?

-- Daniel Maldonado
flannel
kubeadm
kubernetes

2 Answers

10/7/2019

According to Documentation about version skew policy:

kubelet

kubelet must not be newer than kube-apiserver, and may be up to two minor versions older.

Example:

  • kube-apiserver is at 1.13
  • kubelet is supported at 1.13, 1.12, and 1.11

That means that worker nodes with version v1.16.0 is not supported on master node with version v1.15.0.

To fix this issue I recommend reinstalling node with version v1.15.0 to match the rest of the cluster.

Optionally You can upgrade whole cluster to v1.16.1 however there are some problems with it running flannel as network plugin at the moment. Please review this guide from documentation before proceeding.

-- Piotr Malec
Source: StackOverflow

2/6/2020

This occurs when its not able to lookup the host add the below after name: POD_NAMESPACE

- name: KUBERNETES_SERVICE_HOST
          value: "10.220.64.186" #ip address of the host where kube-apiservice is running
        - name: KUBERNETES_SERVICE_PORT
          value: "6443"
-- Ren Roz
Source: StackOverflow