Can't install third kubernetes master node: Kubelet TLS bootstrapping timeout in kubeadm join

10/5/2018

When trying to set up an HA cluster in Kubernetes 1.12 with external etcd I experienced a timeout when using the following command:

kubeadm join <load balancer>:443 --token <token> --discovery-token-ca-cert-hash sha256:3dfa042fcc28a26da9335c14802718bbc36b82bb71b4e5dfaa70c004454932da --experimental-control-plane

Output:

[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "<load balancer>:443"
[discovery] Created cluster-info discovery client, requesting info from "https://<load balancer>:443"
[discovery] Requesting info from "https://<load balancer>:443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "<load balancer>:443"
[discovery] Successfully established connection with API Server "<load balancer>:443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I1005 12:48:29.896403    8131 join.go:334] [join] running pre-flight checks before initializing the new control plane instance
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[certificates] Using the existing apiserver certificate and key.
[certificates] Using the existing apiserver-kubelet-client certificate and key.
[certificates] Using the existing front-proxy-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Using the existing sa key.
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'
timed out waiting for the condition

Two master nodes were installed successfully before experiencing this error. I used this as installation guideline: https://kubernetes.io/docs/setup/independent/high-availability/#set-up-the-cluster

My Load Balancer is running on the same node that I'm trying to install the cluster on, but I don't see why it might be an issue (maybe it is?).

kubelet logs don't show me anything critical:

   kubelet[26132]: I1005 09:34:32.667360   26132 server.go:408] Version: v1.12.0
   kubelet[26132]: I1005 09:34:32.667520   26132 plugins.go:99] No cloud provider specified.
   kubelet[26132]: W1005 09:34:32.667553   26132 server.go:553] standalone mode, no API client
   kubelet[26132]: W1005 09:34:32.745120   26132 server.go:465] No api server defined - no events will be sent to API server.
   kubelet[26132]: I1005 09:34:32.745178   26132 server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
   kubelet[26132]: I1005 09:34:32.745944   26132 container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
   kubelet[26132]: I1005 09:34:32.745974   26132 container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: En
   kubelet[26132]: I1005 09:34:32.746237   26132 container_manager_linux.go:271] Creating device plugin manager: true
   kubelet[26132]: I1005 09:34:32.746368   26132 state_mem.go:36] [cpumanager] initializing new in-memory state store
   kubelet[26132]: I1005 09:34:32.747800   26132 kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
   kubelet[26132]: I1005 09:34:32.752107   26132 client.go:75] Connecting to docker on unix:///var/run/docker.sock
   kubelet[26132]: I1005 09:34:32.752172   26132 client.go:104] Start docker client with request timeout=2m0s
   kubelet[26132]: W1005 09:34:32.754889   26132 docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
   kubelet[26132]: I1005 09:34:32.754954   26132 docker_service.go:236] Hairpin mode set to "hairpin-veth"
   kubelet[26132]: W1005 09:34:32.755195   26132 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
   kubelet[26132]: W1005 09:34:32.759325   26132 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
   kubelet[26132]: I1005 09:34:32.762094   26132 docker_service.go:251] Docker cri networking managed by kubernetes.io/no-op
   kubelet[26132]: I1005 09:34:32.789329   26132 docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan nul
   kubelet[26132]: I1005 09:34:32.789503   26132 docker_service.go:269] Setting cgroupDriver to cgroupfs
   kubelet[26132]: I1005 09:34:32.820067   26132 kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
   kubelet[26132]: I1005 09:34:32.822547   26132 server.go:1013] Started kubelet
   kubelet[26132]: W1005 09:34:32.822599   26132 kubelet.go:1387] No api server defined - no node status update will be sent.
   kubelet[26132]: E1005 09:34:32.822622   26132 kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
   kubelet[26132]: I1005 09:34:32.822624   26132 server.go:133] Starting to listen on 127.0.0.1:10250
   kubelet[26132]: I1005 09:34:32.823855   26132 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
   kubelet[26132]: I1005 09:34:32.823900   26132 status_manager.go:148] Kubernetes client is nil, not starting status manager.
   kubelet[26132]: I1005 09:34:32.823919   26132 kubelet.go:1804] Starting kubelet main sync loop.
   kubelet[26132]: I1005 09:34:32.823971   26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
   kubelet[26132]: I1005 09:34:32.824016   26132 volume_manager.go:248] Starting Kubelet Volume Manager
   kubelet[26132]: I1005 09:34:32.824094   26132 desired_state_of_world_populator.go:130] Desired state populator starts to run
   kubelet[26132]: I1005 09:34:32.824656   26132 server.go:318] Adding debug handlers to kubelet server.
   kubelet[26132]: I1005 09:34:32.924253   26132 kubelet.go:1821] skipping pod synchronization - [container runtime is down]
   kubelet[26132]: I1005 09:34:33.072557   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.077937   26132 cpu_manager.go:155] [cpumanager] starting with none policy
   kubelet[26132]: I1005 09:34:33.077967   26132 cpu_manager.go:156] [cpumanager] reconciling every 10s
   kubelet[26132]: I1005 09:34:33.077976   26132 policy_none.go:42] [cpumanager] none policy: Start
   kubelet[26132]: W1005 09:34:33.078616   26132 manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
   kubelet[26132]: I1005 09:34:33.078989   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.124726   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.130955   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.136320   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.136580   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.142780   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.143667   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.224945   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-ca-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.225058   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etcd-certs-0" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etcd-certs-0") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.225200   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-etc-pki") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.325745   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flexvolume-dir" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-flexvolume-dir") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.325834   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "etc-pki" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-etc-pki") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.325890   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-kubeconfig") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326047   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kubeconfig" (UniqueName: "kubernetes.io/host-path/dd3b0cd7d636afb2b116453dc6524f26-kubeconfig") pod "kube-scheduler-" (UID: "dd3b0cd7d636afb2b116453dc6524f26")
   kubelet[26132]: I1005 09:34:33.326393   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/c01ca7b14938930625aacf5a32476dd0-k8s-certs") pod "kube-apiserver-" (UID: "c01ca7b14938930625aacf5a32476dd0")
   kubelet[26132]: I1005 09:34:33.326524   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "k8s-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-k8s-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326645   26132 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "ca-certs" (UniqueName: "kubernetes.io/host-path/92f250670b6bc27fc8b90703d1196aa3-ca-certs") pod "kube-controller-manager-" (UID: "92f250670b6bc27fc8b90703d1196aa3")
   kubelet[26132]: I1005 09:34:33.326693   26132 reconciler.go:154] Reconciler: start to sync state
   dockerd[24966]: time="2018-10-05T09:34:33.789690025+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 40806fa9041d3a65d39fdc1a68e2415f0d77f84e0c4f8c163d3bd48fec0d763f"
   kubelet[26132]: W1005 09:34:33.792727   26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/92f250670b6bc27fc8b90703d1196aa3/kube-controller-manager/0.log"
   dockerd[24966]: time="2018-10-05T09:34:33.820145872+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 19328df83a640d71faf86310d1a4052f3af42e75513d9745a2775532803ba122"
   kubelet[26132]: W1005 09:34:33.822612   26132 docker_container.go:202] Deleted previously existing symlink file: "/var/log/pods/dd3b0cd7d636afb2b116453dc6524f26/kube-scheduler/0.log"
   dockerd[24966]: time="2018-10-05T09:34:33.836511632+02:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 6b9e3036a5027b42a4340ad0779be6030593d1a10df4367c0a0ca54ff1345f16"
   kubelet[26132]: I1005 09:34:33.851661   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.865408   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:33.874766   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: W1005 09:34:34.841803   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fc349d-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
   kubelet[26132]: W1005 09:34:34.841888   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/7c7d1db45cb11bf12de2eac803da8b77/volumes" does not exist
   kubelet[26132]: W1005 09:34:34.841935   26132 kubelet_getters.go:264] Path "/var/lib/kubelet/pods/43fbcf1b-c86e-11e8-a0aa-001018759bc8/volumes" does not exist
   kubelet[26132]: I1005 09:34:34.880168   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:34.880564   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:34.880645   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:43.121992   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:34:53.165661   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   sshd[26621]: Connection closed by 172.29.2.56 port 50080 [preauth]
   kubelet[26132]: I1005 09:35:03.210021   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:35:13.252179   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
   kubelet[26132]: I1005 09:35:23.295605   26132 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach

Any ideas?

EDIT:

When comparing the kubelets on the nodes I discobered, that kubelet was started like this on the other two nodes:

kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni

After the TLS timeout, I used this command on the third node which led to:

  I1005  .008343  server.go:408] Version: v1.12.0
  I1005  .008857  plugins.go:99] No cloud provider specified.
  I1005  .045644  certificate_store.go:131] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
  I1005  .134861  server.go:667] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /
  I1005  .135501  container_manager_linux.go:247] container manager verified user specified cgroup-root exists: []
  I1005  .135551  container_manager_linux.go:252] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:docker CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerReconcilePeriod:10s ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms}
  I1005  .135777  container_manager_linux.go:271] Creating device plugin manager: true
  I1005  .135829  state_mem.go:36] [cpumanager] initializing new in-memory state store
  I1005  .136055  state_mem.go:84] [cpumanager] updated default cpuset: ""
  I1005  .136084  state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]"
  I1005  .136410  kubelet.go:279] Adding pod path: /etc/kubernetes/manifests
  I1005  .136461  kubelet.go:304] Watching apiserver
  I1005  .141009  client.go:75] Connecting to docker on unix:///var/run/docker.sock
  I1005  .141054  client.go:104] Start docker client with request timeout=2m0s
  W1005  .143351  docker_service.go:540] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
  I1005  .143395  docker_service.go:236] Hairpin mode set to "hairpin-veth"
  W1005  .143618  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  W1005  .147722  hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
  W1005  .147880  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  I1005  .147944  docker_service.go:251] Docker cri networking managed by cni
  I1005  .177322  docker_service.go:256] Docker Info: &{ID:LJUT:6WWB:WNW2:UJGM:R5HT:4POO:QL2M:PFOI:OKZN:OBP2:ODQS:SSJU Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStopped:0 Images:19 Driver:overlay2 DriverStatus:[[Backing Filesystem xfs] [Supports d_type true] [Native Overlay Diff true]] SystemStatus:[] Plugins:{Volume:[local] Network:[bridge host macvlan null overlay] Authorization:[] Log:[awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog]} MemoryLimit:true SwapLimit:true KernelMemory:true CPUCfsPeriod:true CPUCfsQuota:true CPUShares:true CPUSet:true IPv4Forwarding:true BridgeNfIptables:true BridgeNfIP6tables:true Debug:false NFd:18 OomKillDisable:true NGoroutines:27 SystemTime:2018-10-05T .158551524+02:00 LoggingDriver:json-file CgroupDriver:cgroupfs NEventsListener:0 KernelVersion:4.18.5-1.el7.elrepo.x86_64 OperatingSystem:CentOS Linux 7 (Core) OSType:linux Architecture:x86_64 IndexServerAddress:https://index.docker.io/v1/ RegistryConfig:0xc4201e65b0 NCPU:40 MemTotal:134664974336 GenericResources:[] DockerRootDir:/export/data/docker HTTPProxy: HTTPSProxy: NoProxy: Name:dax Labels:[] ExperimentalBuild:false ServerVersion:17.06.2-ce ClusterStore: ClusterAdvertise: Runtimes:map[runc:{Path:docker-runc Args:[]}] DefaultRuntime:runc Swarm:{NodeID: NodeAddr: LocalNodeState:inactive ControlAvailable:false Error: RemoteManagers:[] Nodes:0 Managers:0 Cluster:<nil>} LiveRestoreEnabled:false Isolation: InitBinary:docker-init ContainerdCommit:{ID:6e23458c129b551d5c9871e5174f6b1b7f6d1170 Expected:6e23458c129b551d5c9871e5174f6b1b7f6d1170} RuncCommit:{ID:810190ceaa507aa2727d7ae6f4790c76ec150bd2 Expected:810190ceaa507aa2727d7ae6f4790c76ec150bd2} InitCommit:{ID:949e6fa Expected:949e6fa} SecurityOptions:[name=seccomp,profile=default]}
  I1005  .177565  docker_service.go:269] Setting cgroupDriver to cgroupfs
  I1005  .211074  kuberuntime_manager.go:197] Container runtime docker initialized, version: 17.06.2-ce, apiVersion: 1.30.0
  I1005  .213560  server.go:1013] Started kubelet
  E1005  .213611  kubelet.go:1287] Image garbage collection failed once. Stats initialization may not have completed yet: failed to get imageFs info: unable to find data in memory cache
  I1005  .213712  server.go:133] Starting to listen on 0.0.0.0:10250
  I1005  .216143  fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
  I1005  .216334  status_manager.go:152] Starting to sync pod status with apiserver
  I1005  .216447  kubelet.go:1804] Starting kubelet main sync loop.
  I1005  .216962  kubelet.go:1821] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 2562047h47m16.854775807s ago; threshold is 3m0s]
  I1005  .218285  volume_manager.go:248] Starting Kubelet Volume Manager
  I1005  .218904  desired_state_of_world_populator.go:130] Desired state populator starts to run
  I1005  .220387  server.go:318] Adding debug handlers to kubelet server.
  W1005  .221605  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  E1005  .221954  kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
  E1005  .317227  kubelet.go:2236] node "dax" not found
  I1005  .317229  kubelet.go:1821] skipping pod synchronization - [container runtime is down]
  I1005  .318558  kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach
  I1005  .323926  kubelet_node_status.go:70] Attempting to register node dax
  I1005  .332022  kubelet_node_status.go:73] Successfully registered node dax
  I1005  .417546  kuberuntime_manager.go:910] updating runtime config through cri with podcidr 10.244.3.0/24
  I1005  .418060  docker_service.go:345] docker cri received runtime config &RuntimeConfig{NetworkConfig:&NetworkConfig{PodCidr:10.244.3.0/24,},}
  I1005  .418505  kubelet_network.go:75] Setting Pod CIDR: -> 10.244.3.0/24
  I1005  .465985  cpu_manager.go:155] [cpumanager] starting with none policy
  I1005  .466004  cpu_manager.go:156] [cpumanager] reconciling every 10s
  I1005  .466012  policy_none.go:42] [cpumanager] none policy: Start
  W1005  .466606  manager.go:527] Failed to retrieve checkpoint for "kubelet_internal_checkpoint": checkpoint is not found
  W1005  .467018  container_manager_linux.go:803] CPUAccounting not enabled for pid: 
  W1005  .467029  container_manager_linux.go:806] MemoryAccounting not enabled for pid: 
  W1005  .467770  cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
  E1005  .467952  kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
  I1005  .520111  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-lib-modules") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520186  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-run-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-run-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520296  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "run" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-run") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520485  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-net-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-net-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520581  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy" (UniqueName: "kubernetes.io/configmap/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .520641  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-lib-modules") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .520697  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "var-lib-calico" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-var-lib-calico") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520755  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "flannel-cfg" (UniqueName: "kubernetes.io/configmap/dde7c5af-c893-11e8-a0aa-001018759bc8-flannel-cfg") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520855  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "cni-bin-dir" (UniqueName: "kubernetes.io/host-path/dde7c5af-c893-11e8-a0aa-001018759bc8-cni-bin-dir") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .520952  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "canal-token-nsdwz" (UniqueName: "kubernetes.io/secret/dde7c5af-c893-11e8-a0aa-001018759bc8-canal-token-nsdwz") pod "canal-tmm28" (UID: "dde7c5af-c893-11e8-a0aa-001018759bc8")
  I1005  .521094  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/dde74f33-c893-11e8-a0aa-001018759bc8-xtables-lock") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .521160  reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "kube-proxy-token-zjtdh" (UniqueName: "kubernetes.io/secret/dde74f33-c893-11e8-a0aa-001018759bc8-kube-proxy-token-zjtdh") pod "kube-proxy-qbkzh" (UID: "dde74f33-c893-11e8-a0aa-001018759bc8")
  I1005  .521232  reconciler.go:154] Reconciler: start to sync state
  E1005  .537905  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .574965  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .613275  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
  E1005  .656607  summary_sys_containers.go:45] Failed to get system container stats for "/user.slice/user-0.slice/session-145.scope": failed to get cgroup stats for "/user.slice/user-0.slice/session-145.scope": failed to get container info for "/user.slice/user-0.slice/session-145.scope": unknown container "/user.slice/user-0.slice/session-145.scope"
-- albrr
kubelet
kubernetes

2 Answers

1/21/2019

Thanks so much for adding your solution! This is why I did in my case:

  1. Uninstall and purge kubelet, kubeadm and kubectl.
  2. Clear /etc/systemd/system/kubelnet.service.d
  3. Reinstall and retry.

On Ubuntu:

apt-get remove --purge kubelet kubeadm kubectl
rm -rf /etc/systemd/system/kubelnet.service.d
apt-get install kubelet kubeadm kubectl
kubeadm join ...
-- Traz
Source: StackOverflow

10/5/2018

I found the solution myself - a config file in /etc/systemd/system/kubelet.service.d used wrong startup parameters - I changed them and it resolved my problem

The file 20-etcd-service-manager.conf containing the values

ExecStart=/usr/bin/kubelet --address=127.0.0.1
--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true

caused my problem. I changed it to

ExecStart=/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni

because these were the parameters for my other nodes. It might be even better to just delete the file so it doesn't override any other settings

-- albrr
Source: StackOverflow