Couldn't initialize a Kubernetes cluster with Vagrant

11/5/2018

I want to create a K8s cluster (1 master node and 2 slave nodes) with Vagrant on W10.

I have a problem when starting my master node.

I do a sudo kubeadm init to start my master node, but the command fails.

"/etc/kubernetes/manifests/etcd.yaml" [init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" [init] this might take a minute or longer if the control plane images have to be pulled

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker. Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID' couldn't initialize a Kubernetes cluster

I check with systemctl status kubelet that kubelet is running:

kubelet.service - kubelet: The Kubernetes Node Agent    Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)   Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf    Active: active (running) since Mon 2018-11-05 13:55:48 UTC; 36min ago
     Docs: https://kubernetes.io/docs/home/  Main PID: 24683 (kubelet)
    Tasks: 18 (limit: 1135)    CGroup: /system.slice/kubelet.service
           └─24683 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-dr

Nov 05 14:32:07 master-node kubelet[24683]: E1105 14:32:07.605330   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:07 master-node kubelet[24683]: E1105 14:32:07.710945   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:07 master-node kubelet[24683]: W1105 14:32:07.801125   24683 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d Nov 05 14:32:07 master-node kubelet[24683]: E1105 14:32:07.804756   24683 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docke Nov 05 14:32:07 master-node kubelet[24683]: E1105 14:32:07.813349   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:07 master-node kubelet[24683]: E1105 14:32:07.916319   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:08 master-node kubelet[24683]: E1105 14:32:08.030146   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:08 master-node kubelet[24683]: E1105 14:32:08.136622   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:08 master-node kubelet[24683]: E1105 14:32:08.238376   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:08 master-node kubelet[24683]: E1105 14:32:08.340852   24683 kubelet.go:2236] node "master-node" not found

and after I check logs with journalctl -xeu kubelet:

Nov 05 14:32:39 master-node kubelet[24683]: E1105 14:32:39.328035   24683 kubelet.go:2236] node "master-node" not found Nov 05 14:32:39 master-node kubelet[24683]: E1105 14:32:39.632382   24683 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Get https://10.0.2.15:6 Nov 05 14:32:39 master-node kubelet[24683]: E1105 14:32:39.657289   24683 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list
*v1.Pod: Get https://10.0.2. Nov 05 14:32:39 master-node kubelet[24683]: E1105 14:32:39.752441   24683 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Get https://10.0.2.15:6443 Nov 05 14:32:39 master-node kubelet[24683]: I1105 14:32:39.804026   24683 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach Nov 05 14:32:39 master-node kubelet[24683]: I1105 14:32:39.835423   24683 kubelet_node_status.go:70] Attempting to register node master-node Nov 05 14:32:41 master-node kubelet[24683]: I1105 14:32:41.859955   24683 kubelet_node_status.go:276] Setting node annotation to enable volume controller attach/detach Nov 05 14:32:41 master-node kubelet[24683]: E1105 14:32:41.881897   24683 pod_workers.go:186] Error syncing pod e808f2bea99d167c3e91a819362a586b ("kube-apiserver-master-node_kube-system(e80

I don't understand the error. Should I start a CNI (like weave) before starting my master node?

You can find my vagrantfile here, maybe I forget something:

Vagrant.configure("2") do |config|   config.vm.box = "bento/ubuntu-18.04"   config.vm.box_check_update = true   config.vm.network "public_network"   config.vm.hostname = "master-node"   config.vm.provider :virtualbox do |vb|
        vb.name = "master-node"
    end

  config.vm.provision "shell", inline: <<-SHELL

     echo "UPDATE"
     apt-get -y update

     echo "INSTALL PREREQUIER"
     apt-get install -y apt-transport-https ca-certificates curl software-properties-common

     echo "START INSTALL DOCKER"
     curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
     add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
     apt-get -y update
     apt-get install -y docker-ce
     systemctl start docker
     systemctl enable docker
     usermod -aG docker vagrant
     curl -L "https://github.com/docker/compose/releases/download/1.22.0/docker-compose-$(uname
-s)-$(uname -m)" -o /usr/local/bin/docker-compose
     chmod +x /usr/local/bin/docker-compose
     chown vagrant /var/run/docker.sock
     docker-compose --version
     docker --version
     echo "END INSTALL DOCKER"

     echo "START INSTALL KUBENETES"
     curl -s "https://packages.cloud.google.com/apt/doc/apt-key.gpg" | apt-key add -
     echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" >> /etc/apt/sources.list.d/kubernetes.list
     apt-get -y update
     swapoff -a
     sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
     apt-get install -y kubelet kubeadm kubectl
     systemctl enable kubelet
     systemctl start kubelet
     echo "END INSTALL KUBENETES"
     kubeadm config images pull #pre-download kubeadm config FOR MASTER ONLY

     IPADDR=`hostname -I`
     echo "This VM has IP address $IPADDR"
     SHELL
  end

If i do a docker ps -a after the error, i can see two kube-apiserver, one is up and another is exited.

CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS                        PORTS               NAMES
befac2364452        51a9c329b7c5           "kube-apiserver --au…"   45 seconds ago      Up 42 seconds                                     k8s_kube-apiserver_kube-apiserver-master-node_kube-system_de7285496ca374bf069328c290f65db8_2
dab8889cada8        51a9c329b7c5           "kube-apiserver --au…"   3 minutes ago       Exited (137) 46 seconds ago                       k8s_kube-apiserver_kube-apiserver-master-node_kube-system_de7285496ca374bf069328c290f65db8_1
87d74bdeb62b        3cab8e1b9802           "etcd --advertise-cl…"   5 minutes ago       Up 5 minutes                                      k8s_etcd_etcd-master-node_kube-system_2dba96180d17235a902e739497ef2f50_0
4d869d0be44f        15548c720a70           "kube-controller-man…"   5 minutes ago       Up 5 minutes                                      k8s_kube-controller-manager_kube-controller-manager-master-node_kube-system_7c81d10c743d19c292e161476cf2b945_0
1f72b9b636b4        d6d57c76136c           "kube-scheduler --ad…"   5 minutes ago       Up 5 minutes                                      k8s_kube-scheduler_kube-scheduler-master-node_kube-system_ee7b1077c61516320f4273309e9b4690_0
6116a35a7ec7        k8s.gcr.io/pause:3.1   "/pause"                 5 minutes ago       Up 5 minutes                                      k8s_POD_etcd-master-node_kube-system_2dba96180d17235a902e739497ef2f50_0
5de762296ece        k8s.gcr.io/pause:3.1   "/pause"                 5 minutes ago       Up 5 minutes                                      k8s_POD_kube-controller-manager-master-node_kube-system_7c81d10c743d19c292e161476cf2b945_0
156544886f28        k8s.gcr.io/pause:3.1   "/pause"                 5 minutes ago       Up 5 minutes                                      k8s_POD_kube-scheduler-master-node_kube-system_ee7b1077c61516320f4273309e9b4690_0
1f6c396fc6e0        k8s.gcr.io/pause:3.1   "/pause"                 5 minutes ago       Up 5 minutes                                      k8s_POD_kube-apiserver-master-node_kube-system_de7285496ca374bf069328c290f65db8_0

EDIT: If i check the logs of the k8s_kube-apiserver who has exited i see that

Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I1107 10:35:23.236063       1 server.go:681] external host was not specified, using 192.168.1.49
I1107 10:35:23.237046       1 server.go:152] Version: v1.12.2
I1107 10:35:42.690715       1 plugins.go:158] Loaded 8 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,Priority,DefaultTolerationSeconds,DefaultStorageClass,MutatingAdmissionWebhook.
I1107 10:35:42.691369       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
I1107 10:35:42.705302       1 plugins.go:158] Loaded 8 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,Priority,DefaultTolerationSeconds,DefaultStorageClass,MutatingAdmissionWebhook.
I1107 10:35:42.709912       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
I1107 10:35:59.955297       1 master.go:240] Using reconciler: lease
W1107 10:36:31.566656       1 genericapiserver.go:325] Skipping API batch/v2alpha1 because it has no resources.
W1107 10:36:41.454087       1 genericapiserver.go:325] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
W1107 10:36:41.655602       1 genericapiserver.go:325] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
W1107 10:36:42.148577       1 genericapiserver.go:325] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
W1107 10:36:59.451535       1 genericapiserver.go:325] Skipping API admissionregistration.k8s.io/v1alpha1 because it has no resources.
[restful] 2018/11/07 10:37:00 log.go:33: [restful/swagger] listing is available at https://192.168.1.49:6443/swaggerapi
[restful] 2018/11/07 10:37:00 log.go:33: [restful/swagger] https://192.168.1.49:6443/swaggerui/ is mapped to folder /swagger-ui/
[restful] 2018/11/07 10:37:37 log.go:33: [restful/swagger] listing is available at https://192.168.1.49:6443/swaggerapi
[restful] 2018/11/07 10:37:37 log.go:33: [restful/swagger] https://192.168.1.49:6443/swaggerui/ is mapped to folder /swagger-ui/
I1107 10:37:38.920238       1 plugins.go:158] Loaded 8 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,Priority,DefaultTolerationSeconds,DefaultStorageClass,MutatingAdmissionWebhook.
I1107 10:37:38.920985       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.

I also notice that k8s_kube-apiserver containers start and exited in a loop.

Thanks a lot!

-- zlacheman
kubernetes
vagrant

1 Answer

11/5/2018

Your kubelet is running but looks like it can't talk to the API server.

I would check on the VM:

docker ps | grep apiserver

You should get something like this:

$ docker ps | grep api
2f15a11f65f4        dcb029b5e3ad           "kube-apiserver --au…"   2 weeks ago         Up 2 weeks                              k8s_kube-apiserver_kube-apiserver-xxxx.internal_kube-system_acd8011fdf93688f6391aaca470a1fe8_2
8a1a5ce855aa        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-apiserver-xxxx.internal_kube-system_acd8011fdf93688f6391aaca470a1fe8_2

Then look at the logs to see if you see any failures:

$ docker logs 2f15a11f65f4

If you don't see the kube-apiserver containers you might want to try docker ps -a which would mean that at some point it crashed.

Hope it helps.

-- Rico
Source: StackOverflow