Unable to create kubernetes with HA etc on Ubuntu 20.04

1/6/2021

Since two days I am fighting with Kubernetes setup on Ubuntu 20.04. I created so called template vm on vSphere and I cloned three vm's out of it.

I have following configurations for each master node:

/etc/hosts

127.0.0.1 localhost
127.0.1.1 kubernetes-master1 

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
192.168.255.200 kubernetes-cluster.homelab01.local
192.168.255.201 kubernetes-master1.homelab01.local
192.168.255.202 kubernetes-master2.homelab01.local
192.168.255.203 kubernetes-master3.homelab01.local
192.168.255.204 kubernetes-worker1.homelab01.local
192.168.255.205 kubernetes-worker2.homelab01.local
192.168.255.206 kubernetes-worker3.homelab01.local

127.0.1.1 kubernetes-master1on a first master and 127.0.1.1 kubernetes-master2 on second one and 127.0.1.1 kubernetes-master3 on the third one.

I am using Docker 19.03.11 which is latest supported by Kubernetes as per documentation.

Docker

Client: Docker Engine - Community
 Version:           19.03.11
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        42e35e61f3
 Built:             Mon Jun  1 09:12:34 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.11
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       42e35e61f3
  Built:            Mon Jun  1 09:11:07 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

I used following commands to install docker:

sudo apt-get update && sudo apt-get install -y \
  containerd.io=1.2.13-2 \
  docker-ce=5:19.03.11~3-0~ubuntu-$(lsb_release -cs) \
  docker-ce-cli=5:19.03.11~3-0~ubuntu-$(lsb_release -cs)

I marked all the necessary packets on hold.

sudo apt-mark hold kubelet kubeadm kubectl docker-ce containerd.io docker-ce-cli

Some details about the VM's.

Master1

sudo cat /sys/class/dmi/id/product_uuid
f09c3242-c8f7-c97e-bc6a-b2065c286ea9
IP: 192.168.255.201

Master2

sudo cat /sys/class/dmi/id/product_uuid
b4fe3242-ba37-a533-c12f-b30b735cbe9f
IP: 192.168.255.202

Master3

sudo cat /sys/class/dmi/id/product_uuid
c3cc3242-4115-8c38-8e46-166190620249
IP: 192.168.255.203

IP addresses and name resolution works flawless on all hosts

192.168.255.200 kubernetes-cluster.homelab01.local
192.168.255.201 kubernetes-master1.homelab01.local
192.168.255.202 kubernetes-master2.homelab01.local
192.168.255.203 kubernetes-master3.homelab01.local
192.168.255.204 kubernetes-worker1.homelab01.local
192.168.255.205 kubernetes-worker2.homelab01.local
192.168.255.206 kubernetes-worker3.homelab01.local

Keepalived.conf

From master1. On master2 it has state=backup and priority 100, on master3 state=backup and priority 89.

! /etc/keepalived/keepalived.conf
! Configuration File for keepalived

$STATE=MASTER
$INTERFACE=ens160
$ROUTER_ID=51
$PRIORITY=255
$AUTH_PASS=Kub3rn3t3S!
$APISERVER_VIP=192.168.255.200/24

global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state $STATE
    interface $INTERFACE
    virtual_router_id $ROUTER_ID
    priority $PRIORITY
    authentication {
        auth_type PASS
        auth_pass $AUTH_PASS
    }
    virtual_ipaddress {
        $APISERVER_VIP
    }
    track_script {
        check_apiserver
    }
}

check_apiserver.sh

/etc/keepalived/check_apiserver.sh
#!/bin/sh
APISERVER_VIP=192.168.255.200
APISERVER_DEST_PORT=6443

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
    curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi

Keepalive service

sudo service keepalived status
keepalived.service - Keepalive Daemon (LVS and VRRP)
     Loaded: loaded (/lib/systemd/system/keepalived.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2021-01-06 16:41:38 CET; 1min 26s ago
   Main PID: 804 (keepalived)
      Tasks: 2 (limit: 4620)
     Memory: 4.7M
     CGroup: /system.slice/keepalived.service
             ├─804 /usr/sbin/keepalived --dont-fork
             └─840 /usr/sbin/keepalived --dont-fork

Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: Registering Kernel netlink reflector
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: Registering Kernel netlink command channel
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: Opening file '/etc/keepalived/keepalived.conf>
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: WARNING - default user 'keepalived_script' fo>
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: (Line 29) Truncating auth_pass to 8 characters
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: SECURITY VIOLATION - scripts are being execut>
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: (VI_1) ignoring tracked script check_apiserve>
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: Warning - script check_apiserver is not used
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: Registering gratuitous ARP shared channel
Jan 06 16:41:38 kubernetes-master1 Keepalived_vrrp[840]: (VI_1) Entering MASTER STATE
lines 1-20/20 (END)

haproxy.cfg

# /etc/haproxy/haproxy.cfg
#
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
    bind *:8443
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server kubernetes-master1 192.168.255.201:6443 check
        server kubernetes-master2 192.168.255.202:6443 check
        server kubernetes-master3 192.168.255.203:6443 check

haproxy service status

sudo service haproxy status
● haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2021-01-06 16:41:38 CET; 3min 12s ago
       Docs: man:haproxy(1)
             file:/usr/share/doc/haproxy/configuration.txt.gz
    Process: 847 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUC>
   Main PID: 849 (haproxy)
      Tasks: 3 (limit: 4620)
     Memory: 4.7M
     CGroup: /system.slice/haproxy.service
             ├─849 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/hapro>
             └─856 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/hapro>

Jan 06 16:41:38 kubernetes-master1 haproxy[856]: Server apiserver/kubernetes-master1 is DOWN, reason: >
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: [WARNING] 005/164139 (856) : Server apiserver/kuberne>
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: Server apiserver/kubernetes-master2 is DOWN, reason: >
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: Server apiserver/kubernetes-master2 is DOWN, reason: >
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: [WARNING] 005/164139 (856) : Server apiserver/kuberne>
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: [ALERT] 005/164139 (856) : backend 'apiserver' has no>
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: Server apiserver/kubernetes-master3 is DOWN, reason: >
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: Server apiserver/kubernetes-master3 is DOWN, reason: >
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: backend apiserver has no server available!
Jan 06 16:41:39 kubernetes-master1 haproxy[856]: backend apiserver has no server available!
lines 1-23/23 (END)

I am creating the first kubernetes node with following command

sudo kubeadm init --control-plane-endpoint kubernetes-cluster.homelab01.local:8443 --upload-certs

This works well and I apply Calico CNI plugin with command

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

After that I am attempting join from master2.

Keepalived works perfectly fine as I tested it on all three with stopping service and observing failover to other nodes. When on the first master1 node I created kubernetes haproxy informed that backend was visible.

Kubernetes cluster bootstrap process

udo kubeadm init --control-plane-endpoint kubernetes-cluster.homelab01.local:8443 --upload-certs
[init] Using Kubernetes version: v1.20.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes-cluster.homelab01.local kubernetes-master1 kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.255.201]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes-master1 localhost] and IPs [192.168.255.201 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes-master1 localhost] and IPs [192.168.255.201 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 18.539325 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
57abea9f00357a4459c852249ac0170633c9a0f2327cde191e529a1689ea158b
[mark-control-plane] Marking the node kubernetes-master1 as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node kubernetes-master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 2cu336.rjxs8i0svtna27ke
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join kubernetes-cluster.homelab01.local:8443 --token 2cu336.rjxs8i0svtna27ke \
    --discovery-token-ca-cert-hash sha256:eb0668ca16acec622e4a97d69e0d4c42e64b1a61ffea13a3787956817021ca54 \
    --control-plane --certificate-key 57abea9f00357a4459c852249ac0170633c9a0f2327cde191e529a1689ea158b

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join kubernetes-cluster.homelab01.local:8443 --token 2cu336.rjxs8i0svtna27ke \
    --discovery-token-ca-cert-hash sha256:eb0668ca16acec622e4a97d69e0d4c42e64b1a61ffea13a3787956817021ca54

All stuff is up and running on master1

NAMESPACE     NAME                                             READY   STATUS    RESTARTS   AGE
kube-system   pod/calico-kube-controllers-744cfdf676-mks4d     1/1     Running   0          36s
kube-system   pod/calico-node-bnvmz                            1/1     Running   0          37s
kube-system   pod/coredns-74ff55c5b-skdzk                      1/1     Running   0          3m11s
kube-system   pod/coredns-74ff55c5b-tctl9                      1/1     Running   0          3m11s
kube-system   pod/etcd-kubernetes-master1                      1/1     Running   0          3m4s
kube-system   pod/kube-apiserver-kubernetes-master1            1/1     Running   0          3m4s
kube-system   pod/kube-controller-manager-kubernetes-master1   1/1     Running   0          3m4s
kube-system   pod/kube-proxy-smmmx                             1/1     Running   0          3m11s
kube-system   pod/kube-scheduler-kubernetes-master1            1/1     Running   0          3m4s

NAMESPACE     NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
default       service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP                  3m17
s
kube-system   service/kube-dns     ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   3m11
s

NAMESPACE     NAME                         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SE
LECTOR            AGE
kube-system   daemonset.apps/calico-node   1         1         1       1            1           kuberne
tes.io/os=linux   38s
kube-system   daemonset.apps/kube-proxy    1         1         1       1            1           kuberne
tes.io/os=linux   3m11s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-kube-controllers   1/1     1            1           38s
kube-system   deployment.apps/coredns                   2/2     2            2           3m11s

NAMESPACE     NAME                                                 DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/calico-kube-controllers-744cfdf676   1         1         1       37s
kube-system   replicaset.apps/coredns-74ff55c5b                    2         2         2       3m11s

Immediately after attempting to join master2 to cluster master1 kubernetes dies.

wojcieh@kubernetes-master2:~$ sudo kubeadm join kubernetes-cluster.homelab01.local:8443 --token 2cu336.rjxs8i0svtna27ke \
>     --discovery-token-ca-cert-hash sha256:eb0668ca16acec622e4a97d69e0d4c42e64b1a61ffea13a3787956817021ca54 \
>     --control-plane --certificate-key 57abea9f00357a4459c852249ac0170633c9a0f2327cde191e529a1689ea158b
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes-cluster.homelab01.local kubernetes-master2 kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.255.202]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes-master2 localhost] and IPs [192.168.255.202 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes-master2 localhost] and IPs [192.168.255.202 127.0.0.1 ::1]
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.

Broadcast message from systemd-journald@kubernetes-master2 (Wed 2021-01-06 16:53:04 CET):

haproxy[870]: backend apiserver has no server available!


Broadcast message from systemd-journald@kubernetes-master2 (Wed 2021-01-06 16:53:04 CET):

haproxy[870]: backend apiserver has no server available!

^C
wojcieh@kubernetes-master2:~$

Here are some logs which might be relevant

Logs from master1 https://pastebin.com/Y1zcwfWt

Logs from master2 https://pastebin.com/rBELgK1Y

-- wojcieh
calico
docker
kubernetes
ubuntu-20.04

0 Answers