How to setup a 2-node kubernetes cluster in custom environment

12/17/2018

Env

I've set up a 2-node kubernetes cluster in custom environment (it's not Google Cloud, not AWS, not Azure) but it's backed by Amazon EC2 instances. So I have 2 c1xlarge (4 CPU, 8GB RAM, CentOS 7.4 v18.01) machines in US West region.

Problem

When I ssh to the kubernetes master machine a day after the setup, I see this:

[administrator@d4191051 ~]$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

Could someone please review my setup and suggest what I could be doing wrong? I have been reading documentation and setups from other people for 3 weeks now but could not manage to have this cluster up and running in a stable way.

Setup

It does work though right after the setup described below (if not specified, the commands were run on both master and worker nodes):

ssh administrator@10.40.50.60 # master
ssh administrator@10.40.50.61 # worker

[administrator@d4191051 ~]$ sudo vi /etc/hosts # master
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1     localhost localhost.localdomain localhost6 localhost6.localdomain6
10.40.50.61       56fa67ff
10.40.50.60       d4191051

[administrator@56fa67ff ~]$ sudo vi /etc/hosts # worker
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1     localhost localhost.localdomain localhost6 localhost6.localdomain6
10.40.50.60       d4191051
10.40.50.61       56fa67ff

# disable SELinux
cat /etc/sysconfig/selinux
setenforce 0
sudo sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

sudo su root

# disable swap memory
swapoff -a
vi /etc/fstab # comment out the swap line

yum update -y

# install docker on CentOS 7
yum install yum-utils device-mapper-persistent-data lvm2 -y
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install docker-ce-18.06.1.ce-3.el7.x86_64 -y

# configure Kubernetes repository
vi /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
       https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

# install kubernetes
yum install kubelet kubeadm kubectl -y

reboot
ssh administrator@10.40.50.60 # master
ssh administrator@10.40.50.61 # worker
sudo su root

# start docker service
systemctl start docker && systemctl enable docker

systemctl start firewalld

# open Kubernetes ports in firewall on master
firewall-cmd --permanent --add-port=6443/tcp        # Kubernetes API Server
firewall-cmd --permanent --add-port=2379-2380/tcp   # etcd server client API
firewall-cmd --permanent --add-port=10250/tcp       # Kubelet API
firewall-cmd --permanent --add-port=10251/tcp       # kube-scheduler
firewall-cmd --permanent --add-port=10252/tcp       # kube-controller-manager
firewall-cmd --permanent --add-port=10255/tcp       # Read-Only Kubelet API

# open Kubernetes ports in firewall on worker
firewall-cmd --permanent --add-port=10250/tcp       # Kubelet API
firewall-cmd --permanent --add-port=10255/tcp       # Read-Only Kubelet API
firewall-cmd --permanent --add-port=30000-32767/tcp # NodePort Services
firewall-cmd --permanent --add-port=6783/tcp        # Allows the node to join the overlay network that allows service discovery among nodes on a Docker Cloud account
#firewall-cmd --permanent --add-port=6783/udp        # Allows the node to join the overlay network that allows service discovery among nodes on a Docker Cloud account

firewall-cmd --reload

Ctrl-D

# enable the br_netfilter kernel module
cat /proc/sys/net/bridge/bridge-nf-call-iptables
sudo modprobe br_netfilter
echo '1' | sudo tee /proc/sys/net/bridge/bridge-nf-call-iptables
sudo sysctl net.bridge.bridge-nf-call-iptables=1

# initialize Kubernetes cluster on master
[administrator@d4191051 ~]$ sudo kubeadm init --apiserver-advertise-address=10.40.50.60 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
  [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
  [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [d4191051 localhost] and IPs [10.40.50.60 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [d4191051 localhost] and IPs [10.40.50.60 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [d4191051 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.40.50.60]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 22.502676 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "d4191051" as an annotation
[mark-control-plane] Marking the node d4191051 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node d4191051 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: hx910e.xh7kl0zbcjqsktdv
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.40.50.60:6443 --token hx910e.xh7kl0zbcjqsktdv --discovery-token-ca-cert-hash sha256:39ee4baaf600d1872ef2482cfa2a895e21dacee92a831e3e3f0af2f0278db2d3

# configure the cluster for a non-root user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
echo "export KUBECONFIG=$HOME/admin.conf" | tee -a ~/.bashrc

sudo su root

# start kubernetes service
systemctl start kubelet && systemctl enable kubelet

# use the cgroupfs driver
docker info | grep -i cgroup
sed -i 's/cgroup-driver=systemd/cgroup-driver=cgroupfs/g' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# add --cgroup-driver=cgroupfs to KUBELET_CGROUP_ARGS=
vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet

CtrlD

# deploy a pod network to the cluster (this article suggests to use flannel network https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/ )
[administrator@d4191051 ~]$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

# check the cluster nodes (wait until it's ready)
[administrator@d4191051 ~]$ kubectl get nodes
NAME           STATUS   ROLES    AGE     VERSION
d4191051   Ready    master   3m56s   v1.13.0

# join the Kubernetes pod network on worker
[administrator@56fa67ff ~]$ sudo kubeadm join 10.40.50.60:6443 --token hx910e.xh7kl0zbcjqsktdv --discovery-token-ca-cert-hash sha256:39ee4baaf600d1872ef2482cfa2a895e21dacee92a831e3e3f0af2f0278db2d3
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "10.40.50.60:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.40.50.60:6443"
[discovery] Requesting info from "https://10.40.50.60:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server 

"10.40.50.60:6443"
[discovery] Successfully established connection with API Server "10.40.50.60:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "56fa67ff" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

# monitor kubernetes on master
[administrator@d4191051 ~]$ kubectl get nodes
NAME           STATUS   ROLES    AGE     VERSION
56fa67ff   Ready    <none>   30s     v1.13.0
d4191051   Ready    master   6m42s   v1.13.0

[administrator@cfe9680b ~]$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY   STATUS              RESTARTS   AGE
kube-system   coredns-86c58d9df4-7d5xz               0/1     ContainerCreating   0          7m4s
kube-system   coredns-86c58d9df4-nm2nw               0/1     ContainerCreating   0          7m4s
kube-system   etcd-cfe9680b                      1/1     Running             0          6m17s
kube-system   kube-apiserver-cfe9680b            1/1     Running             0          6m2s
kube-system   kube-controller-manager-cfe9680b   1/1     Running             0          6m20s
kube-system   kube-flannel-ds-amd64-2p77k            1/1     Running             1          2m48s
kube-system   kube-flannel-ds-amd64-vvvbx            1/1     Running             0          4m14s
kube-system   kube-proxy-67sdh                       1/1     Running             0          2m48s
kube-system   kube-proxy-ptdpv                       1/1     Running             0          7m4s
kube-system   kube-scheduler-cfe9680b            1/1     Running             0          6m25s
-- Robin
centos7
kubectl
kubernetes

1 Answer

12/17/2018

To access your cluster as a non-root user you are doing following steps(I am assuming as non root user):

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You're running following command which is not correct:

echo "export KUBECONFIG=$HOME/admin.conf" | tee -a ~/.bashrc

And it should be:

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" | tee -a ~/.bashrc

Now check your config file in .kube folder, it should look like

[centos@ip-10-0-1-91 ~]$ ls -al $HOME/.kube
drwxr-xr-x.  3 centos centos   23 Dec 17 11:42 cache
-rw-------.  1 centos centos 5573 Dec 17 11:42 config
drwxrwxr-x.  3 centos centos 4096 Dec 17 11:42 http-cache

The owner should be your non-root user. If owner is root user then you should run first three command as non-root user and it will work.

Hope this helps.

-- Prafull Ladha
Source: StackOverflow