How do I properly bring up a kubernetes cluster that was previously brought down?

9/19/2018

I have created a local 3-node kubernetes cluster in GNOME-Boxes, using the CentOS minimal ISO. This is for testing a custom install on client-provisioned machines. Everything went very smooth, and I even had things working well for a few days. However, I needed to restart my server, so I brought the k8s cluster down with it via the shutdown now command run on each node in the cluster. When I brought everything back up, the cluster did not come back up as expected. The logs tell me there was an issue bringing up apiserver and etcd images. docker logs for apiserver show me this:

Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I0919 03:05:10.238042       1 server.go:703] external host was not specified, using 192.168.122.2
I0919 03:05:10.238160       1 server.go:145] Version: v1.11.3
Error: unable to load server certificate: open /etc/kubernetes/pki/apiserver.crt: permission denied
...[cli params for kube-apiserver]
    error: unable to load server certificate: open /etc/kubernetes/pki/apiserver.crt: permission denied

When I check the permissions, it is set to 644, and the file is definitely there. My real question is why does it work when I initialize my cluster with kubeadm, then fail to restart properly?

Here are the steps I am using to init my cluster:

# NOTE: this file needs to be run as root
#  1: install kubelet, kubeadm, kubectl, and docker
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF
yum install -y kubelet kubeadm kubectl docker --disableexcludes=kubernetes
systemctl enable --now kubelet
systemctl enable --now docker

# 2: disable enforcement of SELinux policies (k8s has own policies)
sed -i 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/sysconfig/selinux
setenforce 0

# 3: make sure the network can function properly
sysctl net.bridge.bridge-nf-call-iptables=1

# 4. insert all necessary modules
modprobe --all ip_vs_wrr ip_vs_sh ip_vs ip_vs_rr
cat <<EOF > /etc/modules-load.d/ifvs.conf
ip_vs_wrr
ip_vs_sh
ip_vs
ip_vs_rr
EOF
systemctl disable --now firewalld

# 5: initialize the cluster. this should happen only on the master node. This will print out instructions and a command that should be run on each supporting node.
kubeadm init --pod-network-cidr=10.244.0.0/16 

# 6: run the kubeadm join command from result of step 5 on all the other nodes
kubeadm join 192.168.122.91:6443 --token jvr7dh.ymoahxxhu3nig8kl --discovery-token-ca-cert-hash sha256:7cc1211aa882c535f371e2cf6706072600f2cc47b7da18b1d242945c2d8cab65

#################################
# the cluster is  all setup to be accessed via API. use kubectl on your local machine from here on out!
# to access the cluster via kubectl, you need to merge the contents of <master_node>:/etc/kubernetes/admin.conf with your local ~/.kube/config
#################################

# 7: to allow the master to run pods: 
kubectl taint nodes --all node-role.kubernetes.io/master-

# 8: install the networking node: 
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml

# 10: setup dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

# 11: set admin user (for dashboard)
kubectl apply -f deploy/admin-user.yaml
# copy the token into
TOKEN=$(kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') | grep token:)

# start proxy on local machine to cluster
kubectl proxy &

# go to the dashboard in your browser
open http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

# paste the token into the login:
echo $TOKEN
-- Skaman Sam
docker
kubernetes

2 Answers

9/19/2018

I think I may have found a solution. I had to grant write permissions to the pki directory on the master node.

chmod a+rw -R /etc/kubernetes/pki

I still don't understand why it works, but it appears to be repeatably working.

-- Skaman Sam
Source: StackOverflow

9/19/2018

I ran into exact same issue on Centos 7 Virtual Box post creation of my Kubernetes single master using kubeadm , I ended up creating an issue against kubeadm .

You might want to follow some or all of those steps mentioned by me and the person who supported me during the debugging of the issue. To summarize , what worked for me was setting the hostname to localhost or something of that sort and trying to create my cluster again using kubeadm init. ( See this link on my last comment on this issue to find the exact steps that resolved my problem). I have been able to run my kubernetes cluster and also join other nodes to it successfully post this change. Goodluck

-- fatcook
Source: StackOverflow