I have followed this tutorial and this tutorial and this one but am facing the same issue for last 3 days.
I am able to set up the master node correctly with the following steps:
kubeadm init
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export kubever=$(kubectl version | base64 | tr -d ‘\’)
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$kubever"
and everything seems fine in
kubectl get all --namespace=kube-system
then,
on the worker node:
kubeadm join --token 864655.fdf6d0b389867b79 192.168.100.17:6443 --discovery-token-ca-cert-hash sha256:a2d840808b17b53b9612e6271ccde489f13dbede7d354f97188d0faa9e210af2
The output seems fine and is as below:
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.100.17:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.100.17:6443"
[discovery] Requesting info from "https://192.168.100.17:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.100.17:6443"
[discovery] Successfully established connection with API Server "192.168.100.17:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
BUT as soon as I run this command, all hell breaks loose. The
kubectl get all --namespace=kube-system
starts showing that all pods are kind of restarting all the time. the status keeps changing between Pending and Running, and at time some of the pods will even disappear and may have ContainerCreating status etc.
NAME READY STATUS RESTARTS AGE
po/etcd-ubuntu 0/1 Pending 0 0s
po/kube-controller-manager-ubuntu 0/1 Pending 0 0s
po/kube-dns-6f4fd4bdf-cmcfk 3/3 Running 0 13m
po/kube-proxy-2chb6 1/1 Running 0 13m
po/kube-scheduler-ubuntu 0/1 Pending 0 0s
po/weave-net-ptdxr 2/2 Running 0 11m
I have also tried the second tutorial, with flannel, and get the exact same issue.
My Set Up
I created two new VMs with a fresh installation of Ubuntu 17.10 on VMware with 2 processor/2core 6 GB of ram and 50 GB hard disk each. My physical machine is a i7-6700k with 32gb of ram. I installed kubeadm, kubelet and docker on both of them and then followed the steps as mentioned above.
I have also tried switching between NAT and Bridge on VMware and nothing changed.
The initial IP of both VMs with bridge network was 192.168.100.12 and 192.168.100.17. The hostname -I
for master:
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2
The hostname -I
for worker-node:
192.168.100.12 172.17.0.1 10.44.0.0 10.32.0.1
journalctl -xeu kubelet
shows the following:
https://gist.github.com/saad749/9a771a3460bf88c274498b5bc4b7fd84
While trying with flannel (and still the same issue), the result from
kubectl describe nodes
is
https://gist.github.com/saad749/d24c453c8b4e663e9abf572a0fb38bf4
Am I missing any step before kubeadm init? Should I change the IP addresses (to what)? Are there any specific logs I should look into? Is there a more comprehensive tutorial for this? All Issues start after kubeadm join on the worker node, I can deploy the kubernetes on the master node or any other stuff, and it works fine.
UPDATE:
Even after applying the suggestions from errordeveloper, The same issue persists.
I add the following flag to kubeadm init:
--apiserver-advertise-address 192.168.100.17
I updated the kubeadm.conf to following and did reload and restart: https://gist.github.com/saad749/c7149c87ec3e75a40586f626cf04279a
and also tried changing the cluster dns https://gist.github.com/saad749/5fa66bebc22841e58119333e75600e40
This the log from after initializing the master:
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 1/1 Running 0 22s 192.168.100.17 ubuntu
kube-system kube-apiserver-ubuntu 1/1 Running 0 29s 192.168.100.17 ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 13s 192.168.100.17 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 1m 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 1m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 34s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 32s 192.168.100.17 ubuntu
The hostname -i results:
kube-master@ubuntu:~$ hostname -I
192.168.100.17 172.17.0.1 10.32.0.1 10.32.0.2 10.32.0.3 10.32.0.4 10.32.0.5 10.32.0.6 10.244.0.0 10.244.0.1
kube-master@ubuntu:~$ hostname -i
192.168.100.17
Results from:
kubectl describe nodes
https://gist.github.com/saad749/8f460650182a04d0ddf3158a52761a9a
The Internal IP seems correct now.
After joining from second node, this happens:
kube-master@ubuntu:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ubuntu Ready master 49m v1.9.3
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-controller-manager-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 0/3 ContainerCreating 0 49m <none> ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 49m 192.168.100.17 ubuntu
kube-system kube-scheduler-ubuntu 1/1 Running 0 1s 192.168.100.17 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 48m 192.168.100.17 ubuntu
ifconfig -a results:
https://gist.github.com/saad749/63a5a52bd3246ff72477b2aca7d158d0
journalctl -xeu kubelet results
https://gist.github.com/saad749/8a60870b35f93df8565e66cb208aff32
Sometimes, the pods IP is shown at 192.168.100.12 which is the IP of the non-master second node.
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system etcd-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-apiserver-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system kube-controller-manager-ubuntu 1/1 Running 0 0s 192.168.100.12 ubuntu
kube-system kube-dns-6f4fd4bdf-wfqhb 2/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system kube-scheduler-ubuntu 0/1 Pending 0 0s <none> ubuntu
kube-system weave-net-fkgnh 2/2 Running 1 3h 192.168.100.17 ubuntu
kube-master@ubuntu:~$ kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kube-dns-6f4fd4bdf-wfqhb 3/3 Running 0 3h 10.32.0.7 ubuntu
kube-system kube-proxy-h4hz9 1/1 Running 0 3h 192.168.100.12 ubuntu
kube-system weave-net-fkgnh 2/2 Running 0 3h 192.168.100.12 ubuntu
kubectl describe nodes
Name: ubuntu
Roles: master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=ubuntu
node-role.kubernetes.io/master=
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: node-role.kubernetes.io/master:NoSchedule
CreationTimestamp: Fri, 02 Mar 2018 08:21:47 -0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 08:21:43 -0800 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Fri, 02 Mar 2018 11:38:36 -0800 Fri, 02 Mar 2018 11:28:25 -0800 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.100.12
Hostname: ubuntu
Capacity:
cpu: 4
memory: 6080832Ki
pods: 110
Allocatable:
cpu: 4
memory: 5978432Ki
pods: 110
System Info:
Machine ID: 59bf65b835b242a3aa182f4b8a542219
System UUID: 0C3C4D56-4747-D59E-EE09-F16F2793677E
Boot ID: 658b4a08-d724-425e-9246-2b41995ecc46
Kernel Version: 4.13.0-36-generic
OS Image: Ubuntu 17.10
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.13.1
Kubelet Version: v1.9.3
Kube-Proxy Version: v1.9.3
ExternalID: ubuntu
Non-terminated Pods: (3 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system kube-dns-6f4fd4bdf-wfqhb 260m (6%) 0 (0%) 110Mi (1%) 170Mi (2%)
kube-system kube-proxy-h4hz9 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system weave-net-fkgnh 20m (0%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
280m (7%) 0 (0%) 110Mi (1%) 170Mi (2%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Rebooted 12m (x814 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
Normal NodeHasNoDiskPressure 10m kubelet, ubuntu Node ubuntu status is now: NodeHasNoDiskPressure
Normal Starting 10m kubelet, ubuntu Starting kubelet.
Normal NodeAllocatableEnforced 10m kubelet, ubuntu Updated Node Allocatable limit across pods
Normal NodeHasSufficientDisk 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientDisk
Normal NodeHasSufficientMemory 10m kubelet, ubuntu Node ubuntu status is now: NodeHasSufficientMemory
Normal NodeNotReady 10m kubelet, ubuntu Node ubuntu status is now: NodeNotReady
Warning Rebooted 2m (x870 over 2h) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 658b4a08-d724-425e-9246-2b41995ecc46
Warning Rebooted 15s (x60 over 10m) kubelet, ubuntu Node ubuntu has been rebooted, boot id: 16efd500-a2a5-446f-ba25-1187857996e0
What am I doing wrong?
So after following the advice from @errordeveloper and still hitting the wall, I was able to solve the issue that turns out to be pretty simple.
Both my VMs had the same hostname.
hostname -f
would return
ubuntu
on both, and that causes issue with kubernetes, apparently.
I changed the name on my non-master node with
hostnamectl set-hostname kminion
and in the following files:
/etc/hostname
/etc/hosts
and everything went smooth onward!
Should I change the IP addresses (to what)?
Yes, this is typically the way to make things work on VMs where the default route is for NATed access to the Internet.
You want to use the IP of the bridge network, for you master that appears to be 192.168.100.17
(but please double check).
First, please try using kubeadm init --apiserver-advertise-address 192.168.100.17
, but that may not solve all of the issues.
In your ouput of kubectl describe nodes
, I can see this
Addresses:
InternalIP: 172.17.0.1
Hostname: ubuntu
So you probably want to make sure that kubelet also doesn't used the NATed interface, for which you would need to use kubelet's --node-ip
flag.
However, there are other ways to fix this problem, e.g. if you can ensure that hostname -i
returns the IP of the bridged interface (which you can do by tweaking /etc/hosts
).