I have setup a small cluster with kubeadm, it was working fine and 6443 port was up. But after rebooting my system, the cluster is not getting up anymore.
What should I do?
Here is some information:
systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Sun 2020-04-05 14:16:44 UTC; 6s ago
Docs: https://kubernetes.io/docs/home/
Main PID: 31079 (kubelet)
Tasks: 20 (limit: 4915)
CGroup: /system.slice/kubelet.service
└─31079 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet
k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://infra01.mydomainname.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dtest-infra01&limit=500&resourceVersion=0: dial tcp 116.66.187.210:6443: connect: connection refused
kubectl get nodes
The connection to the server infra01.mydomainname.com:6443 was refused - did you specify the right host or port?
kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:12:12Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
journalctl -xeu kubelet
6 18167 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458:
Failed to list *v1.Node: Get https://infra01.mydomainname.com
1 18167 reflector.go:153]
k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://huawei-infra01.s
4 18167 aws_credentials.go:77] while getting AWS credentials
NoCredentialProviders: no valid providers in chain. Deprecated.
messaging see aws.Config.CredentialsChainVerboseErrors
6 18167 kuberuntime_manager.go:211] Container runtime docker initialized,
version: 19.03.7, apiVersion: 1.40.0
6 18167 server.go:1113] Started kubelet
1 18167 kubelet.go:1302] Image garbage collection failed once. Stats
initialization may not have completed yet: failed to get imageF
8 18167 server.go:144] Starting to listen on 0.0.0.0:10250
4 18167 server.go:778] Starting healthz server failed: listen tcp
127.0.0.1:10248: bind: address already in use
5 18167 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
4 18167 volume_manager.go:265] Starting Kubelet Volume Manager
1 18167 desired_state_of_world_populator.go:138] Desired state populator
starts to run
3 18167 server.go:384] Adding debug handlers to kubelet server.
4 18167 server.go:158] listen tcp 0.0.0.0:10250: bind: address already in
use
Docker
docker run hello-world
Hello from Docker!
ubuntu
lsb_release -a
Ubuntu 18.04.2 LTS
swap && kubeconfig
swap is turned off and kubeconfig was correctly exported
Note
Things can be fixed by resetting the cluster, but this should be the final option.
Kubelet is not started because of port already in use and hence not able to create pod for api server. Use following command to find out which process is holding the port 10250
root@master admin]# ss -lntp | grep 10250
LISTEN 0 128 :::10250 :::* users:(("kubelet",pid=23373,fd=20))
It will give you PID of that process and name of that process. If it is unwanted process which is holding the port, you can always kill the process and that port becomes available to use by kubelet.
After killing the process again run the above command, it should return no value.
Just to be on safe side run kubeadm reset and then run kubeadm init and it should go through
Edit:
Using snap stop kubelet
did the trick of stopping kubelet on the node.