Two of my microk8s clusters running version 1.21 just stopped working.
kubectl locally returns The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
microk8s.status says not running, and microk8s.inspect just checks four services:
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver-kicker is running
Service snap.microk8s.daemon-kubelite is running
Apiserver not mentioned, and it's not running (checking status for that separately says "Will not run along with kubelite")
I didn't change anything on any of the machines.
I tried upgrading microk8s to 1.22 - no change.
journal.log for apiserver says:
Oct 18 07:57:05 myserver microk8s.daemon-kubelite[30037]: I1018 07:57:05.143264 30037 daemon.go:65] Starting API Server
Oct 18 07:57:05 myserver microk8s.daemon-kubelite[30037]: Flag --insecure-port has been deprecated, This flag has no effect now and will be removed in v1.24.
Oct 18 07:57:05 myserver microk8s.daemon-kubelite[30037]: I1018 07:57:05.144650 30037 server.go:654] external host was not specified, using 192.168.1.10
Oct 18 07:57:05 myserver microk8s.daemon-kubelite[30037]: W1018 07:57:05.144719 30037 authentication.go:507] AnonymousAuth is not allowed with the AlwaysAllow authorizer. Resetting AnonymousAuth to false. You should use a different authorizer
snap services:
Service Startup Current Notes
microk8s.daemon-apiserver enabled inactive -
microk8s.daemon-apiserver-kicker enabled active -
microk8s.daemon-cluster-agent enabled active -
microk8s.daemon-containerd enabled active -
microk8s.daemon-control-plane-kicker enabled inactive -
microk8s.daemon-controller-manager enabled inactive -
microk8s.daemon-etcd enabled inactive -
microk8s.daemon-flanneld enabled inactive -
microk8s.daemon-kubelet enabled inactive -
microk8s.daemon-kubelite enabled active -
microk8s.daemon-proxy enabled inactive -
microk8s.daemon-scheduler enabled inactive -
It's not this (https://github.com/ubuntu/microk8s/issues/2486), both info.yaml and cluster.yaml have the correct contents.
All machines are virtual Ubuntus running in Hyper-V in a Windows Server cluster.
Turns out there were two different problems in the cluster, and that I hadn't changed anything was not entirely true.
cluster.yaml was not correct, it was empty. Copying the contents of localnode.yaml to cluster.yaml fixed the problem.
One node had gone offline (microk8s not running) due to a stuck unsuccessful auto-refresh of the microk8s snap.
I had temporarily shut down one node for a couple of days. That left only one node to hold the vote on master for dqlite, which failed. When the shut down node was turned back on the cluster had already failed. Unsticking the auto-refresh on the third node fixed the cluster.