So the goal is to set up a kubernetes cluster with 4 raspis, one master and 3 workers. I am following this guide. After initial setup the cluster works fine, but is rendered useless after a restart. After some investigation I found out that there is a problem with the docker daemon not restarting after a reboot, which causes necessary kubernetes containers to not start. Also my filesystem goes into read-only mode after a restart. The output of sudo service docker status
shows the following
docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2019-10-03 12:23:57 CEST; 21min ago
Docs: https://docs.docker.com
Process: 1126 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/
Main PID: 1126 (code=exited, status=1/FAILURE)
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Stopped Docker Application Container Engine.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Start request repeated too quickly.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.service: Failed with result 'exit-code'.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Failed to start Docker Application Container Engine.
Trying to run any docker commands results in
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
and docker was installed via curl -sSL get.docker.com | sh && sudo usermod pi -aG docker && newgrp docker
.
I can't even uninstall it, because it was not installed via apt-get:
sudo apt-get remove docker
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package 'docker' is not installed, so not removed
0 upgraded, 0 newly installed, 0 to remove and 62 not upgraded.
The output of journalctl -xe
ist
The unit docker.service has entered the 'failed' state with result 'exit-code'.
Oct 03 12:23:57 k8smaster-2 systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: A start job for unit docker.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- A start job for unit docker.service has finished with a failure.
--
-- The job identifier is 1060 and the job result is failed.
Oct 03 12:23:57 k8smaster-2 systemd[1]: docker.socket: Failed with result 'service-start-limit-hit'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
I googled the various errors I was getting but they only lead to github errors that where 2 years old and unsolved or solved with a solution that didn't help me. (Ref. this, this and this)
I also tried sudo systemctl enable docker
to get a automatic start on boot but I do not think that's the problem. It looks like a configuration problem that can be solved via a fresh install which is what i need to avoid if I want to run a kubernetes cluster that should be able to shutdown gracefully. I really hope anyone might be able to help me.