I have set up a Vagrant VM, and installed Kubernetes and Helm.
vagrant@vagrant:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-19T00:05:56Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.8", GitCommit:"c138b85178156011dc934c2c9f4837476876fb07", GitTreeState:"clean", BuildDate:"2018-05-21T18:53:18Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
vagrant@vagrant:~$ helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
After the first vagrant up
that creates the VM, Tiller has no issues.
I power-off the VM with vagrant halt
and reactivate it with vagrant up
. Then Tiller starts to misbehave.
It has a lot of restarts and at some point, it enters a ClashLoopBackOff
state.
etcd-vagrant 1/1 Running 2 1h
heapster-5449cf95bd-h9xk8 1/1 Running 2 1h
kube-apiserver-vagrant 1/1 Running 2 1h
kube-controller-manager-vagrant 1/1 Running 2 1h
kube-dns-6f4fd4bdf-xclbb 3/3 Running 6 1h
kube-proxy-8n8tc 1/1 Running 2 1h
kube-scheduler-vagrant 1/1 Running 2 1h
kubernetes-dashboard-5bd6f767c7-lrdjp 1/1 Running 3 1h
tiller-deploy-78f96d6f9-cswbm 0/1 CrashLoopBackOff 8 38m
weave-net-948jt 2/2 Running 5 1h
I get a look at the pod's events and see that the Liveness and Readiness probes are failing.
vagrant@vagrant:~$ kubectl describe pod tiller-deploy-78f96d6f9-cswbm -n kube-system
Name: tiller-deploy-78f96d6f9-cswbm
Namespace: kube-system
Node: vagrant/10.0.2.15
Start Time: Wed, 23 May 2018 08:51:54 +0000
Labels: app=helm
name=tiller
pod-template-hash=349528295
Annotations: <none>
Status: Running
IP: 10.32.0.28
Controlled By: ReplicaSet/tiller-deploy-78f96d6f9
Containers:
tiller:
Container ID: docker://389470b95c46f0a5ba6b4b5457f212b0e6f3e3a754beb1aeae835260de3790a7
Image: gcr.io/kubernetes-helm/tiller:v2.9.1
Image ID: docker-pullable://gcr.io/kubernetes-helm/tiller@sha256:417aae19a0709075df9cc87e2fcac599b39d8f73ac95e668d9627fec9d341af2
Ports: 44134/TCP, 44135/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Wed, 23 May 2018 09:26:53 +0000
Finished: Wed, 23 May 2018 09:27:12 +0000
Ready: False
Restart Count: 8
Liveness: http-get http://:44135/liveness delay=1s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:44135/readiness delay=1s timeout=1s period=10s #success=1 #failure=3
Environment:
TILLER_NAMESPACE: kube-system
TILLER_HISTORY_MAX: 0
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-fl44z (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-fl44z:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-fl44z
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulMountVolume 38m kubelet, vagrant MountVolume.SetUp succeeded for volume "default-token-fl44z"
Normal Scheduled 38m default-scheduler Successfully assigned tiller-deploy-78f96d6f9-cswbm to vagrant
Normal Pulled 29m (x2 over 38m) kubelet, vagrant Container image "gcr.io/kubernetes-helm/tiller:v2.9.1" already present on machine
Normal Killing 29m kubelet, vagrant Killing container with id docker://tiller:Container failed liveness probe.. Container will be killed and recreated.
Normal Created 29m (x2 over 38m) kubelet, vagrant Created container
Normal Started 29m (x2 over 38m) kubelet, vagrant Started container
Warning Unhealthy 28m (x2 over 37m) kubelet, vagrant Readiness probe failed: Get http://10.32.0.19:44135/readiness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 17m (x30 over 37m) kubelet, vagrant Liveness probe failed: Get http://10.32.0.19:44135/liveness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Normal SuccessfulMountVolume 11m kubelet, vagrant MountVolume.SetUp succeeded for volume "default-token-fl44z"
Warning FailedCreatePodSandBox 10m (x7 over 11m) kubelet, vagrant Failed create pod sandbox.
Normal SandboxChanged 10m (x8 over 11m) kubelet, vagrant Pod sandbox changed, it will be killed and re-created.
Normal Pulled 10m kubelet, vagrant Container image "gcr.io/kubernetes-helm/tiller:v2.9.1" already present on machine
Normal Created 10m kubelet, vagrant Created container
Normal Started 10m kubelet, vagrant Started container
Warning Unhealthy 10m kubelet, vagrant Liveness probe failed: Get http://10.32.0.28:44135/liveness: dial tcp 10.32.0.28:44135: getsockopt: connection refused
Warning Unhealthy 10m kubelet, vagrant Readiness probe failed: Get http://10.32.0.28:44135/readiness: dial tcp 10.32.0.28:44135: getsockopt: connection refused
Warning Unhealthy 8m (x2 over 9m) kubelet, vagrant Liveness probe failed: Get http://10.32.0.28:44135/liveness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 8m (x2 over 9m) kubelet, vagrant Readiness probe failed: Get http://10.32.0.28:44135/readiness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning BackOff 1m (x22 over 7m) kubelet, vagrant Back-off restarting failed container
After entering this state, it stays there.
Only after I delete the Tiller pod, it comes up again and everything runs smoothly.
vagrant@vagrant:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
etcd-vagrant 1/1 Running 2 1h
heapster-5449cf95bd-h9xk8 1/1 Running 2 1h
kube-apiserver-vagrant 1/1 Running 2 1h
kube-controller-manager-vagrant 1/1 Running 2 1h
kube-dns-6f4fd4bdf-xclbb 3/3 Running 6 1h
kube-proxy-8n8tc 1/1 Running 2 1h
kube-scheduler-vagrant 1/1 Running 2 1h
kubernetes-dashboard-5bd6f767c7-lrdjp 1/1 Running 4 1h
tiller-deploy-78f96d6f9-tgx4z 1/1 Running 0 7m
weave-net-948jt 2/2 Running 5 1h
However, the events seem to have the same Unhealthy Warnings.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m default-scheduler Successfully assigned tiller-deploy-78f96d6f9-tgx4z to vagrant
Normal SuccessfulMountVolume 8m kubelet, vagrant MountVolume.SetUp succeeded for volume "default-token-fl44z"
Normal Pulled 7m kubelet, vagrant Container image "gcr.io/kubernetes-helm/tiller:v2.9.1" already present on machine
Normal Created 7m kubelet, vagrant Created container
Normal Started 7m kubelet, vagrant Started container
Warning Unhealthy 7m kubelet, vagrant Readiness probe failed: Get http://10.32.0.28:44135/readiness: dial tcp 10.32.0.28:44135: getsockopt: connection refused
Warning Unhealthy 7m kubelet, vagrant Liveness probe failed: Get http://10.32.0.28:44135/liveness: dial tcp 10.32.0.28:44135: getsockopt: connection refused
Warning Unhealthy 1m (x6 over 3m) kubelet, vagrant Liveness probe failed: Get http://10.32.0.28:44135/liveness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 41s (x14 over 7m) kubelet, vagrant Readiness probe failed: Get http://10.32.0.28:44135/readiness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Any insight would be appreciated.