Kubernetes HA leader control plane services restarting

9/13/2017

While following kubernetes-the-hard-way and porting it to local Ubuntu 16.04 VMs, bringing up the HA control plane is giving me problems. The kube-apiserver and kube-controller-manager services for the elected leader keep failing and restarting but the control planes on the other two non-leader masters come up fine. This behavior follows leader around the cluster.

EtcD looks happy:

member 7f44a7567a5e995 is healthy: got healthy result from https://10.1.15.117:2379
member 43d2258c438cbf4e is healthy: got healthy result from https://10.1.15.116:2379
member a83b22e9f907f471 is healthy: got healthy result from https://10.1.15.115:2379
cluster is healthy

I've verified the current leader by:

kubectl get endpoints kube-controller-manager --namespace=kube-system  -o yaml


apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"df-dev-kube-test02","leaseDurationSeconds":15,"acquireTime":"2017-09-13T17:15:06Z","renewTime":"2017-09-13T18:10:46Z","leaderTransitions":7}'
  creationTimestamp: 2017-09-13T14:59:57Z
  name: kube-controller-manager
  namespace: kube-system
  resourceVersion: "7814"
  selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
  uid: 3599a959-9894-11e7-a34f-005056b24128
subsets: []

kube-apiserver logs: Pastebin apiserver logs

kube-controller-manager logs: Pastebin controller logs

-- MrShushhh
kubernetes

1 Answer

9/15/2017

Considering the various "storage" related errors in the logs and since this is related to leader elections, I thought perhaps the issue might've been etcd related. Etcd kept reporting fine though. For grins I removed --experimental-encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\from the api-server.service and now the control pane is happy.

I'm not sure if it was the encryption itself that was breaking things or if my automation of the encryption hosed it.

-- MrShushhh
Source: StackOverflow