Kubernetes: How are etcd component services health checked?

6/15/2016

I have a k8s cluster in AWS that looks partially up, but won't actually do deployments. When looking at the health of components, etcd is shown as unhealthy. This looks like it's an issue with the etcd endpoints getting queried as http versus https:

kubectl --kubeconfig=Lab_42/kubeconfig.yaml get componentstatuses --namespace=default
NAME                 STATUS      MESSAGE                                                                                                 ERROR
controller-manager   Healthy     ok                                                                                                      
scheduler            Healthy     ok                                                                                                      
etcd-2               Unhealthy   Get http://ip-10-42-2-50.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02"   
etcd-1               Unhealthy   Get http://ip-10-42-2-41.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02"   
etcd-0               Unhealthy   Get http://ip-10-42-2-40.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02" 

I'm not using the --ca-config option, but putting the config values directly in the apiserver run. My apiserver config:

command:
  - /hyperkube
  - apiserver
  - --advertise-address=10.42.2.50
  - --admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota
  - --allow-privileged=true
  - --authorization-mode=AlwaysAllow
  - --bind-address=0.0.0.0
  - --client-ca-file=/etc/ssl/kubernetes/k8s-ca.pem
  - --etcd-cafile=/etc/ssl/etcd/etcd-ca.pem
  - --etcd-certfile=/etc/ssl/etcd/etcd-client.pem
  - --etcd-keyfile=/etc/ssl/etcd/etcd-client-key.pem
  - --etcd-servers=https://127.0.0.1:2379
  - --kubelet-certificate-authority=/etc/ssl/kubernetes/k8s-ca.pem
  - --kubelet-client-certificate=/etc/ssl/kubernetes/k8s-apiserver-client.pem
  - --kubelet-client-key=/etc/ssl/kubernetes/k8s-apiserver-client-key.pem
  - --kubelet-https=true
  - --logtostderr=true
  - --runtime-config=extensions/v1beta1/deployments=true,extensions/v1beta1/daemonsets=true,api/all
  - --secure-port=443
  - --service-account-lookup=false
  - --service-cluster-ip-range=10.3.0.0/24
  - --tls-cert-file=/etc/ssl/kubernetes/k8s-apiserver.pem
  - --tls-private-key-file=/etc/ssl/kubernetes/k8s-apiserver-key.pem

The actual problem is that simple deployments don't actually do anything, and I'm not sure if etcd being unhealthy is causing the problem or not as we have many other certificates in the mix.

kubectl --kubeconfig=Lab_42/kubeconfig.yaml get deployments --namespace=default
NAME               DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   3         0         0            0           2h

I can actually query etcd directly if I use the local https endpoint

/usr/bin/etcdctl --ca-file /etc/ssl/etcd/etcd-ca.pem --cert-file /etc/ssl/etcd/etcd-client.pem --key-file /etc/ssl/etcd/etcd-client-key.pem 
--endpoints 'https://127.0.0.1:2379' \
get /registry/minions/ip-10-42-2-50.ec2.internal | jq "."
{
  "kind": "Node",
  "apiVersion": "v1",
  "metadata": {
    "name": "ip-10-42-2-50.ec2.internal",
    "selfLink": "/api/v1/nodes/ip-10-42-2-50.ec2.internal",
...SNIP
-- BeachGuru
kubernetes

2 Answers

6/22/2017

I think your kube-apiserver's config is missing the option --etcd-server=xxx

-- sure ruan
Source: StackOverflow

6/16/2016

So it turns out that the component statuses was a red herring. The real problem was due to the fact that my controller configuration was wrong. The master was set for http://master_ip:8080 instead of http://127.0.0.1:8080. The insecure port for apiserver is not exposed to external interfaces, so the controller could not connect.

Switching to either loopback insecure or :443 solved my problem.

When using the CoreOS hypercube and kubelet-wrapper, you lose out on the automatically linked container logs in /var/log/containers. To find those, you can do something like:

ls -latr /var/lib/docker/containers/*/*-json.log

I was actually able to see the errors causing my problem this way.

-- BeachGuru
Source: StackOverflow