I have a k8s cluster in AWS that looks partially up, but won't actually do deployments. When looking at the health of components, etcd is shown as unhealthy. This looks like it's an issue with the etcd endpoints getting queried as http versus https:
kubectl --kubeconfig=Lab_42/kubeconfig.yaml get componentstatuses --namespace=default
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Unhealthy Get http://ip-10-42-2-50.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
etcd-1 Unhealthy Get http://ip-10-42-2-41.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
etcd-0 Unhealthy Get http://ip-10-42-2-40.ec2.internal:2379/health: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
I'm not using the --ca-config option, but putting the config values directly in the apiserver run. My apiserver config:
command:
- /hyperkube
- apiserver
- --advertise-address=10.42.2.50
- --admission_control=NamespaceLifecycle,NamespaceAutoProvision,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota
- --allow-privileged=true
- --authorization-mode=AlwaysAllow
- --bind-address=0.0.0.0
- --client-ca-file=/etc/ssl/kubernetes/k8s-ca.pem
- --etcd-cafile=/etc/ssl/etcd/etcd-ca.pem
- --etcd-certfile=/etc/ssl/etcd/etcd-client.pem
- --etcd-keyfile=/etc/ssl/etcd/etcd-client-key.pem
- --etcd-servers=https://127.0.0.1:2379
- --kubelet-certificate-authority=/etc/ssl/kubernetes/k8s-ca.pem
- --kubelet-client-certificate=/etc/ssl/kubernetes/k8s-apiserver-client.pem
- --kubelet-client-key=/etc/ssl/kubernetes/k8s-apiserver-client-key.pem
- --kubelet-https=true
- --logtostderr=true
- --runtime-config=extensions/v1beta1/deployments=true,extensions/v1beta1/daemonsets=true,api/all
- --secure-port=443
- --service-account-lookup=false
- --service-cluster-ip-range=10.3.0.0/24
- --tls-cert-file=/etc/ssl/kubernetes/k8s-apiserver.pem
- --tls-private-key-file=/etc/ssl/kubernetes/k8s-apiserver-key.pem
The actual problem is that simple deployments don't actually do anything, and I'm not sure if etcd being unhealthy is causing the problem or not as we have many other certificates in the mix.
kubectl --kubeconfig=Lab_42/kubeconfig.yaml get deployments --namespace=default
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 0 0 0 2h
I can actually query etcd directly if I use the local https endpoint
/usr/bin/etcdctl --ca-file /etc/ssl/etcd/etcd-ca.pem --cert-file /etc/ssl/etcd/etcd-client.pem --key-file /etc/ssl/etcd/etcd-client-key.pem
--endpoints 'https://127.0.0.1:2379' \
get /registry/minions/ip-10-42-2-50.ec2.internal | jq "."
{
"kind": "Node",
"apiVersion": "v1",
"metadata": {
"name": "ip-10-42-2-50.ec2.internal",
"selfLink": "/api/v1/nodes/ip-10-42-2-50.ec2.internal",
...SNIP
I think your kube-apiserver's config is missing the option --etcd-server=xxx
So it turns out that the component statuses was a red herring. The real problem was due to the fact that my controller configuration was wrong. The master was set for http://master_ip:8080 instead of http://127.0.0.1:8080. The insecure port for apiserver is not exposed to external interfaces, so the controller could not connect.
Switching to either loopback insecure or :443 solved my problem.
When using the CoreOS hypercube and kubelet-wrapper, you lose out on the automatically linked container logs in /var/log/containers. To find those, you can do something like:
ls -latr /var/lib/docker/containers/*/*-json.log
I was actually able to see the errors causing my problem this way.