Kubernetes 1.14.2 HA Master NGINX load balancer log.go:172] http: TLS handshake error from 192.168.5.32:43148: remote error: tls: bad certificate

5/26/2019

This is driving me crazy, I am NO Kubernetes expert but I am also not a novice.

I have tried unsuccessfully for three days to get past this issue but I can't and I am at the end of my rope.

I can query the cluster from my desktop after I copied the certificates from (kube-apiserver-1:/etc/kubernetes/pki/*) to my desktop.

$ kubectl -n kube-system get nodes
NAME               STATUS   ROLES    AGE   VERSION
kube-apiserver-1   Ready    master   71m   v1.14.2

The Kubernetes cluster appears healthy when I query the kube-system pods:

$ kubectl -n kube-system get pods
NAME                                       READY   STATUS    RESTARTS   AGE
coredns-fb8b8dccf-6c85q                    1/1     Running   3          65m
coredns-fb8b8dccf-qwxlp                    1/1     Running   3          65m
kube-apiserver-kube-apiserver-1            1/1     Running   2          72m
kube-controller-manager-kube-apiserver-1   1/1     Running   2          72m
kube-flannel-ds-amd64-phntk                1/1     Running   2          62m
kube-proxy-swxrz                           1/1     Running   2          65m
kube-scheduler-kube-apiserver-1            1/1     Running   1          54m

but when I query the api kubelet:

$ kubectl -n kube-system logs kube-apiserver-kube-apiserver-1 
...
I0526 04:33:51.523828       1 log.go:172] http: TLS handshake error from 192.168.5.32:43122: remote error: tls: bad certificate
I0526 04:33:51.537258       1 log.go:172] http: TLS handshake error from 192.168.5.32:43124: remote error: tls: bad certificate
I0526 04:33:51.540617       1 log.go:172] http: TLS handshake error from 192.168.5.32:43126: remote error: tls: bad certificate
I0526 04:33:52.333817       1 log.go:172] http: TLS handshake error from 192.168.5.32:43130: remote error: tls: bad certificate
I0526 04:33:52.334354       1 log.go:172] http: TLS handshake error from 192.168.5.32:43128: remote error: tls: bad certificate
I0526 04:33:52.335570       1 log.go:172] http: TLS handshake error from 192.168.5.32:43132: remote error: tls: bad certificate
I0526 04:33:52.336703       1 log.go:172] http: TLS handshake error from 192.168.5.32:43134: remote error: tls: bad certificate
I0526 04:33:52.338792       1 log.go:172] http: TLS handshake error from 192.168.5.32:43136: remote error: tls: bad certificate
I0526 04:33:52.391557       1 log.go:172] http: TLS handshake error from 192.168.5.32:43138: remote error: tls: bad certificate
I0526 04:33:52.396566       1 log.go:172] http: TLS handshake error from 192.168.5.32:43140: remote error: tls: bad certificate
I0526 04:33:52.519666       1 log.go:172] http: TLS handshake error from 192.168.5.32:43142: remote error: tls: bad certificate
I0526 04:33:52.524702       1 log.go:172] http: TLS handshake error from 192.168.5.32:43144: remote error: tls: bad certificate
I0526 04:33:52.537127       1 log.go:172] http: TLS handshake error from 192.168.5.32:43146: remote error: tls: bad certificate
I0526 04:33:52.550177       1 log.go:172] http: TLS handshake error from 192.168.5.32:43150: remote error: tls: bad certificate
I0526 04:33:52.550613       1 log.go:172] http: TLS handshake error from 192.168.5.32:43148: remote error: tls: bad certificate

On the NGINX load balancer (IP: 192.168.5.32) I have configured the TCP passthrough option as specified in the Kubernetes documentation:

upstream kubernetes-api-cluster {
   server 192.168.5.19:6443;
   server 192.168.5.29:6443;
}
server {
   listen 6443;
   ssl_certificate /etc/nginx/ssl/kube-apiserver.pem;
   ssl_certificate_key /etc/nginx/ssl/private/kube-apiserver.key;
   ssl_prefer_server_ciphers on;
   ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
   ssl_ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS;
   proxy_pass kubernetes-api-cluster;
}

I can query the API server directly from the NGINX LB (IP: 192.168.5.32):

$ curl -v https://192.168.5.29:6443
* Rebuilt URL to: https://192.168.5.29:6443/
*   Trying 192.168.5.29...
* TCP_NODELAY set
* Connected to 192.168.5.29 (192.168.5.29) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=kube-apiserver
*  start date: May 26 03:39:36 2019 GMT
*  expire date: May 25 03:39:36 2020 GMT
*  subjectAltName: host "192.168.5.29" matched cert's IP address!
*  issuer: CN=kubernetes
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55840f1d9900)
> GET / HTTP/2
> Host: 192.168.5.29:6443
> User-Agent: curl/7.58.0
> Accept: */*

I can also query the api using the DNS entry to the api as specified in the documents:

curl -v https://kube-apiserver.mydomain.com:6443
* Rebuilt URL to: https://kube-apiserver.mydomain.com:6443/
*   Trying 10.50.1.50...
* TCP_NODELAY set
* Connected to kube-apiserver.mydomain.com (10.50.1.50) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=kube-apiserver
*  start date: May 26 03:39:36 2019 GMT
*  expire date: May 25 03:39:36 2020 GMT
*  subjectAltName: host "kube-apiserver.mydomain.com" matched cert's "kube-apiserver.mydomain.com"
*  issuer: CN=kubernetes
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x564287cbd900)
> GET / HTTP/2
> Host: kube-apiserver.mydomain.com:6443
> User-Agent: curl/7.58.0
> Accept: */*

I can query the api server using curl as well on the API server:

curl -v https://kube-apiserver.mydomain.com:6443
* Rebuilt URL to: https://kube-apiserver.mydomain.com:6443/
*   Trying 10.50.1.50...
* TCP_NODELAY set
* Connected to kube-apiserver.epc-instore.com (10.50.1.50) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Request CERT (13):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Certificate (11):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=kube-apiserver
*  start date: May 26 03:39:36 2019 GMT
*  expire date: May 25 03:39:36 2020 GMT
*  subjectAltName: host "kube-apiserver.mydomain.com" matched cert's "kube-apiserver.mydomain.com"
*  issuer: CN=kubernetes
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x5628b9dbc900)
> GET / HTTP/2
> Host: kube-apiserver.mydomain.com:6443
> User-Agent: curl/7.58.0
> Accept: */*

The manifest on the api server contains:

cat /etc/kubernetes/manifest/kube-apiserver.yaml
...
  - command:
    - kube-apiserver
    - --advertise-address=192.168.5.29
    - --allow-privileged=true
    - --authorization-mode=Node,RBAC
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --enable-admission-plugins=NodeRestriction
    - --enable-bootstrap-token-auth=true
    - --etcd-servers=http://etcd-cluster.mydomain.com:2379
    - --insecure-port=0
    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    - --requestheader-allowed-names=front-proxy-client
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --requestheader-extra-headers-prefix=X-Remote-Extra-
    - --requestheader-group-headers=X-Remote-Group
    - --requestheader-username-headers=X-Remote-User
    - --secure-port=6443
    - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    - --service-cluster-ip-range=10.96.0.0/12
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    image: k8s.gcr.io/kube-apiserver:v1.14.2
    imagePullPolicy: IfNotPresent
...

If you have any idea or hints on how to fix this I am all ears. I am so frustrated with this issue, it really has gotten to me at this point. I will continue to work on it but if anyone has a clue about this issue and can help it will be great.

Thank you.

-- Daniel Maldonado
handshake
kubernetes
ssl

3 Answers

5/27/2019

This is more of a troubleshooting idea to really target the source of the problem. If you can do:

kubectl --kubeconfig /etc/kubernetes/admin.conf get nodes 

from the api server and you get a response then the problem is NOT the load balancer. To further prove this you can copy the appropriate certificates and files to a remote workstation and do the same:

kubectl --kubeconfig [workstation location]/admin.conf get nodes

This second one obviously implies that you have direct access to the load balancer.
If this works too you have confirmation that the certificates are being passed through the TCP load balancer.

However, the error will persist as the load balancer has a check "availability" of a backend server. This check does NOT use a certificate which produces the exception.

-- Daniel Maldonado
Source: StackOverflow

5/26/2019

Your current nginx config isn't setting up a client cert. ssl_certificate is the server cert, if you want it to present a client cert to kubernetes-api-cluster you'll have to configure nginx to forward the incoming client certificate. I've previously done this using proxy_set_header X-SSL-CERT $ssl_client_escaped_cert (documentation)

upstream kubernetes-api-cluster { 
    server 192.168.5.19:6443;
    server 192.168.5.29:6443; 
} 

server { 
    listen 6443;
    ssl_certificate /etc/nginx/ssl/kube-apiserver.pem;
    ssl_certificate_key /etc/nginx/ssl/private/kube-apiserver.key;
    ssl_prefer_server_ciphers on;
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+3DES:DH+3DES:RSA+AESGCM:RSA+AES:RSA+3DES:!aNULL:!MD5:!DSS;
    proxy_pass kubernetes-api-cluster; 

    #forward incoming client certificate
    ssl_verify_client optional; #requests the client certificate and verifies it if the certificate is present
    proxy_set_header X-SSL-CERT $ssl_client_escaped_cert;
}
-- Misantorp
Source: StackOverflow

6/7/2019

The actual root cause of the original issue was (citing the author of this post @Daniel Maldonado):

This was my mistake, I had a firewall configuration error and all tests indicated that it was the load balancer probing the kube-apiserver when in fact it was not. The issue was completely local to the api-server itself. If anyone gets to this point please verify that ALL ports are available to the API server from itself i.e. loopback.

-- Nepomucen
Source: StackOverflow