kubernetes pods http: TLS handshake error from x.x.x.x:38676: EOF

2/14/2020

When starting cert-manager I get the following message

TLS handshake error from 10.42.152.128:38676: EOF

$ kubectl -n cert-manager logs cert-manager-webhook-8575f88c85-l4tlw
I0214 19:41:28.147106       1 main.go:64]  "msg"="enabling TLS as certificate file flags specified"  
I0214 19:41:28.147365       1 server.go:126]  "msg"="listening for insecure healthz connections"  "address"=":6080"
I0214 19:41:28.147418       1 server.go:138]  "msg"="listening for secure connections"  "address"=":10250"
I0214 19:41:28.147437       1 server.go:155]  "msg"="registered pprof handlers"  
I0214 19:41:28.147570       1 tls_file_source.go:144]  "msg"="detected private key or certificate data on disk has changed. reloading certificate"  
2020/02/14 19:43:32 http: TLS handshake error from 10.42.152.128:38676: EOF

Interestingly there is not pod with that IP

$ kubectl get pod -o wide --all-namespaces | grep 128
cert-manager    cert-manager-webhook-8575f88c85-l4tlw             1/1     Running     0          4m56s   10.42.112.128   node002   <none>           <none>

Similar error on the cert-manager pod

E0214 19:38:22.540589       1 controller.go:131] cert-manager/controller/ingress-shim "msg"="re-queuing item  due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout" "key"="kube-system/dashboard-kubernetes-dashboard"

I have two ClusterIssuer

kubectl get ClusterIssuer --namespace cert-manager
NAME              READY   AGE
letsencrypt-prd   True    42d
letsencrypt-stg   True    42d

But no certificate yet:

kubectl get certificate --all-namespaces
No resources found

When I try to request a certificate I get the same error

kubectl apply -f mycert.yml                                                                                                                                                                  
Error from server (InternalError): error when creating "cert-wyssmann-dev.yml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout

I am not sure how exactly can I get to the bottom of the problem. I ran sonobouy to see if this helps me, however test failed on 2 of my 3 nodes.

Plugin: e2e
Status: failed
Total: 1
Passed: 0
Failed: 1
Skipped: 0

Failed tests:
Container e2e is in a terminated state (exit code 1) due to reason: Error: 

Plugin: systemd-logs
Status: failed
Total: 3
Passed: 1
Failed: 2
Skipped: 0

Failed tests:
timeout waiting for results

For the failing nodes I can see this in the sonobouy logs

E0214 19:38:22.540589       1 controller.go:131] cert-manager/controller/ingress-shim "msg"="re-queuing item  due to error processing" "error"="Internal error occurred: failed calling webhook \"webhook.cert-manager.io\": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: TLS handshake timeout" "key"="kube-system/dashboard-kubernetes-dashboard"
-- papanito
cert-manager
kubernetes

1 Answer

2/15/2020

If you really don't need the webhook then one quick way to solve this is to disable the webhook as per documentation

-- Arghya Sadhu
Source: StackOverflow