Kubernetes api server drop requests from a pod causing dial error

3/6/2016

I've upgraded kubernetes to version 1.1.7 and got this error from one of my pod, which was calling the k8s ApiServer frequently to check liveness status of every other pods.

Error #01: Get http://[api-server]:8080/api/v1/namespaces/production/pods?labelSelector=app%3Dworkflow-worker-mandrill-hook-handler: dial tcp [api-server]:8080: connect: cannot assign requested address

The requests were being sent at the rate of ~80 requests/second. While having that error, I was still able to call that API from my local. Restart the pod solved the issue but the next day, it happened again. It seems that the apiserver was blocking that pod to avoid DOS?

I'm using docker version Docker version 1.7.1, build 2c2c52b-dirty and CoreOS v773.0.0

Linux ***** 4.1.5-coreos #2 SMP Thu Aug 13 09:18:45 UTC 2015 x86_64 Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz GenuineIntel GNU/Linux

Kubernetes api server error log:

I0306 07:32:13.087599       1 logs.go:40] http: TLS handshake error from ***:60033: EOF
I0306 07:32:14.596398       1 logs.go:40] http: TLS handshake error from ***:57257: EOF
I0306 07:32:15.126962       1 logs.go:40] http: TLS handshake error from ***:60035: EOF
I0306 07:32:15.136445       1 logs.go:40] http: TLS handshake error from ***:60054: EOF
I0306 07:32:15.210656       1 logs.go:40] http: TLS handshake error from ***:45384: EOF
I0306 07:32:15.215155       1 logs.go:40] http: TLS handshake error from ***:45385: EOF
I0306 07:32:15.253877       1 logs.go:40] http: TLS handshake error from ***:37527: EOF
I0306 07:32:15.265899       1 logs.go:40] http: TLS handshake error from ***:57258: EOF
I0306 07:32:15.272564       1 logs.go:40] http: TLS handshake error from ***:57249: EOF
I0306 07:32:15.282808       1 logs.go:40] http: TLS handshake error from ***:59928: EOF

dmesg in master node:

[Sun Mar  6 07:32:04 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:04 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:04 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:04 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:04 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:15 2016] net_ratelimit: 34 callbacks suppressed
[Sun Mar  6 07:32:15 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:18 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:18 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:18 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:21 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:21 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:21 2016] TCP: too many orphaned sockets
[Sun Mar  6 07:32:29 2016] TCP: too many orphaned sockets
-- Quyen Nguyen Tuan
coreos
kubernetes
sockets

1 Answer

3/11/2016

After 4 hours of investigating, it turns out that because of my application querying the k8s api server. It is written in Golang and use "gorequest" library to call REST api toward the api server.

The gorequest didn't close the request after sending, even though I closed it explicitly in the code. And it was hard to check the number of open connection because it run inside a Docker container. It was normally enough to check in the host by the command ls /proc/PID/fd | wc -l but this time, I had to access inside the container to check. So I tried to use the "http" library instead of gorequest and it solved the problem!

-- Quyen Nguyen Tuan
Source: StackOverflow