Only apiserver talks directly to etcd. In the etcd cluster there are many hosts. I would like to see to which etcd host the apiserver is talking to. This may be different for each api resource like Pod or Node. I prefer to see etcd host information for each request.
Specifically, kubernetes 1.6.13 and etcd 3.1.14 using v3 store.
I have tried:
Enable etcd client and grpc logging on the kubernetnes api server.
I think grpc only logs in unexpected events. Similarly for etcd clientv3. I was not able to get information about the etcd side of the connection.
Enable http2 debug logging with GODEBUG=http2debug=2
on api server
To my surprise http2 debug logs print a lot of information about each request but I could not find the remote endpoint information. I am still skeptical about this I may be missing a mention in the log files. Not completely sure.
Debug logs on the etcd side.
Enabling debug logs with Enabling Debug Logging prints only about v2 store accesses. For v3 store one could use the http://<host>2379/debug/requests
endpoint but that is not available in my version of etcd 3.1.14.
I have not tried yet to use GODEBUG=http2debug=2
on the etcd side. Maybe the http2 logs on the etcd have the info I need.
tcpdump
or tcpflow
The apiserver <-> etcd connection is encrypted. Would these show me the request url ? I think I did not see that information in the dumps.
Man in the middle attack the apiserver <-> etcd connection with mitmproxy. I do not think this should be that complicated.
I hope, I have missed a super obvious and simple way to accomplish this.
Update:
About using lsof
based approaches:
Using lsof
, we can list the connections with endpoints information at one time. I do not think there is enough information in lsof
output to arrive at endpoint information per request. Apiserver opens a lot of connections to etcd. Looking at the code that observation looks reasonable to me. See NewStorage
in here
$ sudo lsof -p 20816 | grep :2379 | wc -l
130
The connections looks like this
$ sudo lsof -
p 20816 | grep :2379 | head -n 5
hyperkube 20816 root 3u IPv4 58093240 0t0 TCP compute-master7001.dsv31.boxdc.net:36360->compute-etcd7001.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root 5u IPv4 58085987 0t0 TCP compute-master7001.dsv31.boxdc.net:26005->compute-etcd7002.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root 6u IPv4 58085988 0t0 TCP compute-master7001.dsv31.boxdc.net:55650->compute-etcd7003.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root 7u IPv4 58102030 0t0 TCP compute-master7001.dsv31.boxdc.net:36366->compute-etcd7001.dsv31.boxdc.net:2379 (ESTABLISHED)
hyperkube 20816 root 8u IPv4 58085990 0t0 TCP compute-master7001.dsv31.boxdc.net:55654->compute-etcd7003.dsv31.boxdc.net:2379 (ESTABLISHED)
........
Looking at this, I cannot know which etcd is used for each request between the apiserver and etcd.
Update:
I think at the etcdv3 client code that ships with kubernetes 1.6.13, the grpc.Balancer.Get
function returns the endpoint address used for each grpc request. I think one could add a log print here and make apiserver log the etcd address per request.
Find the pid of apiserver
ps aux | grep apiserver
Then use lsof
to see the open socket connections
lsof -p $PID | grep :2379