would like some help debugging this issue we are seeing. We are experiencing kube-api pods continually restarting over time exiting with code 0. We are seeing these restarts as frequent as multiple times in an hour. We are still able to connect to the api server to perform normal kubectl operations, the issue here is the high number of restarts, and the continual restarts over time for example : 471 restarts in 7 days. Our investigation has led us to check the liveness probe of the pod and switched from http to tcp with no effect. We’ve bumped the verbosity of the api logs up to 10 and we are able to see that a healthz check etcd fails multiple times in the log and are not certain thats directly related, but seeing as the last occurrence is so close to the pod shutdown it seems possible.
We've followed the etcd errors further, and to verify that this is indeed what is causing the restarts we have excluded the etcd from the healthz check ( https://gitlab.dev.cncf.ci/kubernetes/kubernetes/commit/69be3057de702e93c2bee84d00f24e932c36590f ) and are currently monitoring the apiserver.
Any additional thoughts on continual/high restart count for the apiserver?