kubernetes is complicated, kubelet run into deadlocks after long running in some scenarios.
Is there a way to dump goroutine stack trace of the running kubelet?
The expected output like following which is very helpful to debug deadlock kind issues of kubelet.
goroutine 386 [chan send, 1140 minutes]:
k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).relist(0xc42069ea20)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:261 +0x74e
k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).(k8s.io/kubernetes/pkg/kubelet/pleg.relist)-fm()
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:130 +0x2a
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc4212ee520)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc4212ee520, 0x3b9aca00, 0x0, 0x1, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc4212ee520, 0x3b9aca00, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by k8s.io/kubernetes/pkg/kubelet/pleg.(*GenericPLEG).Start
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/pleg/generic.go:130 +0x88
...
goroutine 309 [sleep]:
time.Sleep(0x12a05f200)
/usr/local/go/src/runtime/time.go:102 +0x166
k8s.io/kubernetes/pkg/kubelet.(*Kubelet).syncLoop(0xc4205e3b00, 0xc420ff2780, 0x3e56a60, 0xc4205e3b00)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1777 +0x1e7
k8s.io/kubernetes/pkg/kubelet.(*Kubelet).Run(0xc4205e3b00, 0xc420ff2780)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1396 +0x27f
k8s.io/kubernetes/cmd/kubelet/app.startKubelet.func1()
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubelet/app/server.go:998 +0x67
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc42105dfb0)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x54
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc42105dfb0, 0x0, 0x0, 0x1, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134 +0xbd
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc42105dfb0, 0x0, 0xc420056540)
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by k8s.io/kubernetes/cmd/kubelet/app.startKubelet
/workspace/anago-v1.11.5-beta.0.24+753b2dbc622f5c/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubelet/app/server.go:996 +0xea
...
I appreciate that anyone could share the experience about how to dump goroutines stack race of kubelet that something like what docker provided[1]
$ pkill -SIGUSR1 dockerd
[1]. https://success.docker.com/article/how-to-dump-goroutines-stacktraces
pprof, it will keep kubelet running
send signal to kubelet which caused kubelet to exit with a stack dump
kill -SIGABRT