Tiller timing out every 30 sec unable to proceed with installation

6/23/2019

we are recently observing this issue with the tiller timing out every 30 seconds with below error for helm init/upgrade/install commands. Although other commands such as helm init and helm list works fine. i have even tried removing the --wait option as well but that does not appear to be the issue:

I have tried rebooting the nodes, upgrading the GKE version to the latest, rebooting tiller pod and increasing the time in the timeout option, trying the command without timeout option as well.

[tiller] 2019/06/23 15:18:57 warning: Upgrade "xx" failed: 
Failed to recreate resource: Timeout: 
request did not complete within requested timeout 30s 
&& Failed to recreate resource: Timeout: 
request did not complete within requested timeout 30s

Output of helm version:

Client: &version.Version{SemVer:"v2.13.0", GitCommit:"79d07943b03aea2b76c12644b4b54733bc5958d6", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.13.0", GitCommit:"79d07943b03aea2b76c12644b4b54733bc5958d6", GitTreeState:"clean"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.6-gke.13", GitCommit:"fcbc1d20b6bca1936c0317743055ac75aef608ce", GitTreeState:"clean", BuildDate:"2019-06-19T20:50:07Z", GoVersion:"go1.11.5b4", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform (AKS, GKE, Minikube etc.):

GKE

-- satishkumar432
google-kubernetes-engine
kubernetes
kubernetes-helm

2 Answers

6/24/2019

Helm upgrade may resolve the issue please run the below command helm init --upgrade

-- Manoj Kasa
Source: StackOverflow

6/26/2019

The problem for this timeout happens to be a webhook that was configured by the below app on the cluster. The Google support team confirmed that this webhook was restricting the deployments by inspecting the api server logs. After deleting the webhook from the cluster the deployments went through.

https://github.com/reactiveops/polaris

-- satishkumar432
Source: StackOverflow