helm unable to connect to tiller in GKE

5/11/2020

I was also encountering the error message when running helm version or helm list

kubectl port-forward -n kube-system tiller-deploy-xxxxxxxxxxxxxxx 44134
error: error upgrading connection: error dialing backend: No SSH tunnels currently open. Were the targets able to accept an ssh-key for user "gke-xxxxxxxxxxx"?

The root issue appears to be related to the GKE port-forwarding. is the ssh key configurable anywhere? I can see this key being added to my metadata, but it is not part of the metadata for the GKE nodes.

-- turnupthechill
google-kubernetes-engine
kubernetes
kubernetes-helm

2 Answers

5/11/2020

Issue could be related to certicates. Please check part on

Q: On GKE (Google Container Engine) I getNo SSH tunnels currently open
-- dany L
Source: StackOverflow

5/12/2020
  • Under the hood helm is initiating a short-lived kubectl port-forward to tiller.

If it's not working, your issue is with that, not tiller:

  • Kubectl port-forward rely on the cluster's master being able to talk to the nodes in the cluster. However, because the master isn't in the same Compute Engine network as your cluster's nodes, we rely on SSH tunnels to enable secure communication.

  • GKE saves an SSH public key file in your Compute Engine project metadata. All Compute Engine VMs using Google-provided images regularly check their project's common metadata and their instance's metadata for SSH keys to add to the VM's list of authorized users. GKE also adds a firewall rule to your Compute Engine network allowing SSH access from the master's IP address to each node in the cluster.

If any of the above kubectl commands don't run, it's likely that the master is unable to open SSH tunnels with the nodes. Check for these potential causes:

  1. The cluster doesn't have any nodes:
    If you've scaled down the number of nodes in your cluster to zero, SSH tunnels won't work.

To fix it, resize your cluster to have at least one node.

  1. Pods in the cluster have gotten stuck in a terminating state and have prevented nodes that no longer exist from being removed from the cluster:
    This is an issue that should only affect Kubernetes version 1.1, but could be caused by repeated resizing of the cluster.

To fix it, delete the Pods that have been in a terminating state for more than a few minutes. The old nodes are then removed from the master's API and replaced by the new nodes.

  1. Your network's firewall rules don't allow for SSH access to the master.

    All Compute Engine networks are created with a firewall rule called "default-allow-ssh" that allows SSH access from all IP addresses (requiring a valid private key, of course). GKE also inserts an SSH rule for each cluster of the form gke-cluster-name-random-characters-ssh that allows SSH access specifically from the cluster's master IP to the cluster's nodes. If neither of these rules exists, then the master will be unable to open SSH tunnels.

To fix it, re-add a firewall rule allowing access to VMs with the tag that's on all the cluster's nodes from the master's IP address.

  1. Your project's common metadata entry for "ssh-keys" is full.

    If the project's metadata entry named "ssh-keys" is close to the 32KiB size limit, then GKE isn't able to add its own SSH key to enable it to open SSH tunnels. You can see your project's metadata by running the following command:

    gcloud compute project-info describe [--project=PROJECT]

    And then check the length of the list of ssh-keys.

To fix it, delete some of the SSH keys that are no longer needed.

  1. You have set a metadata field with the key "ssh-keys" on the VMs in the cluster.

    The node agent on VMs prefers per-instance ssh-keys to project-wide SSH keys, so if you've set any SSH keys specifically on the cluster's nodes, then the master's SSH key in the project metadata won't be respected by the nodes. To check, run gcloud compute instances describe <VM-name> and look for an "ssh-keys" field in the metadata.

To fix it, delete the per-instance SSH keys from the instance metadata.

You can also find this troubleshooting steps at GCP Kubernetes Engine Troubleshooting kubectl


Note: As a side note I also suggest trying Helm 3:

It does not need Tiller Installation anymore, has a lot of new features and is supported on GKE.

-- willrof
Source: StackOverflow