Helm error: dial tcp *:10250: i/o timeout

2/11/2019

Created a local cluster using Vagrant + Ansible + VirtualBox. Manually deploying works fine, but when using Helm:

:~$helm install stable/nginx-ingress --name nginx-ingress-controller --set rbac.create=true
Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 10.0.52.15:10250: i/o timeout

Kubernetes cluster info:

:~$kubectl get nodes,po,deploy,svc,ingress --all-namespaces -o wide
NAME                        STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
node/ubuntu18-kube-master   Ready    master   32m   v1.13.3   10.0.51.15    <none>        Ubuntu 18.04.1 LTS   4.15.0-43-generic   docker://18.6.1
node/ubuntu18-kube-node-1   Ready    <none>   31m   v1.13.3   10.0.52.15    <none>        Ubuntu 18.04.1 LTS   4.15.0-43-generic   docker://18.6.1

NAMESPACE     NAME                                               READY   STATUS    RESTARTS   AGE     IP           NODE                   NOMINATED NODE   READINESS GATES
default       pod/nginx-server                                   1/1     Running   0          40s     10.244.1.5   ubuntu18-kube-node-1   <none>           <none>
default       pod/nginx-server-b8d78876d-cgbjt                   1/1     Running   0          4m25s   10.244.1.4   ubuntu18-kube-node-1   <none>           <none>
kube-system   pod/coredns-86c58d9df4-5rsw2                       1/1     Running   0          31m     10.244.0.2   ubuntu18-kube-master   <none>           <none>
kube-system   pod/coredns-86c58d9df4-lfbvd                       1/1     Running   0          31m     10.244.0.3   ubuntu18-kube-master   <none>           <none>
kube-system   pod/etcd-ubuntu18-kube-master                      1/1     Running   0          31m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/kube-apiserver-ubuntu18-kube-master            1/1     Running   0          30m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/kube-controller-manager-ubuntu18-kube-master   1/1     Running   0          30m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/kube-flannel-ds-amd64-jffqn                    1/1     Running   0          31m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/kube-flannel-ds-amd64-vc6p2                    1/1     Running   0          31m     10.0.52.15   ubuntu18-kube-node-1   <none>           <none>
kube-system   pod/kube-proxy-fbgmf                               1/1     Running   0          31m     10.0.52.15   ubuntu18-kube-node-1   <none>           <none>
kube-system   pod/kube-proxy-jhs6b                               1/1     Running   0          31m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/kube-scheduler-ubuntu18-kube-master            1/1     Running   0          31m     10.0.51.15   ubuntu18-kube-master   <none>           <none>
kube-system   pod/tiller-deploy-69ffbf64bc-x8lkc                 1/1     Running   0          24m     10.244.1.2   ubuntu18-kube-node-1   <none>           <none>

NAMESPACE     NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS     IMAGES                                  SELECTOR
default       deployment.extensions/nginx-server    1/1     1            1           4m25s   nginx-server   nginx                                   run=nginx-server
kube-system   deployment.extensions/coredns         2/2     2            2           32m     coredns        k8s.gcr.io/coredns:1.2.6                k8s-app=kube-dns
kube-system   deployment.extensions/tiller-deploy   1/1     1            1           24m     tiller         gcr.io/kubernetes-helm/tiller:v2.12.3   app=helm,name=tiller

NAMESPACE     NAME                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE   SELECTOR
default       service/kubernetes      ClusterIP   10.96.0.1      <none>        443/TCP         32m   <none>
default       service/nginx-server    NodePort    10.99.84.201   <none>        80:31811/TCP    12s   run=nginx-server
kube-system   service/kube-dns        ClusterIP   10.96.0.10     <none>        53/UDP,53/TCP   32m   k8s-app=kube-dns
kube-system   service/tiller-deploy   ClusterIP   10.99.4.74     <none>        44134/TCP       24m   app=helm,name=tiller

Vagrantfile:

...

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| 
  $hosts.each_with_index do |(hostname, parameters), index|
    ip_address = "#{$subnet}.#{$ip_offset + index}"

    config.vm.define vm_name = hostname do |vm_config|
      vm_config.vm.hostname = hostname
      vm_config.vm.box = box
      vm_config.vm.network "private_network", ip: ip_address

      vm_config.vm.provider :virtualbox do |vb|        
        vb.gui = false
        vb.name = hostname
        vb.memory = parameters[:memory]
        vb.cpus = parameters[:cpus]
        vb.customize ['modifyvm', :id, '--macaddress1', "08002700005#{index}"]
        vb.customize ['modifyvm', :id, '--natnet1', "10.0.5#{index}.0/24"]
      end
    end
  end
end

Workaround for VirtualBox issue: set diffenrent macaddress and internal_ip.

It is interesting to find a solution that can be placed in one of the configuration files: vagrant, ansible roles. Any ideas on the problem?

-- EnjoyLife
kubernetes
kubernetes-helm
vagrant
virtualbox

1 Answer

2/12/2019

Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 10.0.52.15:10250: i/o timeout

You're getting bitten by a very common kubernetes-on-Vagrant bug: the kubelet believes its IP address is eth0, which is the NAT interface in Vagrant, versus using (what I hope you have) the :private_address network in your Vagrantfile. Thus, since all kubelet interactions happen directly to it (and not through the API server), things like kubectl exec and kubectl logs will fail in exactly the way you see.

The solution is to force kubelet to bind to the private network interface, or I guess you could switch your Vagrantfile to use the bridge network, if that's an option for you -- just so long as the interface isn't the NAT one.

-- mdaniel
Source: StackOverflow