GKE very slow to start with nodepool - cluster and k8s/gcloud api unaivalable

11/14/2018

We have a GKE cluster composed of 7 nodes and 9 microservices at the moment. We also add 2 nodepools with 2 nodes by default. We use istio to do load balancing between microservices.

Our CI environment creates everything with a script. The issue is that it takes minutes for the cluster to be available with the nodepools.

My main question is: why is the api unavailable during this time?

There are also a lot of errors in the logs for kube-system, here is a small excerpt:

k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused github.com/GoogleCloudPlatform/k8s-stackdriver/event-exporter/watchers/watcher.go:55: Failed to list *v1.Event: Get https://10.0.0.1:443/api/v1/events?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused "ERROR: logging before flag.Parse: E1114 09:50:42.925080 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused " k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused "ERROR: logging before flag.Parse: E1114 09:50:42.873176 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused " k8s.io/heapster/metrics/heapster.go:331: Failed to list *v1.Pod: Get https://10.0.0.1:443/api/v1/pods?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused k8s.io/heapster/metrics/processors/namespace_based_enricher.go:90: Failed to list *v1.Namespace: Get https://10.0.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused k8s.io/heapster/metrics/util/util.go:32: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused Error while getting cluster status: Get https://10.0.0.1:443/api/v1/nodes: dial tcp 10.0.0.1:443: getsockopt: connection refused k8s.io/dns/pkg/dns/dns.go:192: Failed to list *v1.Service: Get https://10.0.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused k8s.io/dns/pkg/dns/dns.go:189: Failed to list *v1.Endpoints: Get https://10.0.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.0.0.1:443: connect: connection refused github.com/GoogleCloudPlatform/k8s-stackdriver/event-exporter/watchers/watcher.go:55: Failed to list *v1.Event: Get https://10.0.0.1:443/api/v1/events?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/heapster.go:254: Failed to list *v1.Pod: Get https://10.0.0.1:443/api/v1/pods?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/processors/namespace_based_enricher.go:85: Failed to list *v1.Namespace: Get https://10.0.0.1:443/api/v1/namespaces?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused github.com/kubernetes-incubator/metrics-server/metrics/util/util.go:52: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused "ERROR: logging before flag.Parse: E1114 09:50:41.824128 1 reflector.go:205] k8s.io/autoscaler/addon-resizer/nanny/kubernetes_client.go:107: Failed to list *v1.Node: Get https://10.0.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 10.0.0.1:443: getsockopt: connection refused "

-- unludo
google-kubernetes-engine
kubernetes

1 Answer

11/19/2018

It takes time for the GCE resources to be created. In any environment, to provision a VM &/or multiple VMs in general takes some time. The endpoint is not available because the master is not yet ready. Once the cluster is created, you can add the 2 additional node pools without interrupting the master.

-- xavierc
Source: StackOverflow