GKE | Cluster won't provision in any region

11/10/2018

I have a GKE cluster running in us-central1 with a preemptable node pool. I have nodes in each zone (us-central1-b,us-central1-c,us-central1-f). For the last 10 hours, I get the following error for the underlying node vm:

Instance '[instance-name]' creation failed: The zone 
'[instance-zone]' 
does not have enough resources available to fulfill 
the request. Try a different zone, or try again 
later.

I tried creating new clusters in different regions with different machine types, using HA (multi-zone) settings and I get the same error for every cluster.

I saw an issue on Google Cloud Status Dashboard and tried with the console, as recommended, and it errors out with a timeout error.

Is anyone else having this problem? Any idea what I may be dong wrong?

UPDATES

  • Nov 11
    • I stood up a cluster in us-west2, this was the only one which would work. I used gcloud command line, it seems the UI was not effective. There was a note similar to this situation, use gcloud not ui, on the Google Cloud Status Dashboard.
    • I tried creating node pools in us-central1 with the gcloud command line, and ui, to no avail.
    • I'm now federating deployments across regions and standing up multi-region ingress.
  • Nov. 12
    • Cannot create HA clusters in us-central1; same message as listed above.
    • Reached out via twitter and received a response.
    • Working with the K8s guide to federation to see if I can get multi-cluster running. Most likely going to use Kelsey Hightowers approach
    • Only problem, can't spin up clusters to federate.

Findings

  • Talked with google support, need a $150/mo. package to get a tech person to answer my questions.
  • Preemptible instances are not a good option for a primary node pool. I did this because I'm cheap, it bit me hard.
    • The new architecture is a primary node pool with committed use VMs that do not autoscale, and a secondary node pool with preemptible instances for autoscale needs. The secondary pool will have minimum nodes = 0 and max nodes = 5 (for right now); this cluster is regional so instances are across all zones.
    • Cost for an n1-standard-1 sustained use (assuming 24/7) a 30% discount off list.
    • Cost for a 1-year n1-standard-1 committed use is about ~37% discount off list.
    • Preemptible instances are re-provisioned every 24hrs., if they are not taken from you when resource needs spike in the region.
    • I believe I fell prey to a resource spike in the us-central1.
  • A must-watch for people looking to federate K8s: Kelsey Hightower - CNCF Keynote | Kubernetes Federation
-- Bill Bensing
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

11/13/2018

Issue appears to be resolved as of Nov 13th.

-- Bill Bensing
Source: StackOverflow