I have an existing GKE cluster that was created from some config in Terraform that I got from a tutorial in GitHub.
The cluster has a default node pool with 3 nodes.
I tried to add another node pool with 3 nodes through the GKE console,
but when I do kubectl get nodes
I only see 4 nodes, not 6.
When I tried the same through gcloud
command line,
I remember seeing a message due to IP space.
Seems like I cannot have 6 nodes because of IP space.
How can I change the IP space of my existing cluster?
I did some research on this and it seems like it cannot be changed for an existing cluster in GKE?
How and where can I set this IP space for a new cluster then?
UPDATE:
I found the error message in my notifications in GCP:
(1) deploy error: Not all instances running in IGM after 19.314823406s. Expect 1. Current errors: [IP_SPACE_EXHAUSTED]: Instance '--6fa3ebb6-cw6t' creation failed: IP space of 'projects//regions/us-east4/subnetworks/-pods-4851bf1518184e60' is exhausted. (2) deploy error: Not all instances running in IGM after 19.783096708s. Expect 1. Current errors: [IP_SPACE_EXHAUSTED]: Instance '-spec--bf111c8e-h8mm' creation failed: IP space of 'projects//regions/us-east4/subnetworks/-pods-4851bf1518184e60' is exhausted.
I have figured out the issue.
Background on the issue can be read in detail here.
Specifically, the part:
"...if you set the maximum Pods per node to 30 then, per the table above, a /26 CIDR range is used, and each Node is assigned 64 IP addresses. If you do not configure the maximum number of Pods per node, a /24 CIDR range is used, and each node is assigned 256 IP addresses..."
I had deployed this cluster through a Terraform demo, so I am not sure how to do the changes through GCP Console or command line.
I have made changes to the config in Terraform which resolved this issue.
The Terraform configuration for a variable called kubernetes_pods_ipv4_cidr
was 10.1.92.0/22.
This meant that a range of 10.1.92.0 – 10.1.95.255 was assigned to the Cluster Nodes for Pods.
According to the GCP documentation, by default, a node will have maximum of 110 Pods and be assigned 256 IP addresses.
Hence with the default maximum Pod per Node count, there can only be 4 Nodes on my cluster, since each node will be assigned 256 IP addresses for Pods.
I added a new field, default_max_pods_per_node
, in my Terraform config to reduce this maximum from the default of 110 to 55:
resource "google_container_cluster" "my-cluster" {
provider = "google-beta"
name = "my-cluster"
project = "${local.my-cluster_project_id}"
location = "${var.region}"
default_max_pods_per_node = 55
After that, my cluster was able to support more nodes.
Alternatively, you can also change the IP range assigned to kubernetes_pods_ipv4_cidr
.
The error message you are seeing isn't that your GKE cluster is out of IP space (which can happen if you create a cluster with a small CIDR range for pod IPs) but rather that the underlying GCP network in which the cluster exists is out of space. If you look at the subnetwork (it looks like it's called -pods-4851bf1518184e60
based on your error message) where the cluster is running you should see that it doesn't have sufficient space to add additional nodes.
You can confirm that this is the problem by deleting the new node pool and trying to scale the original node pool from 3 to 6 nodes.
I don't recall if there is a way to expand the size of a subnet dynamically. If so, then you can add IP space to the subnet to add nodes. If not, you will need to create a new cluster in a larger subnet.