what are the limit factors in number of simultaneous running pods in GKE?

9/26/2020

I have 3 nodes on n1-standard-16 with pods that require around 1.2 CPU + 0.2 overhead = 1.4 CPU. Theoretically, I should be able to schedule more than 30 pods among the three. However, the moment they get created, they will get terminated (immediately). I could easily see this by increasing the replicas. The number of pods that always remain is 10, regardless of what I pick for the replica parameter (anything between 10 - 30 gives me 10 | below 10 of course gives me below 10 pods). Even while 10 were running I checked the nodes and they all had ton of room to accept more pods.

The first place I looked into were the quotas (CPU and IP were close but nowhere near the limits), everything was fine. I also looked into the kubectl describe pods pod-name file of the pod just before the termination, nothing out of ordinary. The last few lines are:

  Normal   Created    21s   kubelet  Created container X
  Normal   Started    20s   kubelet  Started container X
  Normal   Pulled     20s   kubelet  Container image Z already present on machine
  Normal   Created    20s   kubelet  Created container Y
  Normal   Started    20s   kubelet  Started container Y
  Normal   Killing    15s   kubelet  Stopping container X
  Normal   Killing    15s   kubelet  Stopping container Y
  Warning  Unhealthy  11s   kubelet  Readiness probe failed: HTTP probe failed with statuscode: 500

My question is why GKE can schedule my pods and run the containers but immediately stopping them?

  1. CPU and Memory: Checked - That is not the case for my scenario
  2. Quota limitations: Checked - That is not the case for my scenario
  3. Pod logs: Checked - Nothing informative
  4. ?
-- Greg
google-kubernetes-engine
kubernetes

1 Answer

9/26/2020

Is it possible that you have edited the following flag, --max-pods-per-node when creating your cluster/node pool?

-- dany L
Source: StackOverflow