I have 3 nodes on n1-standard-16 with pods that require around 1.2 CPU + 0.2 overhead = 1.4 CPU. Theoretically, I should be able to schedule more than 30 pods among the three. However, the moment they get created, they will get terminated (immediately). I could easily see this by increasing the replicas. The number of pods that always remain is 10, regardless of what I pick for the replica parameter (anything between 10 - 30 gives me 10 | below 10 of course gives me below 10 pods). Even while 10 were running I checked the nodes and they all had ton of room to accept more pods.
The first place I looked into were the quotas (CPU and IP were close but nowhere near the limits), everything was fine. I also looked into the kubectl describe pods pod-name file of the pod just before the termination, nothing out of ordinary. The last few lines are:
Normal Created 21s kubelet Created container X
Normal Started 20s kubelet Started container X
Normal Pulled 20s kubelet Container image Z already present on machine
Normal Created 20s kubelet Created container Y
Normal Started 20s kubelet Started container Y
Normal Killing 15s kubelet Stopping container X
Normal Killing 15s kubelet Stopping container Y
Warning Unhealthy 11s kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
My question is why GKE can schedule my pods and run the containers but immediately stopping them?
Is it possible that you have edited the following flag, --max-pods-per-node when creating your cluster/node pool?