We had a GKE cluster with 3 nodes.
On those nodes one ReplicationController was set to run 3 pods of type A and another ReplicationController was set to run 4 pods of type B.
We set up an instance group manager to autoscale the nodes on CPU.
Since there was no load on the cluster it scaled down to 1 node. Now that node was running only 2 pods of type B and 0 of type A.
I was kinda expecting it to at least have 1 pod of A and 1 of B left after the scale down, but that didn't happen. Is there a way to configure Kubernetes (or GKE) to always have at least 1 of each pod?
Yes, there is. Use a Replication Controller and set replicas: 1
.
The cluster autoscaler generally sets the number of nodes based on the target utilization level of your VMs. It doesn't know anything about what you are running on the VMs (pods or otherwise) and only looks at the utilization.
The Google Container Engine / Kubernetes scheduler looks at the resource requests for each pod and finds an available node on which to run the pod. If there isn't space available, then the pod will stay in the Pending state rather than start running.
It sounds like you are experiencing a situation where the pods that are running aren't using sufficient CPU to cause the autoscaler to add new nodes to your cluster, but the existing nodes don't have enough capacity for the pods that you want to schedule.
When configuring the VM autoscaler, you can set the minimum number of VMs (see https://cloud.google.com/compute/docs/reference/latest/autoscalers#resource) based on the minimum pod footprint that you want to always be running in your cluster. Then the autoscaler won't delete the VMs that are necessary for all of your pods to run.
You can also look at the Horizontal Pod Autoscaler in Kubernetes 1.1 to increase the number of pod replicas in your replication controller based on their observed CPU usage.