Does HorizontalPodAutoscaler make sense when there is only one Deployment on GKE (Google Container Engine) Kubernetes cluster?

7/18/2016

I have a "homogeneous" Kubernetes setup. By this I mean that I am only running instances of a single type of pod (an http server) with a load balancer service distributing traffic to them.

By my reasoning, to get the most out of my cluster (edit: to be concrete -- getting the best average response times to http requests) I should have:

  1. At least one pod running on every node: Not having a pod running on a node, means that I am paying for the node and not having it ready to serve a request.
  2. At most one pod running on every node: The pods are threaded http servers so they can maximize utilization of a node, so running multiple pods on a node does not net me anything.

This means that I should have exactly one pod per node. I achieve this using a DaemonSet.

The alternative way is to configure a Deployment and apply a HorizontalPodAutoscaler to it and have Kubernetes handle the number of pods and pod to node mapping. Is there any disadvantage of my approach in comparison to this?

My evaluation is that the HorizontalPodAutoscaler is relevant mainly in heterogeneous situations, where one HorizontalPodAutoscaler can scale up a Deployment at the expense of another Deployment. But since I have only one type of pod, I would have only one Deployment and I would be scaling up that deployment at the expense of itself, which does not make sense.

-- user2771609
autoscaling
google-kubernetes-engine
kubernetes

1 Answer

7/18/2016

HorizontalPodAutoscaler is actually a valid solution for your needs. To address your two concerns:

1. At least one pod running on every node

This isn't your real concern. The concern is underutilizing your cluster. However, you can be underutilizing your cluster even if you have a pod running on every node. Consider a three-node cluster:

  1. Scenario A: pod running on each node, 10% CPU usage per node
  2. Scenario B: pod running on only one node, 70% CPU usage

Even though Scenario A has a pod on each node the cluster is actually being less utilized than in Scenario B where only one node has a pod.

2. At most one pod running on every node

The Kubernetes scheduler tries to spread pods around so that you don't end up with multiple pods of the same type on a single node. Since in your case the other nodes should be empty, the scheduler should have no problems starting the pods on the other nodes. Additionally, if you have the pod request resources equivalent to the node's resources, that will prevent the scheduler from scheduling a new pod on a node that already has one.

Now, you can achieve the same effect whether you go with DaemonSet or HPA, but I personally would go with HPA since I think it fits your semantics better, and would also work much better if you eventually decide to add other types of pods to your cluster

Using a DamonSet means that the pod has to run on every node (or some subset). This is a great fit for something like a logger or a metrics collector which is per-node. But you really just want to use available cluster resources to power your pod as needed, which matches up better with the intent of HPA.

As an aside, I believe GKE supports cluster autoscaling, so you should never be paying for nodes that aren't needed.

-- Pixel Elephant
Source: StackOverflow