I have deployed an app using Kubernetes to a Google Cloud Container Engine Cluster.
I got into autoscaling, and I found the following options:
Kubernetes Horizontal Pod Autoscaling (HPA)
As explained here, Kubernetes offers the HPA on deployments. As per the docs:
Horizontal Pod Autoscaling automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization
Google Cloud Container Cluster
Now I have a Google Cloud Container Cluster using 3 instances, with autoscaling enabled. As per the docs:
Cluster Autoscaler enables users to automatically resize clusters so that all scheduled pods have a place to run.
This means I have two places to define my autoscaling. Hence my questions:
3 and 10 and a cluster with number of instances between 1 and 3 and autoscaling kicks in. When and how would both scale?Many thanks!
Is a Pod the same as VM instance inside my cluster, or can multiple Pod's run inside a single VM instance?
Multiple Pods can run the same instance (called node in kuberenetes). You can define maximum resources to consume for a POD in the deployment yaml. See the docs. This is an important prerequisite for autoscaling.
Are these two parameters doing the same (aka creating/removing VM instances inside my cluster). If not, what is their behaviour compared to one another?
Kubernetes autoscaler will schedule additional PODs in your existing nodes. Google autoscaler will add worker nodes (new instances) to your cluster. Google autoscaler looks at queued up PODs that cannot be scheduled because there is no space in your cluster and when it finds those will add nodes.
What happens if e.g. I have a number of pods between 3 and 10 and a cluster with number of instances between 1 and 3 and autoscaling kicks in. When and how would both scale?
By the maximum resource usage you define for your pods google autoscaler will estimate how many new nodes are required to run all queued up, scheduled pods.
Also read this article.