How to implement EKS Cluster Autoscaling effectively

5/4/2021

I've created an EKS managed node cluster in US-EAST-2 with the following configuration:

managedNodeGroups:
  - name: eks-worker-nodegroup
    instanceType: m5.xlarge
    desiredCapacity: 2
    minSize: 2
    maxSize: 3
    volumeSize: 200
    instanceName: eks-worker-node
    ebsOptimized: true
    volumeEncrypted: true
    labels: {mynodegroup: qa-a}
    tags:
      mynodegroup: qa-a
    iam:
      withAddonPolicies:
        imageBuilder: true
        autoScaler: true
        albIngress: true
        cloudWatch: true
        ebs: true
        efs: true

I want to make use of Cluster Autoscaling and have been reviewing the documentation here, but need some guidance on understanding exactly how to implement it.

The eksctl create cluster command created a single Autoscaling Group in AWS using the sizes I specified in the configuration above. This ASG spans all three Availability Zones for US-EAST-2. The documentation from Amazon, however, recommends setting up one ASG per Availability Zone.

The behavior I'm looking for is this:

  • Two nodes to start with, running a series of applications and services. (None are stateful at this time, but that could change in the future - which is why I think the one-ASG-per-AZ route is actually the better approach)
  • As demand increases, Kubernetes will horizontally scale my pods via additional replicas as defined in my deployment manifests.
  • As available resources within the nodegroup are depleted, the Cluster Autoscaler spins up additional nodes and adds them to the nodegroup temporarily to support the increased demand. Pods are added and/or rebalanced gracefully, and as demand falls off, the additional nodes are spun down.

My question is multifaceted...

  1. Should I have designed my cluster configuration file differently in order to support one ASG per AZ (and therefore should tear down the infrastructure I've already created and start over)?

  2. Is doing this as simple as manually creating new ASGs per AZ and then tagging each one identically to the default ASG that eksctl created for me? If so, then deploying the Cluster Autoscaler after doing this should allow the manually created ASGs to be automatically detected - but is this the correct way to go about this?

What is the most appropriate way to achieve cluster autoscaling based on the behavior I'm seeking?

EDIT: Based on one of the comments below and re-reading AWS's docs, I'm now wondering if the approach of having a single Managed Nodegroup then manually creating AZ-specific ASGs is wrong.

Is the better approach to create multiple Managed Nodegroups, each restricted to its own AZ? This would have the effect of creating one ASG per AZ.

-- Bryan
amazon-eks
aws-auto-scaling
eksctl
kubernetes

0 Answers