I had numerous deployment issues with getting the kubernetes cluster-autoscaler working on AWS EKS. At length, I got it working.
Notes:
don't use the helm chart.
Refer to their version chart here
Based on your version of kubernetes, you'll need to go to that branch of their code. WARN: their "version" does NOT necessarily line up with the container version being deployed.
Now RTFM, but here are the other gotchas.
The examples in their AWS directory don't assume you're using EKS, which deploys on AMZN2 linux by default. You'll have to update the manifest as described here to use the appropriate SSL directories
The example in their AWS directory assumes the us-east-1
region for you. You're probably not using it, and it's not a variable. Update that manifest, round 2.
This was my biggest gotcha, and it was a problem of my own creation. Make sure your aws-auth config has the latest IAM role in it, if you've torn down / recreated your EKS cluster. Otherwise literally -nothing- will be able to auth to the K8s APIServer, including your nodes.
Errors I ran into:
Error: 1 static_autoscaler.go:118] Failed to update node registry: RequestError: send request failed caused by: Post https://autoscaling.us-west-2a.amazonaws.com/: dial tcp: lookup autoscaling.us-west-2a.amazonaws.com on RE.DA.CT.ED:53: no such host
Solution: wrong region in default manifest
Error: container_linux.go:262: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:57: mounting \\\"/etc/ssl/certs/ca-certificates.crt\\\" to rootfs \\\"/var/lib/docker/overlay2/60c34f52ad1e6f0e3368d95794c71036fc465707f8ef3247032b6291e71d9331/merged\\\" at \\\"/var/lib/docker/overlay2/60c34f52ad1e6f0e3368d95794c71036fc465707f8ef3247032b6291e71d9331/merged/etc/ssl/certs/ca-certificates.crt\\\" caused \\\"not a directory\\\"\""
Solution: %s/ca\-certificates\.crt/ca\-bundle\.crt/g