autodiscover ASGs: InvalidClientTokenId:

7/29/2020

I installed cluster-autoscaler in my cluster k8S v1.17.5

and I have an error during a deployment

E0729 15:09:09.661938       1 aws_manager.go:261] Failed to regenerate ASG cache: cannot autodiscover ASGs: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: a2b12..........
F0729 15:09:09.661961       1 aws_cloud_provider.go:376] Failed to create AWS Manager: cannot autodiscover ASGs: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: a2b12da3-.........

my values.yaml

autoDiscovery:
# Only cloudProvider `aws` and `gce` are supported by auto-discovery at this time
# AWS: Set tags as described in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md#auto-discovery-setup
  clusterName:  my-cluster
  tags:
  - k8s.io/cluster-autoscaler/enabled
  - k8s.io/cluster-autoscaler/my-cluster
  - kubernetes.io/cluster/my-cluster
autoscalingGroups: []
# At least one element is required if not using autoDiscovery
  # - name: asg1
  #   maxSize: 2
  #   minSize: 1
  # - name: asg2
  #   maxSize: 2
  #   minSize: 1

autoscalingGroupsnamePrefix: []
# At least one element is required if not using autoDiscovery
  # - name: ig01
  #   maxSize: 10
  #   minSize: 0
  # - name: ig02
  #   maxSize: 10
  #   minSize: 0
# Required if cloudProvider=aws
awsRegion: "eu-west-2"
awsAccessKeyID: "xxxxxxxxxxx"
awsSecretAccessKey: "xxxxxxxxxxxxx"

# Required if cloudProvider=azure
# clientID/ClientSecret with contributor permission to Cluster and Node ResourceGroup
azureClientID: ""
azureClientSecret: ""
# Cluster resource Group
azureResourceGroup: ""
azureSubscriptionID: ""
azureTenantID: ""
# if using AKS azureVMType should be set to "AKS"
azureVMType: "AKS"
azureClusterName: ""
azureNodeResourceGroup: ""
# if using MSI, ensure subscription ID and resource group are set
azureUseManagedIdentityExtension: false

# Currently only `gce`, `aws`, `azure` & `spotinst` are supported
cloudProvider: aws

# Configuration file for cloud provider
cloudConfigPath: ~/.aws/credentials

image:
  repository: k8s.gcr.io/cluster-autoscaler
  tag: v1.17.1

  pullPolicy: IfNotPresent

  ## Optionally specify an array of imagePullSecrets.
  ## Secrets must be manually created in the namespace.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
  ##
  # pullSecrets:
  #   - myRegistrKeySecretName

tolerations: []

## Extra ENV passed to the container
extraEnv: {}

extraArgs:
  v: 4
  stderrthreshold: info
  logtostderr: true
  # write-status-configmap: true
  # leader-elect: true
  # skip-nodes-with-local-storage: false
  expander: least-waste
  scale-down-enabled: true
  # balance-similar-node-groups: true
  # min-replica-count: 2
  # scale-down-utilization-threshold: 0.5
  # scale-down-non-empty-candidates-count: 5
  # max-node-provision-time: 15m0s
  # scan-interval: 10s
  scale-down-delay-after-add: 10m
  scale-down-delay-after-delete: 0s
  scale-down-delay-after-failure: 3m
  # scale-down-unneeded-time: 10m
  # skip-nodes-with-local-storage: false
  # skip-nodes-with-system-pods: true

## Affinity for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
## affinity: {}

podDisruptionBudget: |
  maxUnavailable: 1
  # minAvailable: 2

## Node labels for pod assignment
## Ref: https://kubernetes.io/docs/user-guide/node-selection/
nodeSelector: {}

podAnnotations: {}
podLabels: {}
replicaCount: 1

rbac:
  ## If true, create & use RBAC resources
  ##
  create: true
  ## If true, create & use Pod Security Policy resources
  ## https://kubernetes.io/docs/concepts/policy/pod-security-policy/
  pspEnabled: false
  serviceAccount:
    # Specifies whether a service account should be created
    create: true
    # The name of the ServiceAccount to use.
    # If not set and create is true, a name is generated using the fullname template
    name: ""
  ## Annotations for the Service Account
  ##
  serviceAccountAnnotations: {}

resources: 
  limits:
    cpu: 100m
    memory: 300Mi
  requests:
    cpu: 100m
    memory: 300Mi

priorityClassName: "system-node-critical"

# Defaults to "ClusterFirst". Valid values are
# 'ClusterFirstWithHostNet', 'ClusterFirst', 'Default' or 'None'
# autoscaler does not depend on cluster DNS, recommended to set this to "Default"
dnsPolicy: "ClusterFirst"

## Security context for pod
## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext:
  runAsNonRoot: true
  runAsUser: 1001
  runAsGroup: 1001

## Security context for container
## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
containerSecurityContext:
  capabilities:
    drop:
      - all

## Deployment update strategy
## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
# updateStrategy:
#   rollingUpdate:
#     maxSurge: 1
#     maxUnavailable: 0
#   type: RollingUpdate

service:
  annotations: {}

  ## List of IP addresses at which the service is available
  ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
  ##
  externalIPs: []

  loadBalancerIP: ""
  loadBalancerSourceRanges: []
  servicePort: 8085
  portName: http
  type: ClusterIP

spotinst:
  account: ""
  token: ""
  image:
    repository: spotinst/kubernetes-cluster-autoscaler
    tag: 0.6.0
    pullPolicy: IfNotPresent

## Are you using Prometheus Operator?
serviceMonitor:
  enabled: true
  interval: "10s"
   # Namespace Prometheus is installed in
  namespace: cattle-prometheus
   ## Defaults to whats used if you follow CoreOS [Prometheus Install Instructions](https://github.com/helm/charts/tree/master/stable/prometheus-operator#tldr)
   ## [Prometheus Selector Label](https://github.com/helm/charts/tree/master/stable/prometheus-operator#prometheus-operator-1)
   ## [Kube Prometheus Selector Label](https://github.com/helm/charts/tree/master/stable/prometheus-operator#exporters)
  selector:
    prometheus: kube-prometheus
  # The metrics path to scrape - autoscaler exposes /metrics (standard)
  path: /metrics

## String to partially override cluster-autoscaler.fullname template (will maintain the release name)
nameOverride: ""

## String to fully override cluster-autoscaler.fullname template
fullnameOverride: ""

# Allow overridding the .Capabilities.KubeVersion.GitVersion (useful for "helm template" command)
kubeTargetVersionOverride: ""

## Priorities Expander
## If extraArgs.expander is set to priority, then expanderPriorities is used to define cluster-autoscaler-priority-expander priorities
## https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/expander/priority/readme.md
expanderPriorities: {}
-- breizh5729
amazon-web-services
kubernetes

2 Answers

7/30/2020

the following parameters are filled in with correct keys

# Required if cloudProvider=aws
awsRegion: "eu-west-2"
awsAccessKeyID: "xxxxxxxxxxx"
awsSecretAccessKey: "xxxxxxxxxxxxx"

so i don't understand why my config is wrong

-- breizh5729
Source: StackOverflow

7/29/2020

How are you handling IAM roles for that pod?

The cluster autoscaler needs an IAM role with permissions to do some operations on your autoscaling group: https://github.com/helm/charts/tree/master/stable/cluster-autoscaler#iam

You need to create an IAM role and then the helm template should take care of creating a service account for you that uses that role. Just like they explain here: https://github.com/helm/charts/tree/master/stable/cluster-autoscaler#iam-roles-for-service-accounts-irsa

Once you have the IAM role configured, you would then need to --set rbac.serviceAccountAnnotations."eks.amazonaws.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName when installing.

This is a pretty good explanation on how it works (although it could be a bit dense if you are starting): https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/

-- paulopontesm
Source: StackOverflow