kube2iam - Invalid role: does not match annotated role

4/30/2019

I have a fully working AWS EKS cluster (1.12) for which I use kube2iam for two production services. However, I am receiving the an error when attempting to add another service to our cluster and it has me stumped. I am running kube2iam 0.10.0. Thanks.

errors in kube2iam logs

time="2019-04-29T21:21:57Z" level=info msg="GET /latest (200) took 654374 ns" req.method=GET req.path=/latest req.remote=172.18.3.44 res.duration=654374 res.status=200
time="2019-04-29T21:21:57Z" level=info msg="GET /latest/meta-data/iam/security-credentials/ (200) took 18485 ns" req.method=GET req.path=/latest/meta-data/iam/security-credentials/ req.remote=172.18.3.44 res.duration=18485 res.status=200
time="2019-04-29T21:21:57Z" level=info msg="GET /latest/meta-data/iam/security-credentials/ui-eb-instance (200) took 37702 ns" req.method=GET req.path=/latest/meta-data/iam/security-credentials/ui-eb-instance req.remote=172.18.3.44 res.duration=37702 res.status=200
time="2019-04-29T21:21:57Z" level=info msg="GET /latest/dynamic/instance-identity/document (200) took 869383 ns" req.method=GET req.path=/latest/dynamic/instance-identity/document req.remote=172.18.3.44 res.duration=869383 res.status=200
time="2019-04-29T21:21:57Z" level=info msg="GET /latest/meta-data/iam/info (200) took 574052 ns" req.method=GET req.path=/latest/meta-data/iam/info req.remote=172.18.3.44 res.duration=574052 res.status=200
time="2019-04-29T21:21:57Z" level=error msg="Invalid role: does not match annotated role" ns.name=apollo-ui params.iam.role=e2e3-XXXXXXXXXX pod.iam.role="arn:aws:iam::XXXXXXXXXX:role/ui-eb-instance" req.method=GET req.path=/latest/meta-data/iam/security-credentials/e2e3-XXXXXXXXXX req.remote=172.18.3.44
time="2019-04-29T21:21:57Z" level=info msg="GET /latest/meta-data/iam/security-credentials/e2e3-XXXXXXXXXX (403) took 36262 ns" req.method=GET req.path=/latest/meta-data/iam/security-credentials/e2e3-XXXXXXXXXX req.remote=172.18.3.44 res.duration=36262 res.status=403

my kubernetes deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-04-29T20:13:44Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: ui
    app.kubernetes.io/managed-by: Tiller
    app.kubernetes.io/name: ui
    helm.sh/chart: ui-0.1.0
  name: ui
  namespace: apollo-ui
  resourceVersion: "3267240"
  selfLink: /apis/extensions/v1beta1/namespaces/apollo-ui/deployments/ui
  uid: 4a3cf375-6abb-11e9-b606-063dfc51737e
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: ui
      app.kubernetes.io/name: ui
  strategy:
    rollingUpdate:
      maxSurge: 50%
      maxUnavailable: 50%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        iam.amazonaws.com/role: ui-eb-instance
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: ui
        app.kubernetes.io/name: ui
    spec:
      containers:
      - env:
        ...
        ...
        ...
        image: XXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/ui:709870106
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /health_check
            port: http
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: ui
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /health_check
            port: http
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  conditions:
  - lastTransitionTime: "2019-04-29T20:13:44Z"
    lastUpdateTime: "2019-04-29T20:13:44Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2019-04-29T21:21:33Z"
    lastUpdateTime: "2019-04-29T21:21:33Z"
    message: ReplicaSet "ui-756cf57f9b" is progressing.
    reason: ReplicaSetUpdated
    status: "True"
    type: Progressing
  observedGeneration: 1
  replicas: 2
  unavailableReplicas: 2
  updatedReplicas: 2

AWS role "ui-eb-instance" role

{
    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Principal": {
                        "Service": "ec2.amazonaws.com"
                    },
                    "Effect": "Allow",
                    "Sid": ""
                },
                {
                    "Action": "sts:AssumeRole",
                    "Principal": {
                        "AWS": "arn:aws:iam::XXXXXXXXXX:role/e2e3-XXXXXXXXXX"
                    },
                    "Effect": "Allow",
                    "Sid": ""
                }
            ]
        },
        "MaxSessionDuration": 3600,
        "RoleId": "XXXXXXXXXX",
        "CreateDate": "2018-04-20T18:06:01Z",
        "RoleName": "ui-eb-instance",
        "Path": "/",
        "Arn": "arn:aws:iam::XXXXXXXXXX:role/ui-eb-instance"
    }
}

my AWS EKS worker node e2e3-XXXXXXXXXX role

{
    "Role": {
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Action": "sts:AssumeRole",
                    "Principal": {
                        "Service": "ec2.amazonaws.com"
                    },
                    "Effect": "Allow",
                    "Sid": "EKSWorkerAssumeRole"
                }
            ]
        },
        "MaxSessionDuration": 3600,
        "RoleId": "XXXXXXXXXX",
        "CreateDate": "2019-04-10T23:13:31Z",
        "RoleName": "e2e3-XXXXXXXXXX",
        "Path": "/",
        "Arn": "arn:aws:iam::XXXXXXXXXX:role/e2e3-XXXXXXXXXX"
    }
}

my kube2iam daemonset

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  annotations:
    flux.weave.works/antecedent: infra:helmrelease/kube2iam
  creationTimestamp: "2019-04-16T18:37:54Z"
  generation: 3
  labels:
    app: kube2iam
    chart: kube2iam-0.9.1
    heritage: Tiller
    release: kube2iam
  name: kube2iam
  namespace: infra
  resourceVersion: "3252943"
  selfLink: /apis/extensions/v1beta1/namespaces/infra/daemonsets/kube2iam
  uid: bf258ef8-6076-11e9-b606-063dfc51737e
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: kube2iam
      release: kube2iam
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: kube2iam
        release: kube2iam
    spec:
      containers:
      - args:
        - --host-interface=eni+
        - --node=$(NODE_NAME)
        - --host-ip=$(HOST_IP)
        - --iptables=true
        - --auto-discover-base-arn=true
        - --auto-discover-default-role=true
        - --app-port=8181
        env:
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.podIP
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: jtblin/kube2iam:0.10.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /healthz
            port: 8181
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        name: kube2iam
        ports:
        - containerPort: 8181
          hostPort: 8181
          protocol: TCP
        resources: {}
        securityContext:
          privileged: true
          procMount: Default
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      hostNetwork: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: kube2iam
      serviceAccountName: kube2iam
      terminationGracePeriodSeconds: 30
  templateGeneration: 3
  updateStrategy:
    type: OnDelete
status:
  currentNumberScheduled: 2
  desiredNumberScheduled: 2
  numberAvailable: 2
  numberMisscheduled: 0
  numberReady: 2
  observedGeneration: 3
  updatedNumberScheduled: 2

Verifying the kube2iam role inside my problematic ui pod

bash-4.4#  curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
ui-eb-instance
-- Chris
amazon-eks
amazon-web-services
kubernetes

1 Answer

4/30/2019

A couple of things I can think of:

  1. It could be a problem where your node role e2e3-XXXXXXXXXX is not able to assume the ui-eb-instance role. You do have the trust relationship between the 2 roles but did you attach any permission policy to e2e3-XXXXXXXXXX?

    You can put or attach a permission policy and you can start with a policy that allows AssumeRole permissions for all the resources:

    {
      "Version": "2012-10-17",
       "Statement": [
        {
          "Action": [
            "sts:AssumeRole"
          ],
          "Effect": "Allow",
          "Resource": "*"
        }
      ]
    }
    

    This is also described in the kube2iam docs.

  2. It could also be that your base role arn is not being autodiscovered with --auto-discover-base-arn so you could also try: --base-role-arn=arn:aws:iam::xxxxxxx:role/

-- Rico
Source: StackOverflow