Not authorized to perform sts:AssumeRoleWithWebIdentity- 403

2/28/2021

I have been trying to run an external-dns pod using the guide provided by k8s-sig group. I have followed every step of the guide, and getting the below error.

time="2021-02-27T13:27:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 87a3ca86-ceb0-47be-8f90-25d0c2de9f48"

I had created AWS IAM policy using Terraform, and it was successfully created. Except IAM Role for service account for which I had used eksctl, everything else has been spun via Terraform.

But then I got hold of this article which says creating AWS IAM policy using awscli would eliminate this error. So I deleted the policy created using Terraform, and recreated it with awscli. Yet, it is throwing the same error error.

Below is my external dns yaml file.

apiVersion: v1
kind: ServiceAccount
metadata:
  name: external-dns
  # If you're using Amazon EKS with IAM Roles for Service Accounts, specify the following annotation.
  # Otherwise, you may safely omit it.
  annotations:
    # Substitute your account ID and IAM service role name below.
    eks.amazonaws.com/role-arn: arn:aws:iam::268xxxxxxx:role/eksctl-ats-Eks1-addon-iamserviceaccoun-Role1-WMLL93xxxx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: external-dns
rules:
- apiGroups: [""]
  resources: ["services","endpoints","pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
  resources: ["ingresses"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
    spec:
      serviceAccountName: external-dns
      containers:
      - name: external-dns
        image: k8s.gcr.io/external-dns/external-dns:v0.7.6
        args:
        - --source=service
        - --source=ingress
        - --domain-filter=xyz.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
        - --provider=aws
        - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
        - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
        - --registry=txt
        - --txt-owner-id=Z0471542U7WSPZxxxx
      securityContext:
        fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files

I am scratching my head as there is no proper solution to this error anywhere in the net. Hoping to find a solution to this issue in this forum.

End result must show something like below and fill up records in hosted zone.

time="2020-05-05T02:57:31Z" level=info msg="All records are already up to date"
-- CK5
amazon-web-services
aws-cli
eksctl
external-dns
kubernetes

6 Answers

3/2/2021

I have a few possibilities here.

Before anything else, does your cluster have an OIDC provider associated with it? IRSA won't work without it.

You can check that in the AWS console, or via the CLI with:

aws eks describe-cluster --name {name} --query "cluster.identity.oidc.issuer"

First

Delete the iamserviceaccount, recreate it, remove the ServiceAccount definition from your ExternalDNS manfiest (the entire first section) and re-apply it.

eksctl delete iamserviceaccount --name {name} --namespace {namespace} --cluster {cluster}
eksctl create iamserviceaccount --name {name} --namespace {namespace} --cluster 
{cluster} --attach-policy-arn {policy-arn} --approve --override-existing-serviceaccounts
kubectl apply -n {namespace} -f {your-externaldns-manifest.yaml}

It may be that there is some conflict going on as you have overwritten what you created with eksctl createiamserviceaccount by also specifying a ServiceAccount in your ExternalDNS manfiest.

Second

Upgrade your cluster to v1.19 (if it's not there already):

eksctl upgrade cluster --name {name} will show you what will be done;

eksctl upgrade cluster --name {name} --approve will do it

Third

Some documentation suggests that in addition to setting securityContext.fsGroup: 65534, you also need to set securityContext.runAsUser: 0.

-- LiquidPony
Source: StackOverflow

5/10/2021

I've been struggling with a similar issue after following the setup suggested here

I ended up with the exception below in the deploy logs.

time="2021-05-10T06:40:17Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 3fda6c69-2a0a-4bc9-b478-521b5131af9b"
time="2021-05-10T06:41:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 7d3e07a2-c514-44fa-8e79-d49314d9adb6"

In my case, it was an issue with wrong Service account name mapped to the new role created.

Here is a step by step approach to get this done without much hiccups.

  1. Create the IAM Policy
{
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "route53:ChangeResourceRecordSets"
          ],
          "Resource": [
            "arn:aws:route53:::hostedzone/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "route53:ListHostedZones",
            "route53:ListResourceRecordSets"
          ],
          "Resource": [
            "*"
          ]
        }
      ]
    }
  1. Create the IAM role and the service account for your EKS cluster.
eksctl create iamserviceaccount \
    --name external-dns-sa-eks \
    --namespace default \
    --cluster aecops-grpc-test \
    --attach-policy-arn arn:aws:iam::xxxxxxxx:policy/external-dns-policy-eks  \
    --approve 
    --override-existing-serviceaccounts
  1. Created new hosted zone.

aws route53 create-hosted-zone --name "hosted.domain.com." --caller-reference "grpc-endpoint-external-dns-test-$(date +%s)"

  1. Deploy ExternalDNS, after creating the Cluster role and Cluster role binding to the previously created service account.
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: external-dns
rules:
- apiGroups: [""]
  resources: ["services","endpoints","pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
  resources: ["ingresses"]
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
- kind: ServiceAccount
  name: external-dns-sa-eks
  namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
spec:
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: external-dns
  template:
    metadata:
      labels:
        app: external-dns
      # If you're using kiam or kube2iam, specify the following annotation.
      # Otherwise, you may safely omit it.
      annotations:
        iam.amazonaws.com/role: arn:aws:iam::***********:role/eksctl-eks-cluster-name-addon-iamserviceacco-Role1-156KP94SN7D7
    spec:
      serviceAccountName: external-dns-sa-eks
      containers:
      - name: external-dns
        image: k8s.gcr.io/external-dns/external-dns:v0.7.6
        args:
        - --source=service
        - --source=ingress
        - --domain-filter=hosted.domain.com. # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
        - --provider=aws
        - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
        - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
        - --registry=txt
        - --txt-owner-id=my-hostedzone-identifier
      securityContext:
        fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
  1. Update Ingress resource with the domain name and reapply the manifest.

For ingress objects, ExternalDNS will create a DNS record based on the host specified for the ingress object.

- host: myapp.hosted.domain.com

  1. Validate new records created.
BASH-3.2$ aws route53 list-resource-record-sets --output json
--hosted-zone-id "/hostedzone/Z065*********" --query "ResourceRecordSets[?Name == 'hosted.domain.com..']|[?Type == 'A']"

[
    {
        "Name": "myapp.hosted.domain.com..",
        "Type": "A",
        "AliasTarget": {
            "HostedZoneId": "ZCT6F*******",
            "DNSName": "****************.elb.ap-southeast-2.amazonaws.com.",
            "EvaluateTargetHealth": true
        }
    } ]
-- anrajme
Source: StackOverflow

8/24/2021

In our case this issue occurred when using the Terraform module to create the eks cluster, and eksctl to create the iamserviceaccount for the aws-load-balancer controller. It all works fine the first go-round. But if you do a terraform destroy, you need to do some cleanup, like delete the CloudFormation script created by eksctl. Somehow things got crossed, and the CloudTrail was passing along a resource role that was no longer valid. So check the annotation of the service account to ensure it's valid, and update it if necessary. Then in my case I deleted and redeployed the aws-load-balancer-controller

%> kubectl describe serviceaccount aws-load-balancer-controller -n kube-system        
Name:                aws-load-balancer-controller
Namespace:           kube-system
Labels:              app.kubernetes.io/managed-by=eksctl
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-JQL4R3JM7I1A
Image pull secrets:  <none>
Mountable secrets:   aws-load-balancer-controller-token-b8hw7
Tokens:              aws-load-balancer-controller-token-b8hw7
Events:              <none>
%>

%> kubectl annotate --overwrite serviceaccount aws-load-balancer-controller eks.amazonaws.com/role-arn='arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-17A92GGXZRY6O' -n kube-system
-- Gary Morris
Source: StackOverflow

4/4/2022

In my case, I was able to attach the oidc role with route53 permissions policy and that resolved the error.

https://medium.com/swlh/amazon-eks-setup-external-dns-with-oidc-provider-and-kube2iam-f2487c77b2a1

and then with the external-dns service account used that instead of the cluster role.

  annotations:
  #   # Substitute your account ID and IAM service role name below.
    eks.amazonaws.com/role-arn: arn:aws:iam::<account>:role/external-dns-service-account-oidc-role
-- user17313446
Source: StackOverflow

5/5/2021

I also struggled with this error.

The problem was in the definition of the trust relationship.

You can see in some offical aws tutorials (like this) the following setup:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:<my-namespace>:<my-service-account>"
        }
      }
    }
  ]
}

Option 1 for failure

My problem was that I passed the a wrong value for my-service-account at the end of ${OIDC_PROVIDER}:sub in the Condition part.

Option 2 for failure

After the previous fix - I still faced the same error - it was solved by following this aws tutorial which shows the output of using the eksctl with the command below:

eksctl create iamserviceaccount \
                --name my-serviceaccount \
                --namespace <your-ns> \
                --cluster <your-cluster-name> \
                --attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
                --approve

When you look at the output in the trust relationship tab in the AWS web console - you can see that an additional condition was added with the postfix of :aud and the value of sts.amazonaws.com:

enter image description here

So this need to be added after the "${OIDC_PROVIDER}:sub" condition.

-- RtmY
Source: StackOverflow

3/12/2021

I was able to get help from the Kubernetes Slack (shout out to @Rob Del) and this is what we came up with. There's nothing wrong with the k8s rbac from the article, the issue is the way the IAM role is written. I am using Terraform v0.12.24, but I believe something similar to the following .tf should work for Terraform v0.14:

data "aws_caller_identity" "current" {}

resource "aws_iam_role" "external_dns_role" {
  name = "external-dns"

  assume_role_policy = jsonencode({
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "Federated": format(
            "arn:aws:iam::${data.aws_caller_identity.current.account_id}:%s", 
            replace(
              "${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}", 
              "https://", 
              "oidc-provider/"
            )
          )
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
          "StringEquals": {
            format(
              "%s:sub", 
              trimprefix(
                "${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}", 
                "https://"
              )
            ) : "system:serviceaccount:default:external-dns"
          }
        }
      }
    ]
  })
}

The above .tf assume you created your eks cluster using terraform and that you use the rbac manifest from the external-dns tutorial.

-- Ryan Walden
Source: StackOverflow