I have been trying to run an external-dns pod using the guide provided by k8s-sig group. I have followed every step of the guide, and getting the below error.
time="2021-02-27T13:27:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 87a3ca86-ceb0-47be-8f90-25d0c2de9f48"
I had created AWS IAM policy using Terraform, and it was successfully created. Except IAM Role for service account for which I had used eksctl
, everything else has been spun via Terraform.
But then I got hold of this article which says creating AWS IAM policy using awscli would eliminate this error. So I deleted the policy created using Terraform, and recreated it with awscli. Yet, it is throwing the same error error.
Below is my external dns yaml file.
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
# If you're using Amazon EKS with IAM Roles for Service Accounts, specify the following annotation.
# Otherwise, you may safely omit it.
annotations:
# Substitute your account ID and IAM service role name below.
eks.amazonaws.com/role-arn: arn:aws:iam::268xxxxxxx:role/eksctl-ats-Eks1-addon-iamserviceaccoun-Role1-WMLL93xxxx
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services","endpoints","pods"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: k8s.gcr.io/external-dns/external-dns:v0.7.6
args:
- --source=service
- --source=ingress
- --domain-filter=xyz.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
- --provider=aws
- --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
- --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
- --registry=txt
- --txt-owner-id=Z0471542U7WSPZxxxx
securityContext:
fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
I am scratching my head as there is no proper solution to this error anywhere in the net. Hoping to find a solution to this issue in this forum.
End result must show something like below and fill up records in hosted zone.
time="2020-05-05T02:57:31Z" level=info msg="All records are already up to date"
I have a few possibilities here.
Before anything else, does your cluster have an OIDC provider associated with it? IRSA won't work without it.
You can check that in the AWS console, or via the CLI with:
aws eks describe-cluster --name {name} --query "cluster.identity.oidc.issuer"
First
Delete the iamserviceaccount
, recreate it, remove the ServiceAccount
definition from your ExternalDNS manfiest (the entire first section) and re-apply it.
eksctl delete iamserviceaccount --name {name} --namespace {namespace} --cluster {cluster}
eksctl create iamserviceaccount --name {name} --namespace {namespace} --cluster
{cluster} --attach-policy-arn {policy-arn} --approve --override-existing-serviceaccounts
kubectl apply -n {namespace} -f {your-externaldns-manifest.yaml}
It may be that there is some conflict going on as you have overwritten what you created with eksctl createiamserviceaccount
by also specifying a ServiceAccount
in your ExternalDNS manfiest.
Second
Upgrade your cluster to v1.19 (if it's not there already):
eksctl upgrade cluster --name {name}
will show you what will be done;
eksctl upgrade cluster --name {name} --approve
will do it
Third
Some documentation suggests that in addition to setting securityContext.fsGroup: 65534
, you also need to set securityContext.runAsUser: 0
.
I've been struggling with a similar issue after following the setup suggested here
I ended up with the exception below in the deploy logs.
time="2021-05-10T06:40:17Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 3fda6c69-2a0a-4bc9-b478-521b5131af9b"
time="2021-05-10T06:41:20Z" level=error msg="records retrieval failed: failed to list hosted zones: WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 7d3e07a2-c514-44fa-8e79-d49314d9adb6"
In my case, it was an issue with wrong Service account name mapped to the new role created.
Here is a step by step approach to get this done without much hiccups.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"route53:ChangeResourceRecordSets"
],
"Resource": [
"arn:aws:route53:::hostedzone/*"
]
},
{
"Effect": "Allow",
"Action": [
"route53:ListHostedZones",
"route53:ListResourceRecordSets"
],
"Resource": [
"*"
]
}
]
}
eksctl create iamserviceaccount \ --name external-dns-sa-eks \ --namespace default \ --cluster aecops-grpc-test \ --attach-policy-arn arn:aws:iam::xxxxxxxx:policy/external-dns-policy-eks \ --approve --override-existing-serviceaccounts
aws route53 create-hosted-zone --name "hosted.domain.com." --caller-reference "grpc-endpoint-external-dns-test-$(date +%s)"
--- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: name: external-dns rules: - apiGroups: [""] resources: ["services","endpoints","pods"] verbs: ["get","watch","list"] - apiGroups: ["extensions","networking.k8s.io"] resources: ["ingresses"] verbs: ["get","watch","list"] - apiGroups: [""] resources: ["nodes"] verbs: ["list","watch"] --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: external-dns-viewer roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: external-dns subjects: - kind: ServiceAccount name: external-dns-sa-eks namespace: default --- apiVersion: apps/v1 kind: Deployment metadata: name: external-dns spec: strategy: type: Recreate selector: matchLabels: app: external-dns template: metadata: labels: app: external-dns # If you're using kiam or kube2iam, specify the following annotation. # Otherwise, you may safely omit it. annotations: iam.amazonaws.com/role: arn:aws:iam::***********:role/eksctl-eks-cluster-name-addon-iamserviceacco-Role1-156KP94SN7D7 spec: serviceAccountName: external-dns-sa-eks containers: - name: external-dns image: k8s.gcr.io/external-dns/external-dns:v0.7.6 args: - --source=service - --source=ingress - --domain-filter=hosted.domain.com. # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones - --provider=aws - --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization - --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both) - --registry=txt - --txt-owner-id=my-hostedzone-identifier securityContext: fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
For ingress objects, ExternalDNS will create a DNS record based on the host specified for the ingress object.
- host: myapp.hosted.domain.com
BASH-3.2$ aws route53 list-resource-record-sets --output json
--hosted-zone-id "/hostedzone/Z065*********" --query "ResourceRecordSets[?Name == 'hosted.domain.com..']|[?Type == 'A']"
[
{
"Name": "myapp.hosted.domain.com..",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "ZCT6F*******",
"DNSName": "****************.elb.ap-southeast-2.amazonaws.com.",
"EvaluateTargetHealth": true
}
} ]
In our case this issue occurred when using the Terraform module to create the eks cluster, and eksctl to create the iamserviceaccount for the aws-load-balancer controller. It all works fine the first go-round. But if you do a terraform destroy, you need to do some cleanup, like delete the CloudFormation script created by eksctl. Somehow things got crossed, and the CloudTrail was passing along a resource role that was no longer valid. So check the annotation of the service account to ensure it's valid, and update it if necessary. Then in my case I deleted and redeployed the aws-load-balancer-controller
%> kubectl describe serviceaccount aws-load-balancer-controller -n kube-system
Name: aws-load-balancer-controller
Namespace: kube-system
Labels: app.kubernetes.io/managed-by=eksctl
Annotations: eks.amazonaws.com/role-arn: arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-JQL4R3JM7I1A
Image pull secrets: <none>
Mountable secrets: aws-load-balancer-controller-token-b8hw7
Tokens: aws-load-balancer-controller-token-b8hw7
Events: <none>
%>
%> kubectl annotate --overwrite serviceaccount aws-load-balancer-controller eks.amazonaws.com/role-arn='arn:aws:iam::212222224610:role/eksctl-ch-test-addon-iamserviceaccou-Role1-17A92GGXZRY6O' -n kube-system
In my case, I was able to attach the oidc role with route53 permissions policy and that resolved the error.
https://medium.com/swlh/amazon-eks-setup-external-dns-with-oidc-provider-and-kube2iam-f2487c77b2a1
and then with the external-dns service account used that instead of the cluster role.
annotations:
# # Substitute your account ID and IAM service role name below.
eks.amazonaws.com/role-arn: arn:aws:iam::<account>:role/external-dns-service-account-oidc-role
I also struggled with this error.
The problem was in the definition of the trust relationship.
You can see in some offical aws tutorials (like this) the following setup:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_PROVIDER}:sub": "system:serviceaccount:<my-namespace>:<my-service-account>"
}
}
}
]
}
Option 1 for failure
My problem was that I passed the a wrong value for my-service-account
at the end of ${OIDC_PROVIDER}:sub
in the Condition
part.
Option 2 for failure
After the previous fix - I still faced the same error - it was solved by following this aws tutorial which shows the output of using the eksctl with the command below:
eksctl create iamserviceaccount \
--name my-serviceaccount \
--namespace <your-ns> \
--cluster <your-cluster-name> \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
--approve
When you look at the output in the trust relationship tab in the AWS web console - you can see that an additional condition was added with the postfix of :aud
and the value of sts.amazonaws.com
:
So this need to be added after the "${OIDC_PROVIDER}:sub"
condition.
I was able to get help from the Kubernetes Slack (shout out to @Rob Del) and this is what we came up with. There's nothing wrong with the k8s rbac from the article, the issue is the way the IAM role is written. I am using Terraform v0.12.24, but I believe something similar to the following .tf should work for Terraform v0.14:
data "aws_caller_identity" "current" {}
resource "aws_iam_role" "external_dns_role" {
name = "external-dns"
assume_role_policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": format(
"arn:aws:iam::${data.aws_caller_identity.current.account_id}:%s",
replace(
"${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}",
"https://",
"oidc-provider/"
)
)
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
format(
"%s:sub",
trimprefix(
"${aws_eks_cluster.<YOUR_CLUSTER_NAME>.identity[0].oidc[0].issuer}",
"https://"
)
) : "system:serviceaccount:default:external-dns"
}
}
}
]
})
}
The above .tf assume you created your eks cluster using terraform and that you use the rbac manifest from the external-dns tutorial.