I deployed a EKS cluster to AWS via terraform. There are two fargate profile, one for kube-system
the other is default
. After create the cluster, all pods under kube-system
are pending. And the error is:
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6db676b456-694w8 0/1 Pending 0 3m43s
kube-system coredns-6db676b456-tddtd 0/1 Pending 0 3m43s
kube-system coredns-b8f47f545-7wzm8 0/1 Pending 0 78m
$ kubectl describe --namespace kube-system pod coredns-6db676b456-694w8
Warning FailedScheduling 21s (x3 over 92s) default-scheduler no nodes available to schedule pods
It seems the fargate is not scheculed to these pods. What did I do wrong? Below is the full terrform configuration:
resource "aws_eks_cluster" "elk" {
name = "elk"
role_arn = aws_iam_role.elk.arn
version = 1.20
vpc_config {
subnet_ids = [module.vpc.private_subnets[0], module.vpc.private_subnets[1]]
}
# Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
# Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
depends_on = [
aws_iam_role_policy_attachment.elk-AmazonEKSClusterPolicy,
aws_iam_role_policy_attachment.elk-AmazonEKSVPCResourceController,
]
}
output "endpoint" {
value = aws_eks_cluster.elk.endpoint
}
output "kubeconfig-certificate-authority-data" {
value = aws_eks_cluster.elk.certificate_authority[0].data
}
# Fargate
resource "aws_eks_fargate_profile" "elk" {
cluster_name = aws_eks_cluster.elk.name
fargate_profile_name = "elk_profile"
pod_execution_role_arn = aws_iam_role.fargate_profile.arn
subnet_ids = [module.vpc.private_subnets[0], module.vpc.private_subnets[1]]
selector {
namespace = "default"
}
}
resource "aws_eks_fargate_profile" "kube_system" {
cluster_name = aws_eks_cluster.elk.name
fargate_profile_name = "kube_system_profile"
pod_execution_role_arn = aws_iam_role.fargate_profile.arn
subnet_ids = [module.vpc.private_subnets[0], module.vpc.private_subnets[1]]
selector {
namespace = "kube-system"
}
}
# IAM Role
resource "aws_iam_role" "elk" {
name = "eks-cluster-elk"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "eks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
POLICY
}
resource "aws_iam_role_policy_attachment" "elk-AmazonEKSClusterPolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.elk.name
}
# Optionally, enable Security Groups for Pods
# Reference: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
resource "aws_iam_role_policy_attachment" "elk-AmazonEKSVPCResourceController" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController"
role = aws_iam_role.elk.name
}
# IAM role for service account
data "tls_certificate" "elk" {
url = aws_eks_cluster.elk.identity[0].oidc[0].issuer
}
resource "aws_iam_openid_connect_provider" "elk" {
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.tls_certificate.elk.certificates[0].sha1_fingerprint]
url = aws_eks_cluster.elk.identity[0].oidc[0].issuer
}
data "aws_iam_policy_document" "elk_assume_role_policy" {
statement {
actions = ["sts:AssumeRoleWithWebIdentity"]
effect = "Allow"
condition {
test = "StringEquals"
variable = "${replace(aws_iam_openid_connect_provider.elk.url, "https://", "")}:sub"
values = ["system:serviceaccount:kube-system:aws-node"]
}
principals {
identifiers = [aws_iam_openid_connect_provider.elk.arn]
type = "Federated"
}
}
}
# resource "aws_iam_role" "elk" {
# assume_role_policy = data.aws_iam_policy_document.elk_assume_role_policy.json
# name = "elk"
# }
# IAM role for fargate profile
resource "aws_iam_role" "fargate_profile" {
name = "eks-fargate-profile"
assume_role_policy = jsonencode({
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "eks-fargate-pods.amazonaws.com"
}
}]
Version = "2012-10-17"
})
}
resource "aws_iam_role_policy_attachment" "AmazonEKSFargatePodExecutionRolePolicy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSFargatePodExecutionRolePolicy"
role = aws_iam_role.fargate_profile.name
}
It is possible that you need to patch the CoreDNS deployment. By default it is configured to only run on worker nodes and not on Fargate. See the "(Optional) Update CoreDNS" section in this doc page