AWS EKS Terraform - Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found

8/5/2018

I followed "https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html" to create an EKS cluster using terraform.

I was able to create a config map successfully but i am unable to get the node details -

$ ./kubectl_1.10.3_darwin get nodes 
No resources found.

Service details -

$ ./kubectl_1.10.3_darwin get services
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.100.0.1   <none>        443/TCP   2h

Kubectl logs on nodes -

Aug  5 09:14:32 ip-172-31-18-205 kubelet: I0805 09:14:32.617738   25463 aws.go:1026] Building AWS cloudprovider
Aug  5 09:14:32 ip-172-31-18-205 kubelet: I0805 09:14:32.618168   25463 aws.go:988] Zone not specified in configuration file; querying AWS metadata service
Aug  5 09:14:32 ip-172-31-18-205 kubelet: E0805 09:14:32.794914   25463 tags.go:94] Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.
Aug  5 09:14:32 ip-172-31-18-205 kubelet: F0805 09:14:32.795622   25463 server.go:233] failed to run Kubelet: could not init cloud provider "aws": AWS cloud failed to find ClusterID
Aug  5 09:14:32 ip-172-31-18-205 systemd: kubelet.service: main process exited, code=exited, status=255/n/a
Aug  5 09:14:32 ip-172-31-18-205 systemd: Unit kubelet.service entered failed state.
Aug  5 09:14:32 ip-172-31-18-205 systemd: kubelet.service failed.

AWS getting started documentation doesn't mention any tags related information "https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html".

After a while I found out that I missed to put resource tags like "kubernetes.io/cluster/*" to my networking resources.

My networking resources are pre-created, I use remote states to fetch the required details. I believe that I can either add tags to it OR create a new VPC env.

Is there any alternate way to solve this without adding tags or provisioning new resources?

-- vivekyad4v
amazon-eks
amazon-web-services
kubernetes

1 Answer

8/5/2018

Make sure you add a similar tag as below to your VPCs, Subnets & ASGs -

"kubernetes.io/cluster/${CLUSTER_NAME}" = "shared"

NOTE: The usage of the specific kubernetes.io/cluster/* resource tags below are required for EKS and Kubernetes to discover and manage networking resources.
NOTE: The usage of the specific kubernetes.io/cluster/* resource tag below is required for EKS and Kubernetes to discover and manage compute resources. - Terraform docs

I had missed propagating tags using auto-scaling groups on worker nodes. I added below code to ASG terraform module & it started working, at least the nodes were able to connect to the master cluster. You also need to add the tag to VPC & Subnets for EKS and Kubernetes to discover and manage networking resources.

For VPC -

locals {
  cluster_tags = {
    "kubernetes.io/cluster/${var.project}-${var.env}-cluster" = "shared"
  }
}

resource "aws_vpc" "myvpc" {
  cidr_block = "${var.vpc_cidr}"
  enable_dns_hostnames = true

  tags = "${merge(map("Name", format("%s-%s-vpcs", var.project, var.env)), var.default_tags, var.cluster_tags)}"
}

resource "aws_subnet" "private_subnet" {
  count = "${length(var.private_subnets)}"

  vpc_id            = "${aws_vpc.myvpc.id}"
  cidr_block        = "${var.private_subnets[count.index]}"
  availability_zone = "${element(var.azs, count.index)}"

  tags = "${merge(map("Name", format("%s-%s-pvt-%s", var.project, var.env, element(var.azs, count.index))), var.default_tags, var.cluster_tags)}"
}

resource "aws_subnet" "public_subnet" {
  count = "${length(var.public_subnets)}"

  vpc_id            = "${aws_vpc.myvpc.id}"
  cidr_block        = "${var.public_subnets[count.index]}"
  availability_zone = "${element(var.azs, count.index)}"
  map_public_ip_on_launch = "true"

  tags = "${merge(map("Name", format("%s-%s-pub-%s", var.project, var.env, element(var.azs, count.index))), var.default_tags, var.cluster_tags)}"
}

For ASGs -

resource "aws_autoscaling_group" "asg-node" {
    name = "${var.project}-${var.env}-asg-${aws_launch_configuration.lc-node.name}"

    vpc_zone_identifier = ["${var.vpc_zone_identifier}"]
    min_size  = 1
    desired_capacity  = 1
    max_size  = 1
    target_group_arns = ["${var.target_group_arns}"]
    default_cooldown= 100
    health_check_grace_period = 100
    termination_policies = ["ClosestToNextInstanceHour", "NewestInstance"]
    health_check_type="EC2"
    depends_on = ["aws_launch_configuration.lc-node"]
    launch_configuration = "${aws_launch_configuration.lc-node.name}"
    lifecycle {
    create_before_destroy = true
    }

    tags = ["${data.null_data_source.tags.*.outputs}"]
    tags = [
      {
      key                 = "Name"
      value               = "${var.project}-${var.env}-asg-eks"
      propagate_at_launch = true
       },
      {
      key                 = "role"
      value               = "eks-worker"
      propagate_at_launch = true
       },
       {
      key                 = "kubernetes.io/cluster/${var.project}-${var.env}-cluster"
      value               = "owned"
      propagate_at_launch = true
      }
   ]
}

I was able to deploy a sample application post above changes.

PS - Answering this since AWS EKS getting started documentation doesn't have these instructions very clear & people trying to create ASGs manually may fall into this issue. This might help others save their time.

-- vivekyad4v
Source: StackOverflow