I followed "https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html" to create an EKS cluster using terraform.
I was able to create a config map successfully but i am unable to get the node details -
$ ./kubectl_1.10.3_darwin get nodes
No resources found.
Service details -
$ ./kubectl_1.10.3_darwin get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 2h
Kubectl logs on nodes -
Aug 5 09:14:32 ip-172-31-18-205 kubelet: I0805 09:14:32.617738 25463 aws.go:1026] Building AWS cloudprovider
Aug 5 09:14:32 ip-172-31-18-205 kubelet: I0805 09:14:32.618168 25463 aws.go:988] Zone not specified in configuration file; querying AWS metadata service
Aug 5 09:14:32 ip-172-31-18-205 kubelet: E0805 09:14:32.794914 25463 tags.go:94] Tag "KubernetesCluster" nor "kubernetes.io/cluster/..." not found; Kubernetes may behave unexpectedly.
Aug 5 09:14:32 ip-172-31-18-205 kubelet: F0805 09:14:32.795622 25463 server.go:233] failed to run Kubelet: could not init cloud provider "aws": AWS cloud failed to find ClusterID
Aug 5 09:14:32 ip-172-31-18-205 systemd: kubelet.service: main process exited, code=exited, status=255/n/a
Aug 5 09:14:32 ip-172-31-18-205 systemd: Unit kubelet.service entered failed state.
Aug 5 09:14:32 ip-172-31-18-205 systemd: kubelet.service failed.
AWS getting started documentation doesn't mention any tags related information "https://docs.aws.amazon.com/eks/latest/userguide/getting-started.html".
After a while I found out that I missed to put resource tags like "kubernetes.io/cluster/*" to my networking resources.
My networking resources are pre-created, I use remote states to fetch the required details. I believe that I can either add tags to it OR create a new VPC env.
Is there any alternate way to solve this without adding tags or provisioning new resources?
Make sure you add a similar tag as below to your VPCs, Subnets & ASGs -
"kubernetes.io/cluster/${CLUSTER_NAME}" = "shared"
NOTE: The usage of the specific kubernetes.io/cluster/* resource tags below are required for EKS and Kubernetes to discover and manage networking resources.
NOTE: The usage of the specific kubernetes.io/cluster/* resource tag below is required for EKS and Kubernetes to discover and manage compute resources. - Terraform docs
I had missed propagating tags using auto-scaling groups on worker nodes. I added below code to ASG terraform module & it started working, at least the nodes were able to connect to the master cluster. You also need to add the tag to VPC & Subnets for EKS and Kubernetes to discover and manage networking resources.
For VPC -
locals {
cluster_tags = {
"kubernetes.io/cluster/${var.project}-${var.env}-cluster" = "shared"
}
}
resource "aws_vpc" "myvpc" {
cidr_block = "${var.vpc_cidr}"
enable_dns_hostnames = true
tags = "${merge(map("Name", format("%s-%s-vpcs", var.project, var.env)), var.default_tags, var.cluster_tags)}"
}
resource "aws_subnet" "private_subnet" {
count = "${length(var.private_subnets)}"
vpc_id = "${aws_vpc.myvpc.id}"
cidr_block = "${var.private_subnets[count.index]}"
availability_zone = "${element(var.azs, count.index)}"
tags = "${merge(map("Name", format("%s-%s-pvt-%s", var.project, var.env, element(var.azs, count.index))), var.default_tags, var.cluster_tags)}"
}
resource "aws_subnet" "public_subnet" {
count = "${length(var.public_subnets)}"
vpc_id = "${aws_vpc.myvpc.id}"
cidr_block = "${var.public_subnets[count.index]}"
availability_zone = "${element(var.azs, count.index)}"
map_public_ip_on_launch = "true"
tags = "${merge(map("Name", format("%s-%s-pub-%s", var.project, var.env, element(var.azs, count.index))), var.default_tags, var.cluster_tags)}"
}
For ASGs -
resource "aws_autoscaling_group" "asg-node" {
name = "${var.project}-${var.env}-asg-${aws_launch_configuration.lc-node.name}"
vpc_zone_identifier = ["${var.vpc_zone_identifier}"]
min_size = 1
desired_capacity = 1
max_size = 1
target_group_arns = ["${var.target_group_arns}"]
default_cooldown= 100
health_check_grace_period = 100
termination_policies = ["ClosestToNextInstanceHour", "NewestInstance"]
health_check_type="EC2"
depends_on = ["aws_launch_configuration.lc-node"]
launch_configuration = "${aws_launch_configuration.lc-node.name}"
lifecycle {
create_before_destroy = true
}
tags = ["${data.null_data_source.tags.*.outputs}"]
tags = [
{
key = "Name"
value = "${var.project}-${var.env}-asg-eks"
propagate_at_launch = true
},
{
key = "role"
value = "eks-worker"
propagate_at_launch = true
},
{
key = "kubernetes.io/cluster/${var.project}-${var.env}-cluster"
value = "owned"
propagate_at_launch = true
}
]
}
I was able to deploy a sample application post above changes.
PS - Answering this since AWS EKS getting started documentation doesn't have these instructions very clear & people trying to create ASGs manually may fall into this issue. This might help others save their time.