I followed this tutorial: Kubernetes Cluster with private networking on AWS using Kops
However, after creating the kubernetes cluster, I am getting the following error:
$ kops validate cluster
Using cluster from kubectl context: k8s-cluster.mydomain.com
Validating cluster k8s-cluster.mydomain.com
unexpected error during validation: error listing nodes: Get https://subdomain.eu-central-1.elb.amazonaws.com/api/v1/nodes: EOF
Any ideas on how to debug or resolve this issue?
Steps I used to create are below:
Setup VPC and Subnets
Create the VPC
$ aws ec2 create-vpc --cidr-block 10.0.0.0/16 --region eu-central-1
Allow DNS hostnames
$ aws ec2 modify-vpc-attribute --vpc-id ${VPC_ID} --enable-dns-hostnames "{\"Value\":true}" --region ${REGION}
Create internet gateway
$ aws ec2 create-internet-gateway --region ${REGION}
Attach internet gateway to VPC
$ aws ec2 attach-internet-gateway --internet-gateway-id ${INTERNET_GATEWAY_ID} --vpc-id ${VPC_ID} --region ${REGION}
[PUBLIC SUBNETS] Create three public zones / subnets (3x)
$ aws ec2 create-subnet --vpc-id ${VPC_ID} --cidr-block 10.0.0.0/20 --availability-zone ${AVAILABILITY_ZONE_1} --region ${REGION}
Set public subnets to auto-assign public ip to instances (3x)
$ aws ec2 modify-subnet-attribute --subnet-id ${PUBLIC_SUBNET_1} --map-public-ip-on-launch --region ${REGION}
[PRIVATE SUBNETS] Create three private zones / subnets (3x)
$ aws ec2 create-subnet --vpc-id ${VPC_ID} --cidr-block 10.0.48.0/20 --availability-zone ${AVAILABILITY_ZONE_1} --region ${REGION}
[Setup NAT Gateways] Allocate address (3x)
$ aws ec2 allocate-address --domain vpc --region ${REGION}
Create NAT gateway for public zones (3x)
$ aws ec2 create-nat-gateway --subnet-id ${PUBLIC_SUBNET_1} --allocation-id ${EIP_ALLOCATION_ID_1} --region ${REGION}
[CONFIGURE ROUTE TABLES] Create route table
$ aws ec2 create-route-table --vpc-id ${VPC_ID} --region ${REGION}
Create route for internet gateway
$ aws ec2 create-route --route-table-id ${RTB_PUBLIC_1} --destination-cidr-block 0.0.0.0/0 --gateway-id ${INTERNET_GATEWAY_ID} --region ${REGION}
Associate public subnets with route table (3x)
$ aws ec2 associate-route-table --route-table-id ${RTB_PUBLIC_1} --subnet-id ${PUBLIC_SUBNET_1} --region ${REGION}
[ROUTE TABLE FOR PRIVATE ZONES] Create route table for each private zone (3x)
$ aws ec2 create-route-table --vpc-id ${VPC_ID} --region ${REGION}
Create route to NAT Gateway (3x)
$ aws ec2 create-route --route-table-id ${RTB_PRIVATE_1} --destination-cidr-block 0.0.0.0/0 --nat-gateway-id ${NAT_GW_1} --region ${REGION}
Associate subnets (3x)
$ aws ec2 associate-route-table --route-table-id ${RTB_PRIVATE_1} --subnet-id ${PRIVATE_SUBNET_1} --region ${REGION}
Other Configuration
Set up S3 Bucket as Kops state store
$ aws s3api create-bucket --bucket my-state-store --region ${REGION} --create-bucket-configuration LocationConstraint=eu-central-1
Create cluster
$ kops create cluster --node-count 3 --zones ${AVAILABILITY_ZONE_1},${AVAILABILITY_ZONE_2},${AVAILABILITY_ZONE_3} --master-zones ${AVAILABILITY_ZONE_1},${AVAILABILITY_ZONE_2},${AVAILABILITY_ZONE_3} --state ${KOPS_STATE_STORE} --dns-zone=${DNS_ZONE_PRIVATE_ID} --dns private --node-size m5.large --master-size m5.large --topology private --networking weave --vpc=${VPC_ID} --bastion ${NAME}
Edit cluster to config subnets
$ kops edit cluster ${NAME}
Note: update subnets to correspond with created public/private subnets above
$ kops update cluster ${NAME} --yes
Issue resolved. It was not a kops
problem, the issue was with AWS M5 and linux version.
The kops default Debian jessie images do not support nvme for EBS volumes, which is used by the AWS M5 instance types. As a result, masters fail to start, as they can not mount the EBS volumes.