I'm trying to setup Flannel 0.5 at AWS in order to let pods communicate with each other. I followed the guide at https://coreos.com/blog/introducing-flannel-0.5.0-with-aws-and-gce/
created three "Amazon EC2" instances with AIM, whose role policy is AmazonVPCFullAccess. AmazonVPCFullAccess includes all privielges. My VPC is "172.31.0.0/16"
Diable the "Source/Destination Check for the instance" for the three instances
install k8s, flannel and configure them.
Set the type as "aws-vpc" by command:
etcdctl set /atomic.io/network/config '{"Network":"172.31.0.0/20","Backend": {"Type": "aws-vpc"}}'
Start services and create a k8s rc
Then I get error in /var/log/messages:
Sep 27 15:18:54 ip-172-31-0-242 flanneld: I0927 15:18:54.023391 02400 etcd.go:129] Found lease (172.31.1.0/24) for current IP (172.31.0.242), reusing
Sep 27 15:18:54 ip-172-31-0-242 flanneld: I0927 15:18:54.024120 02400 etcd.go:84] Subnet lease acquired: 172.31.1.0/24
Sep 27 15:18:54 ip-172-31-0-242 flanneld: I0927 15:18:54.389834 02400 awsvpc.go:106] Warning- disabling source destination check failed: You are not authorized to perform this operation. (UnauthorizedOperation)
Sep 27 15:18:54 ip-172-31-0-242 flanneld: I0927 15:18:54.389882 02400 awsvpc.go:116] RouteRouteTableID: rtb-9893a9f1
Sep 27 15:18:54 ip-172-31-0-242 flanneld: E0927 15:18:54.511297 02400 network.go:71] Failed to initialize network (type aws-vpc): unable to add route 172.31.1.0/24: cannot create a more specific route for 172.31.1.0/24 than local route 172.31.0.0/16 in route table rtb-9893a9f1 (InvalidParameterValue)
The log shows that the type configuration "aws-vpc" works. And Flannel gets the right AWS route table "rtb-9893a9f1", which means the AIM role configuration also works.
But the log says "UnauthorizedOperation" and "InvalidParameterValue"... I have no idea about this now!
BTW, without setting the IAM role and type a aws-vpc, after starting k8s, etcd and flannel, the routing table includes the interface flannel:
[root@ip-172-31-14-47 ~]# route -n Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.31.0.1 0.0.0.0 UG 0 0 0 eth0
172.31.0.0 0.0.0.0 255.255.240.0 U 0 0 0 eth0
172.31.0.0 0.0.0.0 255.255.240.0 U 0 0 0 flannel0
172.31.14.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0
Although the route had the interface flannel, the flannel couldnot work. The reason is clear: AWS doesn't follow the internal route table of one instance.
But after setting the AIM role and the type, flannel cannot add the interface any more, suffered by the exception of UnauthorizedOperation and InvalidParameterValue, although I have gurantee the AIM with role AmazonVPCFullAccess
Issue resolved.
First Issue
The warning of below can be ignored. This happens while flannel awsvpc.go call "m.disableSrcDestCheck(instanceID, ec2c)", and it's only info level log.
log.Infof("Warning- disabling source destination check failed: %v", err)
The background is that flannel check whether the "Source/Destination Check for the instance" has been disabled or not. We have to say that it should be much better if the log could mention "please go to aws console to disable the source destionation check".
Warning- disabling source destination check failed: You are not authorized to perform this operation. (UnauthorizedOperation)
Although above log is not serious actually, this misled me and I thought I made a mistake to configure the policy and spent too much time on privielges.
Second Issue
The error of "Request limit exceeded" is because of too many aws API retries. There are many possible conditions to "exceed limit" more than the 50 entries per route table. After stop the flannel for a while, the aws api provides services again. Or aws api does always reject requests.
Third Issue
Finally, the root cause of breaking flannel is the route table conflict. After assigning another subnet, the issue is resolved. The flannel log is right, which says "InvalidParameter". At first, I was mislead by the first "UnauthorizedOperation" and tried to change the privilege policy...
network.go:71] Failed to initialize network (type aws-vpc): unable to add route 172.31.1.0/24: cannot create a more specific route for 172.31.1.0/24 than local route 172.31.0.0/16 in route table rtb-9893a9f1 (InvalidParameterValue)