I'm trying to create a service of type NodePort
in my Kubernetes cluster, but it's not working as expected, and I suspect it has to do the fact that I've disabled ELB permissions for the IAM roles being used on my master node. I wouldn't think ELB permissions should matter for NodePort
, but I'm seeing an error message that leads me to think this. Am I doing something wrong? Is this a known issue others have seen before?
Deployed a service of type NodePort
to my cluster, expecting to be able to reach my service on any of the nodes' public IPs and the given port, but I can't. There's 1 master and 2 non-master nodes, and no process is even bound to port 30095 (the assigned NodePort
) except on the master node. SSH'ing onto the master and curling that port in a variety of ways does nothing (curl
just hangs). Curling the endpoints associated with the service works fine. kubectl describe
on the service suggests there was some error creating a load balancer, but I don't know why it would be doing that.
I'll reiterate that I specifically disabled the IAM role used by the master nodes from being able to do any ELB things. I don't want developers using my Kubernetes cluster to be able to spin up ELB's in my account, or do anything for that matter that would create AWS resources in my account.
information about service (commands run from local workstation) -- note CreatingLoadBalancerFailed
error in output of kubectl describe service
:
$ kubectl get services frontend -oyaml
apiVersion: v1
kind: Service
---SNIP---
ports:
- nodePort: 30095
port: 80
protocol: TCP
targetPort: 80
selector:
app: guestbook
tier: frontend
sessionAffinity: None
type: NodePort
status:
loadBalancer: {}
$ kubectl describe services frontend
Name: frontend
Namespace: default
Labels: app=guestbook
tier=frontend
Selector: app=guestbook,tier=frontend
Type: NodePort
IP: 100.67.10.125
Port: <unset> 80/TCP
NodePort: <unset> 30095/TCP
Endpoints: 100.96.1.2:80,100.96.2.2:80,100.96.2.4:80
Session Affinity: None
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 4m 15 {service-controller } Warning CreatingLoadBalancerFailed (events with common reason combined)
looking for processes bound to port on non-master node:
$ netstat -tulpn | grep 30095
# no output
looking for processes bound to port on master node:
$ netstat -tulpn | grep 30095
tcp6 0 0 :::30095 :::* LISTEN 1540/kube-proxy
attempting to curl the service (just hangs):
$ curl localhost:30095
# just hangs
^C
$ curl -g -6 http://[::1]:30095
# just hangs
^C
$ curl -vvvg -6 http://[::1]:30095
* Rebuilt URL to: http://[::1]:30095/
* Hostname was NOT found in DNS cache
* Trying ::1...
* Connected to ::1 (::1) port 30095 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.38.0
> Host: [::1]:30095
> Accept: */*
>
# just hangs after that
^C
$ curl 100.67.10.125:30095
# just hangs
^C
curling an endpoint from master node (works, so the pods are running fine):
$ curl 100.96.2.4
<html ng-app="redis">
<head>
---SNIP---
</body>
</html>
Expected to see the same result from curling the endpoints when curling the external IP of any of the nodes on the service's assigned NodePort
of 30095
.
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1+82450d0", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"not a git tree", BuildDate:"2016-12-14T04:09:31Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.6", GitCommit:"e569a27d02001e343cb68086bc06d47804f62af6", GitTreeState:"clean", BuildDate:"2016-11-12T05:16:27Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
GitHub issue: https://github.com/kubernetes/kubernetes/issues/39214
Mailing list post: https://groups.google.com/forum/#!topic/kubernetes-dev/JNC_bk1L3iI
Kubernetes does this because it assumes that a new NodePort
service may have previously been a LoadBalancer
service, and so it may need to clean up the cloud load balancer. A PR was opened that would fix this issue, but then closed. In the mean time, switching IAM policy for the master
role to have elasticloadbalancing:DescribeLoadBalancers
instead of elasticloadbalancing:*
solved the issue, i.e. the rest of the cluster including NodePort
services work fine, but still restricts people from creating ELBs.