Kubernetes keeps removing Heapster & Grafana services due to already-used NodePort

7/13/2016

I am running a Kubernetes cluster on Ubuntu (trusty) locally via Docker.

Since I'm using Vagrant to create the Ubuntu VM I had to modify the docker run command from the official Kubernetes guide a bit:

docker run -d \
    --volume=/:/rootfs:ro \
    --volume=/sys:/sys:ro \
    --volume=/var/lib/docker/:/var/lib/docker:rw \
    --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
    --volume=/var/run:/var/run:rw \
    --net=host \
    --privileged=true \
    --pid=host \
    gcr.io/google_containers/hyperkube:v1.3.0 \
    /hyperkube kubelet \
        --allow-privileged=true \
        --api-servers=http://localhost:8080 \
        --v=2 \
        --address=0.0.0.0 \
        --enable-server \
        --hostname-override=192.168.10.30 \
        --config=/etc/kubernetes/manifests-multi \
        --containerized \
        --cluster-dns=10.0.0.10 \
        --cluster-domain=cluster.local

Additionally, running a reverse proxy allows me to access my cluster's services via a browser from outside of the VM:

docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v1.3.0 \
/hyperkube proxy --master=http://127.0.0.1:8080 --v=2

These steps work fine and eventually I'm able to access the Kubernetes UI in my browser.

vagrant@trusty-vm:~$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Now I'd like to run Heapster in that Kubernetes cluster with an InfluxDB backend and a Grafana UI, just as described in this guide. In order to do so, I've cloned the Heapster repo and configured grafana-service.yaml to use an external IP by adding type: NodePort:

apiVersion: v1
kind: Service
metadata:
  labels:
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP. 
  type: NodePort
  ports:
  - port: 80
    targetPort: 3000
  selector:
    name: influxGrafana

Creating the services, rcs, etc.:

vagrant@trusty-vm:~/heapster$ kubectl create -f deploy/kube-config/influxdb/
You have exposed your service on an external port on all nodes in your
cluster.  If you want to expose this service to the external internet, you may
need to set up firewall rules for the service port(s) (tcp:30593) to serve traffic.

See http://releases.k8s.io/release-1.3/docs/user-guide/services-firewalls.md for more details.
service "monitoring-grafana" created
replicationcontroller "heapster" created
service "heapster" created
replicationcontroller "influxdb-grafana" created
service "monitoring-influxdb" created

vagrant@trusty-vm:~/heapster$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
Heapster is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
monitoring-grafana is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                READY     STATUS              RESTARTS   AGE
kube-system   heapster-y2yci                      1/1       Running             0          32m
kube-system   influxdb-grafana-6udas              2/2       Running             0          32m
kube-system   k8s-master-192.168.10.30            4/4       Running             0          58m
kube-system   k8s-proxy-192.168.10.30             1/1       Running             0          58m
kube-system   kube-addon-manager-192.168.10.30    2/2       Running             0          57m
kube-system   kube-dns-v17-y4cwh                  3/3       Running             0          58m
kube-system   kubernetes-dashboard-v1.1.0-bnbnp   1/1       Running             0          58m

vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces
NAMESPACE     NAME                   CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
default       kubernetes             10.0.0.1     <none>        443/TCP             18m
kube-system   heapster               10.0.0.234   <none>        80/TCP              3s
kube-system   kube-dns               10.0.0.10    <none>        53/UDP,53/TCP       18m
kube-system   kubernetes-dashboard   10.0.0.58    <none>        80/TCP              18m
kube-system   monitoring-grafana     10.0.0.132   <nodes>       80/TCP              3s
kube-system   monitoring-influxdb    10.0.0.197   <none>        8083/TCP,8086/TCP   16m

As you can see, everything seems to run smoothly and I can also access Grafana's UI at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/ via a browser.

However, after like 1 minute, both Heapster and Grafana endpoints disappear from kubectl cluster-info.

vagrant@trusty-vm:~/heapster$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

Browser output:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "endpoints \"monitoring-grafana\" not found",
  "reason": "NotFound",
  "details": {
    "name": "monitoring-grafana",
    "kind": "endpoints"
  },
  "code": 404
}

Pods are still up & running ...

vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                READY     STATUS              RESTARTS   AGE
kube-system   heapster-y2yci                      1/1       Running             0          32m
kube-system   influxdb-grafana-6udas              2/2       Running             0          32m
kube-system   k8s-master-192.168.10.30            4/4       Running             0          58m
kube-system   k8s-proxy-192.168.10.30             1/1       Running             0          58m
kube-system   kube-addon-manager-192.168.10.30    2/2       Running             0          57m
kube-system   kube-dns-v17-y4cwh                  3/3       Running             0          58m
kube-system   kubernetes-dashboard-v1.1.0-bnbnp   1/1       Running             0          58m

... but Heapster and Grafana services have disappeared:

vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces
NAMESPACE     NAME                   CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
default       kubernetes             10.0.0.1     <none>        443/TCP             19m
kube-system   kube-dns               10.0.0.10    <none>        53/UDP,53/TCP       19m
kube-system   kubernetes-dashboard   10.0.0.58    <none>        80/TCP              19m
kube-system   monitoring-influxdb    10.0.0.197   <none>        8083/TCP,8086/TCP   17m

While checking the output of kubectl cluster-info dump I discovered the following errors:

I0713 09:31:09.088567       1 proxier.go:427] Adding new service "kube-system/monitoring-grafana:" at 10.0.0.227:80/TCP
E0713 09:31:09.273385       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.395280       1 proxier.go:427] Adding new service "kube-system/heapster:" at 10.0.0.111:80/TCP
E0713 09:31:09.466306       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.480468       1 proxier.go:502] Setting endpoints for "kube-system/monitoring-grafana:" to [172.17.0.5:3000]
E0713 09:31:09.519698       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.532026       1 proxier.go:502] Setting endpoints for "kube-system/heapster:" to [172.17.0.4:8082]
E0713 09:31:09.558527       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:31:17.249001       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:22.252280       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:27.257895       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:31.126035       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:31:32.264430       1 server.go:294] Starting health server failed: E0709 09:32:01.153168       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" ($
E0713 09:31:37.265109       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:42.269035       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:47.270950       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:52.272354       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:31:57.273424       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
E0713 09:32:01.153168       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:32:02.276318       1 server.go:294] Starting health server failed: listen tcp 127.0.0.1:10249: bind: address already in use
I0713 09:32:06.105878       1 proxier.go:447] Removing service "kube-system/monitoring-grafana:"
I0713 09:32:07.175025       1 proxier.go:447] Removing service "kube-system/heapster:"
I0713 09:32:07.210270       1 proxier.go:517] Removing endpoints for "kube-system/monitoring-grafana:"
I0713 09:32:07.249824       1 proxier.go:517] Removing endpoints for "kube-system/heapster:"

Apparently, the services and endpoints of Heapster & Grafana are removed due to nodePort being already in use. I didn't specify a designated nodePort in grafana-service.yaml, which means Kubernetes could choose one that isn't already used - so how can this be an error? Also, are there any ways to fix this?


OS: Ubuntu 14.04.4 LTS (trusty) | Kubernetes: v1.3.0 | Docker: v1.11.2

-- IggyBlob
docker
grafana
heapster
influxdb
kubernetes

2 Answers

8/9/2016

Same Issue in our environment. K8S version = 1.3.4, Docker 1.12, Heapster is latest from master branch

-- Artak
Source: StackOverflow

10/4/2016

I ran into a very similar issue.

In the grafana-service.yaml file (and probably the heapster-service.yaml file) you have the line: kubernetes.io/cluster-service: 'true'. This label means that this service will be managed by the addon-manager. When the addon-manager runs its periodic checks, it will see that there is no grafana/heapster service defined in /etc/kubernetes/addons and will remove the service(s).

To work around this, you have two options:

  1. Change the label to kubernetes.io/cluster-service: 'false'.
  2. Move the controller and service yaml files into /etc/kubernetes/addons (or wherever the addon-manager is configured to look for yaml files) on the master node.

Hope that helps

-- arobinson
Source: StackOverflow