I use kubernetes on AWS with CoreOS & flannel VLAN network. (followed this guide https://coreos.com/kubernetes/docs/latest/getting-started.html) k8s version is 1.4.6
.
And I have the following node-exporter daemon-set.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
spec:
template:
metadata:
labels:
app: node-exporter
tier: monitor
category: platform
name: node-exporter
spec:
containers:
- image: prom/node-exporter:0.12.0
name: node-exporter
ports:
- containerPort: 9100
hostPort: 9100
name: scrape
hostNetwork: true
hostPID: true
When I run this, kube-controller-manager outputs an error repeatedly as below:
E1117 18:31:23.197206 1 endpoints_controller.go:513]
Endpoints "node-exporter" is invalid:
[subsets[0].addresses[0].nodeName: Forbidden: Cannot change NodeName for 172.17.64.5 to ip-172-17-64-5.ec2.internal,
subsets[0].addresses[1].nodeName: Forbidden: Cannot change NodeName for 172.17.64.6 to ip-172-17-64-6.ec2.internal,
subsets[0].addresses[2].nodeName: Forbidden: Cannot change NodeName for 172.17.80.5 to ip-172-17-80-5.ec2.internal,
subsets[0].addresses[3].nodeName: Forbidden: Cannot change NodeName for 172.17.80.6 to ip-172-17-80-6.ec2.internal,
subsets[0].addresses[4].nodeName: Forbidden: Cannot change NodeName for 172.17.96.6 to ip-172-17-96-6.ec2.internal]
Just for information, despite from this error message, node_exporter is accessible on e.g.) 172-17-96-6:9100
. My nodes are in a private network including k8s master.
But these logs are output too many and makes it difficult to see other logs by eyes from our log console. Could I see how to resolve this error?
Because I built my k8s cluster from scratch, cloud-provider=aws
flag was not activated at first and I recently turned it on, but not sure if it's related to this issue.
It looks this is caused by my another manifest file
apiVersion: v1
kind: Service
metadata:
name: node-exporter
labels:
app: node-exporter
tier: monitor
category: platform
annotations:
prometheus.io/scrape: 'true'
spec:
clusterIP: None
ports:
- name: scrape
port: 9100
protocol: TCP
selector:
app: node-exporter
type: ClusterIP
I thought this is necessary to expose node-exporter daemon-set above, but it could rather introduce some sort of conflict when I set hostNetwork: true
in a daemon-set (actually, a pod) manifest. I'm not 100% certain though, after I delete this service the error disappears while I can still access to 172-17-96-6:9100
from outside of the k8s cluster.
I just followed by this post when setting prometheus and node-exporter, https://coreos.com/blog/prometheus-and-kubernetes-up-and-running.html
in case others face with the same problem, I'm leaving my comment here.