I've deployed my cluster on debian based on https://github.com/CiscoCloud/kubernetes-ansible. Everything work fine except some addons and i suspect name resolution.
I've a rc defined like this
apiVersion: v1
kind: ReplicationController
metadata:
name: monitoring-influx-grafana-v1
namespace: kube-system
labels:
k8s-app: influxGrafana
version: v1
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: influxGrafana
version: v1
template:
metadata:
labels:
k8s-app: influxGrafana
version: v1
kubernetes.io/cluster-service: "true"
spec:
containers:
- image: gcr.io/google_containers/heapster_influxdb:v0.3
name: influxdb
resources:
limits:
cpu: 100m
memory: 200Mi
ports:
- containerPort: 8083
hostPort: 8083
- containerPort: 8086
hostPort: 8086
volumeMounts:
- name: influxdb-persistent-storage
mountPath: /data
- image: gcr.io/google_containers/heapster_grafana:v0.7
name: grafana
resources:
limits:
cpu: 100m
memory: 100Mi
env:
- name: INFLUXDB_EXTERNAL_URL
value: /api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:api/db/
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: INFLUXDB_PORT
value: "8086"
volumes:
- name: influxdb-persistent-storage
emptyDir: {}
So the container connects to a host named "monitoring-influxdb". Actually this name is well resolved by the dns when i use a "busy-box" pod:
kubectl exec busybox --namespace=kube-system -- nslookup monitoring-influxdb
Server: 10.233.0.10
Address 1: 10.233.0.10
Name: monitoring-influxdb Address 1: 10.233.83.114
But inside the monitoring-influx-grafana-v1 pod, the name is not resolved :
kubectl exec monitoring-influx-grafana-v1-obu8h --namespace=kube-system -- ping -c 1 monitoring-influxdb
PING monitoring-influxdb.kube-system.svc.osm.local (10.233.83.114) 56(84) bytes of data.
--- monitoring-influxdb.kube-system.svc.osm.local ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Note the "kubernetes" service is resolved
kubectl exec monitoring-influx-grafana-v1-obu8h --namespace=kube-system -- ping -c 1 kubernetes
PING kubernetes.default.svc.osm.local (10.233.0.1) 56(84) bytes of data.
64 bytes from 10.233.0.1: icmp_seq=1 ttl=63 time=0.441 ms
--- kubernetes.default.svc.osm.local ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
```
The service is defined as follows in etcd
{'apiVersion': 'v1',
'kind': 'Endpoints',
'metadata': {'creationTimestamp': '2015-08-31T16:11:47Z',
'labels': {'kubernetes.io/cluster-service': 'true',
'kubernetes.io/name': 'InfluxDB'},
'name': 'monitoring-influxdb',
'namespace': 'kube-system',
'resourceVersion': '219',
'selfLink': '/api/v1/namespaces/kube-system/endpoints/monitoring-influxdb',
'uid': 'fab1bd0d-4ffa-11e5-bad3-52540011ed64'},
'subsets': [{'addresses': [{'ip': '10.233.3.5',
'targetRef': {'kind': 'Pod',
'name': 'monitoring-influx-grafana-v1-obu8h',
'namespace': 'kube-system',
'resourceVersion': '391',
'uid': 'fa6fb734-4ffa-11e5-bad3-52540011ed64'}}],
'ports': [{'name': 'http', 'port': 8083, 'protocol': 'TCP'},
{'name': 'api', 'port': 8086, 'protocol': 'TCP'}]}]}
i think this problem causes errors for other pods too
kubectl logs kibana-logging-v1-ifztn --namespace=kube-system
...
{"@timestamp":"2015-08-31T16:13:29.082Z","level":"info","message":"Unable to connect to elasticsearch at http://elasticsearch-logging:9200. Retrying in 2.5 seconds.","node_env":"production"}
...
kubectl logs monitoring-heapster-v8-9cy06 --namespace=kube-system
...
E0831 16:12:27.367871 1 driver.go:326] Database creation failed: Post http://monitoring-influxdb:8086/db?u=root&p=root: dial tcp: lookup monitoring-influxdb: no such host. Retrying after 30 seconds
...
Regards, Smana
Smana, 1) Like you guessed again, it is not a DNS resolution problem PING monitoring-influxdb.kube-system.svc.osm.local (10.233.83.114) 56(84) bytes of data. the ping command indicates that is pinging IP 10.233.83.144
2) ICMP is not supported for cluster ips. Currently, only tcp and udp are supported. Hence ping fails.