I'm running 10 nodes and about 160 pods in our cluster. The cluster is up for a month and suddenly this error happened frequently when making request among services in cluster
"response": {
"err": {
"code": "ENOTFOUND",
"errno": "ENOTFOUND",
"syscall": "getaddrinfo",
"hostname": "svc-api-mapper",
"host": "svc-api-mapper",
"port": "80"
}
},
"attempt": 0
the pods selected by the service are up and running, there was no event indicating that pods are up and down but somehow the service name svc-api-mapper cannot be resolved.
$ kubectl describe se svc-api-mapper
Name: svc-api-mapper
Namespace: production
Labels: app=workflow-koa-api-mapper
Selector: app=workflow-koa-api-mapper
Type: LoadBalancer
IP: 10.3.0.33
Port: <unnamed> 80/TCP
NodePort: <unnamed> 30106/TCP
Endpoints: 10.2.8.126:80,10.2.80.248:80
Session Affinity: None
No events.
When I looked at the skydns logs, there were a lot of errors that printed every second: 2015/11/22 05:40:00 skydns: can not forward, name too short (less than 2 labels): svc-rethinkdb-driver.
Notice that the ENOTFOUND error doesn't always happen, just frequently. Is it related to performance issue?
I'm using kubernetes 1.0.1 in CoreOS cluster 773.1.0