DNS does not resolve with NGINX in Kubernetes

11/18/2016

I have a Kubernetes cluster that I setup with kube-aws. I'm trying to run a custom NGINX configuration which uses DNS resolutions to proxy_pass. Here is the NGINX block of code

location /api/v1/lead {
  resolver 10.3.0.10 ipv6=off;
  set $container lead-api;
  proxy_pass http://$container:3000;
}

10.3.0.10 comes from the cluster IP of the DNS service found in Kubernetes. I've also tried 127.0.0.11 which is what we use in the docker-compose/docker environments.

$ kubectl describe --namespace=kube-system service kube-dns
Name:                   kube-dns
Namespace:              kube-system
Labels:                 k8s-app=kube-dns
                        kubernetes.io/cluster-service=true
                        kubernetes.io/name=KubeDNS
Selector:               k8s-app=kube-dns
Type:                   ClusterIP
IP:                     10.3.0.10
Port:                   dns     53/UDP
Endpoints:              10.2.26.61:53
Port:                   dns-tcp 53/TCP
Endpoints:              10.2.26.61:53
Session Affinity:       None

This configuration works well on three different environments which use docker-compose. However I get the following error in the NGINX logs of the Kubernetes cluster

[error] 9#9: *20 lead-api could not be resolved (2: Server failure), client: 10.2.26.0, server: , request: "GET /api/v1/lead/661DF757-722B-41BB-81BD-C7FD398BBC88 HTTP/1.1"

If I run nslookup within the NGINX pod I can resolve the host with the same dns server:

$ kubectl exec nginx-1855584872-kdiwh -- nslookup lead-api
Server:         10.3.0.10
Address:        10.3.0.10#53

Name:   lead-api.default.svc.cluster.local
Address: 10.3.0.167

I don't know if it matters or not, but notice the "server" part of the error is empty. When I look at the pod logs for dnsmasq I don't see anything relevant. If I change the NGINX block to hardcode the proxy_pass then it resolves fine. However, I have other configurations that require dynamic proxy names. I could hard code every upstream this way, but I want to know how to make the DNS resolver work.

location /api/v1/lead {
  proxy_pass http://lead-api:3000;
}
-- blockloop
amazon-web-services
dns
kubernetes
nginx

2 Answers

11/19/2016

Resolving the name fails because you need to use the Full Qualified Domain name. That is, you should use:

lead-api.<namespace>.svc.cluster.local

not just

lead-api

Using just the hostname will usually work because in kubernetes the resolv.conf is configured with search domains so that you don't usually need to provide a service's FQDN. e.g:

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.3.240.10
options ndots:5

However, specifying the FQDN is necessary when you tell nginx to use a custom resolver because it does not get the benefit of these domain search specs.

-- MrE
Source: StackOverflow

11/19/2016

You need to use a Service

http://kubernetes.io/docs/user-guide/services/

A kubernetes Service proxies traffic to your Pods (i.e. what you call 'service', which is your application)

I guess you use Kubernetes for the ability to deploy and scale your applications (Pods) so traffic will need to be load balanced to them once you scale and you have multiple Pods to talk to. This is what a Service does.

A Service has its own IP address. As long as the Service exists, a Nginx Pod referencing this Service in upstream will work fine.

Nginx (free version) dies when it can't resolve the upstream, but if the Service is defined, it has its own IP and it gets resolved.

If the Pods behind the Service are not running, Nginx will not see that, and will try to forward the traffic but will return a 502 (bad gateway)

So, just defined the Service and then bring up your Pods with the proper label so the Service will pick them up. You can delete, scale, replace those Pods without affecting the Nginx Pod. As long as there is at least one Pod running behind the Service, Nginx will always be able to connect to your API.

-- MrE
Source: StackOverflow