I have few internal services which talk to one or more internal <service_name>.example.com. How can I deploy a cluster where calls to <service_name>.example.com would route to the actual service? NOTE: There areno
Note, I might need to create aliases such as <service_name>.interal.example.com ---> <service_name>.example.com
The idea is, a lot of the components in the architecture have http calls to .example.com
domain, and for the migration to work. I want Kubernetes to take care of mapping the apropriate .example.com
to the service within the cluster, and not the outside one. And not having to rename all of .example.com to .svc.cluster.local
these services shouldn't be exposed externally, only the ingress is exposed externally.
What would be the best way to achieve this?
So from reading some Kubernetes docs, it looks like kube-dns handles all DNS resolution in your cluster. And by default this just does what the node does (which is to use public DNS resolution) so it would actually try to talk out to *.example.com.
So I think the way to do it is to override the behaviour of kube-dns. The Kubernetes docs have some information about it here: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/
I'm not really good with DNS stuffs, but from the docs page the CoreDNS option looks interesing because it has a rewrite config. So maybe it's easy for to rewrite the service.example.com
to be calls to service.svc.cluster.local
.
This works, the assumption here is that a service, <service_name>.example.com
maps to <service_name>.svc.cluster.local
. Usually a namespace will be involved, so the rewrite would look more like {1}.{1}.svc.cluster.local
(wherein <service_name>
is also the <namespace_name>
), or the namespace can be hard coded as needed {1}.<namespace_name>.svc.cluster.local
.
Keep in mind to not set kubernetes.io/cluster-service: "true"
to true
hence it is commented out, otherwise if it set to true
GKE keeps removing the service. I did not look into why this was happening.
CoreDNS proxy plugin will not take a DNS name, it takes a IP, IP:PORT or a FILENAME (such as /etc/resolv.conf).
The proxy/upstream is needed because once the DNS resolution is handed to CoreDNS and CoreDNS rewrites it to a local cluster service, that local cluster service DNS entry has to be resolved, please see effects on pods from the kubernetes documentation. The final resolving to an IP happens with the proxy or perhaps even using an upstream server which points back to kube-dns.kube-system.svc.cluster.local
.
apiVersion: v1
kind: ConfigMap
metadata:
name: internal-dns
namespace: kube-system
data:
Corefile: |
example.com:53 {
log
errors
health
prometheus :9153
rewrite name regex (.*).example.com {1}.svc.cluster.local
proxy . 10.10.10.10 ### ip of kube-dns.kube-system.svc.cluster.local
}
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: internal-dns
namespace: kube-system
labels:
k8s-app: internal-dns
kubernetes.io/name: "CoreDNS"
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: internal-dns
template:
metadata:
labels:
k8s-app: internal-dns
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
- key: "CriticalAddonsOnly"
operator: "Exists"
containers:
- name: coredns
image: coredns/coredns:1.2.6
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: internal-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: internal-dns
#kubernetes.io/cluster-service: "true"
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: internal-dns
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
As pointed out by the in the comments above by @patrick-w and @danny-l, a stubdomain needs to be inserted into kube-dns, which then delegates the calls to example.com o CoreDNS.
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
data:
stubDomains: |
{"example.com": ["10.20.20.20"]} ### ip of internal-dns.kube-system.svc.cluster.local.
The stubdomain has the capability of taking a DNS name, internal-dns.kube-system.svc.cluster.local
would have worked, but because of bug in kube-dns (dnsmasq) the dnsmasq container fails to start and ends up in a CrashLoopBackOff.
internal-dns.kube-system.svc.cluster.local
is the name of the CoreDNS/internal-dns service.
dnsmasq error:
I1115 17:19:20.506269 1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1115 17:19:20.506570 1 sync.go:167] Updated stubDomains to map[example.com:[internal-dns.kube-system.svc.cluster.local]]
I1115 17:19:20.506734 1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053 --server /example.com/internal-dns.kube-system.svc.cluster.local]
I1115 17:19:20.507923 1 nanny.go:116]
I1115 17:19:20.507952 1 nanny.go:116] dnsmasq: bad command line options: bad address
I1115 17:19:20.507966 1 nanny.go:119]
W1115 17:19:20.507970 1 nanny.go:120] Got EOF from stderr
I1115 17:19:20.507978 1 nanny.go:119]
W1115 17:19:20.508079 1 nanny.go:120] Got EOF from stdout
F1115 17:19:20.508091 1 nanny.go:190] dnsmasq exited: exit status 1
dnsmasq successful when using ip in the stubdomain:
I1115 17:24:18.499937 1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1115 17:24:18.500605 1 sync.go:167] Updated stubDomains to map[example.com:[10.20.20.20]]
I1115 17:24:18.500668 1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053 --server /example.com/10.20.20.20]
I1115 17:24:18.850687 1 nanny.go:119]
W1115 17:24:18.850726 1 nanny.go:120] Got EOF from stdout
I1115 17:24:18.850748 1 nanny.go:116] dnsmasq[15]: started, version 2.78 cachesize 1000
I1115 17:24:18.850765 1 nanny.go:116] dnsmasq[15]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I1115 17:24:18.850773 1 nanny.go:116] dnsmasq[15]: using nameserver 10.20.20.20#53 for domain example.com
I1115 17:24:18.850777 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1115 17:24:18.850780 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1115 17:24:18.850783 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1115 17:24:18.850788 1 nanny.go:116] dnsmasq[15]: reading /etc/resolv.conf
I1115 17:24:18.850791 1 nanny.go:116] dnsmasq[15]: using nameserver 10.20.20.20#53 for domain example.com
I1115 17:24:18.850796 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1115 17:24:18.850800 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1115 17:24:18.850803 1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1115 17:24:18.850850 1 nanny.go:116] dnsmasq[15]: read /etc/hosts - 7 addresses