GKE custom internal domain for services

9/2/2018

I have few internal services which talk to one or more internal <service_name>.example.com. How can I deploy a cluster where calls to <service_name>.example.com would route to the actual service? NOTE: There areno

Note, I might need to create aliases such as <service_name>.interal.example.com ---> <service_name>.example.com

The idea is, a lot of the components in the architecture have http calls to .example.com domain, and for the migration to work. I want Kubernetes to take care of mapping the apropriate .example.com to the service within the cluster, and not the outside one. And not having to rename all of .example.com to .svc.cluster.local

these services shouldn't be exposed externally, only the ingress is exposed externally.

What would be the best way to achieve this?

-- aclokay
google-cloud-platform
google-kubernetes-engine
kubernetes

2 Answers

9/2/2018

So from reading some Kubernetes docs, it looks like kube-dns handles all DNS resolution in your cluster. And by default this just does what the node does (which is to use public DNS resolution) so it would actually try to talk out to *.example.com.

So I think the way to do it is to override the behaviour of kube-dns. The Kubernetes docs have some information about it here: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/

I'm not really good with DNS stuffs, but from the docs page the CoreDNS option looks interesing because it has a rewrite config. So maybe it's easy for to rewrite the service.example.com to be calls to service.svc.cluster.local.

-- Hitobat
Source: StackOverflow

11/15/2018

This works, the assumption here is that a service, <service_name>.example.com maps to <service_name>.svc.cluster.local. Usually a namespace will be involved, so the rewrite would look more like {1}.{1}.svc.cluster.local (wherein <service_name> is also the <namespace_name>), or the namespace can be hard coded as needed {1}.<namespace_name>.svc.cluster.local.

Keep in mind to not set kubernetes.io/cluster-service: "true" to true hence it is commented out, otherwise if it set to true GKE keeps removing the service. I did not look into why this was happening.

CoreDNS proxy plugin will not take a DNS name, it takes a IP, IP:PORT or a FILENAME (such as /etc/resolv.conf).

The proxy/upstream is needed because once the DNS resolution is handed to CoreDNS and CoreDNS rewrites it to a local cluster service, that local cluster service DNS entry has to be resolved, please see effects on pods from the kubernetes documentation. The final resolving to an IP happens with the proxy or perhaps even using an upstream server which points back to kube-dns.kube-system.svc.cluster.local.

apiVersion: v1
kind: ConfigMap
metadata:
  name: internal-dns
  namespace: kube-system
data:
  Corefile: |
    example.com:53 {
      log
      errors
      health
      prometheus :9153
      rewrite name regex (.*).example.com {1}.svc.cluster.local
      proxy . 10.10.10.10 ### ip of kube-dns.kube-system.svc.cluster.local
    }
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: internal-dns
  namespace: kube-system
  labels:
    k8s-app: internal-dns
    kubernetes.io/name: "CoreDNS"
spec:
  replicas: 1
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      k8s-app: internal-dns
  template:
    metadata:
      labels:
        k8s-app: internal-dns
    spec:
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
        - key: "CriticalAddonsOnly"
          operator: "Exists"
      containers:
      - name: coredns
        image: coredns/coredns:1.2.6
        imagePullPolicy: IfNotPresent
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        args: [ "-conf", "/etc/coredns/Corefile" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/coredns
          readOnly: true
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
      dnsPolicy: Default
      volumes:
        - name: config-volume
          configMap:
            name: coredns
            items:
            - key: Corefile
              path: Corefile
---
apiVersion: v1
kind: Service
metadata:
  name: internal-dns
  namespace: kube-system
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: internal-dns
    #kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "CoreDNS"
spec:
  selector:
    k8s-app: internal-dns
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP

As pointed out by the in the comments above by @patrick-w and @danny-l, a stubdomain needs to be inserted into kube-dns, which then delegates the calls to example.com o CoreDNS.

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
data:
  stubDomains: |
    {"example.com": ["10.20.20.20"]} ### ip of internal-dns.kube-system.svc.cluster.local.

The stubdomain has the capability of taking a DNS name, internal-dns.kube-system.svc.cluster.local would have worked, but because of bug in kube-dns (dnsmasq) the dnsmasq container fails to start and ends up in a CrashLoopBackOff.

internal-dns.kube-system.svc.cluster.local is the name of the CoreDNS/internal-dns service.

dnsmasq error:

I1115 17:19:20.506269       1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1115 17:19:20.506570       1 sync.go:167] Updated stubDomains to map[example.com:[internal-dns.kube-system.svc.cluster.local]]
I1115 17:19:20.506734       1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053 --server /example.com/internal-dns.kube-system.svc.cluster.local]
I1115 17:19:20.507923       1 nanny.go:116]
I1115 17:19:20.507952       1 nanny.go:116] dnsmasq: bad command line options: bad address
I1115 17:19:20.507966       1 nanny.go:119]
W1115 17:19:20.507970       1 nanny.go:120] Got EOF from stderr
I1115 17:19:20.507978       1 nanny.go:119]
W1115 17:19:20.508079       1 nanny.go:120] Got EOF from stdout
F1115 17:19:20.508091       1 nanny.go:190] dnsmasq exited: exit status 1

dnsmasq successful when using ip in the stubdomain:

I1115 17:24:18.499937       1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1115 17:24:18.500605       1 sync.go:167] Updated stubDomains to map[example.com:[10.20.20.20]]
I1115 17:24:18.500668       1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053 --server /example.com/10.20.20.20]
I1115 17:24:18.850687       1 nanny.go:119]
W1115 17:24:18.850726       1 nanny.go:120] Got EOF from stdout
I1115 17:24:18.850748       1 nanny.go:116] dnsmasq[15]: started, version 2.78 cachesize 1000
I1115 17:24:18.850765       1 nanny.go:116] dnsmasq[15]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I1115 17:24:18.850773       1 nanny.go:116] dnsmasq[15]: using nameserver 10.20.20.20#53 for domain example.com
I1115 17:24:18.850777       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1115 17:24:18.850780       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1115 17:24:18.850783       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1115 17:24:18.850788       1 nanny.go:116] dnsmasq[15]: reading /etc/resolv.conf
I1115 17:24:18.850791       1 nanny.go:116] dnsmasq[15]: using nameserver 10.20.20.20#53 for domain example.com
I1115 17:24:18.850796       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1115 17:24:18.850800       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1115 17:24:18.850803       1 nanny.go:116] dnsmasq[15]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1115 17:24:18.850850       1 nanny.go:116] dnsmasq[15]: read /etc/hosts - 7 addresses
-- krafts
Source: StackOverflow