K
Q

Question

How to avoid coredns resolving overhead in kubernetes

3/21/2020

I think the title is pretty much self explanatory. I have done many experiments and the sad truth, is that coredns does add a 20 ms overhead to all the requests inside the cluster. At first we thought maybe by adding more replications, and balancing the resolving requests between more instances, we could improve the response time, but it did not help at all. (we scaled up from 2 pods to 4 pods)

There was some enhancements on the fluctuations of resolving time, after scaling up to 4 instances. But it wasn't what we were expecting, and the 20 ms overhead was still there.

We have some web-services that their actual response time is < 30 ms and using coredns we are doubling up the response time, and it is not cool!

After coming to a conclusion about this matter, we did an experiment to double-check that this is not an OS level overhead. And the results were not different from what we were expecting.

We thought maybe we can implement/deploy a solution based on putting list of needed hostname mappings for each pod, inside /etc/hosts of that pod. So my final questions are as follows:

Has anyone else experienced something similar with coredns?
Can you please suggest alternative solutions to coredns that work in k8s environment?

Any thoughts or insights are appreciated. Thanks in advance.

-- nima

coredns

dns

kubernetes

1 Answer

3/22/2020

There are several things to look at when running coreDNS in your kubernetes cluster

Memory
AutoPath
Number of Replicas
Autoscaler
Other Plugins
Prometheus metrics
Separate Server blocks

Memory

CoreDNS recommended amount of memory for replicas is

MB required (default settings) = (Pods + Services) / 1000 + 54

Autopath

Autopath is a feature in Coredns that helps increase the response time for external queries

Normally a DNS query goes through

..svc.cluster.local
.svc.cluster.local
cluster.local
Then the configured forward, usually host search path (/etc/resolv.conf

Trying "example.com.default.svc.cluster.local"
Trying "example.com.svc.cluster.local"
Trying "example.com.cluster.local"
Trying "example.com"
Trying "example.com"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55265
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;example.com.           IN  A

;; ANSWER SECTION:
example.com.        30  IN  A   93.184.216.34

This requires more memory so the calculation now becomes

MB required (w/ autopath) = (Number of Pods + Services) / 250 + 56

Number of replicas

Defaults to 2 but enabling the Autoscaler should help with load issues.

Autoscaler

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: coredns
  namespace: default
spec:
  maxReplicas: 20
  minReplicas: 2
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: coredns
  targetCPUUtilizationPercentage: 50

Node local cache

Beta in Kubernetes 1.15

NodeLocal DNSCache improves Cluster DNS performance by running a dns caching agent on cluster nodes as a DaemonSet. In today’s architecture, Pods in ClusterFirst DNS mode reach out to a kube-dns serviceIP for DNS queries. This is translated to a kube-dns/CoreDNS endpoint via iptables rules added by kube-proxy. With this new architecture, Pods will reach out to the dns caching agent running on the same node, thereby avoiding iptables DNAT rules and connection tracking. The local caching agent will query kube-dns service for cache misses of cluster hostnames(cluster.local suffix by default).

https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/

Other Plugins

These will also help see what is going on inside CoreDNS

Error - Any errors encountered during the query processing will be printed to standard output.

Trace - enable OpenTracing of how a request flows through CoreDNS

Log - query logging

health - CoreDNS is up and running this returns a 200 OK HTTP status code

ready - By enabling ready an HTTP endpoint on port 8181 will return 200 OK when all plugins that are able to signal readiness have done so.

Ready and Health should be used in the deployment

        livenessProbe:
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /ready
            port: 8181
            scheme: HTTP

Prometheus Metrics

Prometheus Plugin

coredns_health_request_duration_seconds{} - duration to process a HTTP query to the local /health endpoint. As this a local operation, it should be fast. A (large) increase in this duration indicates the CoreDNS process is having trouble keeping up with its query load.

https://github.com/coredns/deployment/blob/master/kubernetes/Scaling_CoreDNS.md

Separate Server blocks

One last bit of advice is to separate the Cluster DNS server block to external block

    CLUSTER_DOMAIN REVERSE_CIDRS {
        errors
        health
        kubernetes
        ready
        prometheus :9153
        loop
        reload
        loadbalance
    }

    . {
        errors
        autopath @kubernetes
        forward . UPSTREAMNAMESERVER
        cache
        loop
    }

More information about the k8 plugin and other options here https://github.com/coredns/coredns/blob/master/plugin/kubernetes/README.md

-- strongjz

Source: StackOverflow

KQ