I am new to Kubernetes trying to build kubernetes cluster by kubeadm and I have crossed the stage till kube init (with args PFB command FYR) but the CoreDNS pod is failing with CrashLoopBackOff state. I have tried to check the event of that pod and got the below error. I am not sure to which service its calling and why it gets a connection refused. Could you please help me to fix this issue? PFB commands which I have executed.
kubeadm init:
kubeadm init --pod-network-cidr=10.244.0.0/16 \
--apiserver-advertise-address=192.168.56.2
CNI network plugin:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Namespace command:
root@kube-master:/etc# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-5z9f6 0/1 CrashLoopBackOff 32 143m
kube-system coredns-66bff467f8-tsjdj 0/1 CrashLoopBackOff 32 143m
kube-system etcd-kube-master 1/1 Running 0 143m
kube-system kube-apiserver-kube-master 1/1 Running 0 143m
kube-system kube-controller-manager-kube-master 1/1 Running 0 143m
kube-system kube-flannel-ds-amd64-2kklk 1/1 Running 0 123m
kube-system kube-proxy-5p9jf 1/1 Running 0 143m
kube-system kube-scheduler-kube-master 1/1 Running 0 143m
coredns-event check:
root@kube-master:/etc# kubectl describe pod coredns-66bff467f8-tsjdj -n kube-system
Name: coredns-66bff467f8-tsjdj
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: kube-master/192.168.0.142
Start Time: Fri, 10 Jul 2020 15:54:45 -0400
Labels: k8s-app=kube-dns
pod-template-hash=66bff467f8
Annotations: <none>
Status: Running
IP: 10.244.0.3
IPs:
IP: 10.244.0.3
Controlled By: ReplicaSet/coredns-66bff467f8
Containers:
coredns:
Container ID: docker://20d8697b21f1bfe955de6769abfbb18c658f3c4134a2505f580b8146c998871d
Image: k8s.gcr.io/coredns:1.6.7
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:2c8d61c46f484d881db43b34d13ca47a269336e576c81cf007ca740fa9ec0800
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 10 Jul 2020 17:55:00 -0400
Finished: Fri, 10 Jul 2020 17:55:01 -0400
Ready: False
Restart Count: 32
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-2w9w6 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-2w9w6:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-2w9w6
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 13m (x523 over 123m) kubelet, kube-master Back-off restarting failed container
Warning Unhealthy 9m18s kubelet, kube-master Readiness probe failed: Get http://10.244.0.3:8181/ready: dial tcp 10.244.0.3:8181: connect: connection refused
Normal Pulled 7m42s (x4 over 9m23s) kubelet, kube-master Container image "k8s.gcr.io/coredns:1.6.7" already present on machine
Normal Created 7m42s (x4 over 9m22s) kubelet, kube-master Created container coredns
Normal Started 7m41s (x4 over 9m21s) kubelet, kube-master Started container coredns
Warning BackOff 4m14s (x26 over 9m17s) kubelet, kube-master Back-off restarting failed container
Could you please help me to fix this issue?
Thanks
You have to edit your coredns configmap and add the readiness flag:
kubectl -n kube-system edit coredns -o yaml
You will get the YAML definition for Edit.. add the line ready :8181
as below (if not exist):
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.as-gmbh.de in-addr.arpa ip6.arpa {
pods verified
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
ready :8181 # <---- 🔴 Ignore other lines and Focus Here 🔴
forward . 172.18.2.21
cache 30
loop
reload
loadbalance
}
I found the issue in the old and latest version as well. coredns forward query was in the loop state and it got exited once it detects the loop. Please check this URL for more details - https://coredns.io/plugins/loop/#troubleshooting
I did check my /etc/resolv.conf and my nameserver was 127.0.1.1 so it detects the loops to forward the DNS query. So I have changed my network config as per below
https://askubuntu.com/questions/627899/nameserver-127-0-1-1-in-resolv-conf-wont-go-away
Now its working fine. Even I didnt edit the coredns yaml file to add 8181 port to ready.
root@kube-master:/etc# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-522nr 1/1 Running 0 11m
kube-system coredns-66bff467f8-8qskt 1/1 Running 0 11m
kube-system etcd-kube-master 1/1 Running 0 11m
kube-system kube-apiserver-kube-master 1/1 Running 0 11m
kube-system kube-controller-manager-kube-master 1/1 Running 0 11m
kube-system kube-flannel-ds-amd64-4jbpb 1/1 Running 0 10m
kube-system kube-proxy-hz94b 1/1 Running 0 11m
kube-system kube-scheduler-kube-master 1/1 Running 0 11m
coredns log
root@kube-master:/etc# kubectl logs coredns-66bff467f8-522nr -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
linux/amd64, go1.13.6, da7f65b
coredns pod event:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 118s (x12 over 3m27s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Normal Scheduled 107s default-scheduler Successfully assigned kube-system/coredns-66bff467f8-522nr to kube-master
Normal Pulled 98s kubelet, kube-master Container image "k8s.gcr.io/coredns:1.6.7" already present on machine
Normal Created 96s kubelet, kube-master Created container coredns
Normal Started 94s kubelet, kube-master Started container coredns
Thanks for all your help @Abdennour TOUMI and @HelloWorld