I created a cluster with kubeadm .and all the pods except for Coredns are up and running, he is always in the CrashLoopBackOff stat. Coredns never started properly.
NAME READY STATUS RESTARTS AGE
coredns-5c98db65d4-qx4mq 0/1 CrashLoopBackOff 3 81s
coredns-5c98db65d4-v5mg8 0/1 CrashLoopBackOff 3 81s
etcd-localhost.localdomain 1/1 Running 0 33s
kube-apiserver-localhost.localdomain 1/1 Running 0 22s
kube-controller-manager-localhost.localdomain 1/1 Running 0 40s
kube-flannel-ds-amd64-gltqj 1/1 Running 0 73s
kube-proxy-x2crp 1/1 Running 0 81s
kube-scheduler-localhost.localdomain 1/1 Running 0 15s
vm
2 cpu
4G memory
cat /etc/os-release
):centos 7.6
uname -a
):Linux localhost.localdomain 3.10.0-957.el7.x86_64
1、 use journalctl -f -u kubelet
-- Logs begin at 一 2019-08-05 14:29:46 CST. --
8月 05 16:43:29 localhost.localdomain kubelet[23907]: E0805 16:43:29.325790 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:30 localhost.localdomain kubelet[23907]: E0805 16:43:30.337973 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:31 localhost.localdomain kubelet[23907]: E0805 16:43:31.826577 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:35 localhost.localdomain kubelet[23907]: E0805 16:43:35.781871 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:43:44 localhost.localdomain kubelet[23907]: E0805 16:43:44.689542 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:43:48 localhost.localdomain kubelet[23907]: E0805 16:43:48.690229 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:43:58 localhost.localdomain kubelet[23907]: E0805 16:43:58.689996 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:44:00 localhost.localdomain kubelet[23907]: E0805 16:44:00.690532 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:44:12 localhost.localdomain kubelet[23907]: E0805 16:44:12.689339 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
8月 05 16:44:14 localhost.localdomain kubelet[23907]: E0805 16:44:14.690199 23907 pod_workers.go:190] Error syncing pod 34dc0078-481a-4d2d-b013-6c65a1ba8d5a ("coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-v5mg8_kube-system(34dc0078-481a-4d2d-b013-6c65a1ba8d5a)"
8月 05 16:44:25 localhost.localdomain kubelet[23907]: E0805 16:44:25.689499 23907 pod_workers.go:190] Error syncing pod 209945cb-f289-450b-9c25-c0cdc3655940 ("coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=coredns pod=coredns-5c98db65d4-qx4mq_kube-system(209945cb-f289-450b-9c25-c0cdc3655940)"
2、 use kubectl describe pod coredns-5c98db65d4-v5mg8 -n kube-system
Name: coredns-5c98db65d4-v5mg8
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: localhost.localdomain/10.0.2.15
Start Time: Mon, 05 Aug 2019 16:42:01 +0800
Labels: k8s-app=kube-dns
pod-template-hash=5c98db65d4
Annotations: <none>
Status: Running
IP: 10.244.0.11
Controlled By: ReplicaSet/coredns-5c98db65d4
Containers:
coredns:
Container ID: docker://daf187222dfaa4d686dfd587e782369cb18c7de0c4de4850d8dd871b0dbe200c
Image: k8s.gcr.io/coredns:1.3.1
Image ID: docker://sha256:eb516548c180f8a6e0235034ccee2428027896af16a509786da13022fe95fe8c
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 139
Started: Mon, 05 Aug 2019 16:44:53 +0800
Finished: Mon, 05 Aug 2019 16:44:52 +0800
Ready: False
Restart Count: 5
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8080/health delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-hzkdx (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-hzkdx:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-hzkdx
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m7s (x3 over 5m27s) default-scheduler 0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
Normal Scheduled 5m5s default-scheduler Successfully assigned kube-system/coredns-5c98db65d4-v5mg8 to localhost.localdomain
Normal Pulled 3m42s (x5 over 5m4s) kubelet, localhost.localdomain Container image "k8s.gcr.io/coredns:1.3.1" already present on machine
Normal Created 3m42s (x5 over 5m4s) kubelet, localhost.localdomain Created container coredns
Normal Started 3m41s (x5 over 5m3s) kubelet, localhost.localdomain Started container coredns
Warning BackOff 3m40s (x10 over 5m1s) kubelet, localhost.localdomain Back-off restarting failed container
Environment: - Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
3、use kubectl logs coredns-5c98db65d4-qx4mq -n kube-system
empty
4、use docker version
Client: Docker Engine - Community
Version: 19.03.1
API version: 1.39 (downgraded from 1.40)
Go version: go1.12.5
Git commit: 74b1e89
Built: Thu Jul 25 21:21:07 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 18.09.7
API version: 1.39 (minimum version 1.12)
Go version: go1.10.8
Git commit: 2d0083d
Built: Thu Jun 27 17:26:28 2019
OS/Arch: linux/amd64
Experimental: false
5、show the CoreDns yaml file
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{"deployment.kubernetes.io/revision":"1"},"creationTimestamp":"2019-08-05T10:53:11Z","generation":1,"labels":{"k8s-app":"kube-dns"},"name":"coredns","namespace":"kube-system","resourceVersion":"930","selfLink":"/apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns","uid":"7250d0fc-7827-4910-bf87-f8340cde9f09"},"spec":{"progressDeadlineSeconds":600,"replicas":2,"revisionHistoryLimit":10,"selector":{"matchLabels":{"k8s-app":"kube-dns"}},"strategy":{"rollingUpdate":{"maxSurge":"25%","maxUnavailable":1},"type":"RollingUpdate"},"template":{"metadata":{"creationTimestamp":null,"labels":{"k8s-app":"kube-dns"}},"spec":{"containers":[{"args":["-conf","/etc/coredns/Corefile"],"image":"k8s.gcr.io/coredns:1.3.1","imagePullPolicy":"IfNotPresent","livenessProbe":{"failureThreshold":5,"httpGet":{"path":"/health","port":8080,"scheme":"HTTP"},"initialDelaySeconds":60,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":5},"name":"coredns","ports":[{"containerPort":53,"name":"dns","protocol":"UDP"},{"containerPort":53,"name":"dns-tcp","protocol":"TCP"},{"containerPort":9153,"name":"metrics","protocol":"TCP"}],"readinessProbe":{"failureThreshold":3,"httpGet":{"path":"/health","port":8080,"scheme":"HTTP"},"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1},"resources":{"limits":{"memory":"170Mi"},"requests":{"cpu":"100m","memory":"70Mi"}},"securityContext":{"allowPrivilegeEscalation":true,"capabilities":{"add":["NET_BIND_SERVICE"],"drop":["all"]},"readOnlyRootFilesystem":true},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","volumeMounts":[{"mountPath":"/etc/coredns","name":"config-volume","readOnly":true}]}],"dnsPolicy":"Default","nodeSelector":{"beta.kubernetes.io/os":"linux"},"priorityClassName":"system-cluster-critical","restartPolicy":"Always","schedulerName":"default-scheduler","securityContext":{},"serviceAccount":"coredns","serviceAccountName":"coredns","terminationGracePeriodSeconds":30,"tolerations":[{"key":"CriticalAddonsOnly","operator":"Exists"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"}],"volumes":[{"configMap":{"defaultMode":420,"items":[{"key":"Corefile","path":"Corefile"}],"name":"coredns"},"name":"config-volume"}]}}},"status":{"conditions":[{"lastTransitionTime":"2019-08-05T10:53:26Z","lastUpdateTime":"2019-08-05T10:53:26Z","message":"Deployment does not have minimum availability.","reason":"MinimumReplicasUnavailable","status":"False","type":"Available"},{"lastTransitionTime":"2019-08-06T01:45:12Z","lastUpdateTime":"2019-08-06T01:45:12Z","message":"ReplicaSet \"coredns-5c98db65d4\" has timed out progressing.","reason":"ProgressDeadlineExceeded","status":"False","type":"Progressing"}],"observedGeneration":1,"replicas":2,"unavailableReplicas":2,"updatedReplicas":2}}
creationTimestamp: "2019-08-05T10:53:11Z"
generation: 2
labels:
k8s-app: kube-dns
name: coredns
namespace: kube-system
resourceVersion: "1334"
selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/coredns
uid: 7250d0fc-7827-4910-bf87-f8340cde9f09
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kube-dns
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-dns
spec:
containers:
- args:
- -conf
- /etc/coredns/Corefile
image: k8s.gcr.io/coredns:1.3.1
imagePullPolicy: IfNotPresent
"/tmp/kubectl-edit-t5zl3.yaml" 135L, 6559C
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
securityContext:
allowPrivilegeEscalation: true
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/coredns
name: config-volume
readOnly: true
dnsPolicy: Default
nodeSelector:
beta.kubernetes.io/os: linux
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: coredns
serviceAccountName: coredns
terminationGracePeriodSeconds: 30
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
volumes:
- configMap:
defaultMode: 420
items:
- key: Corefile
path: Corefile
name: coredns
name: config-volume
status:
conditions:
- lastTransitionTime: "2019-08-05T10:53:26Z"
lastUpdateTime: "2019-08-05T10:53:26Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2019-08-06T01:50:20Z"
lastUpdateTime: "2019-08-06T01:50:20Z"
message: ReplicaSet "coredns-7688bbffb9" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 2
replicas: 3
unavailableReplicas: 3
updatedReplicas: 2
6、 look at the https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/#pods-in-runcontainererror-crashloopbackoff-or-error-state but did not solve my problem
7、use kubeadm init --config ./kubeadm.yml --ignore-preflight-errors=Swap
init the cluster
kubeadm.yml:
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.0.2.15
nodeRegistration:
taints:
- effect: PreferNoSchedule
key: node-role.kubernetes.io/master
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.15.0
networking:
podSubnet: 10.244.0.0/16
Verify if the node is in ready state kubectl get nodes -o wide
Verify your flannel Network (by default it should be 10.244.0.0/16). It should be set-up during cluster initialization --pod-network-cidr=10.244.0.0/16
.
Please also refer to:
Take a look for this example:
If you install a version of Docker older than 1.12.1, remove the MountFlags=slave option when booting dockerd with systemd and restart docker
If those steps will not help you, please provide more details about cluster initialization and configuration to recreate this issue.
I have reproduced this case using below settings.
uname -a
Linux g-dvmpku-0 3.10.0-957.27.2.el7.x86_64
cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)
kubectl version:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
kubeadm version: &version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:15:32Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
kubelet version: Kubernetes v1.15.1
kubectl get pods --all-namespaces:
kube-system coredns-5c98db65d4-grgcq 1/1 Running 0 5m22s
kube-system coredns-5c98db65d4-gvk4w 1/1 Running 0 5m22s
kube-system etcd-g-dvmpku-0 1/1 Running 0 4m44s
kube-system kube-apiserver-g-dvmpku-0 1/1 Running 0 4m32s
kube-system kube-controller-manager-g-dvmpku-0 1/1 Running 0 4m36s
kube-system kube-flannel-ds-amd64-zhb9v 1/1 Running 0 4m38s
kube-system kube-proxy-6mdmr 1/1 Running 0 5m22s
kube-system kube-scheduler-g-dvmpku-0 1/1 Running 0 4m25s
Everything is working as expected.
I am wondering about coredns image:
In your case:
Image ID: docker://sha256:eb516548c180f8a6e0235034ccee2428027896af16a509786da13022fe95fe8c
While in my case:
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:02382353821b12c21b062c59184e227e001079bb13ebd01f9d3270ba0fcbf1e4
If I am wrong please colaborate with more details.