Docker local repository not working after node reboot

4/8/2020

The docker local repository server do not work properly after k8s master node reboot.

The repository server exists and has the same IP assigned as initially working and added on docker daemon.json. Repository volume is also available.

After running the command 'sudo docker-compose up', successful curl from parallel terminal, 'Ctrl C', re-creating and tagging the images I was able again to push and pull images from local repository and the pods started to run as expected.

Docker details Version: 19.03.6 API version: 1.40 Go version: go1.12.17 Git commit: 369ce74a3c Built: Fri Feb 28 23:45:43 2020 OS/Arch: linux/amd64 Experimental: false

Server: Engine: Version: 19.03.6 API version: 1.40 (minimum version 1.12) Go version: go1.12.17 Git commit: 369ce74a3c Built: Wed Feb 19 01:06:16 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.3.3-0ubuntu1~18.04.1 GitCommit: runc: Version: spec: 1.0.1-dev GitCommit: docker-init: Version: 0.18.0 GitCommit:

For me this looks like a docker daemon bug and if so is there any proper/permanent fix for this problem?

~$ kubectl get pods,svc,pvc,pv,deployment
NAME READY STATUS RESTARTS AGE
pod/nginx-595f85746d-w8khf 1/1 Running 4 2d23h
pod/registry-cbc9b4779-nhf5f 1/1 Running 4 2d23h
pod/try1-7f766ff65-7x79x 0/1 ImagePullBackOff 0 19h
pod/try1-7f766ff65-fpzpx 0/1 ImagePullBackOff 0 19h
pod/try1-7f766ff65-hq7ql 0/1 ImagePullBackOff 0 19h
pod/try1-7f766ff65-j7nh2 0/1 ImagePullBackOff 0 19h
pod/try1-7f766ff65-vzcsk 0/1 ImagePullBackOff 0 19h
pod/try1-7f766ff65-x646d 0/1 ImagePullBackOff 0 19h


$ kubectl describe pod try1-7f766ff65-x646d
Name: try1-7f766ff65-x646d
Namespace: default
Priority: 0
Node: kw1/...
Start Time: Tue, 07 Apr 2020 01:46:05 +0000
Labels: app=try1
pod-template-hash=7f766ff65
Annotations: cni.projectcalico.org/podIP: 192.168.159.102/32
Status: Pending
IP: 192.168.159.102
IPs:
IP: 192.168.159.102
Controlled By: ReplicaSet/try1-7f766ff65
Containers:
simpleapp:
Container ID:
Image: 10.107.241.131:5000/simpleapp:latest
Image ID:
Port:
Host Port:
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Readiness: exec [cat /tmp/healthy] delay=0s timeout=1s period=5s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-zqbsw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-zqbsw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-zqbsw
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25h default-scheduler Successfully assigned default/try1-7f766ff65-x646d to kw1
Normal Pulling 25h kubelet, kw1 Pulling image "10.107.241.131:5000/simpleapp:latest"
Normal Pulled 25h kubelet, kw1 Successfully pulled image "10.107.241.131:5000/simpleapp:latest"
Normal Created 25h kubelet, kw1 Created container simpleapp
Normal Started 25h kubelet, kw1 Started container simpleapp
Warning Unhealthy 25h (x120 over 25h) kubelet, kw1 Readiness probe failed: cat: /tmp/healthy: No such file or directory
Warning FailedCreatePodSandBox 4m48s kubelet, kw1 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "3c107122748a2d99f9f89e92d960d939c29392fbf94f28761c1cdce66240b844" network for pod "try1-7f766ff65-x646d": networkPlugin cni failed to set up pod "try1-7f766ff65-x646d_default" network: error getting ClusterInformation: Get https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default: dial tcp 10.96.0.1:443: i/o timeout
Normal SandboxChanged 4m47s kubelet, kw1 Pod sandbox changed, it will be killed and re-created.
Warning Failed 4m32s kubelet, kw1 Failed to pull image "10.107.241.131:5000/simpleapp:latest": rpc error: code = Unknown desc = Error response from daemon: Get http://10.107.241.131:5000/v2/: dial tcp 10.107.241.131:5000: connect: no route to host
Normal BackOff 3m15s (x5 over 4m31s) kubelet, kw1 Back-off pulling image "10.107.241.131:5000/simpleapp:latest"
Normal Pulling 3m2s (x4 over 4m45s) kubelet, kw1 Pulling image "10.107.241.131:5000/simpleapp:latest"
Warning Failed 3m2s (x4 over 4m32s) kubelet, kw1 Error: ErrImagePull
Warning Failed 3m2s (x3 over 4m19s) kubelet, kw1 Failed to pull image "10.107.241.131:5000/simpleapp:latest": rpc error: code = Unknown desc = Error response from daemon: manifest for 10.107.241.131:5000/simpleapp:latest not found: manifest unknown: manifest unknown
Warning Failed 2m47s (x6 over 4m31s) kubelet, kw1 Error: ImagePullBackOff
-- Stefan
docker
docker-compose
kubernetes

1 Answer

4/8/2020

Check you docker image , seems like it can't pull your Docker image if you use minikube just type

minikube start --insecure-registry=<URHOST>:5000 

or edit /etc/default/docker and add

 DOCKER_OPTS="--insecure-registry <URHOST>:5000"
-- GeeketteTN
Source: StackOverflow