I set up a Harbor registry which worked successfully for a couple of weeks now. For each deployment and namespace I a have a secret with the credentials from my ~/.docker/config.json file to get access to the registry. Since last weekend I was not able to pull images from that registry anymore and I didn't change anything! The cluster is running on GKE v1.12.5 btw.
What works? I can pull and push images from my local machine witch docker.
What does not work? My Kubernetes cluster cannot pull images anymore and runs in a timeout.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned k8s-test7/nginx-k8s-test7-6f7b8fdd79-2ffmp to gke-k8s-cloudops-test-default-pool-72fccd21-hrhk
Normal SandboxChanged 12m kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Pod sandbox changed, it will be killed and re-created.
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Failed to pull image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10": rpc error: code = Unknown desc = Error response from daemon: Get https://core.k8s-harbor-test.my-domain.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Failed 11m (x3 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ErrImagePull
Normal BackOff 11m (x7 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Back-off pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Normal Pulling 10m (x4 over 13m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk pulling image "core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10"
Warning Failed 3m2s (x38 over 12m) kubelet, gke-k8s-cloudops-test-default-pool-72fccd21-hrhk Error: ImagePullBackOff
deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-k8s-test7
namespace: k8s-test7
spec:
replicas: 1
template:
metadata:
labels:
app: nginx-k8s-test7
spec:
containers:
- name: nginx-k8s-test7
image: core.k8s-harbor-test.my-domain.com/nginx-test/nginx:1.15.10
volumeMounts:
- name: webcontent
mountPath: /usr/share/nginx/html
ports:
- containerPort: 80
volumes:
- name: webcontent
configMap:
name: webcontent
imagePullSecrets:
- name: harborcred
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: webcontent
namespace: k8s-test7
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 5Gi
The secret "harborcred" is part of every namespace so that the deployment can access it. The secret was created per kubernetes documentation:
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
kubectl create secret generic harborcred \
--from-file=.dockerconfigjson=~/.docker/config.json \
--type=kubernetes.io/dockerconfigjson \
--namespace=k8s-test7
Any help would be appreciated!
Hi at first look could you please:
Change image source and use some public one f.e. nginx to verify your deployment doesn't have other issues.
https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ provide also more details about inspecting the "Secrets".
Please also perform additional tests related to connectivity directly from your node as described within this post [How to debug "ImagePullBackOff"?
Additional steps to find the root cause:
1. Convert your secrets data:
kubectl get secret harborcred -n k8s-test7
--output="jsonpath={.data.\.dockerconfigjson}" | base64 --decode
2. Compare the result of decoding your "auth" field from the 1 step with your docker credentials using:
echo "your auth data" | base64 --decode
3. To find the root cause please use also:
kubectl get events -n k8s-test7 | grep pull
Please share with your logs.