Kubernetes cannot pull from insecure registry ans cannot run container from local image on offline cluster

9/15/2017

I am working on a offline cluster (machines have no internet access), deploying docker images using ansible and docker compose scripts. My servers are Centos7. I have set up an insecure docker registry on the machines. We are going to change environnement, and I am installing kubernetes in order to manage my pull of container.

I follow this guide to install kubernetes: https://severalnines.com/blog/installing-kubernetes-cluster-minions-centos7-manage-pods-services

After the installation, I tried to launch a testing pod. here is the yml for the pod, launching with

    kubectl -f create nginx.yml

here the yml:

    apiVersion: v1
    kind: Pod
    metadata: 
      name: nginx
    spec:
      containers:
      - name: nginx
        image: [my_registry_addr]:[my_registry_port]/nginx:v1 

        ports:
        - containerPort: 80

I used kubectl describe to get more information on what was wrong:

    Name:       nginx
    Namespace:  default
    Node:       [my node]
    Start Time: Fri, 15 Sep 2017 11:29:05 +0200
    Labels:     <none>
    Status:     Pending
    IP:     
    Controllers:    <none>
    Containers:
      nginx:
        Container ID:       
        Image:          [my_registry_addr]:[my_registry_port]/nginx:v1
        Image ID:           
        Port:           80/TCP
        State:          Waiting
          Reason:           ContainerCreating
        Ready:          False
        Restart Count:      0
        Volume Mounts:      <none>
        Environment Variables:  <none>
    Conditions:
      Type      Status
      Initialized   True 
      Ready     False 
      PodScheduled  True 
    No volumes.
    QoS Class:  BestEffort
    Tolerations:    <none>
    Events:
      FirstSeen LastSeen    Count   From                SubObjectPath           Type        Reason      Message
      --------- --------    -----   ----                -------------   --------    ------      -------
      2m        2m      1   {default-scheduler }                Normal      Scheduled   Successfully assigned nginx to [my kubernet node]
      1m        1m      2   {kubelet [my kubernet node]}            Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_addr]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"

      54s   54s 1   {kubelet [my kubernet node]}        Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"[my_registry_addr]:[my_registry_port]\""

      8s    8s  1   {kubelet [my kubernet node]}        Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Network timed out while trying to connect to https://index.docker.io/v1/repositories/library/[my_registry_addr]/images. You may want to check your internet connection or if you are behind a proxy."

then, I go to my node and use journalctl -xe

    sept. 15 11:22:02 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:02.350930396+02:00" level=info msg="{Action=create, LoginUID=4294967295, PID=11555}"
    sept. 15 11:22:17 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:17.351536727+02:00" level=warning msg="Error getting v2 registry: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
    sept. 15 11:22:17 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:17.351606330+02:00" level=error msg="Attempting next endpoint for pull after error: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
    sept. 15 11:22:32 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:32.353946452+02:00" level=error msg="Not continuing with pull after error: Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"
    sept. 15 11:22:32 [my_node_ip] kubelet[11555]: E0915 11:22:32.354309   11555 docker_manager.go:2161] Failed to create pod infra container: ErrImagePull; Skipping pod "nginx_default(8b5c40e5-99f4-11e7-98db-f8bc12456ee4)": Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving
    sept. 15 11:22:32 [my_node_ip] kubelet[11555]: E0915 11:22:32.354390   11555 pod_workers.go:184] Error syncing pod 8b5c40e5-99f4-11e7-98db-f8bc12456ee4, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Error while pulling image: Get https://index.docker.io/v1/repositories/library/[my_registry_ip]/images: dial tcp: lookup index.docker.io on [kubernet_master_ip]:53: server misbehaving"
    sept. 15 11:22:44 [my_node_ip] dockerd-current[9861]: time="2017-09-15T11:22:44.350708175+02:00" level=error msg="Handler for GET /v1.24/images/[my_registry_ip]:[my_registry_port]/json returned error: No such image: [my_registry_ip]:[my_registry_port]"

I sure thant my docker configuration is good, cause I am using it every day with ansible or mesos.

docker version is 1.12.6, kubernetes version is 1.5.2

What can I do now? I didn't find any configuration key for this usage.

When I saw that pulling was failing, I manually pull the image on all the nodes. I put a tag to ensure that kubernetes will to try to pull as default, and set " imagePullPolicy: IfNotPresent "

-- valentin blondeau
centos7
docker
docker-registry
kubernetes

2 Answers

9/18/2017

The syntax for specifying the docker image is :

[docker_registry]/[image_name]:[image_tag]

In your manifest file, you have used ":" to separate docker repository host and the port the repository is listening on. The default port for docker private registry I guess is 5000. So change your image declaration from

Image:          [my_registry_addr]:[my_registry_port]/nginx:v1

to

Image:          [my_registry_addr]/nginx:v1

Also, check the network connectivity from the worker node to your docker registry by doing a ping.

ping [my_registry_addr]

If you still want to check if the port 443 is opened on the registry you can do a tcp check on that port on the host running docker registry

curl telnet://[my_registry_addr]:443

Hope that helps.

-- Swarup Donepudi
Source: StackOverflow

9/28/2017

I finally find what was the problem.

To work, Kubernetes need a pause container. Kubernetes was trying to find the pause container on the internet.

I deployed a custom pause container on my registry, I set up kubernetes pause container to this image.

After that, kubernetes is working like a charm.

-- valentin blondeau
Source: StackOverflow