rkt discovery failed error while bringing flannel up (Kubernetes setup)

2/16/2017

We are trying to setup Kubernetes cluster on 3 nodes with coreos following official step by step documentation - https://coreos.com/kubernetes/docs/latest/deploy-master.html Servers are behind company proxy, and have proxy service defined in both

/etc/systemd/system/docker.service.d 
/etc/systemd/system/flanneld.service.d

Following is picked in

systemctl cat flanneld
# /usr/lib/systemd/system/flanneld.service
[Unit]
Description=flannel - Network fabric for containers (System Application Container)
Documentation=https://github.com/coreos/flannel
After=etcd.service etcd2.service etcd-member.service
Before=docker.service flannel-docker-opts.service
Requires=flannel-docker-opts.service

[Service]
Type=notify
Restart=always
RestartSec=10s
LimitNOFILE=40000
LimitNPROC=1048576

Environment="FLANNEL_IMAGE_TAG=v0.6.2"
Environment="FLANNEL_OPTS=--ip-masq=true"
Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid"
EnvironmentFile=-/run/flannel/options.env

ExecStartPre=/sbin/modprobe ip_tables
ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos /run/flannel
ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/flannel-wrapper.uuid
ExecStart=/usr/lib/coreos/flannel-wrapper $FLANNEL_OPTS
ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/flannel-wrapper.uuid

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env
# /etc/systemd/system/flanneld.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"

and

systemctl cat docker
# /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=containerd.service docker.socket early-docker.target network.target
Requires=containerd.service docker.socket early-docker.target

[Service]
Type=notify
EnvironmentFile=-/run/flannel/flannel_docker_opts.env

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/lib/coreos/dockerd --host=fd:// --containerd=/var/run/docker/libcontainerd/docker-containerd.sock $DOCKER_OPTS $DOCKER_CGROUPS $DOCKER_OPT_BIP $DOCKER_OP
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/docker.service.d/40-flannel.conf
[Unit]
Requires=flanneld.service
After=flanneld.service
[Service]
EnvironmentFile=/etc/kubernetes/cni/docker_opts_cni.env

# /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"

# /etc/systemd/system/flanneld.service.d/40-ExecStartPre-symlink.conf
[Service]
ExecStartPre=/usr/bin/ln -sf /etc/flannel/options.env /run/flannel/options.env
# /etc/systemd/system/flanneld.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.140.65.114:8080/"
Environment="HTTPS_PROXY=http://10.140.65.114:8080/"

after running systemctl daemon-reload and systemctl start flannel, getting following error

Feb 16 19:50:40 localhost systemd[1]: Starting flannel - Network fabric for containers (System Application Container)...
-- Subject: Unit flanneld.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has begun starting up.
Feb 16 19:50:40 localhost rkt[52933]: rm: cannot get pod: no matches found for "26778eb4-9d8a-4d3c-9bb7-6ffb13a55d6a"
Feb 16 19:50:40 localhost rkt[52933]: rm: failed to remove one or more pods
Feb 16 19:50:40 localhost flannel-wrapper[52947]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper.uuid --trust-keys-from-https --mount volume=notify,target=/run/systemd/notify --volume notify,kind=host,source=/run/systemd/notify --set-env=NOTIFY_SOCKET=/run/systemd/notify --net=host --volume run-flannel,kind=host,source=/run/flannel,readOnly=false --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=run-flannel,target=/run/flannel --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=usr-share-certs,target=/usr/share/ca-certificates --mount volume=etc-hosts,target=/etc/hosts --mount volume=etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.6.2 -- --ip-masq=true
Feb 16 19:50:41 localhost sudo[52978]:    admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:41 localhost sudo[52978]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:41 localhost sudo[52978]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:41 localhost sudo[52978]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:42 localhost flannel-wrapper[52947]: image: keys already exist for prefix "quay.io/coreos/flannel", not fetching again
Feb 16 19:50:43 localhost sudo[52990]:    admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:43 localhost sudo[52990]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:43 localhost sudo[52990]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:43 localhost sudo[52990]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:44 localhost flannel-wrapper[52947]: Downloading signature:  0 B/473 B
Feb 16 19:50:44 localhost flannel-wrapper[52947]: Downloading signature:  473 B/473 B
Feb 16 19:50:45 localhost flannel-wrapper[52947]: Downloading signature:  473 B/473 B
Feb 16 19:50:45 localhost flannel-wrapper[52947]: run: Get https://quay-registry.s3.amazonaws.com/sharedimages/36acf4f7-a5bd-470b-9a44-13cbd244b571/layer?Signature=v8rQghQZR0k%2B1UxDG8oGw89vTqY%3D&Expires=1487255465&AWSAccessKeyId=AKIAJWZWUIS24TWSMWRA: Blocked site:
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Main process exited, code=exited, status=254/n/a
Feb 16 19:50:45 localhost rkt[52993]: stop: cannot get pod: no matches found for "26778eb4-9d8a-4d3c-9bb7-6ffb13a55d6a"
Feb 16 19:50:45 localhost rkt[52993]: stop: failed to stop 1 pod(s)
Feb 16 19:50:45 localhost systemd[1]: Failed to start flannel - Network fabric for containers (System Application Container).
-- Subject: Unit flanneld.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flanneld.service has failed.
--
-- The result is failed.
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Unit entered failed state.
Feb 16 19:50:45 localhost systemd[1]: flanneld.service: Failed with result 'exit-code'.
Feb 16 19:50:45 localhost systemd[1]: Starting flannel docker export service - Network fabric for containers (System Application Container)...
-- Subject: Unit flannel-docker-opts.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flannel-docker-opts.service has begun starting up.
Feb 16 19:50:45 localhost sudo[53003]:    admin : TTY=pts/1 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/journalctl -e -u kubelet
Feb 16 19:50:45 localhost sudo[53003]: pam_unix(sudo:session): session opened for user root by admin(uid=0)
Feb 16 19:50:45 localhost sudo[53003]: pam_systemd(sudo:session): Cannot create session: Already running in a session
Feb 16 19:50:45 localhost sudo[53003]: pam_unix(sudo:session): session closed for user root
Feb 16 19:50:45 localhost rkt[53000]: rm: cannot get pod: UUID cannot be empty
Feb 16 19:50:45 localhost rkt[53000]: rm: failed to remove one or more pods
Feb 16 19:50:45 localhost flannel-wrapper[53019]: + exec /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/flannel-wrapper2.uuid --trust-keys-from-https --net=host --volume run-flannel,kind=host,source=/run/flannel,readOnly=false --volume etc-ssl-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume usr-share-certs,kind=host,source=/usr/share/ca-certificates,readOnly=true --volume etc-hosts,kind=host,source=/etc/hosts,readOnly=true --volume etc-resolv,kind=host,source=/etc/resolv.conf,readOnly=true --mount volume=run-flannel,target=/run/flannel --mount volume=etc-ssl-certs,target=/etc/ssl/certs --mount volume=usr-share-certs,target=/usr/share/ca-certificates --mount volume=etc-hosts,target=/etc/hosts --mount volume=etc-resolv,target=/etc/resolv.conf --inherit-env --stage1-from-dir=stage1-fly.aci quay.io/coreos/flannel:v0.6.2 --exec=/opt/bin/mk-docker-opts.sh -- -d /run/flannel/flannel_docker_opts.env -i
Feb 16 19:50:46 localhost flannel-wrapper[53019]: run: discovery failed
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Main process exited, code=exited, status=254/n/a
Feb 16 19:50:46 localhost systemd[1]: Failed to start flannel docker export service - Network fabric for containers (System Application Container).
-- Subject: Unit flannel-docker-opts.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit flannel-docker-opts.service has failed.
--
-- The result is failed.
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Unit entered failed state.
Feb 16 19:50:46 localhost systemd[1]: flannel-docker-opts.service: Failed with result 'exit-code'.

We tried different document https://www.upcloud.com/support/deploy-kubernetes-coreos/ following it, getting same type error while starting kubelet.

Seems to be problem with rkt and quay registry issue behind company proxy. Let us know if we missed something or configured something wrong.

-- Vikrant Kamble
coreos
flannel
kubelet
kubernetes
rkt

1 Answer

3/10/2017

can you please try to $ sudo rkt fetch quay.io/coreos/flannel:v0.6.2

first in the shell.

I believe the issue is due to either running https proxy over http, or rkt fetch running as an unprivileged user and not inheriting system environmental variables.

-- Aleks
Source: StackOverflow