weave-net pods in pending state, error in scheduler logs

10/13/2019

I am trying to set up a fresh kubernetes cluster, and facing issue with using weave as the networking solution. Weave pods are hung in pending state and no events/logs available from kubectl command line.

I am trying to set-up a kubernetes cluster from scratch as part of an online course. I have set up master nodes - with api server, controller manager and scheduler up and running. And the worker nodes with kubelets and kube-proxy running.

Node status:

vagrant@master-1:~$ kubectl get nodes -n kube-system

NAME STATUS ROLES AGE VERSION worker-1 NotReady <none> 25h v1.13.0 worker-2 NotReady <none> 9h v1.13.0

As next step to enable networking, I am using weave. I have installed weave and extracted on worker nodes.

Now when I try to run below command:

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

I see DaemonSet getting initialized, but the pods created continue to be in "Pending state".

vagrant@master-1:~$ kubectl get pods -n kube-system

NAME READY STATUS RESTARTS AGE weave-net-ccrqs 0/2 Pending 0 73m weave-net-vrm5f 0/2 Pending 0 73m

The below command: vagrant@master-1:~$ kubectl describe pods -n kube-system does not return any events ongoing.

From the scheduler service logs, I could see below errors logged.

Oct 13 16:46:51 master-2 kube-scheduler[14569]: E1013 16:46:51.973883   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:anonymous" cannot list resource "statefulsets" in API group "apps" at the cluster scope
Oct 13 16:46:51 master-2 kube-scheduler[14569]: E1013 16:46:51.982228   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.StorageClass: storageclasses.storage.k8s.io is forbidden: User "system:anonymous" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.338171   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:anonymous" cannot list resource "persistentvolumes" in API group "" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.745288   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Service: services is forbidden: User "system:anonymous" cannot list resource "services" in API group "" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.765103   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1beta1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:anonymous" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.781419   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:anonymous" cannot list resource "replicasets" in API group "apps" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.785872   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:anonymous" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.786117   14569 reflector.go:134] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:232: Failed to list *v1.Pod: pods is forbidden: User "system:anonymous" cannot list resource "pods" in API group "" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.786790   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Node: nodes is forbidden: User "system:anonymous" cannot list resource "nodes" in API group "" at the cluster scope
Oct 13 16:46:52 master-2 kube-scheduler[14569]: E1013 16:46:52.787016   14569 reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:anonymous" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope

Since I am quite new to kubernetes, please excuse if I missed to add relevant information. Will share with immediate effect. Kind help required.

Added kubeconfig for scheduler:

    {
      kubectl config set-cluster kubernetes-the-hard-way \
        --certificate-authority=ca.crt \
        --embed-certs=true \
        --server=https://127.0.0.1:6443 \
        --kubeconfig=kube-scheduler.kubeconfig

      kubectl config set-credentials system:kube-scheduler \
        --client-certificate=kube-scheduler.crt \
        --client-key=kube-scheduler.key \
        --embed-certs=true \
        --kubeconfig=kube-scheduler.kubeconfig

      kubectl config set-context default \
        --cluster=kubernetes-the-hard-way \
        --user=system:kube-scheduler \
        --kubeconfig=kube-scheduler.kubeconfig

      kubectl config use-context default --kubeconfig=kube- 
   scheduler.kubeconfig
    }

Added scheduler service definition:

cat <<EOF | sudo tee /etc/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
  --kubeconfig=/var/lib/kubernetes/kube-scheduler.kubeconfig \\
  --address=127.0.0.1 \\
  --leader-elect=true \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

Started scheduler using:

sudo systemctl enable kube-scheduler
sudo systemctl start kube-scheduler

Component status:

vagrant@master-1:~$ kubectl get componentstatuses --kubeconfig admin.kubeconfig
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok
controller-manager   Healthy   ok
etcd-0               Healthy   {"health":"true"}
etcd-1               Healthy   {"health":"true"}
-- anmsal
kubernetes
weave

2 Answers

10/13/2019

The log messages from the scheduler suggest it is not configured to run under a system account - it needs the rights to see what is going on and to make changes.

Presumably you are supposed to configure this in an earlier step.

I don't think it has anything to do with Weave Net - probably you would face the same difficulty trying to run anything.

-- Bryan
Source: StackOverflow

10/14/2019

I have restarted kube scheduler and controller manager on both master nodes participating in HA which I believe allowed load balancer URL for the api server to take effect and the errors observed earlier were eliminated.

After this, I have set up a worker node and installed weave, the pod got deployed and node became ready.

vagrant@master-1:~$ kubectl get pods -n kube-system
NAME              READY   STATUS    RESTARTS   AGE
weave-net-zswht   1/2     Running   0          41s
vagrant@master-1:~$ kubectl get nodes
NAME       STATUS   ROLES    AGE     VERSION
worker-1   Ready    <none>   4m51s   v1.13.0
-- anmsal
Source: StackOverflow