Kubernetes Canal CNI error on masters

5/18/2018

I'm setting up a Kubernetes cluster on a customer.

I've done this process before multiple times, including dealing with vagrant specifics and I've been able to constantly get a K8s cluster up and running without too much fuss.

Now, on this customer I'm doing the same but I've been finding a lot of issues when setting things up, which is completely unexpected. Comparing to other places where I've setup Kubernetes, the only obvious difference that I see is that I have a proxy server which I constantly have to battle with. Nothing that a NO_PROXY env hasn't been able to handle.

The main issue I'm facing is setting up Canal (Calico + Flannel). For some reason, on Masters 2 and 3 it just won't start.

NAMESPACE     NAME                                             READY     STATUS              RESTARTS   AGE   IP            NODE
kube-system   canal-2pvpr                                      2/3       CrashLoopBackOff    7          14m   10.136.3.37   devmn2.cpdprd.pt
kube-system   canal-rdmnl                                      2/3       CrashLoopBackOff    7          14m   10.136.3.38   devmn3.cpdprd.pt
kube-system   canal-swxrw                                      3/3       Running             0          14m   10.136.3.36   devmn1.cpdprd.pt
kube-system   kube-apiserver-devmn1.cpdprd.pt                  1/1       Running             1          1h    10.136.3.36   devmn1.cpdprd.pt
kube-system   kube-apiserver-devmn2.cpdprd.pt                  1/1       Running             1          4h    10.136.3.37   devmn2.cpdprd.pt
kube-system   kube-apiserver-devmn3.cpdprd.pt                  1/1       Running             1          1h    10.136.3.38   devmn3.cpdprd.pt
kube-system   kube-controller-manager-devmn1.cpdprd.pt         1/1       Running             0          15m   10.136.3.36   devmn1.cpdprd.pt
kube-system   kube-controller-manager-devmn2.cpdprd.pt         1/1       Running             0          15m   10.136.3.37   devmn2.cpdprd.pt
kube-system   kube-controller-manager-devmn3.cpdprd.pt         1/1       Running             0          15m   10.136.3.38   devmn3.cpdprd.pt
kube-system   kube-dns-86f4d74b45-vqdb4                        0/3       ContainerCreating   0          1h    <none>        devmn2.cpdprd.pt
kube-system   kube-proxy-4j7dp                                 1/1       Running             1          2h    10.136.3.38   devmn3.cpdprd.pt
kube-system   kube-proxy-l2wpm                                 1/1       Running             1          2h    10.136.3.36   devmn1.cpdprd.pt
kube-system   kube-proxy-scm9g                                 1/1       Running             1          2h    10.136.3.37   devmn2.cpdprd.pt
kube-system   kube-scheduler-devmn1.cpdprd.pt                  1/1       Running             1          1h    10.136.3.36   devmn1.cpdprd.pt
kube-system   kube-scheduler-devmn2.cpdprd.pt                  1/1       Running             1          4h    10.136.3.37   devmn2.cpdprd.pt
kube-system   kube-scheduler-devmn3.cpdprd.pt                  1/1       Running             1          1h    10.136.3.38   devmn3.cpdprd.pt

Looking for the specific error, I've come to find out that the issue is with the kube-flannel container, which is throwing an error:

[exXXXXX@devmn1 ~]$ kubectl logs canal-rdmnl -n kube-system -c kube-flannel
I0518 16:01:22.555513       1 main.go:487] Using interface with name ens192 and address 10.136.3.38
I0518 16:01:22.556080       1 main.go:504] Defaulting external address to interface address (10.136.3.38)
I0518 16:01:22.565141       1 kube.go:130] Waiting 10m0s for node controller to sync
I0518 16:01:22.565167       1 kube.go:283] Starting kube subnet manager
I0518 16:01:23.565280       1 kube.go:137] Node controller sync successful
I0518 16:01:23.565311       1 main.go:234] Created subnet manager: Kubernetes Subnet Manager - devmn3.cpdprd.pt
I0518 16:01:23.565331       1 main.go:237] Installing signal handlers
I0518 16:01:23.565388       1 main.go:352] Found network config - Backend type: vxlan
I0518 16:01:23.565440       1 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
E0518 16:01:23.565619       1 main.go:279] Error registering network: failed to acquire lease: node "devmn3.cpdprd.pt" pod cidr not assigned
I0518 16:01:23.565671       1 main.go:332] Stopping shutdownHandler...

I just can't understand why.

Some relevant info:

  • My clusterCIDR and podCIDR are: 192.168.151.0/25 (I know, it's weird, don't ask unless it's a huge issue)
  • I've setup etcd on systemd
  • I've modified the kube-controller-manager.yaml to change the mask size to 25 (otherwise the IP mentioned before wouldn't work).

I'm installing everything with Kubeadm. One weird thing I did notice was that, when viewing the config (kubeadm config view) much of the information that I had setup on the kubeadm config.yaml (for kubeadm init) was not present in the config view, including the paths to etcd certs. I'm also not sure why that happened, but I've fixed it (hopefully) by editing the kubeadm config map (kubectl edit cm kubeadm-config -n kube-system) and saving it.

Still no luck with canal.

Can anyone help me figure out what's wrong? I have documented pretty much every step of the configuration I've done, so if required I may be able to provide it.

EDIT:

I figured how meanwhile that indeed my master2 and 3 do not have a podCIDR associated. Why would this happen? And how can I add it?

-- Zed_Blade
flannel
kubernetes
project-calico

1 Answer

5/21/2018

Try to edit: /etc/kubernetes/manifests/kube-controller-manager.yaml and add

--allocate-node-cidrs=true  
--cluster-cidr=192.168.151.0/25

then, reload kubelet.

I found this information here and it was useful for me.

-- Nick Rak
Source: StackOverflow