Kubernetes upgrade failed to keep Pod Subnet

7/5/2018

I have upgraded my home baremetal cluster using kubeadm from 1.10.2 to 1.11.0. The upgrade process failed due to cgroup mismatch (I needed systemd cgroup and cgroupfs was there instead :( ).

I have my cluster reporting it is on 1.11.0. Pods are running and communicating on the same node; however, pods are not able to communicate to pods on other nodes.

I have flannel as my cni, but somehow the pod CIDR has been changed to 172.17.0.0/16 when it should be 10.244.0.0/16. Networking, ingress, and services work as long as I use 1 host, but when I need a second host, IP routing is not occurring. My theory is that an IP routing issue is going on.

Q: How do I modify my CNI plugin to use the flannel interface on my single NIC nodes when it wants to use the docker interface?

kubeadm mentioning I am on the latest version

[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I0704 23:08:54.560787    4588 feature_gate.go:230] feature gates: &{map[]}
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.11.0
[upgrade/versions] kubeadm version: v1.11.0
[upgrade/versions] Latest stable version: v1.11.0
[upgrade/versions] Latest version in the v1.11 series: v1.11.0

Awesome, you're up-to-date! Enjoy!

master NIC

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s20u3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 68:1d:ef:06:2c:3a brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.103/24 brd 192.168.0.255 scope global noprefixroute enp0s20u3
       valid_lft forever preferred_lft forever
    inet6 fe80::6a1d:efff:fe06:2c3a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:3a:a6:ec:db brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 76:35:47:c7:1f:38 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::7435:47ff:fec7:1f38/64 scope link 
       valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 0a:58:0a:f4:00:01 brd ff:ff:ff:ff:ff:ff
    inet 10.244.0.1/24 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::8820:b3ff:fed0:bff9/64 scope link 
       valid_lft forever preferred_lft forever
11: veth6494e2fc@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default 
    link/ether fa:84:37:5f:ac:ac brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::f884:37ff:fe5f:acac/64 scope link 
       valid_lft forever preferred_lft forever

node 1

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s20u3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 68:1d:ef:06:2f:d3 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.104/24 brd 192.168.0.255 scope global noprefixroute enp0s20u3
       valid_lft forever preferred_lft forever
    inet6 fe80::6a1d:efff:fe06:2fd3/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:6c:7d:0e:a5 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:6cff:fe7d:ea5/64 scope link 
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 9a:88:74:72:0a:45 brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::9888:74ff:fe72:a45/64 scope link 
       valid_lft forever preferred_lft forever
5: cni0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 0a:58:0a:f4:01:01 brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.1/24 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::50c0:eaff:fede:e09e/64 scope link 
       valid_lft forever preferred_lft forever
34: veth69dcb3f@if33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether be:eb:08:71:dc:8a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::bceb:8ff:fe71:dc8a/64 scope link 
       valid_lft forever preferred_lft forever
36: veth89519e8@if35: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 8e:c8:77:5c:b3:85 brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::8cc8:77ff:fe5c:b385/64 scope link 
       valid_lft forever preferred_lft forever
44: veth7e0c05a@if43: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 4a:36:e1:60:78:cc brd ff:ff:ff:ff:ff:ff link-netnsid 3
    inet6 fe80::4836:e1ff:fe60:78cc/64 scope link 
       valid_lft forever preferred_lft forever
46: veth944bd64@if45: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether f6:36:31:d7:df:11 brd ff:ff:ff:ff:ff:ff link-netnsid 6
    inet6 fe80::f436:31ff:fed7:df11/64 scope link 
       valid_lft forever preferred_lft forever
48: vethe018c5f@if47: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether ca:6b:b1:7d:15:63 brd ff:ff:ff:ff:ff:ff link-netnsid 7
    inet6 fe80::c86b:b1ff:fe7d:1563/64 scope link 
       valid_lft forever preferred_lft forever
50: vethbd59e85@if49: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 3e:71:2e:ae:97:02 brd ff:ff:ff:ff:ff:ff link-netnsid 8
    inet6 fe80::3c71:2eff:feae:9702/64 scope link 
       valid_lft forever preferred_lft forever
52: veth6ba9feb@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether e2:57:42:a0:ec:3b brd ff:ff:ff:ff:ff:ff link-netnsid 9
    inet6 fe80::e057:42ff:fea0:ec3b/64 scope link 
       valid_lft forever preferred_lft forever
58: veth33e51b9@if57: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 86:c6:09:5e:d0:b9 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::84c6:9ff:fe5e:d0b9/64 scope link 
       valid_lft forever preferred_lft forever

node 2

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s20u3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 68:1d:ef:06:2d:9b brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.105/24 brd 192.168.0.255 scope global noprefixroute enp0s20u3
       valid_lft forever preferred_lft forever
    inet6 fe80::6a1d:efff:fe06:2d9b/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:2e:c6:f8:ae brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:2eff:fec6:f8ae/64 scope link 
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 96:b0:30:d1:9f:ca brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::94b0:30ff:fed1:9fca/64 scope link 
       valid_lft forever preferred_lft forever
14: veth2d26912@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether b6:f1:1c:ea:80:bb brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::b4f1:1cff:feea:80bb/64 scope link 
       valid_lft forever preferred_lft forever
16: veth600a995@if15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether e2:cb:50:d1:c8:3d brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet6 fe80::e0cb:50ff:fed1:c83d/64 scope link 
       valid_lft forever preferred_lft forever
18: vethba1dfe3@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 7e:a8:37:60:0a:11 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::7ca8:37ff:fe60:a11/64 scope link 
       valid_lft forever preferred_lft forever
20: vethc330c28@if19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether c6:19:7a:ec:f3:05 brd ff:ff:ff:ff:ff:ff link-netnsid 3
    inet6 fe80::c419:7aff:feec:f305/64 scope link 
       valid_lft forever preferred_lft forever
22: vethed97c29@if21: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether a6:55:f4:ce:31:48 brd ff:ff:ff:ff:ff:ff link-netnsid 4
    inet6 fe80::a455:f4ff:fece:3148/64 scope link 
       valid_lft forever preferred_lft forever
24: vethd8a7c40@if23: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether fe:8e:10:2a:b7:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::fc8e:10ff:fe2a:b7c3/64 scope link 
       valid_lft forever preferred_lft forever

/etc/cni/net.d/10-flannel.conf (similar on all hosts)

{
  "name": "cbr0",
  "type": "flannel",
  "delegate": {
    "isDefaultGateway": true
  }
}

kubeadm config

api:
  advertiseAddress: 192.168.0.103
  bindPort: 6443
  controlPlaneEndpoint: ""
apiServerExtraArgs:
  authorization-mode: Node,RBAC
apiVersion: kubeadm.k8s.io/v1alpha2
auditPolicy:
  logDir: /var/log/kubernetes/audit
  logMaxAge: 2
  path: ""
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
etcd:
  local:
    dataDir: /var/lib/etcd
    image: ""
imageRepository: k8s.gcr.io
kind: MasterConfiguration
kubeProxy:
  config:
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 10
      contentType: application/vnd.kubernetes.protobuf
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 5
    clusterCIDR: 10.244.0.0/16
    configSyncPeriod: 15m0s
    conntrack:
      max: null
      maxPerCore: 32768
      min: 131072
      tcpCloseWaitTimeout: 1h0m0s
      tcpEstablishedTimeout: 24h0m0s
    enableProfiling: false
    healthzBindAddress: 0.0.0.0:10256
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      syncPeriod: 30s
    metricsBindAddress: 127.0.0.1:10249
    mode: ""
    nodePortAddresses: null
    oomScoreAdj: -999
    portRange: ""
    resourceContainer: /kube-proxy
    udpIdleTimeout: 250ms
kubeletConfiguration:
  baseConfig:
    address: 0.0.0.0
    authentication:
      anonymous:
        enabled: false
      webhook:
        cacheTTL: 2m0s
        enabled: true
      x509:
        clientCAFile: /etc/kubernetes/pki/ca.crt
    authorization:
      mode: Webhook
      webhook:
        cacheAuthorizedTTL: 5m0s
        cacheUnauthorizedTTL: 30s
    cgroupDriver: systemd
    cgroupsPerQOS: true
    clusterDNS:
    - 10.96.0.10
    clusterDomain: cluster.local
    containerLogMaxFiles: 5
    containerLogMaxSize: 10Mi
    contentType: application/vnd.kubernetes.protobuf
    cpuCFSQuota: true
    cpuManagerPolicy: none
    cpuManagerReconcilePeriod: 10s
    enableControllerAttachDetach: true
    enableDebuggingHandlers: true
    enforceNodeAllocatable:
    - pods
    eventBurst: 10
    eventRecordQPS: 5
    evictionHard:
      imagefs.available: 15%
      memory.available: 100Mi
      nodefs.available: 10%
      nodefs.inodesFree: 5%
    evictionPressureTransitionPeriod: 5m0s
    failSwapOn: true
    fileCheckFrequency: 20s
    hairpinMode: promiscuous-bridge
    healthzBindAddress: 127.0.0.1
    healthzPort: 10248
    httpCheckFrequency: 20s
    imageGCHighThresholdPercent: 85
    imageGCLowThresholdPercent: 80
    imageMinimumGCAge: 2m0s
    iptablesDropBit: 15
    iptablesMasqueradeBit: 14
    kubeAPIBurst: 10
    kubeAPIQPS: 5
    makeIPTablesUtilChains: true
    maxOpenFiles: 1000000
    maxPods: 110
    nodeStatusUpdateFrequency: 10s
    oomScoreAdj: -999
    podPidsLimit: -1
    port: 10250
    registryBurst: 10
    registryPullQPS: 5
    resolvConf: /etc/resolv.conf
    rotateCertificates: true
    runtimeRequestTimeout: 2m0s
    serializeImagePulls: true
    staticPodPath: /etc/kubernetes/manifests
    streamingConnectionIdleTimeout: 4h0m0s
    syncFrequency: 1m0s
    volumeStatsAggPeriod: 1m0s
kubernetesVersion: v1.11.0
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
nodeRegistration: {}
unifiedControlPlaneImage: ""

pods (truncated)

NAMESPACE         NAME                                              READY     STATUS             RESTARTS   AGE       IP              NODE
cert-manager      cert-manager-65b7d47f7d-77nqj                     2/2       Running            4          9h        172.17.0.4      k8s-node-1.home
docker-registry   reg-server-76695985b6-bc49x                       0/1       CrashLoopBackOff   22         1h        172.17.0.2      k8s-node-2.home
docker-registry   registry-5df69cb5f7-2n2lv                         1/1       Running            0          1h        172.17.0.3      k8s-node-2.home
ingress-nginx     nginx-ingress-controller-699cdf846-w2dmj          1/1       Running            0          2d        172.17.0.8      k8s-node-1.home
kube-system       coredns-78fcdf6894-v2hrg                          0/1       CrashLoopBackOff   18         1h        172.17.0.6      k8s-node-2.home
kube-system       coredns-df995dbb4-j9pzw                           1/1       Running            0          2d        10.244.0.53     k8s-master.home
kube-system       kube-apiserver-k8s-master.home                   1/1       Running            0          1h        192.168.0.103   k8s-master.home
kube-system       kube-controller-manager-k8s-master.home          1/1       Running            0          56m       192.168.0.103   k8s-master.home
kube-system       kube-flannel-ds-6flvr                             1/1       Running            15         96d       192.168.0.103   k8s-master.home
kube-system       kube-proxy-mjxn9                                  1/1       Running            0          35m       192.168.0.103   k8s-master.home
kube-system       kube-scheduler-k8s-master.home                   1/1       Running            27         2d        192.168.0.103   k8s-master.home

Note: reg-server-76695985b6-bc49x is trying to access registry-5df69cb5f7-2n2lv through an ingress. It is failing to do so. If both were on the same node as the nginx ingress, they succeed in their requests.

-- AWippler
kubeadm
kubernetes

1 Answer

7/5/2018

So apparently with the new kubernetes 1.11.x version, there are new config maps in kube-system which specify the cgroup being used. Once I edited those to have the correct cgroup of systemd (I run CentOS nodes), waited several minutes, and restarted the pods on the nodes, the flannel subnet was being used again.

EDIT
Looks like I also had to edit /var/lib/kubelet/kubeadm-flags.env to include:

KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --network-plugin=cni
-- AWippler
Source: StackOverflow