impaired / delayed connectivity of Cluster-IPs from k8s master node

1/3/2020

I am using kubernetes 1.17 on CentOS 7 with flannel:v0.11.0 and having issues with reachibility of my CLUSTER-IPs from the control plane.

I installed and setup the cluster manually with kubeadm.

This is basically my cluster:

k8s-master-01 10.0.0.50/24
k8s-worker-01 10.0.0.60/24 
k8s-worker-02 10.0.0.61/24

Pod CIDR: 10.244.0.0/16
Service CIDR: 10.96.0.0/12

Hint: each node has two NICs (eth0: uplink, eth1: private) The above listed IPs are assigned to eth1 each. kubelet, kube-proxy and flannel are configured to send/receive their traffic through private network on eth1.

I faced the problem the first time when i tried to provide the metric-server api through kube-apiserver. I followed the instructions from here. It seems that the control plane isn't able to communicate with the service network properly.

So here are my pods of kube-system namespace:

$ kubectl get pods -n kube-system -o wide
NAME                                    READY   STATUS    RESTARTS   AGE     IP            NODE            NOMINATED NODE   READINESS GATES
coredns-6955765f44-jrbs6                0/1     Running   9          24d     10.244.0.30   k8s-master-01   <none>           <none>
coredns-6955765f44-mwn2l                1/1     Running   8          24d     10.244.1.37   k8s-worker-01   <none>           <none>
etcd-k8s-master-01                      1/1     Running   9          24d     10.0.0.50     k8s-master-01   <none>           <none>
kube-apiserver-k8s-master-01            1/1     Running   0          2m26s   10.0.0.50     k8s-master-01   <none>           <none>
kube-controller-manager-k8s-master-01   1/1     Running   15         24d     10.0.0.50     k8s-master-01   <none>           <none>
kube-flannel-ds-amd64-7d6jq             1/1     Running   11         26d     10.0.0.60     k8s-worker-01   <none>           <none>
kube-flannel-ds-amd64-c5rj2             1/1     Running   11         26d     10.0.0.50     k8s-master-01   <none>           <none>
kube-flannel-ds-amd64-dsg6l             1/1     Running   11         26d     10.0.0.61     k8s-worker-02   <none>           <none>
kube-proxy-mrz9v                        1/1     Running   10         24d     10.0.0.50     k8s-master-01   <none>           <none>
kube-proxy-slt95                        1/1     Running   9          24d     10.0.0.61     k8s-worker-02   <none>           <none>
kube-proxy-txlrp                        1/1     Running   9          24d     10.0.0.60     k8s-worker-01   <none>           <none>
kube-scheduler-k8s-master-01            1/1     Running   14         24d     10.0.0.50     k8s-master-01   <none>           <none>
metrics-server-67684d476-mrvj2          1/1     Running   2          7d23h   10.244.2.43   k8s-worker-02   <none>           <none>

So here are my services:

kubectl get services --all-namespaces -o wide
NAMESPACE              NAME                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE    SELECTOR
default                kubernetes                  ClusterIP   10.96.0.1       <none>        443/TCP                  26d    <none>
default                phpdemo                     ClusterIP   10.96.52.157    <none>        80/TCP                   11d    app=phpdemo
kube-system            kube-dns                    ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP   26d    k8s-app=kube-dns
kube-system            metrics-server              ClusterIP   10.96.71.138    <none>        443/TCP                  5d3h   k8s-app=metrics-server
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP   10.99.136.237   <none>        8000/TCP                 23d    k8s-app=dashboard-metrics-scraper
kubernetes-dashboard   kubernetes-dashboard        ClusterIP   10.97.209.113   <none>        443/TCP                  23d    k8s-app=kubernetes-dashboard

Metric API doesn't work due to failed connection checks:

$ kubectl describe apiservice v1beta1.metrics.k8s.io
...
Status:
  Conditions:
    Last Transition Time:  2019-12-27T21:25:01Z
    Message:               failing or missing response from https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: Get https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:  

kube-apiserver doesn't get a connection:

$ kubectl logs --tail=20 kube-apiserver-k8s-master-01 -n kube-system
...
I0101 22:27:00.712413       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
W0101 22:27:00.712514       1 handler_proxy.go:97] no RequestInfo found in the context
E0101 22:27:00.712559       1 controller.go:114] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0101 22:27:00.712591       1 controller.go:127] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
E0101 22:27:04.712991       1 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: Get https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0101 22:27:09.714801       1 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: Get https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0101 22:27:34.709557       1 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: Get https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
E0101 22:27:39.714173       1 available_controller.go:419] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: Get https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

I tried to figure out what happens on kube-apiserver and finally could confirm the problem. I get a delayed response after >60s (unfortunately time is not installed)

$ kubectl exec -it kube-apiserver-k8s-master-01 -n kube-system -- /bin/sh
# echo -e "GET /apis/metrics.k8s.io/v1beta1 HTTP/1.1\r\nHost:v1beta1.metrics.k8s.io\r\n" | openssl s_client -connect 10.96.71.138:443 -quiet
Can't use SSL_get_servername
depth=1 CN = localhost-ca@1577481905
verify error:num=19:self signed certificate in certificate chain
verify return:1
depth=1 CN = localhost-ca@1577481905
verify return:1
depth=0 CN = localhost@1577481906
verify return:1
HTTP/1.1 400 Bad Request
Content-Type: text/plain; charset=utf-8
Connection: close

The same command succeeds from two of my own test pods (respectively from two different worker nodes). So service IPs are reachable from my pod network on the worker nodes:

$ kubectl exec -it phpdemo-55858f97c4-fjc6q -- /bin/sh
/usr/local/bin # echo -e "GET /apis/metrics.k8s.io/v1beta1 HTTP/1.1\r\nHost:v1beta1.metrics.k8s.io\r\n" | openssl s_client -connect 10.96.71.138:443 -quiet
Can't use SSL_get_servername
depth=1 CN = localhost-ca@1577481905
verify error:num=19:self signed certificate in certificate chain
verify return:1
depth=1 CN = localhost-ca@1577481905
verify return:1
depth=0 CN = localhost@1577481906
verify return:1
HTTP/1.1 403 Forbidden
Content-Type: application/json
X-Content-Type-Options: nosniff
Date: Wed, 01 Jan 2020 22:53:44 GMT
Content-Length: 212

{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"forbidden: User \"system:anonymous\" cannot get path \"/apis/metrics.k8s.io/v1beta1\"","reason":"Forbidden","details":{},"code":403}

And from the worker node aswell:

[root@k8s-worker-02 ~ ] time curl -k https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/apis/metrics.k8s.io/v1beta1\"",
  "reason": "Forbidden",
  "details": {

  },
  "code": 403
}
real    0m0.146s
user    0m0.048s
sys 0m0.089s

This doesnt work on my master node. I get a delayed response after >60s

[root@k8s-master-01 ~ ] time curl -k https://10.96.71.138:443/apis/metrics.k8s.io/v1beta1
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/apis/metrics.k8s.io/v1beta1\"",
  "reason": "Forbidden",
  "details": {

  },
  "code": 403
}
real    1m3.248s
user    0m0.061s
sys 0m0.079s

From the master node I can see lots of unreplied SYN_SENT packets.

[root@k8s-master-01 ~ ] conntrack -L -d 10.96.71.138
tcp      6 75 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48550 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=19813 mark=0 use=1
tcp      6 5 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48287 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=23710 mark=0 use=1
tcp      6 40 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48422 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=24286 mark=0 use=1
tcp      6 5 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48286 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=35030 mark=0 use=1
tcp      6 80 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48574 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=40636 mark=0 use=1
tcp      6 50 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48464 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=65512 mark=0 use=1
tcp      6 5 SYN_SENT src=10.0.2.15 dst=10.96.71.138 sport=48290 dport=443 [UNREPLIED] src=10.244.2.38 dst=10.244.0.0 sport=4443 dport=47617 mark=0 use=1

iptables are set:

[root@k8s-master-01 ~ ] iptables-save | grep 10.96.71.138
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.71.138/32 -p tcp -m comment --comment "kube-system/metrics-server: cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.71.138/32 -p tcp -m comment --comment "kube-system/metrics-server: cluster IP" -m tcp --dport 443 -j KUBE-SVC-LC5QY66VUV2HJ6WZ

kube-proxy is up and running on each node without errors.

$ kubectl get pods -A -o wide
...
kube-system            kube-proxy-mrz9v                             1/1     Running   10         21d    10.0.0.50     k8s-master-01   <none>           <none>
kube-system            kube-proxy-slt95                             1/1     Running   9          21d    10.0.0.61     k8s-worker-02   <none>           <none>
kube-system            kube-proxy-txlrp                             1/1     Running   9          21d    10.0.0.60     k8s-worker-01   <none>           <none>
$ kubectl -n kube-system logs kube-proxy-mrz9v
W0101 21:31:14.268698       1 server_others.go:323] Unknown proxy mode "", assuming iptables proxy
I0101 21:31:14.283958       1 node.go:135] Successfully retrieved node IP: 10.0.0.50
I0101 21:31:14.284034       1 server_others.go:145] Using iptables Proxier.
I0101 21:31:14.284624       1 server.go:571] Version: v1.17.0
I0101 21:31:14.286031       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0101 21:31:14.286093       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0101 21:31:14.287207       1 conntrack.go:83] Setting conntrack hashsize to 32768
I0101 21:31:14.298760       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0101 21:31:14.298984       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0101 21:31:14.300618       1 config.go:313] Starting service config controller
I0101 21:31:14.300665       1 shared_informer.go:197] Waiting for caches to sync for service config
I0101 21:31:14.300720       1 config.go:131] Starting endpoints config controller
I0101 21:31:14.300740       1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I0101 21:31:14.400864       1 shared_informer.go:204] Caches are synced for service config 
I0101 21:31:14.401021       1 shared_informer.go:204] Caches are synced for endpoints config 

> kubectl -n kube-system logs kube-proxy-slt95
W0101 21:31:13.856897       1 server_others.go:323] Unknown proxy mode "", assuming iptables proxy
I0101 21:31:13.905653       1 node.go:135] Successfully retrieved node IP: 10.0.0.61
I0101 21:31:13.905704       1 server_others.go:145] Using iptables Proxier.
I0101 21:31:13.906370       1 server.go:571] Version: v1.17.0
I0101 21:31:13.906983       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0101 21:31:13.907032       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0101 21:31:13.907413       1 conntrack.go:83] Setting conntrack hashsize to 32768
I0101 21:31:13.912221       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0101 21:31:13.912321       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0101 21:31:13.915322       1 config.go:313] Starting service config controller
I0101 21:31:13.915353       1 shared_informer.go:197] Waiting for caches to sync for service config
I0101 21:31:13.915755       1 config.go:131] Starting endpoints config controller
I0101 21:31:13.915779       1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I0101 21:31:14.016995       1 shared_informer.go:204] Caches are synced for endpoints config 
I0101 21:31:14.017115       1 shared_informer.go:204] Caches are synced for service config 

> kubectl -n kube-system logs kube-proxy-txlrp
W0101 21:31:13.552518       1 server_others.go:323] Unknown proxy mode "", assuming iptables proxy
I0101 21:31:13.696793       1 node.go:135] Successfully retrieved node IP: 10.0.0.60
I0101 21:31:13.696846       1 server_others.go:145] Using iptables Proxier.
I0101 21:31:13.697396       1 server.go:571] Version: v1.17.0
I0101 21:31:13.698000       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0101 21:31:13.698101       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0101 21:31:13.698509       1 conntrack.go:83] Setting conntrack hashsize to 32768
I0101 21:31:13.704280       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0101 21:31:13.704467       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0101 21:31:13.704888       1 config.go:131] Starting endpoints config controller
I0101 21:31:13.704935       1 shared_informer.go:197] Waiting for caches to sync for endpoints config
I0101 21:31:13.705046       1 config.go:313] Starting service config controller
I0101 21:31:13.705059       1 shared_informer.go:197] Waiting for caches to sync for service config
I0101 21:31:13.806299       1 shared_informer.go:204] Caches are synced for endpoints config 
I0101 21:31:13.806430       1 shared_informer.go:204] Caches are synced for service config 

Here are my (default) kube-proxy settings:

$ kubectl -n kube-system get configmap kube-proxy -o yaml
apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    clientConnection:
      acceptContentTypes: ""
      burst: 10
      contentType: application/vnd.kubernetes.protobuf
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
      qps: 5
    clusterCIDR: 10.244.0.0/16
    configSyncPeriod: 15m0s
    conntrack:
      maxPerCore: 32768
      min: 131072
      tcpCloseWaitTimeout: 1h0m0s
      tcpEstablishedTimeout: 24h0m0s
    enableProfiling: false
    healthzBindAddress: 0.0.0.0:10256
    hostnameOverride: ""
    iptables:
      masqueradeAll: false
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: ""
      strictARP: false
      syncPeriod: 30s
    kind: KubeProxyConfiguration
    metricsBindAddress: 127.0.0.1:10249
    mode: ""
    nodePortAddresses: null
    oomScoreAdj: -999
    portRange: ""
    udpIdleTimeout: 250ms
    winkernel:
      enableDSR: false
      networkName: ""
      sourceVip: ""
  kubeconfig.conf: |-
    apiVersion: v1
    kind: Config
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://10.0.0.50:6443
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  creationTimestamp: "2019-12-06T22:07:40Z"
  labels:
    app: kube-proxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "185"
  selfLink: /api/v1/namespaces/kube-system/configmaps/kube-proxy
  uid: bac4a8df-e318-4c91-a6ed-9305e58ac6d9
$ kubectl -n kube-system get daemonset kube-proxy -o yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "2"
  creationTimestamp: "2019-12-06T22:07:40Z"
  generation: 2
  labels:
    k8s-app: kube-proxy
  name: kube-proxy
  namespace: kube-system
  resourceVersion: "115436"
  selfLink: /apis/apps/v1/namespaces/kube-system/daemonsets/kube-proxy
  uid: 64a53d29-1eaa-424f-9ebd-606bcdb3169c
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-proxy
  template:
    metadata:
      creationTimestamp: null
      labels:
        k8s-app: kube-proxy
    spec:
      containers:
      - command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf
        - --hostname-override=$(NODE_NAME)
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: k8s.gcr.io/kube-proxy:v1.17.0
        imagePullPolicy: IfNotPresent
        name: kube-proxy
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/kube-proxy
          name: kube-proxy
        - mountPath: /run/xtables.lock
          name: xtables-lock
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      nodeSelector:
        beta.kubernetes.io/os: linux
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: kube-proxy
      serviceAccountName: kube-proxy
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - operator: Exists
      volumes:
      - configMap:
          defaultMode: 420
          name: kube-proxy
        name: kube-proxy
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
      - hostPath:
          path: /lib/modules
          type: ""
        name: lib-modules
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 3
  desiredNumberScheduled: 3
  numberAvailable: 3
  numberMisscheduled: 0
  numberReady: 3
  observedGeneration: 2
  updatedNumberScheduled: 3

Is this just a result of misconfiguration or is this a bug? Any help is appreciated.

-- user10374572
api
flannel
kubernetes
metrics
timeout

2 Answers

4/13/2020

This issue is related to weird flannel bug(s) with vxlan backend which causes NAT rules and/or routing entries to be missing or incomplete:

https://github.com/coreos/flannel/issues/1243
https://github.com/coreos/flannel/issues/1245

As a workaround, setting up a static route (to the service network via cni0 interface) on my nodes helped me instantly:

ip route add 10.96.0.0/12 dev cni0
-- user10374572
Source: StackOverflow

1/4/2020

Here is what I did to make it work:

1.Set - --enable-aggregator-routing=true flag in kube API Server.

2.Set below flags in metrics-server-deployment.yaml

- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP

3.Set hostNetwork: truein metrics-server-deployment.yaml

-- Arghya Sadhu
Source: StackOverflow