I'm trying to create a single control-plane cluster with kubeadm on 3 bare metal nodes (1 master and 2 workers) running on Debian 10 with Docker as a container runtime. Each node has an external IP and internal IP. I want to configure a cluster on the internal network and be accessible from the Internet. Used this command for that (please correct me if something wrong):
kubeadm init --control-plane-endpoint=10.10.0.1 --apiserver-cert-extra-sans={public_DNS_name},10.10.0.1 --pod-network-cidr=192.168.0.0/16
I got:
kubectl get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
dev-k8s-master-0.public.dns Ready master 16h v1.18.2 10.10.0.1 <none> Debian GNU/Linux 10 (buster) 4.19.0-8-amd64 docker://19.3.8
Init phase complete successfully and the cluster is accessible from the Internet. All pods are up and running except coredns that should be running after networking will be applied.
kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calico.yaml
After networking applied, coredns pods still not ready:
kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-75d56dfc47-g8g9g 0/1 CrashLoopBackOff 192 16h
kube-system calico-node-22gtx 1/1 Running 0 16h
kube-system coredns-66bff467f8-87vd8 0/1 Running 0 16h
kube-system coredns-66bff467f8-mv8d9 0/1 Running 0 16h
kube-system etcd-dev-k8s-master-0 1/1 Running 0 16h
kube-system kube-apiserver-dev-k8s-master-0 1/1 Running 0 16h
kube-system kube-controller-manager-dev-k8s-master-0 1/1 Running 0 16h
kube-system kube-proxy-lp6b8 1/1 Running 0 16h
kube-system kube-scheduler-dev-k8s-master-0 1/1 Running 0 16h
Some logs from failed pods:
kubectl -n kube-system logs calico-kube-controllers-75d56dfc47-g8g9g
2020-04-22 08:24:55.853 [INFO][1] main.go 88: Loaded configuration from environment config=&config.Config{LogLevel:"info", ReconcilerPeriod:"5m", CompactionPeriod:"10m", EnabledControllers:"node", WorkloadEndpointWorkers:1, ProfileWorkers:1, PolicyWorkers:1, NodeWorkers:1, Kubeconfig:"", HealthEnabled:true, SyncNodeLabels:true, DatastoreType:"kubernetes"}
2020-04-22 08:24:55.855 [INFO][1] k8s.go 228: Using Calico IPAM
W0422 08:24:55.855525 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2020-04-22 08:24:55.856 [INFO][1] main.go 109: Ensuring Calico datastore is initialized
2020-04-22 08:25:05.857 [ERROR][1] client.go 255: Error getting cluster information config ClusterInformation="default" error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
2020-04-22 08:25:05.857 [FATAL][1] main.go 114: Failed to initialize Calico datastore error=Get https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default: context deadline exceeded
coredns:
[INFO] plugin/ready: Still waiting on: "kubernetes"
I0422 08:29:12.275344 1 trace.go:116] Trace[1050055850]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105 (started: 2020-04-22 08:28:42.274382393 +0000 UTC m=+59491.429700922) (total time: 30.000897581s):
Trace[1050055850]: [30.000897581s] [30.000897581s] END
E0422 08:29:12.275388 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0422 08:29:12.276163 1 trace.go:116] Trace[188478428]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105 (started: 2020-04-22 08:28:42.275499997 +0000 UTC m=+59491.430818380) (total time: 30.000606394s):
Trace[188478428]: [30.000606394s] [30.000606394s] END
E0422 08:29:12.276198 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
I0422 08:29:12.277424 1 trace.go:116] Trace[16697023]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105 (started: 2020-04-22 08:28:42.276675998 +0000 UTC m=+59491.431994406) (total time: 30.000689778s):
Trace[16697023]: [30.000689778s] [30.000689778s] END
E0422 08:29:12.277452 1 reflector.go:153] pkg/mod/k8s.io/client-go@v0.17.2/tools/cache/reflector.go:105: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"
Any thoughts what's wrong?
This answer is to call attention to @florin suggestion:
I've seen a similar behavior when I had multiple public interfaces on the node and calico selected the wrong one.
What I did is to set IP_AUTODETECT_METHOD in the calico config.
The method to use to autodetect the IPv4 address for this host. This is only used when the IPv4 address is being autodetected. See IP Autodetection methods for details of the valid methods.
Learn more Here: https://docs.projectcalico.org/reference/node/configuration#ip-autodetection-methods