K8S: Node does not join cluster

9/17/2015

I have an issue with my kubernetes node that doesn't register to the kubernetes master.

I have seen lots of issues corresponding to my problem but most of them was bugs that have been corrected. The prerequisite and the different components of kubernetes seems operationnal. I certainly have a bad configuration, but trying some that works doesn't do the work for me.

I'm following the Step by Step tutorial from the CoreOS team.

My configuration:

  • VirtualBox: 5.0.2
  • Vagrant: 1.7.3
  • CoreOS: 801
  • Hyperkube: 1.0.6

My procedure:

- I boot a kubernetes master
  - start etcd
  - start flanneld
  - start docker after flanneld
  - start kubelet
    - it start apiserver (as a container)
    - it start controller-manager (as a container)
    - it start scheduler (as a container)
    - it start proxy (as a container)

- I start a kubernetes node
  - start etcd
  - start flanneld
  - start docker after flanneld
  - start the kubelet

ETCD2:

  • I can share value between master and node.
  • The node is an etcd proxy

FLANNELD:

  • I have created a container on each side
  • I'm able to ping from one to the other

MASTER KUBELET:

  • It starts the components that are in the config folder

KUBERNETES SEEMS TO RUN:

  • If I turn the master kubelet register-node to true, it register as a node
  • In the test, the master kubelet is not turned to register
  • I can start pods (that work if the master kubelet is switch to register)

NODE KUBELET:

Here are the logs :

$ journalctl -fu kubelet --since=2012-01-01
-- Logs begin at Thu 2015-09-17 09:38:17 UTC. --
Sep 17 09:39:37 node1 systemd[1]: Starting Kubernetes Kubelet for Node...
Sep 17 09:39:37 node1 systemd[1]: Started Kubernetes Kubelet for Node.
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.080731    1634 manager.go:127] cAdvisor running in container: "/system.slice/kubelet.service"
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.081391    1634 fs.go:93] Filesystem partitions: map[/dev/sda9:{mountpoint:/ major:8 minor:9} /dev/sda3:{mountpoint:/usr major:8 minor:3} /dev/sda6:{mountpoint:/usr/share/oem major:8 minor:6}]
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.083078    1634 manager.go:156] Machine: {NumCores:1 CpuFrequency:3403222 MemoryCapacity:4048441344 MachineID:1c0a9b68c0044cfdb5024dc80a5cdec2 SystemUUID:35A45175-4822-4FFA-9CBF-ECC10430ED28 BootID:18baf9ac-73a9-42f3-9bc5-2dca985d03e9 Filesystems:[{Device:/dev/sda6 Capacity:113229824} {Device:/dev/sda9 Capacity:16718393344} {Device:/dev/sda3 Capacity:1031946240}] DiskMap:map[8:0:{Name:sda Major:8 Minor:0 Size:19818086400 Scheduler:cfq}] NetworkDevices:[{Name:eth0 MacAddress:08:00:27:8c:0a:cd Speed:0 Mtu:1500} {Name:eth1 MacAddress:08:00:27:bc:e6:70 Speed:0 Mtu:1500} {Name:eth2 MacAddress:08:00:27:b9:33:63 Speed:0 Mtu:1500} {Name:flannel0 MacAddress: Speed:10 Mtu:1472}] Topology:[{Id:0 Memory:4048441344 Cores:[{Id:0 Threads:[0] Caches:[{Size:32768 Type:Data Level:1} {Size:32768 Type:Instruction Level:1} {Size:262144 Type:Unified Level:2}]}] Caches:[{Size:6291456 Type:Unified Level:3}]}]}
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.087467    1634 manager.go:163] Version: {KernelVersion:4.1.6-coreos-r2 ContainerOsVersion:CoreOS 801.0.0 DockerVersion:1.8.1 CadvisorVersion:0.15.1}
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.087674    1634 plugins.go:69] No cloud provider specified.
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.087698    1634 docker.go:295] Connecting to docker on unix:///var/run/docker.sock
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.088720    1634 server.go:663] Adding manifest file: /etc/kubernetes/manifests
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.088734    1634 server.go:673] Watching apiserver
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.110463    1634 reflector.go:136] Failed to list *api.Node: Get http://192.168.1.88:8080/api/v1/nodes?fieldSelector=metadata.name%3D192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.111317    1634 reflector.go:136] Failed to list *api.Service: Get http://192.168.1.88:8080/api/v1/services: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.111641    1634 reflector.go:136] Failed to list *api.Pod: Get http://192.168.1.88:8080/api/v1/pods?fieldSelector=spec.nodeName%3D192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.219264    1634 plugins.go:56] Registering credential provider: .dockercfg
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.221429    1634 server.go:635] Started kubelet
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.221752    1634 kubelet.go:682] Image garbage collection failed: unable to find data for container /
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.230631    1634 kubelet.go:702] Running in container "/kubelet"
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.235396    1634 server.go:63] Starting to listen on 0.0.0.0:10250
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.257384    1634 event.go:194] Unable to write event: 'Post http://192.168.1.88:8080/api/v1/namespaces/default/events: dial tcp 192.168.1.88:8080: connection refused' (may retry after sleeping)
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.368996    1634 factory.go:226] System is using systemd
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.369627    1634 factory.go:234] Registering Docker factory
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.370640    1634 factory.go:89] Registering Raw factory
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.490377    1634 manager.go:946] Started watching for new ooms in manager
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.490733    1634 oomparser.go:183] oomparser using systemd
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.491323    1634 manager.go:243] Starting recovery of all containers
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.647835    1634 manager.go:248] Recovery completed
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.702130    1634 status_manager.go:76] Starting to sync pod status with apiserver
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.702375    1634 kubelet.go:1725] Starting kubelet main sync loop.
Sep 17 09:39:37 node1 kubelet[1634]: E0917 09:39:37.712658    1634 kubelet.go:1641] error getting node: node 192.168.1.31 not found
Sep 17 09:39:37 node1 kubelet[1634]: I0917 09:39:37.736035    1634 provider.go:91] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
Sep 17 09:39:37 node1 kubelet[1634]: W0917 09:39:37.743037    1634 status_manager.go:80] Failed to updated pod status: error updating status for pod "kube-proxy-192.168.1.31_default": Get http://192.168.1.88:8080/api/v1/namespaces/default/pods/kube-proxy-192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:38 node1 kubelet[1634]: E0917 09:39:38.113116    1634 reflector.go:136] Failed to list *api.Pod: Get http://192.168.1.88:8080/api/v1/pods?fieldSelector=spec.nodeName%3D192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:38 node1 kubelet[1634]: E0917 09:39:38.113170    1634 reflector.go:136] Failed to list *api.Service: Get http://192.168.1.88:8080/api/v1/services: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:38 node1 kubelet[1634]: E0917 09:39:38.113191    1634 reflector.go:136] Failed to list *api.Node: Get http://192.168.1.88:8080/api/v1/nodes?fieldSelector=metadata.name%3D192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:39 node1 kubelet[1634]: E0917 09:39:39.114141    1634 reflector.go:136] Failed to list *api.Node: Get http://192.168.1.88:8080/api/v1/nodes?fieldSelector=metadata.name%3D192.168.1.31: dial tcp 192.168.1.88:8080: connection refused
Sep 17 09:39:39 node1 kubelet[1634]: E0917 09:39:39.114207    1634 reflector.go:136] Failed to list *api.Service: Get http://192.168.1.88:8080/api/v1/services: dial tcp 192.168.1.88:8080: connection refused

There are lots of message of that kind : 192.168.1.88:8080: connection refused

When I look to the registred nodes :

$ kubectl get nodes
NAME      LABELS    STATUS

For me, the apiserver credentials are not well configured as local kubelet can register but distant cannot.

So here is my apiserver configuration :

 $ cat /etc/kubernetes/manifests/kube-apiserver.yml
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver

spec:
  hostNetwork: true
  containers:
  - name: kube-apiserver
    image: gcr.io/google_containers/hyperkube:v1.0.6
    command:
    - /hyperkube
    - apiserver
    - --bind-address=0.0.0.0
    - --etcd_servers=http://192.168.1.88:2379
    - --allow-privileged=true
    - --service-cluster-ip-range=10.3.0.0/24
    - --secure_port=443
    - --advertise-address=192.168.1.88
    - --admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota
    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --client-ca-file=/etc/kubernetes/ssl/ca.pem
    - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
    - --cloud-provider=
    ports:
    - containerPort: 443
      hostPort: 443
      name: https
    - containerPort: 7080
      hostPort: 7080
      name: http
    - containerPort: 8080
      hostPort: 8080
      name: local
    volumeMounts:
    - mountPath: /etc/kubernetes/ssl
      name: ssl-certs-kubernetes
      readOnly: true
    - mountPath: /etc/ssl/certs
      name: ssl-certs-host
      readOnly: true
  volumes:
  - hostPath:
      path: /etc/kubernetes/ssl
    name: ssl-certs-kubernetes
  - hostPath:
      path: /usr/share/ca-certificates
    name: ssl-certs-host

The certificates are present :

core@master1 ~ $ ls -l /etc/kubernetes/ssl/
total 40
-rw-r--r-- 1 core core 1675 Sep 17 09:31 apiserver-key.pem
-rw-r--r-- 1 core core 1099 Sep 17 09:31 apiserver.pem
-rw-r--r-- 1 core core 1090 Sep 17 09:31 ca.pem

And the logs from the apiserver :

I0917 09:33:48.692147       1 plugins.go:69] No cloud provider specified.
I0917 09:33:49.049701       1 master.go:273] Node port range unspecified. Defaulting to 30000-32767.
E0917 09:33:49.080829       1 reflector.go:136] Failed to list *api.ResourceQuota: Get http://127.0.0.1:8080/api/v1/resourcequotas: dial tcp 127.0.0.1:8080: connection refused
E0917 09:33:49.080955       1 reflector.go:136] Failed to list *api.Secret: Get http://127.0.0.1:8080/api/v1/secrets?fieldSelector=type%3Dkubernetes.io%2Fservice-account-token: dial tcp 127.0.0.1:8080: connection refused
E0917 09:33:49.081032       1 reflector.go:136] Failed to list *api.ServiceAccount: Get http://127.0.0.1:8080/api/v1/serviceaccounts: dial tcp 127.0.0.1:8080: connection refused
E0917 09:33:49.081075       1 reflector.go:136] Failed to list *api.LimitRange: Get http://127.0.0.1:8080/api/v1/limitranges: dial tcp 127.0.0.1:8080: connection refused
E0917 09:33:49.081141       1 reflector.go:136] Failed to list *api.Namespace: Get http://127.0.0.1:8080/api/v1/namespaces: dial tcp 127.0.0.1:8080: connection refused
E0917 09:33:49.081186       1 reflector.go:136] Failed to list *api.Namespace: Get http://127.0.0.1:8080/api/v1/namespaces: dial tcp 127.0.0.1:8080: connection refused
[restful] 2015/09/17 09:33:49 log.go:30: [restful/swagger] listing is available at https://192.168.1.88:443/swaggerapi/
[restful] 2015/09/17 09:33:49 log.go:30: [restful/swagger] https://192.168.1.88:443/swaggerui/ is mapped to folder /swagger-ui/
W0917 09:33:49.132239       1 controller.go:212] Resetting endpoints for master service "kubernetes" to &{{ } {kubernetes  default    0 0001-01-01 00:00:00 +0000 UTC <nil> map[] map[]} [{[{192.168.1.88 <nil>}] [{ 443 TCP}]}]}
I0917 09:33:49.148355       1 server.go:441] Serving securely on 0.0.0.0:443
I0917 09:33:49.148404       1 server.go:483] Serving insecurely on 127.0.0.1:8080
-- Thibault Deheurles
coreos
kubernetes

1 Answer

9/17/2015

Per the last two lines of your apiserver log, it is listening on 0.0.0.0 (all interfaces) on port 443, and 127.0.0.1 (localhost) on port 8080.

From the log output of your kubelet, you have it trying to reach the apiserver on 192.168.1.88:8080 (which it is not listening on).

For remote kubelets, they should be using "https://192.168.1.88" (public interface via port 443) to connect to the api server.

Depending on your TLS configuration, you will likely also need to configure a kubeconfig for the kubelet which uses the proper TLS certificates, which is covered in: https://coreos.com/kubernetes/docs/latest/deploy-workers.html#set-up-kubeconfig

-- Aaron Levy
Source: StackOverflow