i am trying to deploy a cluster of etcd+flanneld+kubernetes on CentOS machines. etcd and flanneld are running fine. But not kubernetes.
My environment:
coreos05: CentOS7 - 192.168.0.114
coreos08: CentOS7 - 192.168.2.57
ETCD:
[root@coreos05 ~]# etcdctl -C 192.168.0.114:4001 member list
e83ffc60b9b71862: name=coreos05 peerURLs=http://coreos05:2380,http://coreos05:7001 clientURLs=http://192.168.0.114:2379,http://192.168.0.114:4001
f877fb31ab0f7105: name=coreos08 peerURLs=http://coreos08:2380,http://coreos08:7001 clientURLs=http://192.168.2.57:2379,http://192.168.2.57:4001
[root@coreos05 ~]# etcdctl -C 192.168.2.57:4001 member list
e83ffc60b9b71862: name=coreos05 peerURLs=http://coreos05:2380,http://coreos05:7001 clientURLs=http://192.168.0.114:2379,http://192.168.0.114:4001
f877fb31ab0f7105: name=coreos08 peerURLs=http://coreos08:2380,http://coreos08:7001 clientURLs=http://192.168.2.57:2379,http://192.168.2.57:4001
INTER-NODE COMMUNICATIONS BETWEEN ECTD AND FLANNELD:
[root@coreos05 ~]# netstat -putona | egrep 'etcd|flanneld' |grep 2.57
tcp 0 0 192.168.0.114:4001 192.168.2.57:42996 ESTABLISHED 16288/etcd keepalive (14,65/0/0)
tcp 0 0 192.168.0.114:2380 192.168.2.57:32817 ESTABLISHED 16288/etcd off (0.00/0/0)
[root@coreos05 ~]#
STATUS OF ALL SERVICES IN THE MASTER:
[root@coreos05 ~]# for SERVICES in etcd flanneld kube-apiserver kube-controller-manager kube-scheduler; do systemctl status $SERVICES ; done
etcd.service - Etcd Server
Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled)
Active: active (running) since mar 2015-05-12 11:54:16 CEST; 33min ago
Main PID: 16590 (etcd)
CGroup: /system.slice/etcd.service
└─16590 /usr/bin/etcd
may 12 11:54:16 coreos05 etcd[16590]: 2015/05/12 11:54:16 raft: e83ffc60b9b71862 became follower at term 46
may 12 11:54:16 coreos05 etcd[16590]: 2015/05/12 11:54:16 raft: newRaft e83ffc60b9b71862 [peers: [], term: 46, commit: 5235, applied: 0, lastindex: 5235, lastterm: 46]
may 12 11:54:16 coreos05 etcd[16590]: 2015/05/12 11:54:16 etcdserver: added local member e83ffc60b9b71862 [http://coreos05:2380 http://coreos05:7001] to cluster 85bb0f76f652d0f6
may 12 11:54:16 coreos05 etcd[16590]: 2015/05/12 11:54:16 etcdserver: added member f877fb31ab0f7105 [http://coreos08:2380 http://coreos08:7001] to cluster 85bb0f76f652d0f6
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 raft: e83ffc60b9b71862 [term: 46] received a MsgVote message with higher term from f877fb31ab0f7105 [term: 47]
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 raft: e83ffc60b9b71862 became follower at term 47
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 raft: e83ffc60b9b71862 [logterm: 46, index: 5235, vote: 0] voted for f877fb31ab0f7105 [logterm: 46, index: 5235] at term 47
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 raft.node: e83ffc60b9b71862 elected leader f877fb31ab0f7105 at term 47
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 rafthttp: starting client stream to f877fb31ab0f7105 at term 47
may 12 11:54:17 coreos05 etcd[16590]: 2015/05/12 11:54:17 etcdserver: published {Name:coreos05 ClientURLs:[http://192.168.0.114:2379 http://192.168.0.114:4001]} to cluster 85bb0f76f652d0f6
flanneld.service - Flanneld overlay address etcd agent
Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled)
Active: active (running) since mar 2015-05-12 11:54:17 CEST; 33min ago
Main PID: 16611 (flanneld)
CGroup: /system.slice/flanneld.service
└─16611 /usr/bin/flanneld -etcd-endpoints=http://192.168.0.114:4001 -etcd-prefix=/kuberdock/network/ --iface=enp3s0
may 12 11:54:17 coreos05 systemd[1]: Starting Flanneld overlay address etcd agent...
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.024119 16611 main.go:247] Installing signal handlers
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.025078 16611 main.go:205] Using 192.168.0.114 as external interface
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.868493 16611 subnet.go:83] Subnet lease acquired: 10.10.93.0/24
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.869081 16611 main.go:215] UDP mode initialized
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.869106 16611 udp.go:239] Watching for new subnet leases
may 12 11:54:17 coreos05 flanneld[16611]: I0512 11:54:17.871602 16611 udp.go:264] Subnet added: 10.10.65.0/24
may 12 11:54:17 coreos05 systemd[1]: Started Flanneld overlay address etcd agent.
kube-apiserver.service - Kubernetes API Server
Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; enabled)
Drop-In: /etc/systemd/system/kube-apiserver.service.d
└─pre-start.conf
Active: active (running) since mar 2015-05-12 11:54:17 CEST; 33min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 16690 (kube-apiserver)
CGroup: /system.slice/kube-apiserver.service
└─16690 /usr/bin/kube-apiserver --logtostderr=true --v=0 --etcd_servers=http://coreos05:4001 --address=0.0.0.0 --port=8080 --kubelet_port=10250 --allow_privileged=false --portal_net=10.10.0.0/16 --admission_control=Namespac...
may 12 11:54:17 coreos05 kube-apiserver[16690]: E0512 11:54:17.985524 16690 reflector.go:123] Failed to list *api.Namespace: Get http://0.0.0.0:8080/api/v1beta3/namespaces: dial tcp 0.0.0.0:8080: connection refused
may 12 11:54:17 coreos05 kube-apiserver[16690]: I0512 11:54:17.986149 16690 master.go:236] Will report 192.168.0.114 as public IP address.
may 12 11:54:17 coreos05 kube-apiserver[16690]: E0512 11:54:17.987132 16690 reflector.go:123] Failed to list *api.LimitRange: Get http://0.0.0.0:8080/api/v1beta3/limitranges: dial tcp 0.0.0.0:8080: connection refused
may 12 11:54:17 coreos05 kube-apiserver[16690]: E0512 11:54:17.987437 16690 reflector.go:123] Failed to list *api.ResourceQuota: Get http://0.0.0.0:8080/api/v1beta3/resourcequotas: dial tcp 0.0.0.0:8080: connection refused
may 12 11:54:18 coreos05 kube-apiserver[16690]: [restful] 2015/05/12 11:54:18 log.go:30: [restful/swagger] listing is available at https://192.168.0.114:6443/swaggerapi/
may 12 11:54:18 coreos05 kube-apiserver[16690]: [restful] 2015/05/12 11:54:18 log.go:30: [restful/swagger] https://192.168.0.114:6443/swaggerui/ is mapped to folder /swagger-ui/
may 12 11:54:18 coreos05 kube-apiserver[16690]: I0512 11:54:18.093361 16690 server.go:353] Serving read-only insecurely on 0.0.0.0:7080
may 12 11:54:18 coreos05 kube-apiserver[16690]: I0512 11:54:18.093784 16690 server.go:390] Serving securely on 0.0.0.0:6443
may 12 11:54:18 coreos05 kube-apiserver[16690]: I0512 11:54:18.100679 16690 server.go:418] Serving insecurely on 0.0.0.0:8080
may 12 11:54:18 coreos05 kube-apiserver[16690]: I0512 11:54:18.925329 16690 server.go:400] Using self-signed cert (/var/run/kubernetes/apiserver.crt, /var/run/kubernetes/apiserver.key)
kube-controller-manager.service - Kubernetes Controller Manager
Loaded: loaded (/usr/lib/systemd/system/kube-controller-manager.service; enabled)
Active: active (running) since mar 2015-05-12 11:54:18 CEST; 33min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 16714 (kube-controller)
CGroup: /system.slice/kube-controller-manager.service
└─16714 /usr/bin/kube-controller-manager --logtostderr=true --v=0 --machines=coreos08
may 12 12:26:48 coreos05 kube-controller-manager[16714]: I0512 12:26:48.282325 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:26:48.282313291 +0200 CEST is later than 2015-05-12 12:26:48.282311109 +0200 CEST + 4m20s
may 12 12:26:53 coreos05 kube-controller-manager[16714]: I0512 12:26:53.468254 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:26:53.468242266 +0200 CEST is later than 2015-05-12 12:26:53.468240541 +0200 CEST + 4m20s
may 12 12:26:58 coreos05 kube-controller-manager[16714]: I0512 12:26:58.677179 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:26:58.677166286 +0200 CEST is later than 2015-05-12 12:26:58.67716449 +0200 CEST + 4m20s
may 12 12:27:03 coreos05 kube-controller-manager[16714]: I0512 12:27:03.778387 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:03.778376111 +0200 CEST is later than 2015-05-12 12:27:03.778374466 +0200 CEST + 4m20s
may 12 12:27:08 coreos05 kube-controller-manager[16714]: I0512 12:27:08.879548 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:08.879537205 +0200 CEST is later than 2015-05-12 12:27:08.879535608 +0200 CEST + 4m20s
may 12 12:27:13 coreos05 kube-controller-manager[16714]: I0512 12:27:13.980986 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:13.980974374 +0200 CEST is later than 2015-05-12 12:27:13.980972639 +0200 CEST + 4m20s
may 12 12:27:19 coreos05 kube-controller-manager[16714]: I0512 12:27:19.574960 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:19.574947254 +0200 CEST is later than 2015-05-12 12:27:19.574945586 +0200 CEST + 4m20s
may 12 12:27:24 coreos05 kube-controller-manager[16714]: I0512 12:27:24.699798 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:24.699787548 +0200 CEST is later than 2015-05-12 12:27:24.699785704 +0200 CEST + 4m20s
may 12 12:27:29 coreos05 kube-controller-manager[16714]: I0512 12:27:29.876981 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:29.876968588 +0200 CEST is later than 2015-05-12 12:27:29.876966413 +0200 CEST + 4m20s
may 12 12:27:34 coreos05 kube-controller-manager[16714]: I0512 12:27:34.988483 16714 nodecontroller.go:504] Evicting pods2: 2015-05-12 12:27:34.988471519 +0200 CEST is later than 2015-05-12 12:27:34.988469853 +0200 CEST + 4m20s
kube-scheduler.service - Kubernetes Scheduler Plugin
Loaded: loaded (/usr/lib/systemd/system/kube-scheduler.service; enabled)
Active: active (running) since mar 2015-05-12 11:54:18 CEST; 33min ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 16734 (kube-scheduler)
CGroup: /system.slice/kube-scheduler.service
└─16734 /usr/bin/kube-scheduler --logtostderr=true --v=0
may 12 11:54:18 coreos05 systemd[1]: kube-scheduler.service: main process exited, code=exited, status=2/INVALIDARGUMENT
may 12 11:54:18 coreos05 systemd[1]: Unit kube-scheduler.service entered failed state.
may 12 11:54:18 coreos05 systemd[1]: Starting Kubernetes Scheduler Plugin...
may 12 11:54:18 coreos05 systemd[1]: Started Kubernetes Scheduler Plugin.
may 12 11:54:18 coreos05 kube-scheduler[16734]: W0512 11:54:18.139880 16734 server.go:83] Neither --kubeconfig nor --master was specified. Using default API client. This might not work.
may 12 12:09:18 coreos05 kube-scheduler[16734]: E0512 12:09:18.150197 16734 reflector.go:158] watch of *api.Service ended with: very short watch
may 12 12:09:18 coreos05 kube-scheduler[16734]: E0512 12:09:18.156710 16734 reflector.go:158] watch of *api.Node ended with: very short watch
may 12 12:24:19 coreos05 kube-scheduler[16734]: E0512 12:24:19.154734 16734 reflector.go:158] watch of *api.Service ended with: very short watch
may 12 12:24:19 coreos05 kube-scheduler[16734]: E0512 12:24:19.160947 16734 reflector.go:158] watch of *api.Node ended with: very short watch
NOTE errors in kube-apiserver.service
Failed to list *api.Namespace: Get http://0.0.0.0:8080/api/v1beta3/namespaces: dial tcp 0.0.0.0:8080: connection refused
KUBECTL NODE STATUS:
[root@coreos05 ~]# kubectl get node
NAME LABELS STATUS
coreos08 <none> NotReady
Why nodes are not ready?
check the below entries for kubernetes
cd /etc/kubernetes,vim config
KUBE_MASTER="--master=http://<master_hostname>:8080"
KUBE_ETCD_SERVERS="--etcd-servers=http://<master_hostnane>:2379"
I can think of three possible cases due to which you might be facing this issue. Whenever I have seen that one/more nodes are in Not ready state, I have observed that either the DNS is not configured properly or network plugin is not running or kubelet is not running. Since I see that you are using flannel for networking and it is running properly, there problem can be due to the other two reasons. Try the following to pin point the issue:
To check whether DNS is properly configured or not, run the following command:
kubectl get pods -o wide --all-namespaces
Check if the pods with the name kube-dns or coredns are running or not. If it's not running then this is the problem.
To check whether kubectl is running properly or not, run the following command:
systemctl get status kubectl
If it is not running then we have found the problem. We have to see the logs of kubectl and fix it.
looking at your configuration, it looks like you are trying to run two kubernetes masters?
Please read into the documentation, a good start are the getting started guides: https://github.com/GoogleCloudPlatform/kubernetes/tree/master/docs/getting-started-guides