I've tested the guide Kubernetes the hard way and the adaptation for AWS Kubernetes The Hard Way - AWS.
Everything runs fine with the DNS addon and even the dashboard as explained here.
But if I create a LoadBalancer service, it doesn't work as cloud-controller-manager isn't deployed (either as master component nor daemonset).
I read this https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/ to get some information on how to deploy it but if I apply the changes needed (on kubelet : --cloud-provider=external) and deploy the daemonset :
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
k8s-app: cloud-controller-manager
name: cloud-controller-manager
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: cloud-controller-manager
template:
metadata:
labels:
k8s-app: cloud-controller-manager
spec:
serviceAccountName: cloud-controller-manager
containers:
- name: cloud-controller-manager
image: k8s.gcr.io/cloud-controller-manager:v1.8.0
command:
- /usr/local/bin/cloud-controller-manager
- --cloud-provider=aws
- --leader-elect=true
- --use-service-account-credentials
- --allocate-node-cidrs=true
- --configure-cloud-routes=true
- --cluster-cidr=${CLUSTERCIRD}
tolerations:
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/master
effect: NoSchedule
nodeSelector:
node-role.kubernetes.io/master: ""
The instances (controllers and workers) have all the right roles.
I can't even create a pod, the status stays "Pending"...
Do you know how to deploy cloud-controller-manager as daemonset or master component (without using kops, kubeadm,...) on a AWS cluster?
Do you know a guide that could help me with that?
Would you give a example of cloud-controller-manager daemonset configuration?
Thanks in advance
UPDATE
When executing, kubectl get nodes
I get a No resources found
.
And when describing a launched pod, I get only one event : Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 28s (x2 over 28s) default-scheduler no nodes available to schedule pods
The question should be now : How to get nodes ready with cloud-controller-manager deployed for aws?
Forget about cloud-controller-manager, you don't seem to have a functioning Kubernetes cluster to run it on!!!
Kubernetes tells you exactly that, but you ignored it...
No offense but maybe if you aren't experienced with Kubernetes, you shouldn't try and follow a guide called Kubernetes The Hard Way (you failed, and you haven't provided any information for me to point out exactly why/how), but use kops or kubeadm instead?
I had the same issue trying to set cloud-provider
with GCE. I solved the problem by adding the following flags to kube-apiserver.service
, kubelet.service
and kube-controller-manager.service
.
--cloud-provider=gce \
--cloud-config=/var/lib/gce.conf \
The gce.conf
file was based off the json Key file generated from Google IAM service account but in Gcfg format. I'm sure AWS has something similar. The format looks like this:
[Global]
type = xxx
project-id = xxx
private-key-id = xxx
private-key = xxx
client-email = xxx
client-id = xxx
auth-uri = xxx
token-uri = xxx
auth-provider-x509-cert-url = xxx
client-x509-cert-url = xxx
For more info see K8s documentation on cloud-provider.
As samhain1138 mentioned, your cluster does not look healthy to install anything. In simple cases, it could be fixed, but sometimes it is better to reinstall everything.
Let's try to investigate the problem.
First of all, check your master node state. Usually, it means that you should have a kubelet
service running.
Check the kubelet log for errors:
$ journalctl -u kubelet
Next, check the state of your static pods. You can find a list of them in the /etc/kubernetes/manifets
directory:
$ ls /etc/kubernetes/manifests
etcd.yaml
kube-apiserver.yaml
kube-controller-manager.yaml
kube-scheduler.yaml
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5cbdc1c13c25 8a7739f672b4 "/sidecar --v=2 --..." 2 weeks ago Up 2 weeks k8s_sidecar_kube-dns-86c47599bd-l7d6m_kube-system_...
bd96ffafdfa6 6816817d9dce "/dnsmasq-nanny -v..." 2 weeks ago Up 2 weeks k8s_dnsmasq_kube-dns-86c47599bd-l7d6m_kube-system_...
69931b5b4cf9 55ffe31ac578 "/kube-dns --domai..." 2 weeks ago Up 2 weeks k8s_kubedns_kube-dns-86c47599bd-l7d6m_kube-system_...
60885aeffc05 k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_kube-dns-86c47599bd-l7d6m_kube-system_...
93144593660c 9f355e076ea7 "/install-cni.sh" 2 weeks ago Up 2 weeks k8s_install-cni_calico-node-nxljq_kube-system_...
b55f57529671 7eca10056c8e "start_runit" 2 weeks ago Up 2 weeks k8s_calico-node_calico-node-nxljq_kube-system_...
d8767b9c07c8 46a3cd725628 "/usr/local/bin/ku..." 2 weeks ago Up 2 weeks k8s_kube-proxy_kube-proxy-lf8gd_kube-system_...
f924cefb953f k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_calico-node-nxljq_kube-system_...
09ceddabdeb9 k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_kube-proxy-lf8gd_kube-system_...
9fc90839bb6f 821507941e9c "kube-apiserver --..." 2 weeks ago Up 2 weeks k8s_kube-apiserver_kube-apiserver-kube-master_kube-system_...
8ea410ce00a6 b8df3b177be2 "etcd --advertise-..." 2 weeks ago Up 2 weeks k8s_etcd_etcd-kube-master_kube-system_...
dd7f9b381e4f 38521457c799 "kube-controller-m..." 2 weeks ago Up 2 weeks k8s_kube-controller-manager_kube-controller-manager-kube-master_kube-system_...
f6681365bea8 37a1403e6c1a "kube-scheduler --..." 2 weeks ago Up 2 weeks k8s_kube-scheduler_kube-scheduler-kube-master_kube-system_...
0638e47ec57e k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_etcd-kube-master_kube-system_...
5bbe35abb3a3 k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_kube-controller-manager-kube-master_kube-system_...
2dc6ee716bb4 k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_kube-scheduler-kube-master_kube-system_...
b15dfc9f089a k8s.gcr.io/pause:3.1 "/pause" 2 weeks ago Up 2 weeks k8s_POD_kube-apiserver-kube-master_kube-system_...
You can see the detailed description of any pod’s container using the command:
$ docker inspect <container_id>
Or check the logs:
$ docker logs <container_id>
This should be enough to understand what to do next, either try to fix the cluster or tear down everything and start from the beginning.
To simplify the process of provisioning Kubernetes cluster, you could use kubeadm
as follows:
# This instruction is for ubuntu VMs, if you use CentOS, the commands will be
# slightly different.
### These steps are the same for the master and the worker nodes
# become root
$ sudo su
# add repository and keys
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
# install components
$ apt-get update
$ apt-get -y install ebtables ethtool docker.io apt-transport-https kubelet kubeadm kubectl
# adjust sysctl settings
$ cat <<EOF >>/etc/ufw/sysctl.conf
net/ipv4/ip_forward = 1
net/bridge/bridge-nf-call-ip6tables = 1
net/bridge/bridge-nf-call-iptables = 1
net/bridge/bridge-nf-call-arptables = 1
EOF
$ sysctl --system
### Next steps are for the master node only.
# Create Kubernetes cluster
$ kubeadm init --pod-network-cidr=192.168.0.0/16
or if you want to use older KubeDNS instead of CoreDNS:
$ kubeadm init --pod-network-cidr=192.168.0.0/16 --feature-gates=CoreDNS=false
# Configure kubectl
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config
# install Calico network
$ kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
# or install Flannel (not both)
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# Untaint master or/and join other nodes:
$ kubectl taint nodes --all node-role.kubernetes.io/master-
# run on master if you forgot the join command:
$ kubeadm token create --print-join-command
# run command printed on the previous step on the worker node to join it to the existing cluster.
# At this point you should have ready to user Kubernetes cluster.
$ kubectl get nodes -o wide
$ kubectl get pods,svc,deployments,daemonsets --all-namespaces
After recovering the cluster, could you try to install cloud-controller-manager
again and share the results?