Kubernetes The hard way on AWS - Deploy and configure cloud-controller-manager

9/21/2018

I've tested the guide Kubernetes the hard way and the adaptation for AWS Kubernetes The Hard Way - AWS.

Everything runs fine with the DNS addon and even the dashboard as explained here.

But if I create a LoadBalancer service, it doesn't work as cloud-controller-manager isn't deployed (either as master component nor daemonset).

I read this https://kubernetes.io/docs/tasks/administer-cluster/running-cloud-controller/ to get some information on how to deploy it but if I apply the changes needed (on kubelet : --cloud-provider=external) and deploy the daemonset :

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    k8s-app: cloud-controller-manager
  name: cloud-controller-manager
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: cloud-controller-manager
  template:
    metadata:
      labels:
        k8s-app: cloud-controller-manager
    spec:
      serviceAccountName: cloud-controller-manager
      containers:
      - name: cloud-controller-manager
        image: k8s.gcr.io/cloud-controller-manager:v1.8.0
        command:
        - /usr/local/bin/cloud-controller-manager
        - --cloud-provider=aws
        - --leader-elect=true
        - --use-service-account-credentials
        - --allocate-node-cidrs=true
        - --configure-cloud-routes=true
        - --cluster-cidr=${CLUSTERCIRD}
      tolerations:
      - key: node.cloudprovider.kubernetes.io/uninitialized
        value: "true"
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      nodeSelector:
        node-role.kubernetes.io/master: ""

The instances (controllers and workers) have all the right roles.

I can't even create a pod, the status stays "Pending"...

Do you know how to deploy cloud-controller-manager as daemonset or master component (without using kops, kubeadm,...) on a AWS cluster?

Do you know a guide that could help me with that?

Would you give a example of cloud-controller-manager daemonset configuration?

Thanks in advance

UPDATE

When executing, kubectl get nodes I get a No resources found.

And when describing a launched pod, I get only one event : Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 28s (x2 over 28s) default-scheduler no nodes available to schedule pods

The question should be now : How to get nodes ready with cloud-controller-manager deployed for aws?

-- Fiftoine
amazon-web-services
cloud
kubernetes

3 Answers

9/27/2018

Forget about cloud-controller-manager, you don't seem to have a functioning Kubernetes cluster to run it on!!!
Kubernetes tells you exactly that, but you ignored it...

No offense but maybe if you aren't experienced with Kubernetes, you shouldn't try and follow a guide called Kubernetes The Hard Way (you failed, and you haven't provided any information for me to point out exactly why/how), but use kops or kubeadm instead?

-- samhain1138
Source: StackOverflow

12/19/2018

I had the same issue trying to set cloud-provider with GCE. I solved the problem by adding the following flags to kube-apiserver.service, kubelet.service and kube-controller-manager.service.

--cloud-provider=gce \
--cloud-config=/var/lib/gce.conf \

The gce.conf file was based off the json Key file generated from Google IAM service account but in Gcfg format. I'm sure AWS has something similar. The format looks like this:

[Global]
type = xxx
project-id = xxx
private-key-id = xxx
private-key = xxx
client-email = xxx
client-id = xxx
auth-uri = xxx
token-uri = xxx
auth-provider-x509-cert-url = xxx
client-x509-cert-url = xxx

For more info see K8s documentation on cloud-provider.

-- ronen
Source: StackOverflow

9/27/2018

As samhain1138 mentioned, your cluster does not look healthy to install anything. In simple cases, it could be fixed, but sometimes it is better to reinstall everything.

Let's try to investigate the problem.
First of all, check your master node state. Usually, it means that you should have a kubelet service running.
Check the kubelet log for errors:

$ journalctl -u kubelet

Next, check the state of your static pods. You can find a list of them in the /etc/kubernetes/manifets directory:

$ ls /etc/kubernetes/manifests

etcd.yaml  
kube-apiserver.yaml  
kube-controller-manager.yaml  
kube-scheduler.yaml

$ docker ps

CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
5cbdc1c13c25        8a7739f672b4           "/sidecar --v=2 --..."   2 weeks ago         Up 2 weeks                              k8s_sidecar_kube-dns-86c47599bd-l7d6m_kube-system_...
bd96ffafdfa6        6816817d9dce           "/dnsmasq-nanny -v..."   2 weeks ago         Up 2 weeks                              k8s_dnsmasq_kube-dns-86c47599bd-l7d6m_kube-system_...
69931b5b4cf9        55ffe31ac578           "/kube-dns --domai..."   2 weeks ago         Up 2 weeks                              k8s_kubedns_kube-dns-86c47599bd-l7d6m_kube-system_...
60885aeffc05        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-dns-86c47599bd-l7d6m_kube-system_...
93144593660c        9f355e076ea7           "/install-cni.sh"        2 weeks ago         Up 2 weeks                              k8s_install-cni_calico-node-nxljq_kube-system_...
b55f57529671        7eca10056c8e           "start_runit"            2 weeks ago         Up 2 weeks                              k8s_calico-node_calico-node-nxljq_kube-system_...
d8767b9c07c8        46a3cd725628           "/usr/local/bin/ku..."   2 weeks ago         Up 2 weeks                              k8s_kube-proxy_kube-proxy-lf8gd_kube-system_...
f924cefb953f        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_calico-node-nxljq_kube-system_...
09ceddabdeb9        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-proxy-lf8gd_kube-system_...
9fc90839bb6f        821507941e9c           "kube-apiserver --..."   2 weeks ago         Up 2 weeks                              k8s_kube-apiserver_kube-apiserver-kube-master_kube-system_...
8ea410ce00a6        b8df3b177be2           "etcd --advertise-..."   2 weeks ago         Up 2 weeks                              k8s_etcd_etcd-kube-master_kube-system_...
dd7f9b381e4f        38521457c799           "kube-controller-m..."   2 weeks ago         Up 2 weeks                              k8s_kube-controller-manager_kube-controller-manager-kube-master_kube-system_...
f6681365bea8        37a1403e6c1a           "kube-scheduler --..."   2 weeks ago         Up 2 weeks                              k8s_kube-scheduler_kube-scheduler-kube-master_kube-system_...
0638e47ec57e        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_etcd-kube-master_kube-system_...
5bbe35abb3a3        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-controller-manager-kube-master_kube-system_...
2dc6ee716bb4        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-scheduler-kube-master_kube-system_...
b15dfc9f089a        k8s.gcr.io/pause:3.1   "/pause"                 2 weeks ago         Up 2 weeks                              k8s_POD_kube-apiserver-kube-master_kube-system_...

You can see the detailed description of any pod’s container using the command:

$ docker inspect <container_id>

Or check the logs:

$ docker logs <container_id>

This should be enough to understand what to do next, either try to fix the cluster or tear down everything and start from the beginning.

To simplify the process of provisioning Kubernetes cluster, you could use kubeadm as follows:

# This instruction is for ubuntu VMs, if you use CentOS, the commands will be
# slightly different.

### These steps are the same for the master and the worker nodes
# become root
$ sudo su

# add repository and keys
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -

$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF

# install components
$ apt-get update
$ apt-get -y install ebtables ethtool docker.io apt-transport-https kubelet kubeadm kubectl

# adjust sysctl settings
$ cat <<EOF >>/etc/ufw/sysctl.conf
net/ipv4/ip_forward = 1
net/bridge/bridge-nf-call-ip6tables = 1
net/bridge/bridge-nf-call-iptables = 1
net/bridge/bridge-nf-call-arptables = 1
EOF

$ sysctl --system

### Next steps are for the master node only.

# Create Kubernetes cluster
$ kubeadm init --pod-network-cidr=192.168.0.0/16
or if you want to use older KubeDNS instead of CoreDNS:
$ kubeadm init --pod-network-cidr=192.168.0.0/16 --feature-gates=CoreDNS=false

# Configure kubectl
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config

# install Calico network
$ kubectl apply -f https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/kubeadm/1.7/calico.yaml
# or install Flannel (not both)
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

# Untaint master or/and join other nodes:
$ kubectl taint nodes --all node-role.kubernetes.io/master-

# run on master if you forgot the join command:
$ kubeadm token create --print-join-command

# run command printed on the previous step on the worker node to join it to the existing cluster.

# At this point you should have ready to user Kubernetes cluster.
$ kubectl get nodes -o wide
$ kubectl get pods,svc,deployments,daemonsets --all-namespaces

After recovering the cluster, could you try to install cloud-controller-manager again and share the results?

-- VAS
Source: StackOverflow