Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead

5/9/2019

I have a hand-built Kubernets cluster 1.11.4 using CentOS running as AWS ec2 instances, 1 master and 1 minion. The cluster is very stable. I'm want to deploy JupyterHub into the cluster. The doc here and here call out some details for provisioning EFS. I elected to go with EBS.

The pvc fails with:

Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

Below is the StorageClass definition:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  fsType: ext4

The pv yaml:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: jupyterhub-pv
  labels:
    type: amazonEBS
spec:
  capacity:
    storage: 30Gi
  accessModes:
    - ReadWriteMany
  awsElasticBlockStore:
    volumeID: vol-0ddb700735db435c7
    fsType: ext4

The pvc yaml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: jupyterhub-pvc
  labels:
    type: amazonEBS
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
$ kubectl -n jhub describe pvc hub-db-dir

returns:

Name:          hub-db-dir
Namespace:     jhub
StorageClass:  standard  <========from an earlier try
Status:        Pending
Volume:
Labels:        app=jupyterhub
               chart=jupyterhub-0.8.2
               component=hub
               heritage=Tiller
               release=jhub
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason              Age                     From                         Message
  ----       ------              ----                    ----                         -------
  Warning    ProvisioningFailed  110s (x106 over 3h43m)  persistentvolume-controller  Failed to provision volume with StorageClass "standard": Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

To me, this looks like an attempt by the pod to mount the storage, but fails. And isolating this error has been a challenge. I tried patching the pvc to update the storageclass to gp2, which is now marked as default, but was not at the time I deployed the pvc policy. Patching failed:

$ kubectl -n jhub patch pvc hub-db-dir -p '{"spec":{"StorageClass":"gp2"}}'
persistentvolumeclaim/hub-db-dir patched (no change)
$ kubectl -n jhub describe pvc hub-db-dir
Name:          hub-db-dir
Namespace:     jhub
StorageClass:  standard  <====== Not changed
Status:        Pending
Volume:
Labels:        app=jupyterhub
               chart=jupyterhub-0.8.2
               component=hub
               heritage=Tiller
               release=jhub
Annotations:   volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Events:
  Type       Reason              Age                      From                         Message
  ----       ------              ----                     ----                         -------
  Warning    ProvisioningFailed  2m26s (x108 over 3h48m)  persistentvolume-controller  Failed to provision volume with StorageClass "standard": Failed to get AWS Cloud Provider. GetCloudProvider returned <nil> instead
Mounted By:  hub-76ffd7d94b-dmj8l

JupyterHub deployment is managed by Helm/tiller, so when any changes are made, I use the following to update the pods:

$ helm upgrade jhub jupyterhub/jupyterhub --version=0.8.2 -f config.yaml

The relevant section in the config.yaml file to allocate user storage is:

proxy:
  secretToken: "<random value>"
singleuser:
  cloudMetadata:
    enabled: true
singleuser:
  storage:
    dynamic:
      storageClass: gp2
singleuser:
  storage:
    extraVolumes:
      - name: jupyterhub-pv
        persistentVolumeClaim:
          claimName: jupyterhub-pvc
    extraVolumeMounts:
      - name: jupyterhub-pv
        mountPath: /home/shared

Part of the troubleshooting has also focused on letting the cluster know that its resources are provisioned by AWS. To that end, I have in the kubernets config file:

/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

the line:

Environment="KUBELET_EXTRA_ARGS=--cloud-provider=aws --cloud-config=/etc/kubernetes/cloud-config.conf

where: /etc/kubernetes/cloud-config.conf contains:

[Global]
KubernetesClusterTag=kubernetes
KubernetesClusterID=kubernetes

In the files kube-controller-manager.yaml and kube-apiserver.yaml I added the line:

- --cloud-provider=aws

I have not yet tagged any AWS resources, but will start doing it based on this.

What are my next next steps for troubleshooting?

Thanks!

-- RandyB
jupyterhub
kubernetes

1 Answer

5/9/2019

May be this link can help?

You must have the --cloud-provider=aws flag added to the Kubelet before adding the node to the cluster. Key to the AWS integration is a particular field on the Node object—the .spec.providerID field - and that field will only get populated if the flag is present when the node is added to the cluster. If you add a node to the cluster and then add the command-line flag afterward, this field/value won’t get populated and the integration won’t work as expected. No error is surfaced in this situation (at least, not that I’ve been able to find).

If you do find yourself with a missing .spec.providerID field on the Node object, you can add it with a kubectl edit node command. The format of the value for this field is aws:///<az-of-instance>/<instance-id>.

-- Vasily Angapov
Source: StackOverflow