It's not so digital ocean specific, would be really nice to verify if this is an expected behavior or not.
I'm trying to setup ElasticSearch cluster on DO managed Kubernetes cluster with helm chart from ElasticSearch itself
And they say that I need to specify a storageClassName
in a volumeClaimTemplate
in order to use volume which is provided by managed kubernetes service. For DO it's do-block-storages
according to their docs. Also seems to be it's not necessary to define PVC, helm chart should do it itself.
Here's config I'm using
# Specify node pool
nodeSelector:
doks.digitalocean.com/node-pool: elasticsearch
# Shrink default JVM heap.
esJavaOpts: "-Xmx128m -Xms128m"
# Allocate smaller chunks of memory per pod.
resources:
requests:
cpu: "100m"
memory: "512M"
limits:
cpu: "1000m"
memory: "512M"
# Specify Digital Ocean storage
# Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: do-block-storage
resources:
requests:
storage: 10Gi
extraInitContainers: |
- name: create
image: busybox:1.28
command: ['mkdir', '/usr/share/elasticsearch/data/nodes/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
- name: file-permissions
image: busybox:1.28
command: ['chown', '-R', '1000:1000', '/usr/share/elasticsearch/']
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-master
Helm chart i'm setting with terraform, but it doesn't matter anyway, which way you'll do it:
resource "helm_release" "elasticsearch" {
name = "elasticsearch"
chart = "elastic/elasticsearch"
namespace = "elasticsearch"
values = [
file("charts/elasticsearch.yaml")
]
}
Here's what I've got when checking pod logs:
51s Normal Provisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-2"
2m28s Normal ExternalProvisioning persistentvolumeclaim/elasticsearch-master-elasticsearch-master-2 waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I'm pretty sure the problem is a volume. it should've been automagically provided by kubernetes. Describing persistent storage gives this:
holms@debian ~/D/c/s/b/t/s/post-infra> kubectl describe pvc elasticsearch-master-elasticsearch-master-0 --namespace elasticsearch
Name: elasticsearch-master-elasticsearch-master-0
Namespace: elasticsearch
StorageClass: do-block-storage
Status: Pending
Volume:
Labels: app=elasticsearch-master
Annotations: volume.beta.kubernetes.io/storage-provisioner: dobs.csi.digitalocean.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: elasticsearch-master-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Provisioning 4m57s (x176 over 14h) dobs.csi.digitalocean.com_master-setupad-eu_04e43747-fafb-11e9-b7dd-e6fd8fbff586 External provisioner is provisioning volume for claim "elasticsearch/elasticsearch-master-elasticsearch-master-0"
Normal ExternalProvisioning 93s (x441 over 111m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "dobs.csi.digitalocean.com" or manually created by system administrator
I've google everything already, it seems to be everything is correct, and volume should be up withing DO side with no problems, but it hangs in pending state. Is this expected behavior or should I ask DO support to check what's going on their side?
Yes, this is expected behavior. This chart might not be compatible with Digital Ocean Kubernetes service.
Digital Ocean documentation has the following information in Known Issues section:
Support for resizing DigitalOcean Block Storage Volumes in Kubernetes has not yet been implemented.
In the DigitalOcean Control Panel, cluster resources (worker nodes, load balancers, and block storage volumes) are listed outside of the Kubernetes page. If you rename or otherwise modify these resources in the control panel, you may render them unusable to the cluster or cause the reconciler to provision replacement resources. To avoid this, manage your cluster resources exclusively with
kubectl
or from the control panel’s Kubernetes page.
In the charts/stable/elasticsearch there are specific requirements mentioned:
Prerequisites Details
- Kubernetes 1.10+
- PV dynamic provisioning support on the underlying infrastructure
You can ask Digital Ocean support for help or try to deploy ElasticSearch without helm chart.
It is even mentioned on github that:
Automated testing of this chart is currently only run against GKE (Google Kubernetes Engine).
Update:
The same issue is present on my kubeadm ha cluster.
However I managed to get it working by manually creating PersistentVolumes
's for my storageclass
.
My storageclass definition: storageclass.yaml
:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
parameters:
type: pd-ssd
$ kubectl apply -f storageclass.yaml
$ kubectl get sc
NAME PROVISIONER AGE
ssd local 50m
My PersistentVolume definition: pv.yaml
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: ssd
capacity:
storage: 30Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <name of the node>
kubectl apply -f pv.yaml
After that I ran helm chart:
helm install stable/elasticsearch --name my-release --set data.persistence.storageClass=ssd,data.storage=30Gi --set data.persistence.storageClass=ssd,master.storage=30Gi
PVC finally got bound.
$ kubectl get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
default data-my-release-elasticsearch-data-0 Bound task-pv-volume2 30Gi RWO ssd 17m
default data-my-release-elasticsearch-master-0 Pending 17m
Note that I only manually satisfied only single pvc and ElasticSearch manual volume provisioning might be very inefficient.
I suggest contacting DO support for automated volume provisioning solution.
What a strange situation, after I've changed 10Gi
to 10G
it started to work. Maybe it has to do something with a storage class it's self, but it started to work.