Elasticsearch data gets deleted when creating with existing disk using a new kubernetes cluster

9/11/2019

I have created a PersistentVolume, PersistentVolumeClaim and StorageClass for elasticsearch in a perisistance.yaml file.

The PersistentVolume, StorageClass,PersistentVolumeClaim is created successfully. The bound is also successful.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
reclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-persistent-volume
spec:
  storageClassName: ssd
  capacity:
    storage: 30G
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: gke-webtech-instance-2-pvc-f5964ddc-d446-11e9-9d1c-42010a800076
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-claim
spec:
  storageClassName: ssd
  volumeName: pv-persistent-volume
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 30G

pv-claim_bound_successful

I have also attached the deployment.yaml for elasticsearch below.

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  labels:
    name: elasticsearch
spec:
  type: NodePort
  ports:
    - name: elasticsearch-port1
      port: 9200
      protocol: TCP
      targetPort: 9200
    - name: elasticsearch-port2
      port: 9300
      protocol: TCP
      targetPort: 9300
  selector:
    app: elasticsearch
    tier: elasticsearch
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: elasticsearch-application
  labels:
    app: elasticsearch
spec:
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: elasticsearch
        tier: elasticsearch
    spec:
      hostname: elasticsearch
      containers:
        - image: gcr.io/xxxxxxx/elasticsearch:7.3.1 
          name: elasticsearch
          ports:
            - containerPort: 9200
              name: elasticport1
            - containerPort: 9300
              name: elasticport2
          env:
            - name: discovery.type
              value: single-node
          volumeMounts:
          - mountPath: "/usr/share/elasticsearch/html"
            name: pv-volume
      volumes:
        - name: pv-volume
          persistentVolumeClaim:
             claimName: pv-claim

I have created the deployment.yaml file as well. The Elastic search applications runs successfully without any issues and i am able to hit the elasticsearch URL also. I have run tests and populated the data in elaticsearch and i am able to view the data's also.

Once deleted the cluster in Kubernetes i try to connect with the same disk which has the persistance data. Everything is perfect. But i am not able to get the data which is already stored. My data is lost and i have a disk empty i guess.

-- klee
datapersistance
elasticsearch
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

9/11/2019

Kubernetes has reclaimPolicy for persistent volumes which defaults in most cases to delete. You can change it with:

kubectl patch pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

Or simply adding persistentVolumeReclaimPolicy: Retain in persistentVolume.yaml

Edited: As in comment below, this problem may not about data being lost. Pasting my comment below:

"I don't think your data is lost. Elasticsearch just needs to index existing data because it doesn't just grab existing stored data. You need to reingest data to elasticsearch or save snapshots regularly or use master, data, client architecture."

-- Akın Özer
Source: StackOverflow