Kubernetes On-Prem Statefulset Pods - ELF Stack

12/13/2019

I am trying to implement ELF stack on our Kubernetes setup running in our DC location. K8s is made up of 3 masters and 3 worker nodes. Currently, I have created and implemented headless service for ElasticSearch module and proceeding to run ElasticSearch pods on the cluster in Statefulset mode. Below is the YAML code -

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
  namespace: kube-logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: elasticsearch:7.4.2
        resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: cluster.name
            value: k8s-logs
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.seed_hosts
            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
          - name: cluster.initial_master_nodes
            value: "es-cluster-0,es-cluster-1,es-cluster-2"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx512m"
      initContainers:
      - name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 5Gi

Problems -

  • Though K8s starts 2 pods, only one POD is consuming PV/PVC and another pods is waiting with an error "pod has unbound immediate PersistentVolumeClaims (repeated 3 times)"

  • Do we have need to have Persistent Volume first?

  • Can we run 3 replica Statefulset pods of ElasticSearch to use same VolumeMounts within On-Prem kubernetes?

Any help is highly appreciated -

-- Raghavendra Guttur
elasticsearch
kubernetes

1 Answer

12/13/2019

I assume you are using Kubeadm.

Though K8s starts 2 pods, only one POD is consuming PV/PVC and another pods is waiting with an error "pod has unbound immediate PersistentVolumeClaims (repeated 3 times)"

If you are using StatefulSet you need to know that It creates pods 0 up through N-1. Also It creates it one by one and pod before must running correctly. So in your case: 1st pod bounded to PV and is running correctly. StatefulSet wants to create second Pod, but it cannot bound to PV, Pod is in Pending state. As this pod is not working correctly, statefulset is not creating another one.

Usually On-Prem or Minikube have default template for storageclass.

Using Minikube it would like like:

$ kubectl get sc
NAME                 PROVISIONER                AGE
standard (default)   k8s.io/minikube-hostpath   21m

or GKE

$ kubectl get sc
NAME                 PROVISIONER            AGE
standard (default)   kubernetes.io/gce-pd   25d

Because of this default storageClass you can only PersistentVolumeClaim and Kubernetes automatically will create PersistentVolume with required resources.

However Kubeadm don't have definied default storageclass. It mean that you need to create PersistentVolume and PersistentVolumeClaim manually.

In Persistent Volume docs, especially in Binding chapter you will find information:

Once bound, PersistentVolumeClaim binds are exclusive, regardless of how they were bound. A PVC to PV binding is a one-to-one mapping.

You can check this StackOverflow thread for more information.

Do we have need to have Persistent Volume first?

Yes, you need to specify resources in PV. Here you have example with PV.

Can we run 3 replica Statefulset pods of ElasticSearch to use same VolumeMounts within On-Prem kubernetes?

Yes, pods can share VolumeMounts however each pod need to have own PVC.

In Addition

You could consider create default storage class.

In your Yaml you have accessModes: [ "ReadWriteOnce" ]. It will allow only one node use this PV. Good explanation of this you can here.

I was not able to create this sts using elasticsearch:7.4.2 image. Newest version is elasticsearch:7.5.0.

You can also check this article.

-- PjoterS
Source: StackOverflow