We would like to setup Elasticsearch Highly Available Setup in Kubernetes. we would like to deploy the below objects and would like to scale them independently
please share your suggestions if you have implemented this kind of setup. Preferably using open source tools
See below some points for a proposed architecture:
Use a ConfigMap to manage their settings. Something like this:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
labels:
component: elasticsearch
role: master
version: v6.5.0 // or whatever version you require
spec:
selector:
component: elasticsearch
role: master
version: v6.5.0
ports:
- name: transport
port: 9300 // no need to expose port 9200, as master nodes don't need it
protocol: TCP
clusterIP: None
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-master-configmap
data:
elasticsearch.yml: |
# these should get you going
# if you want more fine-grained control, feel free to add other ES settings
cluster.name: "${CLUSTER_NAME}"
node.name: "${NODE_NAME}"
network.host: 0.0.0.0
# (no_master_eligible_nodes / 2) + 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ${DISCOVERY_SERVICE}
node.master: true
node.data: false
node.ingest: false
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: elasticsearch-master
labels:
component: elasticsearch
role: master
version: v6.5.0
spec:
replicas: 3 // 3 is the recommended minimum
template:
metadata:
labels:
component: elasticsearch
role: master
version: v6.5.0
spec:
affinity:
// you can also add node affinity in case you have a specific node pool
podAntiAffinity:
// make sure 2 ES processes don't end up on the same machine
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- elasticsearch
- key: role
operator: In
values:
- master
topologyKey: kubernetes.io/hostname
initContainers:
# just basic ES environment configuration
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: elasticsearch-master
image: // your preferred image
imagePullPolicy: Always
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: elasticsearch-cluster
- name: DISCOVERY_SERVICE
value: elasticsearch-discovery
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m // or more, if you want
ports:
- name: tcp-transport
containerPort: 9300
volumeMounts:
- name: configmap
mountPath: /etc/elasticsearch/elasticsearch.yml
subPath: elasticsearch.yml
- name: storage
mountPath: /usr/share/elasticsearch/data
volumes:
- name: configmap
configMap:
name: elasticsearch-master-configmap
- emptyDir:
medium: ""
name: storage
Client nodes can also be deployed in a very similar fashion, so I will avoid adding code for that.
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
ports:
- name: http
port: 9200 # in this example, data nodes are being used as client nodes
- port: 9300
name: transport
selector:
component: elasticsearch
role: data
version: v6.5.0
type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-data-configmap
data:
elasticsearch.yml: |
cluster.name: "${CLUSTER_NAME}"
node.name: "${NODE_NAME}"
network.host: 0.0.0.0
# (no_master_eligible_nodes / 2) + 1
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: ${DISCOVERY_SERVICE}
node.master: false
node.data: true
node.ingest: false
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch-data
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
serviceName: elasticsearch
replicas: 1 # choose the appropriate number
selector:
matchLabels:
component: elasticsearch
role: data
version: v6.5.0
template:
metadata:
labels:
component: elasticsearch
role: data
version: v6.5.0
spec:
affinity:
# again, I recommend using nodeAffinity
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: component
operator: In
values:
- elasticsearch
- key: role
operator: In
values:
- data
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 180
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: elasticsearch-production-container
image: .search the same image that you use for the master node
imagePullPolicy: Always
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: elasticsearch-cluster
- name: DISCOVERY_SERVICE
value: elasticsearch-discovery
- name: ES_JAVA_OPTS
value: -Xms31g -Xmx31g # do not exceed 32 GB!!!
ports:
- name: http
containerPort: 9200
- name: tcp-transport
containerPort: 9300
volumeMounts:
- name: configmap
mountPath: /etc/elasticsearch/elasticsearch.yml
subPath: elasticsearch.yml
- name: elasticsearch-node-pvc
mountPath: /usr/share/elasticsearch/data
readinessProbe:
httpGet:
path: /_cluster/health?local=true
port: 9200
initialDelaySeconds: 15
livenessProbe:
exec:
command:
- /usr/bin/pgrep
- -x
- "java"
initialDelaySeconds: 15
resources:
requests:
# adjust these as per your needs
memory: "32Gi"
cpu: "11"
volumes:
- name: configmap
configMap:
name: elasticsearch-data-configmap
volumeClaimTemplates:
- metadata:
name: elasticsearch-node-pvc
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: # this is dependent on your K8s environment
resources:
requests:
storage: 350Gi # choose the desired storage size for each ES data node
Hope this helps!