I have deployed Influxdb 2.0.0 as Statefulset with EBS volume persistence. I've noticed that, if for some reason, pod gets rescheduled to other node or even if we scale down statefulset pod replicas = 0 and then scale up, the effect would be the same on persisted data: they will be lost.
Initially, in case of pod that gets rescheduled to other node, I would thought the problem is with EBS volume, it doesn't get unmounted and them mounted to another node where pod replica is running but that is NOT the case. EBS volume is present, same pv/pvc exists, but data is lost.
To figure out what might be the problem, I've purposely done influxdb setup and added data and then did this:
kubectl scale statefulsets influxdb --replicas=0
...
kubectl scale statefulsets influxdb --replicas=1
The effect was the same just like when influxdb pod got rescheduled. Data was lost.
Any specific reason why would something like that happen?
My environment: I'm using EKS k8s environment with 1.15 k8s version of control plane/workers.
Fortunately, the problem was due to the big changes that happened between influxdb 1.x and 2.0.0 beta version in terms on where the actual data is persisted.
In 1.x version, data was persisted in:
/var/lib/influxdb
while on the 2.x version, data is persisted, by default, on:
/root/.influxdbv2
My EBS volume was mounted on the 1.x version location and with every restart of the pod (either caused by scaling down or by scheduling to other node), EBS volume was regularly attached but on the wrong location. That was the reason why there was no data.
Also, one difference that I see is that configuration params cannot be provided for 2.x version via configuration file (like it was on 1.x where I had configuration file mounted into the container as configmap). We have to provide additional configuration params inline. This link explains how: https://v2.docs.influxdata.com/v2.0/reference/config-options/
At the end this is the working version of Statefulset:
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: influxdb
name: influxdb
spec:
replicas: 1
selector:
matchLabels:
app: influxdb
serviceName: influxdb
template:
metadata:
labels:
app: influxdb
spec:
containers:
- image: quay.io/influxdb/influxdb:2.0.0-beta
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: api
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: influxdb
ports:
- containerPort: 9999
name: api
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /ping
port: api
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "800m"
memory: 1200Mi
requests:
cpu: 100m
memory: 256Mi
volumeMounts:
- mountPath: /root/.influxdbv2
name: influxdb-data
volumeClaimTemplates:
- metadata:
name: influxdb-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
volumeMode: Filesystem