Problem: The Prometheus server queries result on the Grafana side is not consistent as it should be when we have HA of Prometheus server. So I want to achieve consistent storage access with High Availablity.
Scenario-1 RelicaCount=1 statefulSet=false Created PersitantVolume Count = 1 The application works as expected but not HA.
Scenerio-2 ReplicaCount=3 statefulSet=false Created PersistantVolume Count=1 In this scenario, only one pod is running and other ones are throwing lock error from the Prometheus server.
Scenario-2.1
ReplicaCount=3 statefulSet=false Created PersistantVolume Count=1 with "--storage.tsdb.no-lockfile"
it does create 3 pods but except 1 all others throwing some golang error from Prometheus server application.
Scenario-3 ReplicaCount=3 statefulSet=true Created PersistantVolume Count=3
So in this scenario, we have 3 replica pods with 3 separate persistent storage through the stateful set, Which is recommended configuration from the community. But this configuration not giving consistent metrics on Grafana as the session is sticky.
Infrastructure information: helm chart: "stable/prometheus" version: 10.0.1 appVersion: 2.15.2 Aws EKS with Kubernetes 1.14
Question: So how I can achieve HA of Prometheus server with the plan HPA or VPA with persistent storage on Kubernetes.
Resolution: I am thinking about handling this problem with below-mentioned resolution:
Has anyone faced this issue before, if yes how you overcome with HA and persistent storage on Kubernetes? If my configuration is incorrect to handle this requirement, Can you provide a recommended approach and configuration to handle this?
Please feel to ask if anything missing to explain in my current implementation, Thanks in advance.