we are trying to monitor K8S with Grafana and Prometheus Operator
most of the metrics are working as expected and I was able to see the dashboard with the right value, our system contain 10 nodes with overall 500 pods, now when I restart Prometheus all the data was deleted (I want it to be stored for two week) My question is, How can I define to Prometheus volume to keep the data for two weeks or 100GB DB
. I found the following (we user Prometheus operator )
https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/storage.md
This is the config of the Prometheus Operator
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
k8s-app: prometheus-operator
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
image: quay.io/coreos/prometheus-operator:v0.29.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
This is the config of the Prometheus
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
labels:
prometheus: prometheus
spec:
replica: 2
serviceAccountName: prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
role: observeable
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
resources:
limits:
cpu: 8000m
memory: 24000Mi
requests:
cpu: 6000m
memory: 6000Mi
storage:
volumeClaimTemplate:
spec:
selector:
matchLabels:
app: prometheus
resources:
requests:
storage: 100Gi
https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/storage.md
we have file system (nfs), and the above storage config doesn't works, my question are:
- what I miss here is how to config the
volume
,server
,path
in the following its under thenfs
section
where should I find this "/path/to/prom/db" ? how can I refer to it? should I create it somehow, or just provide the path. we have nfs configured in our system
- how to combine it to Prometheus
As I dont have deep knowlage in pvc
and pv
I've created the following( not sure regard those values, what is my server and what path should I provide...
server: myServer
path: "/path/to/prom/db"
what should I put there and how I make my Prometheus (i.e. the config I have provided in the question )to use it
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
prometheus: prometheus
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteOnce # required
nfs:
server: myServer
path: "/path/to/prom/db"
if there any other persistence volume other than nfs
which I can use for my use-case please advice how
you must have to use persistent volume and volume claim (PV & PVC) for persist data. You can refer "https://kubernetes.io/docs/concepts/storage/persistent-volumes/" must see carefully provisioning, reclaim policy, access mode, storage type in above url.
refer the below code. define storage-retention to 7d or the required retention days in a configmap and load it as env variable in the container as shown below
containers:
- name: prometheus
image: image: prom/prometheus:latest
args:
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention=$(STORAGE_RETENTION)'
- '--web.enable-lifecycle'
- '--storage.tsdb.no-lockfile'
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- name: web
containerPort: 9090
env:
- name: STORAGE_RETENTION
valueFrom:
configMapKeyRef:
name: prometheus.cfg
key: storage-retention
you might need to adjust these settings in the prometheus operator files
To determine when to remove old data, use this switch --storage.tsdb.retention
e.g. --storage.tsdb.retention='7d'
(by default, Prometheus keeps data for 15 days).
To completely remove the data use this API call:
$ curl -X POST -g 'http://<your_host>:9090/api/v1/admin/tsdb/<your_index>'
EDIT
Kubernetes snippet sample
...
spec:
containers:
- name: prometheus
image: docker.io/prom/prometheus:v2.0.0
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.retention=7d'
ports:
- name: web
containerPort: 9090
...
I started working with the operator chart recently ,
And managed to add persistency without defining pv and pvc.
On the new chart configuration adding persistency is much easier than you describe just edit the file /helm/vector-chart/prometheus-operator-chart/values.yaml under prometheus.prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: prometheus
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
selector: {}
And add this /helm/vector-chart/prometheus-operator-chart/templates/prometheus/storageClass.yaml:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: prometheus
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
parameters:
type: gp2
zones: "ap-southeast-2a, ap-southeast-2b, ap-southeast-2c"
encrypted: "true"
This will automatically create you both pv and a pvc which will create an ebs in aws which will store all your data inside.