None persistent Prometheus metrics on Kubernetes

5/30/2019

I'm collecting Prometheus metrics from a uwsgi application hosted on Kubernetes, the metrics are not retained after the pods are deleted. Prometheus server is hosted on the same kubernetes cluster and I have assigned a persistent storage to it.

How do I retain the metrics from the pods even after they deleted?

The Prometheus deployment yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: prometheus
  namespace: default
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
        - name: prometheus
          image: prom/prometheus
          args:
            - "--config.file=/etc/prometheus/prometheus.yml"
            - "--storage.tsdb.path=/prometheus/"
            - "--storage.tsdb.retention=2200h"
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: prometheus-config-volume
              mountPath: /etc/prometheus/
            - name: prometheus-storage-volume
              mountPath: /prometheus/
      volumes:
        - name: prometheus-config-volume
          configMap:
            defaultMode: 420
            name: prometheus-server-conf
        - name: prometheus-storage-volume
          persistentVolumeClaim:
            claimName: azurefile
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: prometheus
  name: prometheus
spec:
  type: LoadBalancer
  loadBalancerIP: ...
  ports:
    - port: 80
      protocol: TCP
      targetPort: 9090
  selector:
    app: prometheus

Application deployment yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api-app
  template:
    metadata:
      labels:
        app: api-app
    spec:
      containers:
      - name: nginx
        image: nginx
        lifecycle:
          preStop:
            exec:
              command: ["/usr/sbin/nginx","-s","quit"]
        ports:
          - containerPort: 80
            protocol: TCP
        resources:
          limits:
            cpu: 50m
            memory: 100Mi
          requests:
            cpu: 10m
            memory: 50Mi
        volumeMounts:
          - name: app-api
            mountPath: /var/run/app
          - name: nginx-conf
            mountPath: /etc/nginx/conf.d
      - name: api-app
        image: azurecr.io/app_api_se:opencv
        workingDir: /app
        command: ["/usr/local/bin/uwsgi"]
        args:
          - "--die-on-term"
          - "--manage-script-name"
          - "--mount=/=api:app_dispatch"
          - "--socket=/var/run/app/uwsgi.sock"
          - "--chmod-socket=777"
          - "--pyargv=se"
          - "--metrics-dir=/storage"
          - "--metrics-dir-restore"
        resources:
          requests:
            cpu: 150m
            memory: 1Gi
        volumeMounts:
          - name: app-api
            mountPath: /var/run/app
          - name: storage
            mountPath: /storage
      volumes:
        - name: app-api
          emptyDir: {}
        - name: storage  
          persistentVolumeClaim:
            claimName: app-storage
        - name: nginx-conf
          configMap:
            name: app
      tolerations:
      - key: "sku"
        operator: "Equal"
        value: "test"
        effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: api-app
  name: api-app
spec:
  ports:
    - port: 80
      protocol: TCP
      targetPort: 80
  selector:
    app: api-app
-- user1334557
flask
kubernetes
monitoring
prometheus
uwsgi

2 Answers

5/30/2019

With this configuration for a volume, it will be removed when you release a pod. You are basically looking for a PersistentVolumne, documentation and example.

Also check, PersistentVolumeClaim.

-- Vishrant
Source: StackOverflow

7/2/2019

Your issue is with the wrong type of controller used to deploy Prometheus.
The Deployment controller is wrong choice in this case (it's meant for Stateless applications, that don't need to maintain any persistence identifiers between Pods rescheduling - like persistence data).

You should switch to StatefulSet kind*, if you require persistence of data (metrics scrapped by Prometheus) across Pod (re)scheduling.

*This is how Prometheus is deployed by default with prometheus-operator.

-- Nepomucen
Source: StackOverflow