How to get notified or alerted on config reloader error in prometheus operator?

4/9/2019

I use prometheus operator for a deployment of a monitoring stack on kubernetes. I would like to know if there is a way to be aware if the config deployed by the config reloader failed. This is valable for prometheus and alert manager ressources that use a config reloader container to reload their configs. When the config failed. We have a log in the container but can we have a notification or an alert based on a failed config reloading ?

-- Florian MEDJA
kubernetes
prometheus
prometheus-alertmanager
prometheus-operator

1 Answer

4/9/2019

Prometheus exposes a /metric endpoint you can scrape. In particular, there is a metric indicating if the last reload suceeded:

# HELP prometheus_config_last_reload_successful Whether the last configuration reload attempt was successful.
# TYPE prometheus_config_last_reload_successful gauge
prometheus_config_last_reload_successful 0

You can use it to alert on failed reload.

groups:
- name: PrometheusAlerts
  rules:
  - alert: FailedReload
    expr: prometheus_config_last_reload_successful == 0
    for: 5m
    labels:
      severity: warning
    annotations:
      description: Reloading Prometheus' configuration has failed for {{$labels.namespace}}/{{ $labels.pod}}.
      summary: Prometheus configuration reload has failed
-- Michael Doubez
Source: StackOverflow