Need to integrate Slack and pager-duty to Prometheus and define custom rules

6/21/2018

I have configured Prometheus & Grafana in GCP kubernetes Environment using the KB's provided in https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/manifests

All are working perfect and my cluster details are showing in Grafana. Now I want to configure alert for Prometheus and need to integrate to my slack channel. If anyone have any Idea about this please let me know.

Thanks in advance

-- prometheusmanu manu
kubernetes
prometheus

2 Answers

6/21/2018

Right - so. You'll need to do a few things.

First download and run alert manager - You can download it here - you can get away with running the simple config.

Then you need to add the alert manager to your prometheus config.

Sample in prometheus.yml

alerting:
  alertmanagers:
alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - "localhost:9093"

  - scheme: http
    static_configs:
    - targets:
      - "localhost:9093"

Assuming you already have rules running, and you now want to integrate it into slack, in the alert managers config file, you need to add

global:
  slack_api_url: '...'


route:

  repeat_interval: 5s
  receiver: slack-alert # replace this field
  group_by:
    - WebsiteStaus
    - InstanceDownTime

receivers:
- name: 'slack-alert'
  slack_configs:
  - channel: '#some-channel'
    username: 'prometheus-bot'
    send_resolved: true
    # title: '{{ range .Alerts }} {{ .Annotations.summary }} {{ end }}'
    # text: '<!channel> \n {{ range .Alerts }} {{ .Annotations.description }} \n {{ end }}'
    title: '{{ .CommonAnnotations.summary }}'
    text: '{{ .CommonAnnotations.description }}'

Just a note on .CommonAnnotations vs .Annotations. CommonAnnotations refers to a single alert, whereas Annotations refers to multiple alert events. So if your alert rules fire off two alerts at the same time, you need to use CommonAnnotations

If you don't have any rules set, here is a sample rule that I use to alert me if a website goes down.

groups:
# ============================================================================
# Website alerts
# --------------
# ============================================================================
- name: WebsiteStaus
  rules:
  # Check if the probe_success was successful
  - alert: SiteDown
    expr: probe_success == 0
    for: 5s
    labels:
      severity: page
    annotations:
      summary: A website has gone down!
      description: '<{{ $labels.instance }}|{{ $labels.instance }}> failed to probe'
-- Giannis Katsini
Source: StackOverflow

1/2/2019

Using the prometheus-operator, it took me a while to figure out that the alertmanager configuration is stored as a secret in https://github.com/coreos/prometheus-operator/blob/master/contrib/kube-prometheus/manifests/alertmanager-secret.yaml

You would need to decode it, edit, encode and apply

echo "Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgImdyb3VwX2ludGVydmFsIjogIjVtIgogICJncm91cF93YWl0IjogIjMwcyIKICAicmVjZWl2ZXIiOiAibnVsbCIKICAicmVwZWF0X2ludGVydmFsIjogIjEyaCIKICAicm91dGVzIjogCiAgLSAibWF0Y2giOiAKICAgICAgImFsZXJ0bmFtZSI6ICJEZWFkTWFuc1N3aXRjaCIKICAgICJyZWNlaXZlciI6ICJudWxsIg==" | base64 --decode 
-- Christina A
Source: StackOverflow