Grafana Dashboard setup for Prometheus Federation

4/24/2020

I am using prometheus federation to scrape metrics from multiple k8s cluster. It works ok and I would like to create some dashboard on grafana which I'd like to filter dashboards by tenant(cluster).. I am trying to use variables but the things that I do not understand, even if ı did not specify something special for kube_pod_container_status_restars_total,it contains the label that I specied below static_configs but kube_node_spec_unschedulable is not.

So where this differences come from and what should I do ? Meanwhile what is the best practice way to setup a dashboard which can provide dashboard filter by multiple cluster name? should ı use relabel?

kube_pod_container_status_restarts_total{app="kube-state-metrics",container="backup",....,tenant="022"}

kube_node_spec_unschedulable{app="kube-state-metrics",....kubernetes_pod_name="kube-state-metrics-7d54b595f-r6m9k",node="022-kube-master01",pod_template_hash="7d54b595f"

Prometheus Server

prometheus.yml:
  rule_files:
    - /etc/config/rules
    - /etc/config/alerts

  scrape_configs:
    - job_name: prometheus
      static_configs:
        - targets:
          - localhost:9090

Central Cluster

  scrape_configs:
    - job_name: federation_012
      scrape_interval: 5m
      scrape_timeout: 1m

      honor_labels: true
      honor_timestamps: true
      metrics_path: /prometheus/federate

      params:
        'match[]':
          - '{job!=""}'
      scheme: https

      static_configs:
        - targets:
          - host
          labels:
            tenant: 012

      tls_config:
        insecure_skip_verify: true

    - job_name: federation_022
      scrape_interval: 5m
      scrape_timeout: 1m

      honor_labels: true
      honor_timestamps: true
      metrics_path: /prometheus/federate

      params:
        'match[]':
          - '{job!=""}'
      scheme: https

      static_configs:
        - targets:
          - host
          labels:
            tenant: 022

      tls_config:
        insecure_skip_verify: true
-- semural
grafana
grafana-variable
kubernetes
monitoring
prometheus

1 Answer

4/25/2020

Central Prometheus server

  scrape_configs:
    - job_name: federate
      scrape_interval: 5m
      scrape_timeout: 1m

      honor_labels: true
      honor_timestamps: true
      metrics_path: /prometheus/federate

      params:
        'match[]':
          - '{job!=""}'
      scheme: https

      static_configs:
        - targets:
          - source_host_012
          - source_host_022

      tls_config:
        insecure_skip_verify: true

Source Prometheus (tenant 012)

prometheus.yml:
  rule_files:
    - /etc/config/rules
    - /etc/config/alerts

  scrape_configs:
    - job_name: tenant_012
      static_configs:
        - targets:
          - localhost:9090
          labels:
            tenant: 012

Source Prometheus (tenant 022)

prometheus.yml:
  rule_files:
    - /etc/config/rules
    - /etc/config/alerts

  scrape_configs:
    - job_name: tenant_022
      static_configs:
        - targets:
          - localhost:9090
          labels:
            tenant: 022

If you still don't get needed labels, try to add relabel_configs to you federate job and try to differentiate metrics by a source job name:

relabel_configs:
  - source_labels: [job]
    target_label: tenant

or extract distinctive information from the __address__ (or from any other __ prefixed) label for example.

relabel_configs:
  - source_labels: [__address__]
    target_label: tenant_host

PS: keep in mind that labels starting with __ will be removed from the label set after target relabeling is completed.

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config

-- dmkvl
Source: StackOverflow