Prometheus job "kubernetes-nodes" endpoints in state "UNKNOWN"

6/11/2019

We are facing one issue, that some endpoints are in state "UNKNOWN". Prometheus job "kubernetes-nodes".

Nodes and Prometheus are all up for several days. We tried to curl those "kubernetes-nodes" endpoints, which are in "UNKNOWN" state. Metrics can be correctly curled, but endpoint state is still "UNKNOWN". We don't know the reason (criteria, on which case it will be marked as "UNKNOWN").

I know before Prometheus does its first scrape, endpoints are in "UNKNOWN" state. Then, if scrape successes, endpoint will be "UP", if fails, "DOWN". However, in below screenshot it seems some endpoints are never been scraped...We just don't know why.

Could you please give advice, about the possible reason of such case? Does this mean this node (name hide in red block...) has something wrong? If so, is it possible to fix, that will let Prometheus treat it as "UP"?

Thanks in advance.

enter image description here

- job_name: kubernetes-nodes
  scrape_interval: 1m
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - api_server: null
    role: node
    namespaces:
      names: []
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: kubernetes.default.svc:443
    action: replace
  - source_labels: [__meta_kubernetes_node_name]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/${1}/proxy/metrics
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
-- Lijing Zhang
kubernetes
prometheus

1 Answer

2/14/2020

I think you're missing nodes/proxy resource in your prometheus cluster role. Here's official example github.com/prometheus/documentation/examples/rbac-setup.yml.

-- aisbaa
Source: StackOverflow