I have deployed Prometheus to a running kubernetes cluster.
I want it to use service discovery to find all our services and scrape their health endpoint to collect their metrics.
The problem is that even though it finds the services using the scrape config kubernetes-apiservers
it says all services are down.
I don't need an absolute solution to the problem even suggestions on how I might debug this would be very helpful.
Within my config I have
scrape_configs:
- job_name: kubernetes-apiservers
honor_timestamps: true
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: https
authorization:
type: Bearer
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
follow_redirects: true
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
separator: ;
regex: default;kubernetes;https
replacement: $1
action: keep
- separator: ;
regex: (.*)
target_label: job
replacement: apiserver
action: replace
when I look on the service discovery tab it says that only 1 of 43 kubernetes-apiservers are available.
On the page status -> targets everything seems to be up. Any guess as to where I can see why the 42 other services aren't being scraped.