Unable to scrape metrics from pods


I am able to scrape Prometheus metrics from a Kubernetes service using this Prometheus job configuration:

- job_name: 'prometheus-potapi'
  - targets: ['potapi-service.potapi:1234']

It uses Kubernetes DNS and it gives me the metrics from any of my three pods I use for my service.

I would like to see the result for each pod.

I am able to see the data I want using this configuration:

- job_name: 'prometheus-potapi-pod'
  - targets: ['']

I have searched and experimented using the service discovery mechanism available in Prometheus. Unfortunately, I don't understand how it should be setup. The service discovery reference isn't really helpful if you don't know how it works.

I am looking for an example where the job using the IP number is replaced with some service discovery mechanism. Specifying the IP was enough for me to see that the data I'm looking for is exposed.

The pods I want to scrape metrics from all live in the same namespace, potapi.

The metrics are always exposed through the same port, 1234.

Finally, the are all named like this:


When I do

kubectl describe pod potapi-deployment-754d96f855-pslgg -n potapi

I get this description:

Name:           potapi-deployment-754d96f855-pslgg
Namespace:      potapi
Node:           docker-for-desktop/
Start Time:     Tue, 07 Aug 2018 14:18:55 +0200
Labels:         app=potapi
Annotations:    <none>
Status:         Running
Controlled By:  ReplicaSet/potapi-deployment-754d96f855
    Container ID:   docker://72a0bafbda9b82ddfc580d79488a8e3c480d76a6d17c43d7f7d7ab18458c56ee
    Image:          potapi-service
    Image ID:       docker://sha256:d64e94c2dda43c40f641008c122e6664845d73cab109768efa0c3619cb0836bb
    Ports:          4567/TCP, 4568/TCP, 1234/TCP
    Host Ports:     0/TCP, 0/TCP, 0/TCP
    State:          Running
      Started:      Tue, 07 Aug 2018 14:18:57 +0200
    Ready:          True
    Restart Count:  0
    Environment:    <none>
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-4fttn (ro)
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-4fttn
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
  node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

How would you rewrite the job definition given these prerequisites?

-- Thomas Sundberg

2 Answers


Here they use example.io/scrape=true (and similar annotations for specifying the scrape port and the scrape path if it's not /metrics), which is how one achieves the "autodiscovery" part.

If you apply that annotation -- and the relevant config snippets in the Prom config -- to a Service, then Prom will scrape the port and path on the Service, meaning you will have stats for the Service itself, and not the individual Endpoints behind it. Similarly, if you label the Pods, you will gather metrics for the Pods but they would need to be rolled up to have a cross-Pod view of the state of affairs. There are multiple different resource types that can be autodiscovered, including node and ingress, also. They all behave similarly.

Unless you have grave CPU or storage concerns for your Prom instance, I absolutely wouldn't enumerate the scrape targets in the config like that: I would use the scrape annotations, meaning you can change who is scraped, what port, etc. without having to reconfigure Prom each time.

Be aware that if you want to use their example as-is, and you want to apply those annotations from within the kubernetes resource YAML, ensure that you quote the : 'true' value, otherwise YAML will promote that to be a boolean literal, and kubernetes annotations can only be string values.

Applying the annotations from the command line will work just fine:

kubectl annotate pod -l app=potapi example.io/scrape=true

(BTW, they use example.io/ in their example, but there is nothing special about that string except it namespaces the scrape part to keep it from colliding with something else named scrape. So feel free to use your organization's namespace if you wish to avoid having something weird named example.io/ in your cluster)

-- mdaniel
Source: StackOverflow


I ended up with this solution:


- job_name: 'kubernetes-pods'
  - role: pod
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__address__]
    action: replace
    regex: ([^:]+)(?::\d+)?
    replacement: $1:1234
    target_label: __address__


There are two parts.

  • Check for an annotation prometheus.io/scrape with the value 'true'. It is done in the first source_labels. It may not be self evident that prometheus_io_scrape translates to prometheus.io/scrape

  • Get the adress and add the desired port to it. It is done on the second source_labels. The __address__ source will be queried for a host name or ip number. In this case, a ip number is found using the cryptic regex ([^:]+)(?::\d+)?. The port I want to use is ´1234´ so I hardcoded it in replacement: The result is that the __address__ now will contain the ip of the pod with the port 1234 attached on the format where is the ip number found.

With this configuration in Prometheus I should be able to find pods with the proper annotation.

Where should the annotation be added then? I ended up adding it in my Kubernetes deployment template description.

The complete deployment description looks like this:

apiVersion: apps/v1
kind: Deployment
  name: potapi-deployment
  namespace: potapi
    app: potapi
  replicas: 3
      app: potapi
        prometheus.io/scrape: 'true'
        app: potapi
      - name: potapi
        image: potapi-service
        imagePullPolicy: IfNotPresent
        - containerPort: 4567
          name: service
        - containerPort: 1234
          name: metrics

The interesting annotation is added in the template section

-- Thomas Sundberg
