I am able to scrape Prometheus metrics from a Kubernetes service using this Prometheus job configuration:
- job_name: 'prometheus-potapi'
static_configs:
- targets: ['potapi-service.potapi:1234']
It uses Kubernetes DNS and it gives me the metrics from any of my three pods I use for my service.
I would like to see the result for each pod.
I am able to see the data I want using this configuration:
- job_name: 'prometheus-potapi-pod'
static_configs:
- targets: ['10.1.0.126:1234']
I have searched and experimented using the service discovery mechanism available in Prometheus. Unfortunately, I don't understand how it should be setup. The service discovery reference isn't really helpful if you don't know how it works.
I am looking for an example where the job using the IP number is replaced with some service discovery mechanism. Specifying the IP was enough for me to see that the data I'm looking for is exposed.
The pods I want to scrape metrics from all live in the same namespace, potapi
.
The metrics are always exposed through the same port, 1234
.
Finally, the are all named like this:
potapi-deployment-754d96f855-lkh4x
potapi-deployment-754d96f855-pslgg
potapi-deployment-754d96f855-z2zj2
When I do
kubectl describe pod potapi-deployment-754d96f855-pslgg -n potapi
I get this description:
Name: potapi-deployment-754d96f855-pslgg
Namespace: potapi
Node: docker-for-desktop/192.168.65.3
Start Time: Tue, 07 Aug 2018 14:18:55 +0200
Labels: app=potapi
pod-template-hash=3108529411
Annotations: <none>
Status: Running
IP: 10.1.0.127
Controlled By: ReplicaSet/potapi-deployment-754d96f855
Containers:
potapi:
Container ID: docker://72a0bafbda9b82ddfc580d79488a8e3c480d76a6d17c43d7f7d7ab18458c56ee
Image: potapi-service
Image ID: docker://sha256:d64e94c2dda43c40f641008c122e6664845d73cab109768efa0c3619cb0836bb
Ports: 4567/TCP, 4568/TCP, 1234/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Tue, 07 Aug 2018 14:18:57 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4fttn (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-4fttn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4fttn
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
How would you rewrite the job definition given these prerequisites?
Here they use example.io/scrape=true
(and similar annotations for specifying the scrape port and the scrape path if it's not /metrics
), which is how one achieves the "autodiscovery" part.
If you apply that annotation -- and the relevant config snippets in the Prom config -- to a Service
, then Prom will scrape the port and path on the Service
, meaning you will have stats for the Service
itself, and not the individual Endpoints behind it. Similarly, if you label the Pod
s, you will gather metrics for the Pod
s but they would need to be rolled up to have a cross-Pod
view of the state of affairs. There are multiple different resource types that can be autodiscovered, including node and ingress, also. They all behave similarly.
Unless you have grave CPU or storage concerns for your Prom instance, I absolutely wouldn't enumerate the scrape targets in the config like that: I would use the scrape annotations, meaning you can change who is scraped, what port, etc. without having to reconfigure Prom each time.
Be aware that if you want to use their example as-is, and you want to apply those annotations from within the kubernetes resource YAML, ensure that you quote the : 'true'
value, otherwise YAML will promote that to be a boolean literal, and kubernetes annotations can only be string values.
Applying the annotations from the command line will work just fine:
kubectl annotate pod -l app=potapi example.io/scrape=true
(BTW, they use example.io/
in their example, but there is nothing special about that string except it namespaces the scrape
part to keep it from colliding with something else named scrape
. So feel free to use your organization's namespace if you wish to avoid having something weird named example.io/
in your cluster)
I ended up with this solution:
...
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__]
action: replace
regex: ([^:]+)(?::\d+)?
replacement: $1:1234
target_label: __address__
...
There are two parts.
Check for an annotation prometheus.io/scrape
with the value 'true'
. It is done in the first source_labels
. It may not be self evident that prometheus_io_scrape
translates to prometheus.io/scrape
Get the adress and add the desired port to it. It is done on the second source_labels
. The __address__
source will be queried for a host name or ip number. In this case, a ip number is found using the cryptic regex ([^:]+)(?::\d+)?
. The port I want to use is ´1234´ so I hardcoded it in replacement:
The result is that the __address__
now will contain the ip of the pod with the port 1234
attached on the format 10.1.0.172:1234
where 10.1.0.172
is the ip number found.
With this configuration in Prometheus I should be able to find pods with the proper annotation.
Where should the annotation be added then? I ended up adding it in my Kubernetes deployment template description.
The complete deployment description looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: potapi-deployment
namespace: potapi
labels:
app: potapi
spec:
replicas: 3
selector:
matchLabels:
app: potapi
template:
metadata:
annotations:
prometheus.io/scrape: 'true'
labels:
app: potapi
spec:
containers:
- name: potapi
image: potapi-service
imagePullPolicy: IfNotPresent
ports:
- containerPort: 4567
name: service
- containerPort: 1234
name: metrics
The interesting annotation is added in the template
section