I have a service running in a k8s cluster, which I want to monitor using Prometheus Operator. The service has a /metrics
endpoint, which returns simple data like:
myapp_first_queue_length 12
myapp_first_queue_processing 2
myapp_first_queue_pending 10
myapp_second_queue_length 4
myapp_second_queue_processing 4
myapp_second_queue_pending 0
The API runs in multiple pods, behind a basic Service
object:
apiVersion: v1
kind: Service
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
ports:
- port: 80
name: myapp-api
targetPort: 80
selector:
app: myapp-api
I've installed Prometheus using kube-prometheus
, and added a ServiceMonitor
object:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
selector:
matchLabels:
app: myapp-api
endpoints:
- port: myapp-api
path: /api/metrics
interval: 10s
Prometheus discovers all the pods running instances of the API, and I can query those metrics from the Prometheus graph. So far so good.
The issue is, those metrics are aggregate - each API instance/pod doesn't have its own queue, so there's no reason to collect those values from every instance. In fact it seems to invite confusion - if Prometheus collects the same value from 10 pods, it looks like the total value is 10x what it really is, unless you know to apply something like avg
.
Is there a way to either tell Prometheus "this value is already aggregate and should always be presented as such" or better yet, tell Prometheus to just scrape the values once via the internal load balancer for that service, rather than hitting each pod?
edit
The actual API is just a simple Deployment
object:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
replicas: 2
selector:
matchLabels:
app: myapp-api
template:
metadata:
labels:
app: myapp-api
spec:
imagePullSecrets:
- name: mysecret
containers:
- name: myapp-api
image: myregistry/myapp:2.0
ports:
- containerPort: 80
volumeMounts:
- name: config
mountPath: "app/config.yaml"
subPath: config.yaml
volumes:
- name: config
configMap:
name: myapp-api-config
In your case to avoid metrics aggregation you can use, as already mentioned in your post, avg()
operator to or PodMonitor instead of ServiceMonitor
.
The
PodMonitor
custom resource definition (CRD) allows to declaratively define how a dynamic set of pods should be monitored. Which pods are selected to be monitored with the desired configuration is defined using label selections.
This way it will scrape the metrics from the specified pod only.