I want my prometheus server to scrape metrics from a pod.
I followed these steps:
kubectl apply -f sample-app.deploy.yaml
kubectl apply -f sample-app.service.yaml
helm upgrade -i prometheus prometheus-community/prometheus -f prometheus-values.yaml
kubectl apply -f service-monitor.yaml
to add a target for prometheus.All pods are running, but when I open prometheus dashboard, I don't see sample-app service as prometheus target, under status>targets in dashboard UI.
I've verified following:
sample-app
when I execute kubectl get servicemonitors
I can see sample-app exposes metrics in prometheus format under at /metrics
At this point I debugged further, entered into the prometheus pod using
kubectl exec -it pod/prometheus-server-65b759cb95-dxmkm -c prometheus-server sh
, and saw that proemetheus configuration (/etc/config/prometheus.yml) didn't have sample-app as one of the jobs so I edited the configmap using
kubectl edit cm prometheus-server -o yaml
Added
- job_name: sample-app
static_configs:
- targets:
- sample-app:8080
Assuming all other fields such as scraping interval, scrape_timeout stays default.
I can see the same has been reflected in /etc/config/prometheus.yml, but still prometheus dashboard doesn't show sample-app
as targets under status>targets.
following are yamls for prometheus-server and service monitor.
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
autopilot.gke.io/resource-adjustment: '{"input":{"containers":[{"name":"prometheus-server-configmap-reload"},{"name":"prometheus-server"}]},"output":{"containers":[{"limits":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"requests":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"name":"prometheus-server-configmap-reload"},{"limits":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"requests":{"cpu":"500m","ephemeral-storage":"1Gi","memory":"2Gi"},"name":"prometheus-server"}]},"modified":true}'
deployment.kubernetes.io/revision: "1"
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: prom
creationTimestamp: "2021-06-24T10:42:31Z"
generation: 1
labels:
app: prometheus
app.kubernetes.io/managed-by: Helm
chart: prometheus-14.2.1
component: server
heritage: Helm
release: prometheus
name: prometheus-server
namespace: prom
resourceVersion: "6983855"
selfLink: /apis/apps/v1/namespaces/prom/deployments/prometheus-server
uid: <some-uid>
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: prometheus
component: server
release: prometheus
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: prometheus
chart: prometheus-14.2.1
component: server
heritage: Helm
release: prometheus
spec:
containers:
- args:
- --volume-dir=/etc/config
- --webhook-url=http://127.0.0.1:9090/-/reload
image: jimmidyson/configmap-reload:v0.5.0
imagePullPolicy: IfNotPresent
name: prometheus-server-configmap-reload
resources:
limits:
cpu: 500m
ephemeral-storage: 1Gi
memory: 2Gi
requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 2Gi
securityContext:
capabilities:
drop:
- NET_RAW
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
readOnly: true
- args:
- --storage.tsdb.retention.time=15d
- --config.file=/etc/config/prometheus.yml
- --storage.tsdb.path=/data
- --web.console.libraries=/etc/prometheus/console_libraries
- --web.console.templates=/etc/prometheus/consoles
- --web.enable-lifecycle
image: quay.io/prometheus/prometheus:v2.26.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /-/healthy
port: 9090
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 15
successThreshold: 1
timeoutSeconds: 10
name: prometheus-server
ports:
- containerPort: 9090
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /-/ready
port: 9090
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 4
resources:
limits:
cpu: 500m
ephemeral-storage: 1Gi
memory: 2Gi
requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 2Gi
securityContext:
capabilities:
drop:
- NET_RAW
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/config
name: config-volume
- mountPath: /data
name: storage-volume
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 65534
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
serviceAccount: prometheus-server
serviceAccountName: prometheus-server
terminationGracePeriodSeconds: 300
volumes:
- configMap:
defaultMode: 420
name: prometheus-server
name: config-volume
- name: storage-volume
persistentVolumeClaim:
claimName: prometheus-server
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2021-06-24T10:43:25Z"
lastUpdateTime: "2021-06-24T10:43:25Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2021-06-24T10:42:31Z"
lastUpdateTime: "2021-06-24T10:43:25Z"
message: ReplicaSet "prometheus-server-65b759cb95" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
yaml for service Monitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"monitoring.coreos.com/v1","kind":"ServiceMonitor","metadata":{"annotations":{},"creationTimestamp":"2021-06-24T07:55:58Z","generation":1,"labels":{"app":"sample-app","release":"prometheus"},"name":"sample-app","namespace":"prom","resourceVersion":"6884573","selfLink":"/apis/monitoring.coreos.com/v1/namespaces/prom/servicemonitors/sample-app","uid":"34644b62-eb4f-4ab1-b9df-b22811e40b4c"},"spec":{"endpoints":[{"port":"http"}],"selector":{"matchLabels":{"app":"sample-app","release":"prometheus"}}}}
creationTimestamp: "2021-06-24T07:55:58Z"
generation: 2
labels:
app: sample-app
release: prometheus
name: sample-app
namespace: prom
resourceVersion: "6904642"
selfLink: /apis/monitoring.coreos.com/v1/namespaces/prom/servicemonitors/sample-app
uid: <some-uid>
spec:
endpoints:
- port: http
selector:
matchLabels:
app: sample-app
release: prometheus
You need to use the prometheus-community/kube-prometheus-stack
chart, which includes the Prometheus operator, in order to have Prometheus' configuration update automatically based on ServiceMonitor resources.
The prometheus-community/prometheus
chart you used does not include the Prometheus operator that watches for ServiceMonitor resources in the Kubernetes API and updates the Prometheus server's ConfigMap accordingly.
It seems that you have the necessary CustomResourceDefinitions (CRDs) installed in your cluster, otherwise you would not have been able to create a ServiceMonitor resource. These are not included in the prometheus-community/prometheus
chart so perhaps they were added to your cluster previously.