I'm trying to create prometheus with operator in fresh new k8s cluster I use the following files ,
apiVersion: apps/v1beta2
kind: Deployment
metadata:
labels:
k8s-app: prometheus-operator
name: prometheus-operator
namespace: monitoring
spec:
replicas: 2
selector:
matchLabels:
k8s-app: prometheus-operator
template:
metadata:
labels:
k8s-app: prometheus-operator
spec:
priorityClassName: "operator-critical"
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
containers:
- args:
- --kubelet-service=kube-system/kubelet
- --logtostderr=true
- --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
- --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
image: quay.io/coreos/prometheus-operator:v0.29.0
name: prometheus-operator
ports:
- containerPort: 8080
name: http
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
nodeSelector:
serviceAccountName: prometheus-operator
Now I want to apply this file (CRD)
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
namespace: monitoring
labels:
prometheus: prometheus
spec:
replica: 1
priorityClassName: "operator-critical"
serviceAccountName: prometheus
nodeSelector:
worker.garden.sapcloud.io/group: operator
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
role: observeable
tolerations:
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoSchedule"
- key: "WorkGroup"
operator: "Equal"
value: "operator"
effect: "NoExecute"
before I've created those CRD
https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd
The problem that the pods didn't able to start (0/2), see the picture below. What could be the problem? please advice
update
when I go to the event of the prom operator I see the following Error creating: pods "prometheus-operator-6944778645-" is forbidden: no PriorityClass with name operator-critical was found replicaset-controller
, any idea ?
You are trying to reference the operator-critical
priority class. Priority classes determine the priority of pods and their resource assignment.
To fix this issue you could either remove the explicit priority class(priorityClassName: "operator-critical"
) in both files or create the operator-critical
class:
apiVersion: scheduling.k8s.io/v1beta1
kind: PriorityClass
metadata:
name: operator-critical
value: 1000000
globalDefault: false
description: "Critical operator workloads"
Prometheus and alert manager pods need persistent volume to store the data. Make sure those pv's are present and are bound to the respective pods. Alternatively you can make those pods ephemeral. It should work