{__name__="kube_pod_status_phase",namespace="mynamespace",phase="Failed"}
When I run the aforementioned PromQL query, it also returns pods that have never failed (what I see in Kubernetes). The pod is in running status and has never failed or restarted. How is Prometheus marking it as "Failed"?
According to this, the kube_pod_status_phase
metric is an enum metric, and exposes a time-series for each possible state. However, only the one the Pod is in has the value of 1
.
So if you want to see actually Failed
Pods, execute this query:
kube_pod_status_phase{namespace="mynamespace",phase="Failed"} == 1