I'm looking to build a PromQL expression that could be added into Grafana and Prometheus Alertmanager and would show the time required for a container running within a Pod in Kubernetes to succeed it's very first readiness Probe. I expect it to show a new value on each new deployment or Pod/container restart.
Metrics that I'm trying to make use of:
prober_probe_total{pod="foo", probe_type="Readiness", result="successful"}
- counter, total number of successful probescontainer_start_time_seconds{pod="foo",container="bar"}
- gauge(?), timestamp, updates on Pod or container restartBoth of those metrics come from kubelet
and show useful information separately.
Here's my current failing attempt (gives no results):
timestamp(prober_probe_total{pod="foo", probe_type="Readiness", result="successful"} == 0)
-
container_start_time_seconds{pod="bar", container="bar"}
There're many issues with my approach:
1. timestamp()
doesn't work the way I expect it to work
2. There can be container or Pod restarts that are not taken in account in my query
Are there any better approaches to achieve the desired result via metrics and PromQL?