How to build a PromQL expression for a time Kubernetes container takes from start to the first successful probe?

8/25/2021

I'm looking to build a PromQL expression that could be added into Grafana and Prometheus Alertmanager and would show the time required for a container running within a Pod in Kubernetes to succeed it's very first readiness Probe. I expect it to show a new value on each new deployment or Pod/container restart.

Metrics that I'm trying to make use of:

  • prober_probe_total{pod="foo", probe_type="Readiness", result="successful"} - counter, total number of successful probes
  • container_start_time_seconds{pod="foo",container="bar"} - gauge(?), timestamp, updates on Pod or container restart

Both of those metrics come from kubelet and show useful information separately.

Here's my current failing attempt (gives no results):

timestamp(prober_probe_total{pod="foo", probe_type="Readiness", result="successful"} == 0)
-
container_start_time_seconds{pod="bar", container="bar"}

There're many issues with my approach: 1. timestamp() doesn't work the way I expect it to work 2. There can be container or Pod restarts that are not taken in account in my query

Are there any better approaches to achieve the desired result via metrics and PromQL?

-- Brad Tris
kubernetes
kubernetes-pod
metrics
prometheus
promql

0 Answers