Prometheus statsd-exporter - how to tag status code in request duration metric (histogram)

12/19/2019

I have setup statsd-exporter to scrape metric from gunicorn web server. My goal is to filter request duration metric only for successful request(non 5xx), however in statsd-exporter there is no way to tag status code in duration metric. Can anyone suggest a way to add status code in request duration metric or a way to filter only successful request duration in prometheus.

In particular I want to extract successful request duration hitogram from statsd-exporter to prometheus.

-- Santosh Kumar
kubernetes
prometheus
statsd

1 Answer

12/19/2019

To export successful request duration histogram metrics from gunicorn web server to prometheus you would need to add this functionality in gunicorn sorcecode.

First take a look at the code that exports statsd metrics here. You should see this peace of code:

status = resp.status
...
self.histogram("gunicorn.request.duration", duration_in_ms)

By changing the code to sth like this:

self.histogram("gunicorn.request.duration.%d" % status, duration_in_ms)

from this moment you will have metrics names exported with status codes like gunicorn_request_duration_200 or gunicorn_request_duration_404 etc.

You can also modify it a little bit and move status codes to label by adding a configuration like below to your statsd_exporter:

mappings:
  - match: gunicorn.request.duration.*
    name: "gunicorn_http_request_duration"
    labels:
      status: "$1"
      job: "gunicorn_request_duration"

So your metrics will now look like this:

# HELP gunicorn_http_request_duration Metric autogenerated by statsd_exporter.
# TYPE gunicorn_http_request_duration summary
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.5"} 2.4610000000000002e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.9"} 2.4610000000000002e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.99"} 2.4610000000000002e-06
gunicorn_http_request_duration_sum{job="gunicorn_request_duration",status="200"} 2.4610000000000002e-06
gunicorn_http_request_duration_count{job="gunicorn_request_duration",status="200"} 1
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.5"} 3.056e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.9"} 3.056e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.99"} 3.056e-06
gunicorn_http_request_duration_sum{job="gunicorn_request_duration",status="404"} 3.056e-06
gunicorn_http_request_duration_count{job="gunicorn_request_duration",status="404"} 1

And now to query all metrics except these with 5xx status in prometheus you can run:

gunicorn_http_request_duration{status=~"[^5].*"}

Let me know if it was helpful.

-- HelloWorld
Source: StackOverflow