I have setup statsd-exporter to scrape metric from gunicorn web server. My goal is to filter request duration metric only for successful request(non 5xx), however in statsd-exporter there is no way to tag status code in duration metric. Can anyone suggest a way to add status code in request duration metric or a way to filter only successful request duration in prometheus.
In particular I want to extract successful request duration hitogram from statsd-exporter to prometheus.
To export successful request duration histogram metrics from gunicorn web server to prometheus you would need to add this functionality in gunicorn sorcecode.
First take a look at the code that exports statsd metrics here. You should see this peace of code:
status = resp.status
...
self.histogram("gunicorn.request.duration", duration_in_ms)
By changing the code to sth like this:
self.histogram("gunicorn.request.duration.%d" % status, duration_in_ms)
from this moment you will have metrics names exported with status codes like gunicorn_request_duration_200
or gunicorn_request_duration_404
etc.
You can also modify it a little bit and move status codes to label by adding a configuration like below to your statsd_exporter
:
mappings:
- match: gunicorn.request.duration.*
name: "gunicorn_http_request_duration"
labels:
status: "$1"
job: "gunicorn_request_duration"
So your metrics will now look like this:
# HELP gunicorn_http_request_duration Metric autogenerated by statsd_exporter.
# TYPE gunicorn_http_request_duration summary
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.5"} 2.4610000000000002e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.9"} 2.4610000000000002e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="200",quantile="0.99"} 2.4610000000000002e-06
gunicorn_http_request_duration_sum{job="gunicorn_request_duration",status="200"} 2.4610000000000002e-06
gunicorn_http_request_duration_count{job="gunicorn_request_duration",status="200"} 1
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.5"} 3.056e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.9"} 3.056e-06
gunicorn_http_request_duration{job="gunicorn_request_duration",status="404",quantile="0.99"} 3.056e-06
gunicorn_http_request_duration_sum{job="gunicorn_request_duration",status="404"} 3.056e-06
gunicorn_http_request_duration_count{job="gunicorn_request_duration",status="404"} 1
And now to query all metrics except these with 5xx status in prometheus you can run:
gunicorn_http_request_duration{status=~"[^5].*"}
Let me know if it was helpful.