How to obtain real time kubernetes pod metrics with a poll interval of 2secs?

11/19/2021

My use case is to make a kubernetes pod's metrics be available for consumption every 2 secs. It is required to poll the pod at this interval to maintain a healthy control plane (determine pods that are possibly choked and avoid routing traffic to those endpoints).

I'm using metrics-server right now, but it is not best suited for my use case. I came across the note below, here.

Metrics Server is not meant for non-autoscaling purposes. For example, don't use it to forward metrics to monitoring solutions, or as a source of monitoring solution metrics. In such cases please collect metrics from Kubelet /metrics/resource endpoint directly.

How often metrics are scraped? Default 60 seconds, can be changed using metric-resolution flag. We are not recommending setting values below 15s, as this is the resolution of metrics calculated by Kubelet.

  1. How should one use the kubelet metrics endpoint directly? All examples I've come across use metrics.k8s.io.
  2. The other approach is to get /sys/fs/cgroup/cpu/cpuacct.usage reading from the docker containers directly, but there needs to be an aggregation layer. How should one design this stats aggregation layer?

Are there other approaches? What is the best recommended way to address my requirement? Thanks.

-- Raji
docker-container
kubelet
kubernetes
kubernetes-pod
metrics-server

2 Answers

12/2/2021

I would like to extend a bit @testfile answer.

Adding metics endpoint on your workloads, scraping using prometheus and finally implementing the prometheus adapter to plug into HPA is a good idea.

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.

Prometheus scrapes metrics from instrumented jobs, either directly or via an intermediary push gateway for short-lived jobs.

Here and here you can find how to start with Prometheus.

Based on this documentation Prometheus Adapter can replace the metrics server on clusters that already run Prometheus and collect the appropriate metrics.

See also this article.

-- kkopczak
Source: StackOverflow

11/19/2021

I think the easiest would be to add a /metrics endpoint on your workloads and then scrape that using prometheus and the implement the prometheus adapter to plug into HPA.

-- testfile
Source: StackOverflow