I am trying to expose custom metrics from k8s application for consumption by the Horizontal Pod Autoscaler. The metrics is app-wide rather than per-instance (chiefly, backlog work queue length, the same for all worker instances and coming from the shared backend), so I split the application into two deployments: k8worker is a "worker bee" instances that are to get autoscaled, and k8worker-metrics is a separate deployment to provide metrics. The purpose of the split is to maintain a limited number of k8worker-metrics instances, so that if k8worker gets scaled to 30 instances, Prometheus does not have to call 30 times to get the very same metrics. I intend to keep the number of k8worker-metrics instances to 2 or 3 (but more than one for the resilience sake).
I was unable to integrate metrics into HPA though.
I tried at first to base HPA manifest off the metrics scraped from k8worker-metrics pods. I can see the metrics readings coming to custom.metrics.k8s.io:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
....
{
"name": "pods/k8worker_work_queue_length",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
....
]
}
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/k8worker_work_queue_length" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/k8worker_work_queue_length"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "k8worker-metrics-758656df7-x67mh",
"apiVersion": "/v1"
},
"metricName": "k8worker_work_queue_length",
"timestamp": "2020-03-03T04:06:47Z",
"value": "8"
},
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "k8worker-metrics-758656df7-58dxb",
"apiVersion": "/v1"
},
"metricName": "k8worker_work_queue_length",
"timestamp": "2020-03-03T04:06:47Z",
"value": "8"
}
]
}
However if I declare a reference to the metrics in HPA manifest as
- type: Object
object:
metric:
name: k8worker_work_queue_length
describedObject:
apiVersion: custom.metrics.k8s.io/v1beta1
kind: Pod
name: k8worker-metrics
target:
type: Value
value: 4
then I get a message
$ kubectl describe hpa
....
"FailedGetObjectMetric ... unable to get metric k8worker_work_queue_length: Pod on default k8worker-metrics/unable to fetch metrics from custom metrics API: unable to map kind Pod.custom.metrics.k8s.io to resource: no matches for kind "Pod" in group "custom.metrics.k8s.io"
I also tried to wrap up k8worker-metrics pods as a service (k8worker-metrics-svc), hoping I might be able to refer to it in the HPA manifest as
- type: Object
object:
metric:
name: k8worker_work_queue_length
describedObject:
apiVersion: custom.metrics.k8s.io/v1beta1
kind: Service
name: k8worker-metrics-svc
target:
type: Value
value: 4
But service readings do not appear in /apis/custom.metrics.k8s.io:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq . | grep queue_length
"name": "namespaces/k8worker_work_queue_length",
"name": "pods/k8worker_work_queue_length",
"name": "jobs.batch/k8worker_work_queue_length",
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/k8worker-metrics-svc/k8worker_work_queue_length"
Error from server (NotFound): the server could not find the metric k8worker_work_queue_length for services
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq . | grep serv
[empty output]
Even though the service and its readings are visible in Prometheus UI (to capture the screenshots below, I left both the pod and the service annotated for scraping):
An advice would be very much appreciated.