Kubernetes HPA with metrics from Prometheus Node-Exporter

10/12/2018

I have a kubernetes v1.12.1 cluster running some of my workloads. I would like to setup HPA in such that I can scale a particular POD based on metrics coming from Prometheus Node-Exporter.

My first question is, is it even possible to do HPA on metrics outside of the 'POD' metric namespace? If so, then here's the rest of what I am trying to do. I have setup Prometheus Node-Exporter to collect machine/node metrics and send them to Prometheus. Prometheus is sending these via the prometheus adapter to Kubernetes. I want to perform POD autoscaling based on one of these node metric values.

For example if: node_netstat_Udp_NoPorts >= '1', I will want to scale out an additional POD. Another one if node_sockstat_udp_mem >= '87380' I also want to scale out and perform a slight kernel level modification to the host.

The problem I am having is that I can not find ANY example on how to setup HPA for POD in which the custom metric is not apart of the 'POD' metrics namespace.

As you can see in my API get command below, those metrics are exposed to me.

<pre>
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1|jq .|grep -i udp
  "name": "jobs.batch/node_netstat_Udp_InErrors",
  "name": "roles.rbac.authorization.k8s.io/node_netstat_Udp6_NoPorts",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_UdpLite6_InErrors",
  "name": "jobs.batch/node_netstat_Udp_InDatagrams",
  "name": "jobs.batch/node_sockstat_UDP_mem_bytes",
  "name": "jobs.batch/node_sockstat_UDP_inuse",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_Udp_InDatagrams",
  "name": "jobs.batch/node_sockstat_UDP_mem",
  "name": "jobs.batch/node_netstat_Udp_NoPorts",
  "name": "roles.rbac.authorization.k8s.io/node_sockstat_UDP_mem",
  "name": "roles.rbac.authorization.k8s.io/node_netstat_Udp_NoPorts",
  "name": "jobs.batch/node_netstat_Udp6_OutDatagrams",
  "name": "jobs.batch/node_netstat_Udp6_NoPorts",
  "name": "jobs.batch/node_netstat_UdpLite6_InErrors",
  "name": "roles.rbac.authorization.k8s.io/node_netstat_Udp6_InErrors",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_Udp6_InDatagrams",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_Udp6_OutDatagrams",
  "name": "roles.rbac.authorization.k8s.io/node_sockstat_UDP_inuse",
  "name": 
"roles.rbac.authorization.k8s.io/node_sockstat_UDP_mem_bytes",
  "name": "jobs.batch/node_netstat_Udp6_InDatagrams",
  "name": "jobs.batch/node_netstat_Udp_OutDatagrams",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_UdpLite_InErrors",
  "name": "jobs.batch/node_netstat_UdpLite_InErrors",
  "name": 
"roles.rbac.authorization.k8s.io/node_sockstat_UDPLITE_inuse",
  "name": "jobs.batch/node_netstat_Udp6_InErrors",
  "name": 
"roles.rbac.authorization.k8s.io/node_netstat_Udp_OutDatagrams",
  "name": "jobs.batch/node_sockstat_UDPLITE_inuse",
  "name": "roles.rbac.authorization.k8s.io/node_netstat_Udp_InErrors"
</pre>

I just do not understand how to add one of them to a HPA descriptor:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: atl
namespace: blackhole
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: awesome-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource: ????????
name: ???????????
target: ???????????

If anyone could help point me in the right direction that would be great.

Thanks!

-- Rory Savage
kubernetes
prometheus

1 Answer

10/12/2018

The documentation is a bit sketchy but I belive you would use something like this:

apiVersion: autoscaling/v2beta2                                                                                                   
kind: HorizontalPodAutoscaler                                                                                                     
metadata:                                                                                                                         
  name: atl                                                                                                                       
  namespace: blackhole                                                                                                             
spec:                                                                                                                             
  scaleTargetRef:                                                                                                                 
    apiVersion: apps/v1                                                                                                           
    kind: Deployment                                                                                                              
    name: awesome-app                                                                                                                   
  minReplicas: 1                                                                                                                  
  maxReplicas: 10                                                                                                                 
  metrics:
  - type: Object
    object:
      metric:
        name: node_sockstat_UDP_inuse
      describedObject:
        apiVersion: extensions/v1beta1
         kind: Job
        name: your-job-name
      target:
        kind: Value
        value: 20                                                                                                                     

As per the docs type: Resource are by default limited to cpu and memory metrics.

-- Rico
Source: StackOverflow