Accessing heapster metrics in Kubernetes code

11/4/2015

I would like to expand/shrink the number of kubelets being used by kubernetes cluster based on resource usage. I have been looking at the code and have some idea of how to implement it at a high level.

I am stuck on 2 things:

  1. What will be a good way for accessing the cluster metrics (via Heapster)? Should I try to use the kubedns for finding the heapster endpoint and directly query the API or is there some other way possible? Also, I am not sure on how to use kubedns to get the heapster URL in the former.

  2. The rescheduler which expands/shrinks the number of nodes will need to kick in every 30 minutes. What will be the best way for it. Is there some interface or something in the code which I can use for it or should I write a code segment which gets called every 30 mins and put it in the main loop?

Any help would be greatly appreciated :)

-- Peeyush
google-compute-engine
kubernetes

1 Answer

11/4/2015

Part 1:

What you said about using kubedns to find heapster and querying that REST API is fine.

You could also write a client interface that abstracts the interface to heapster -- that would help with unit testing.

Take a look at this metrics client: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/metrics/metrics_client.go It doesn't do exactly what you want: it gets per-Pod stats instead of per-cluster or per-node stats. But you could modify it.

In function getForPods, you can see the code that resolves the heapster service and connects to it here:

        resultRaw, err := h.client.Services(h.heapsterNamespace).
            ProxyGet(h.heapsterService, metricPath, map[string]string{"start": startTime.Format(time.RFC3339)}).
            DoRaw()

where heapsterNamespace is "kube-system" and heapsterService is "heapster".

That metrics client is part of the "horizonal pod autoscaler" implementation. It is solving a slightly different problem, but you should take a look at it if you haven't already. If is described here: https://github.com/kubernetes/kubernetes/blob/master/docs/design/horizontal-pod-autoscaler.md

FYI: The Heapster REST API is defined here: https://github.com/kubernetes/heapster/blob/master/docs/model.md You should poke around and see if there are node-level or cluster-level CPU metrics that work for you.

Part 2:

There is no standard interface for shrinking nodes. It is different for each cloud provider. And if you are on-premises, then you can't shrink nodes.

Related discussion: https://github.com/kubernetes/kubernetes/issues/11935

Side note: Among kubernetes developers, we typically use the term "rescheduler" when talking about something that rebalances pods across machines, by removing a pod from one machine and creates the same kind of pod on another machine. That is a different thing than the thing you are talking about building. We haven't built a rescheduler yet, but there is an outline of how to build one here: https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/rescheduler.md

-- Eric Tune
Source: StackOverflow