Kubernetes HPA fails to detect a successfully published custom metric from Stackdriver

4/7/2019

I'm trying to scale a Kubernetes Deployment using a HorizontalPodAutoscaler, which listens to a custom metrics through Stackdriver.

I'm having a GKE cluster, with a Stackdriver adapter enabled. I'm able to publish the custom metric type to Stackdriver, and following is the way it's being displayed in Stackdriver's Metric Explorer.

enter image description here

enter image description here

This is how I have defined my HPA:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetValue: 400
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

After successfully creating example-hpa, executing kubectl get hpa example-hpa, always shows TARGETS as <unknown>, and never detects the value from custom metrics.

NAME          REFERENCE                       TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
example-hpa   Deployment/test-app-group-1-1   <unknown>/400   1         10        1          18m

I'm using a Java client which runs locally to publish my custom metrics. I have given the appropriate resource labels as mentioned here (hard coded - so that it can run without a problem in local environment). I have followed this document to create the Java client.

private static MonitoredResource prepareMonitoredResourceDescriptor() {
        Map<String, String> resourceLabels = new HashMap<>();
        resourceLabels.put("project_id", "<<<my-project-id>>>);
        resourceLabels.put("pod_id", "<my pod UID>");
        resourceLabels.put("container_name", "");
        resourceLabels.put("zone", "asia-southeast1-b");
        resourceLabels.put("cluster_name", "my-cluster");
        resourceLabels.put("namespace_id", "mynamespace");
        resourceLabels.put("instance_id", "");

        return MonitoredResource.newBuilder()
                .setType("gke_container")
                .putAllLabels(resourceLabels)
                .build();
    }

What am I doing wrong in the above-mentioned steps please? Thank you in advance for any answers provided!


EDIT [RESOLVED]: I think I have had some misconfigurations, since kubectl describe hpa [NAME] --v=9 showed me some 403 status code, as well as I was using type: External instead of type: Pods (Thanks MWZ for your answer, pointing out this mistake).

I managed to fix it by creating a new project, a new service account, and a new GKE cluster (basically everything from the beginning again). Then I changed my yaml file as follows, exactly as this document explains.

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: test-app-group-1-1
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: test-app-group-1-1
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods                 # Earlier this was type: External
    pods:                      # Earlier this was external:
      metricName: baz                               # metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetAverageValue: 20

I'm now exporting as custom.googleapis.com/baz, and NOT as custom.googleapis.com/worker_pod_metrics/baz. Also, now I'm explicitly specifying the namespace for my HPA in the yaml.

-- Senthuran Ambalavanar
google-cloud-stackdriver
google-kubernetes-engine
kubernetes
stackdriver

2 Answers

4/8/2019

Since you can see your custom metric in Stackdriver GUI I'm guessing metrics are correctly exported. Based on Autoscaling Deployments with Custom Metrics I believe you wrongly defined metric to be used by HPA to scale the deployment.

Please try using this YAML:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metricName: baz
      targetAverageValue: 400
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

Please have in mind that:

The HPA uses the metrics to compute an average and compare it to the target average value. In the application-to-Stackdriver export example, a Deployment contains Pods that export metric. The following manifest file describes a HorizontalPodAutoscaler object that scales a Deployment based on the target average value for the metric.

Troubleshooting steps described on the page above can also be useful.

Side-note Since above HPA is using beta API autoscaling/v2beta1 I got error when running kubectl describe hpa [DEPLOYMENT_NAME]. I ran kubectl describe hpa [DEPLOYMENT_NAME] --v=9 and got response in JSON.

-- MWZ
Source: StackOverflow

4/10/2019

It is a good practice to put some unique labels to target your metrics. Right now, based on metrics labelled in your java client, only pod_id looks unique which can't be used due to its stateless nature.

So, I would suggest you try introducing a deployment/metrics wide unqiue identifier.

resourceLabels.put("<identifier>", "<could-be-deployment-name>");

After this, you can try modifying your HPA with something similar to following:

kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: custom.googleapis.com|worker_pod_metrics|baz
      metricSelector:
        matchLabels:
          # define labels to target
          metric.labels.identifier: <deployment-name>
      # scale +1 whenever it crosses multiples of mentioned value
      targetAverageValue: "400"
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

Apart from this, this setup has no issues and should work smooth.

Helper command to see what metrics are exposed to HPA :

 kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|worker_pod_metrics|baz" | jq
-- daemonsl
Source: StackOverflow