How to use latency of a service deployed on Kubernetes to Scale the deployment?

10/18/2019

I have a simple spring boot application deployed on Kubernetes on GCP. The service is exposed to an external IP address. I am load testing this application using JMeter. It is just a http GET request which returns True or False.

I want to get the latency metrics with time to feed it to HorizontalPodAutoscaler to implement custom auto-scaler. How do I implement this?

-- anushiya-thevapalan
google-cloud-platform
google-cloud-stackdriver
kubernetes
latency
prometheus

2 Answers

10/18/2019

Since you mentioned Custom Auto Scaler. I would suggest this simple solution which makes use of some of tools which you already might have.

First Part: Is to Create a service or cron or any time-based trigger which will on a regular interval make requests to your deployed application. This application will then store the resultant metrics to persistence storage or file or Database etc.

For example, if you use a simple Apache Benchmark CLI tool(you can also use Jmeter or any other load testing tool which generates structured o/p), You will get a detailed result for a single query. Use this link to get around the result for your reference.

Second Part Is that this same script can also trigger another event which will check for the latency or response time limit configured as per your requirement. If the response time is above the configured value scale if it is below scale down.

The logic for scaling down can be more trivial, But I will leave that to you.

Now for actually scaling the deployment, you can use the Kubernetes API. You can refer to the official doc or this answer for details. Here's a simple flow diagram.

enter image description here

-- damitj07
Source: StackOverflow

10/18/2019

There are two ways to auto scale with custom metrics:

1.You can export a custom metric from every Pod in the Deployment and target the average value per Pod.
2.You can export a custom metric from a single Pod outside of the Deployment and target the total value.

So follow these-

1. To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter. To run Custom Metrics Adapter, you must grant your user the ability to create required authorization roles by running the following command:

kubectl create clusterrolebinding cluster-admin-binding \
    --clusterrole cluster-admin --user "$(gcloud config get-value account)"

To deploy adapter-

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
  1. You can export your metrics to Stackdriver either directly from your application, or by exposing them in Prometheus format and adding the Prometheus-to-Stackdriver adapter to your Pod's containers.

You can view the exported metrics from the Metrics Explorer by searching for custom/[METRIC_NAME]

Your metric needs to meet the following requirements:

  • Metric kind must be GAUGE
  • Metric type can be either DOUBLE or INT64
  • Metric name must start with custom.googleapis.com/ prefix, followed by a simple name
  • Resource type must be "gke_container"
  • Resource labels must include:
    • pod_id set to Pod UID, which can be obtained via the Downward API
    • container_name = ""
    • project_id, zone, cluster_name, which can be obtained by your application from the metadata server. To get values, you can use Google Cloud's compute metadata client.
  • namespace_id, instance_id, which can be set to any value. 3.Once you have exported metrics to Stackdriver, you can deploy a HPA to scale your Deployment based on the metrics.

Vie this on GitHub for additional codes

-- Rishit Dagli
Source: StackOverflow