Kubernetes AWS Cloudwatch adapter not fetching custom metric value for EKS HPA autoscaling

8/26/2020

I'm trying to enable AWS EKS autoscaling based on a custom Cloudwatch metric via the Kubernetes Cloudwatch adapter. I have pushed custom metrics to AWS Cloudwatch, and validated they appear in Cloudwatch console as well as are retrievable using the boto3 client get_metric_data. This is the code I use to publish my custom metric to Cloudwatch:

import boto3
from datetime import datetime

client = boto3.client('cloudwatch')

cloudwatch_response = client.put_metric_data(
    Namespace='TestMetricNS',
    MetricData=[
        {
            'MetricName': 'TotalUnprocessed',
            'Timestamp': datetime.now(),
            'Value': 40,
            'Unit': 'Megabytes',
        }
    ]
)

I have the following yaml files for establishing the external metric and the hpa autoscaler in kubernetes:

extMetricCustom.yaml:

apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
  name: test-custom-metric
spec:
  name: test-custom-metric
  resource:
    resource: "deployment"
  queries:
    - id: sqs_test
      metricStat:
        metric:
          namespace: "TestMetricNS"
          metricName: "TotalUnprocessed"
        period: 60
        stat: Average
        unit: Megabytes
      returnData: true

hpaCustomMetric.yaml

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: test-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: sqs-consumer
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: External
    external:
      metricName: test-custom-metric
      targetAverageValue: 2

When I assess whether the Kubernetes Cloudwatch adapter is properly grabbing my custom metric (kubectl get hpa), it always displays that the metric is 0:

NAME          REFERENCE                 TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
test-scaler   Deployment/sqs-consumer   0/2 (avg)   1         4         1          161m

How can I properly autoscale based off my Cloudwatch custom metric?

-- rlindeborg
amazon-cloudwatch
amazon-eks
autoscaling
hpa
kubernetes

1 Answer

9/4/2020

Worked with OP on this out-of-band and still had the tab open for this question later in the day, so posting the outcome here for posterity for anyone that stumbles upon it.

The root cause of the issue was a timezone conflict. The metrics monitor was based on "current" metrics, but the the following line from the metric generator script was producing time stamps without a timezone specified and also was in a local timezone.

            'Timestamp': datetime.now(),

Since there was "no data" for the current timezone (only data X hours in the past due to a -X UTC offset), the system did not initiate scaling because there was a value of "0"/nil/null effectively. Instead, a UTC time string can be specified to ensure the generated metrics are timely:

            'Timestamp': datetime.utcnow(),

A secondary consideration was that the Kubernetes Nodes need access to poll the metrics from CloudWatch. This is done by attaching this policy to the nodes's IAM role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudwatch:GetMetricData"
            ],
            "Resource": "*"
        }
    ]
}
-- Chase
Source: StackOverflow