Custom cloudwatch metrics EKS CloudWatch Agent

9/10/2019

I have set up container insights as described in the Documentation

Is there a way to remove some of the metrics sent over to CloudWatch ?

Details :

I have a small cluster ( 3 client facing namespaces, ~ 8 services per namespace ) with some custom monitoring, logging, etc in their own separate namespaces, and I just want to use CloudWatch for critical client facing metrics.

The problem I am having is that the Agent sends over 500 metrics to CloudWatch, where I am really only interested in a few of the important ones, especially as AWS bills per metric.

Is there any way to limit which metrics get sent to CloudWatch?

It would be especially helpful if I could only sent metrics from certain namespaces, for example, exclude the kube-system namespace

My configmap is:

  cwagentconfig.json: |
    {
      "logs": {
        "metrics_collected": {
          "kubernetes": {
            "cluster_name": "*****",
            "metrics_collection_interval": 60
          }
        },
        "force_flush_interval": 5
      }
    }

I have searched for a while now, but clouldn't really find anything on:

        "metrics_collected": {
          "kubernetes": {
-- devops to dev
amazon-cloudwatch-metrics
eks
kubernetes

1 Answer

10/4/2019

I've looked as best I can and you're right, there's little or nothing to find on this topic. Before I make the obvious-but-unhelpful suggestions of either using Prometheus or asking on the AWS forums, a quick look at what the CloudWatch agent actually does.

The Cloudwatch agent gets container metrics either from from cAdvisor, which runs as part of kubelet on each node, or from the kubernetes metrics-server API (which also gets it's metrics from kubelet and cAdvisor). cAdvisor is well documented, and it's likely that the Cloudwatch agent uses the Prometheus format metrics cAdvisor produces to construct it's own list of metrics.

That's just a guess though unfortunately, since the Cloudwatch agent doesn't seem to be open source. That also means it may be possible to just set a 'measurement' option within the kubernetes section and select metrics based on Prometheus metric names, but probably that's not supported. (if you do ask AWS, the Premium Support team should keep an eye on the forums, so you might get lucky and get an answer without paying for support)

So, if you can't cut down metrics created by Container Insights, what are your other options? Prometheus is easy to deploy, and you can set up recording rules to cut down on the number of metrics it actually saves. It doesn't push to Cloudwatch by default, but you can keep the metrics locally if you have some space on your node for it, or use a remote storage service like MetricFire (the company I work for, to be clear!) which provides Grafana to go along with it. You can also export metrics from Cloudwatch and use Prometheus as your single source of truth, but that means more storage on your cluster.

If you prefer to view your metrics in Cloudwatch, there are tools like Prometheus-to-cloudwatch which actually scrape Prometheus endpoints and send data to Cloudwatch, much like (I'm guessing) the Cloudwatch Agent does. This service actually has include and exclude settings for deciding which metrics are sent to Cloudwatch.

I've written a blog post on EKS Architecture and Monitoring in case that's of any help to you. Good luck, and let us know which option you go for!

-- Shevaun Frazier
Source: StackOverflow