Heapster stops retrieving CPU metrics from POD when it is under load

1/23/2018

My environment:

  • kubernetes 1.8.4 on AWS, deployed with kops.
  • Heapster 1.5.0 with influxdb sink and 60s metric resolution

When my pods are idle or low traffic everything's fine. My HPAs can get data out of heapster and I can see data on grafana, pulling it out from influxdb.

When I start loadtesting a pod (putting it under some traffic, starting at 10 rqs/second) i stop getting info about the CPU usage in grafana and HPAs start getting this:

Events:
  Type     Reason                        Age               From                       Message
  ----     ------                        ----              ----                       -------
  Warning  FailedGetResourceMetric       2m (x13 over 3h)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from heapster
  Warning  FailedComputeMetricsReplicas  2m (x13 over 3h)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from heapster

After the load finishes, I get almost immediately back my CPU metrics both in influxdb and HPAs. Please note that, during the same period, I never ever lose data about memory usage.

Any help with troubleshooting and/or solving this would be very appreciated.

For the records, I also posted an issue on heapster's github: https://github.com/kubernetes/heapster/issues/1937

-- whites11
heapster
kubernetes

0 Answers