We are receiving KubepiLatencyHigh error in opsgenie

4/27/2020

Hi we are getting below error in opsgenie Error: The API server has a 99th percentile latency of 5.21555555555553 seconds for GET nodes.

So please help on to resolve this issue

Description: 1.[INT_Prometheus]: [FIRING:3] KubeAPILatencyHigh (https apiserver default monitoring/k8s 0.99 kubernetes 3)

Source /#/alerts?receiver=opsgenie Integration INT_PROMETHEUS (Prometheus) Responders FCA_EMEA Owner Team FCA_EMEA Alias - alertname = KubeAPILatencyHigh - endpoint = https - job = apiserver - namespace = default - prometheus = monitoring/k8s - quantile = 0.99 - resource = nodes - scope = cluster - service = kubernetes - severity = 3 - verb = GET Last Updated At Apr 28, 2020 7:59 AM Description Alerts Firing: Labels: - alertname = KubeAPILatencyHigh - endpoint = https - job = apiserver - namespace = default - prometheus = monitoring/k8s - quantile = 0.99 - resource = nodes - scope = cluster - service = kubernetes - severity = 3 - verb = GET Annotations: - message = The API server has a 99th percentile latency of 5.21555555555553 seconds for GET nodes. S Labels: - alertname = KubeAPILatencyHigh - endpoint = https - job = apiserver - namespace = default - prometheus = monitoring/k8s - quantile = 0.99 - resource = pods - scope = namespace - service = kubernetes - severity = 3 - verb = GET Annotations: - message = The API server has a 99th percentile latency of 8 seconds for GET pods.

Labels: - alertname = KubeAPILatencyHigh - endpoint = https - job = apiserver - namespace = default - prometheus = monitoring/k8s - quantile = 0.99 - resource = pods - scope = namespace - service = kubernetes - severity = 3 - subresource = status - verb = PUT Annotations: - message = The API server has a 99th percentile latency of 8 seconds for PUT pods.

-- Ravikumar G
google-kubernetes-engine
kubernetes

1 Answer

4/27/2020

Kubernetes API server uses ETCD as a backing storage for all the kubernetes objects. I would start by looking at logs from ETCD server. Also setup alerts on EtcdHighCommitDurations, EtcdHighFsyncDurations, EtcdHighNumberOfFailedGRPCRequests to know if there is any issue going on with ETCD.

-- Arghya Sadhu
Source: StackOverflow