I have an application running in Kubernetes (Azure AKS) in which each pod contains two containers. I also have Grafana set up to display various metrics some of which are coming from Prometheus. I'm trying to troubleshoot a separate issue and in doing so I've noticed that some metrics don't appear to match up between data sources.
For example, kube_deployment_status_replicas_available
returns the value 30 whereas kubectl -n XXXXXXXX get pod
lists 100 all of which are Running, and kube_deployment_status_replicas_unavailable
returns a value of 0. Also, if I get the deployment in question using kubectl
I'm seeing the expected value.
$ kubectl get deployment XXXXXXXX
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
XXXXXXXX 100 100 100 100 49d
There are other applications (namespaces) in the same cluster where all the values correlate correctly so I'm not sure where the fault may be or if there's any way to know for sure which value is the correct one. Any guidance would be appreciated. Thanks
Based on having the kube_deployment_status_replicas_available
metric I assume that you have Prometheus scraping your metrics from kube-state-metrics. It sounds like there's something quirky about its deployment. It could be:
I would:
kubectl logs
kube-state-metrics`--log.level=debug
flag.Hope it helps.