How to create an alert for k8s resource quotas to send alert eg Prometheus
I have a problem, I can't find something like that, but I need to catch the resource quota error and send an alert to prometheus.
Has anyone seen this and managed to fix it?
For example look this error:
Error from server (Forbidden): error when creating "https://k8s.io/examples/admin/resource/quota-mem-cpu-pod-2.yaml": pods "quota-mem-cpu-demo-2" is forbidden: exceeded quota: mem-cpu-demo, requested: requests.memory=700Mi, used: requests.memory=600Mi, limited: requests.memory=1Gi
How can I send this error by alerting on Prometheus?
one option - if you run kube-state-metrics, there is a handy kube_resourcequota metric that has labels
resource=<ResourceName> (eg: limits.cpu) type=<quota-type> (hard/used)
You could play around with alerting on when a NS goes over X% of quota on that resource. This would fire when a namespace exceeds 90% of a resourceQuota -> would fire for limits.cpu, limits.memory etc.
If you administer a cluster with many teams and respective namespaces this could alert you to keep an eye on any run-away resources (e.g mis-configured cronjobs chewing up cpu) and alert the team.
100 * kube_resourcequota{type="used"} / ignoring(instance, job, type) (kube_resourcequota{type="hard"} > 0) > 90
kube-state-metrics:
https://github.com/kubernetes/kube-state-metrics/blob/master/docs/resourcequota-metrics.md
I lifted the above expression from here: https://sysdig.com/blog/alerting-kubernetes/