I've set up prometheus on my cluster with a couple of alert rules like this one :
Can someone shed some light on the purpose of this rule ? Also, in my case rate(node_context_switches_total[5m]) is always greater than 2000.
Is that something I should be worried about ?
A context switch is the action of storing the state of a process or of a thread. As per Prometheus Documentation book and metrics description -
node_context_switches_total
is Total number of context switches.
Typical alert looks like:
- alert: ContextSwitching
expr: rate(node_context_switches_total[5m]) > 1000
for: 30m
labels:
severity: warning
annotations:
summary: "Context switching (instance {{ $labels.instance }})"
description: "Context switching is growing on node (> 1000 / s)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"