What is rate node_context_switches_total ans why rate(node_context_switches_total[5m]) > 1000?

6/23/2019

I've set up prometheus on my cluster with a couple of alert rules like this one :

  • alert: ContextSwitching expr: rate(node_context_switches_total[5m]) > 1000 for: 30m labels: severity: warning

Can someone shed some light on the purpose of this rule ? Also, in my case rate(node_context_switches_total[5m]) is always greater than 2000.

Is that something I should be worried about ?

-- jaybe78
kubernetes
monitoring
prometheus-node-exporter

1 Answer

6/24/2019

A context switch is the action of storing the state of a process or of a thread. As per Prometheus Documentation book and metrics description -

node_context_switches_total is Total number of context switches.

Typical alert looks like:

  - alert: ContextSwitching
    expr: rate(node_context_switches_total[5m]) > 1000
    for: 30m
    labels:
      severity: warning
    annotations:
      summary: "Context switching (instance {{ $labels.instance }})"
      description: "Context switching is growing on node (> 1000 / s)\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
-- VKR
Source: StackOverflow