One of my Kubernetes clusters (GKE) is only three nodes and often one node is overloaded for CPU. I have about 32 deployments and most have 3 pods. When one node is overloaded I generally see 1 pod out of the 3 showing a CrashLoop. Ideally things wouldn't be crashing, and none of my nodes would be over 100% utilization.
To solve this I delete pods, drain and undrain the node, or pull the node and things usually return to normal. However, I wonder how others solve this:
kubectl top nodes
, kubectl top pods
, and kubectl get pods -o wide
to understand what's happening.Typical node skew:
kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-staging-cluster-default-pool-386bd62d-bj36 481m 24% 4120Mi 38%
gke-staging-cluster-default-pool-386bd62d-cl3p 716m 37% 6583Mi 62%
gke-staging-cluster-default-pool-386bd62d-gms8 1999m 103% 6679Mi 63%
Pod resources:
kubectl top pod | sort -nr -k2
hchchc-staging-deployment-669ff7477c-lcx5d 248m 47Mi
ggg-hc-demo-staging-deployment-77f68db7f8-nf9b5 248m 125Mi
ggg-hc-demo-staging-deployment-77f68db7f8-c6jxd 247m 96Mi
ggg-hc-demo-staging-deployment-77f68db7f8-l44vj 244m 196Mi
athatha-staging-deployment-6dbdf7fb5d-h92h7 244m 95Mi
athatha-staging-deployment-6dbdf7fb5d-hqpm9 243m 222Mi
engine-cron-staging-deployment-77cfbfb948-9s9rv 142m 35Mi
hchchc-twitter-staging-deployment-7846f845c6-g8wt4 59m 83Mi
hchchc-worker-staging-deployment-7cbf995ddd-msrbt 51m 114Mi
hchchc-twitter-staging-deployment-7846f845c6-brlbl 51m 94Mi
Relating the pods and the nodes:
kubectl get pods -o wide | grep Crash
hchchc-twitter-staging-deployment-7846f845c6-v8mgh 1/2 CrashLoopBackOff 17 1h 10.0.199.31 gke-staging-cluster-default-pool-386bd62d-gms8
hchchc-worker-staging-deployment-66d7b5d7f4-thxn6 1/2 CrashLoopBackOff 17 1h 10.0.199.31 gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-hqqhb 1/2 CrashLoopBackOff 17 1h 10.0.199.31 gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-t4xmb 1/2 CrashLoopBackOff 17 1h 10.0.199.31 gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-zpkkf 1/2 CrashLoopBackOff 17 1h 10.0.199.31 gke-staging-cluster-default-pool-386bd62d-gms8
You may need to add pod anti-affinities to your deployments. This will spread out the load more evenly across all your nodes.
An example of anti-affinity:
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
topologyKey: kubernetes.io/hostname
This tells the deployment to avoid putting the same pod on a node if there is already one there. So if you have a deployment with 3 replicas, all 3 replicas should spread themselves across the 3 nodes, not all clumping up on a single node and draining CPU.
This is not a perfect solution, but it could help balance the load a little bit.
See more about anti-affinity here: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/