Q: How to setup kubernetes cluster to balance cpu load better?

3/22/2019

One of my Kubernetes clusters (GKE) is only three nodes and often one node is overloaded for CPU. I have about 32 deployments and most have 3 pods. When one node is overloaded I generally see 1 pod out of the 3 showing a CrashLoop. Ideally things wouldn't be crashing, and none of my nodes would be over 100% utilization.

To solve this I delete pods, drain and undrain the node, or pull the node and things usually return to normal. However, I wonder how others solve this:

  1. Do I need better health checking?
  2. Are my resource settings wrong (too low)?
  3. What are the right ways to debug this? I use kubectl top nodes, kubectl top pods, and kubectl get pods -o wide to understand what's happening.

Typical node skew:

kubectl top nodes                                                                                                                                                                                                                                                                                                                                        
NAME                                             CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%                                                                                                                                                                                                                                                                                                      
gke-staging-cluster-default-pool-386bd62d-bj36   481m         24%       4120Mi          38%                                                                                                                                                                                                                                                                                                          
gke-staging-cluster-default-pool-386bd62d-cl3p   716m         37%       6583Mi          62%       
gke-staging-cluster-default-pool-386bd62d-gms8   1999m        103%      6679Mi          63%       

Pod resources:

kubectl top pod | sort -nr -k2
hchchc-staging-deployment-669ff7477c-lcx5d                248m         47Mi                                                                                                                                                                                                                                                                                                                          
ggg-hc-demo-staging-deployment-77f68db7f8-nf9b5             248m         125Mi           
ggg-hc-demo-staging-deployment-77f68db7f8-c6jxd             247m         96Mi            
ggg-hc-demo-staging-deployment-77f68db7f8-l44vj             244m         196Mi           
athatha-staging-deployment-6dbdf7fb5d-h92h7                 244m         95Mi            
athatha-staging-deployment-6dbdf7fb5d-hqpm9                 243m         222Mi           
engine-cron-staging-deployment-77cfbfb948-9s9rv             142m         35Mi            
hchchc-twitter-staging-deployment-7846f845c6-g8wt4        59m          83Mi            
hchchc-worker-staging-deployment-7cbf995ddd-msrbt         51m          114Mi           
hchchc-twitter-staging-deployment-7846f845c6-brlbl        51m          94Mi            

Relating the pods and the nodes:

kubectl get pods -o wide | grep Crash

hchchc-twitter-staging-deployment-7846f845c6-v8mgh        1/2       CrashLoopBackOff   17         1h    10.0.199.31   gke-staging-cluster-default-pool-386bd62d-gms8
hchchc-worker-staging-deployment-66d7b5d7f4-thxn6         1/2       CrashLoopBackOff   17         1h    10.0.199.31   gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-hqqhb           1/2       CrashLoopBackOff   17         1h    10.0.199.31   gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-t4xmb           1/2       CrashLoopBackOff   17         1h    10.0.199.31   gke-staging-cluster-default-pool-386bd62d-gms8
ggggg-worker-staging-deployment-76b84969d-zpkkf           1/2       CrashLoopBackOff   17         1h    10.0.199.31   gke-staging-cluster-default-pool-386bd62d-gms8
-- Charles Thayer
devops
google-kubernetes-engine
kubernetes

1 Answer

3/22/2019

You may need to add pod anti-affinities to your deployments. This will spread out the load more evenly across all your nodes.

An example of anti-affinity:

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          topologyKey: kubernetes.io/hostname

This tells the deployment to avoid putting the same pod on a node if there is already one there. So if you have a deployment with 3 replicas, all 3 replicas should spread themselves across the 3 nodes, not all clumping up on a single node and draining CPU.

This is not a perfect solution, but it could help balance the load a little bit.

See more about anti-affinity here: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

-- Lindsay Landry
Source: StackOverflow