I've got a node pool defined with min instances set to 1 and max instances set to 5, and autoscaling enabled.
However it does not seem to be scaling down.
The node in question has the following pods running on it:
All the pods above are in the kube-system
namespace besides the redis
pod which is defined within a daemonset.
Is there any additional configuration required? A pod disruption budget perhaps?
Output of kubectl describe -n kube-system configmap cluster-autoscaler-status
:
Name: cluster-autoscaler-status
Namespace: kube-system
Labels: <none>
Annotations: cluster-autoscaler.kubernetes.io/last-updated=2018-06-15 10:40:16.289611397 +0000 UTC
Data
====
status:
----
Cluster-autoscaler status at 2018-06-15 10:40:16.289611397 +0000 UTC:
Cluster-wide:
Health: Healthy (ready=4 unready=0 notStarted=0 longNotStarted=0 registered=4 longUnregistered=0)
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:17:56.845900388 +0000 UTC
ScaleUp: NoActivity (ready=4 registered=4)
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:18:55.777577792 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:39:03.33504599 +0000 UTC
NodeGroups:
Name: https://content.googleapis.com/compute/v1/projects/gcpwp-ayurved-subs-staging/zones/europe-west1-b/instanceGroups/gke-wordpress-preempt-nodes-9c33afcb-grp
Health: Healthy (ready=3 unready=0 notStarted=0 longNotStarted=0 registered=3 longUnregistered=0 cloudProviderTarget=3 (minSize=2, maxSize=3))
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:17:56.845900388 +0000 UTC
ScaleUp: NoActivity (ready=3 cloudProviderTarget=3)
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:18:55.777577792 +0000 UTC
ScaleDown: NoCandidates (candidates=0)
LastProbeTime: 2018-06-15 10:40:14.942263061 +0000 UTC
LastTransitionTime: 2018-06-15 09:39:03.33504599 +0000 UTC
Events: <none>
Also as stated in GKE FAQ, a node will not be downscaled until the sum of cpu and memory requests of all pods running on this node is smaller than 50% of the node's allocatable.
See here for a duplicate question.
There are a few constraints that could prevent the node from scaling down.
You should verify the pods you listed one by one against the What types of pods can prevent CA from removing a node? documentation. This should help you discover if there is a pod that prevents it.
If it is indeed the redis
pod then you could try using the safe to evict annotation:
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"
If it is one of the system pods I would try the same thing on other nodes to see if scaling down works on them. According to the GKE documentation, you should be able to scale down your cluster to 1 node per cluster or completely for a specific node pool.