Kubernetes Horizontal Pod Autoscaler not utilising node resources

4/28/2018

I am currently running Kubernetes 1.9.7 and successfully using the Cluster Autoscaler and multiple Horizontal Pod Autoscalers.

However, I recently started noticing the HPA would favour newer pods when scaling down replicas.

For example, I have 1 replica of service A running on a node alongside several other services. This node has plenty of available resource. During load, the target CPU utilisation for service A rose above the configured threshold, therefore the HPA decided to scale it to 2 replicas. As there were no other nodes available, the CAS span up a new node on which the new replica was successfully scheduled - so far so good!

The problem is, when the target CPU utilisation drops back below the configured threshold, the HPA decides to scale down to 1 replica. I would expect to see the new replica on the new node removed, therefore enabling the CAS to turn off that new node. However, the HPA removed the existing service A replica that was running on the node with plenty of available resources. This means I now have service A running on a new node, by itself, that can't be removed by the CAS even though there is plenty of room for service A to be scheduled on the existing node.

Is this a problem with the HPA or the Kubernetes scheduler? Service A has now been running on the new node for 48 hours and still hasn't been rescheduled despite there being more than enough resources on the existing node.

-- syscll
autoscaling
kubernetes

1 Answer

4/29/2018

After scouring through my cluster configuration, I managed to come to a conclusion as to why this was happening.

Service A was configured to run on a public subnet and the new node created by the CA was public. The existing node running the original replica of Service A was private, therefore leading the HPA to remove this replica.

I'm not sure how Service A was scheduled onto this node in the first place, but that is a different issue.

-- syscll
Source: StackOverflow