Kubernetes stops updating current CPU utilization in HPA

9/20/2016

I am having an issue with some (but not all) HPAs in my cluster stopping updating their CPU utilization. This appears to happen after some different HPA scales its target deployment.

Running kubectl describe hpa on the affected HPA yields these events:

  56m       <invalid>       453     {horizontal-pod-autoscaler }            Warning     FailedUpdateStatus      Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "sync-api": the object has been modified; please apply your changes to the latest version and try again

The controller-manager logs show affected HPAs start having problems right after a scaling event on another HPA:

I0920 03:50:33.807951       1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:33.821044       1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:34.982382       1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:35.002736       1 horizontal.go:403] Successfully updated status for greyhound-api
I0920 03:50:35.014838       1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:35.035785       1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:48.873503       1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:48.949083       1 horizontal.go:403] Successfully updated status for greyhound-api
I0920 03:50:49.005793       1 horizontal.go:403] Successfully updated status for sync-api
I0920 03:50:49.103726       1 horizontal.go:346] Successfull rescale of monolith, old size: 7, new size: 6, reason: All metrics below t
arget
I0920 03:50:49.135993       1 horizontal.go:403] Successfully updated status for monolith
I0920 03:50:49.137008       1 event.go:216] Event(api.ObjectReference{Kind:"Deployment", Namespace:"default", Name:"monolith", UID:"086
bfbee-7ec7-11e6-a6f5-0240c833a143", APIVersion:"extensions", ResourceVersion:"4210077", FieldPath:""}): type: 'Normal' reason: 'Scaling
ReplicaSet' Scaled down replica set monolith-1803096525 to 6
E0920 03:50:49.169382       1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith"
 is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:49.172986       1 replica_set.go:463] Too many "default"/"monolith-1803096525" replicas, need 6, deleting 1
E0920 03:50:49.222184       1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith" is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:50.573273       1 event.go:216] Event(api.ObjectReference{Kind:"ReplicaSet", Namespace:"default", Name:"monolith-1803096525", UID:"086e56d0-7ec7-11e6-a6f5-0240c833a143", APIVersion:"extensions", ResourceVersion:"4210080", FieldPath:""}): type: 'Normal' reason: 'SuccessfulDelete' Deleted pod: monolith-1803096525-gaz5x
E0920 03:50:50.634225       1 deployment_controller.go:400] Error syncing deployment default/monolith: Deployment.extensions "monolith" is invalid: status.unavailableReplicas: Invalid value: -1: must be greater than or equal to 0
I0920 03:50:50.666270       1 horizontal.go:403] Successfully updated status for aurora
I0920 03:50:50.955971       1 horizontal.go:403] Successfully updated status for greyhound-api
W0920 03:50:50.980039       1 horizontal.go:99] Failed to reconcile greyhound-api: failed to update status for greyhound-api: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "greyhound-api": the object has been modified; please apply your changes to the latest version and try again
I0920 03:50:50.995372       1 horizontal.go:403] Successfully updated status for sync-api
W0920 03:50:51.017321       1 horizontal.go:99] Failed to reconcile sync-api: failed to update status for sync-api: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "sync-api": the object has been modified; please apply your changes to the latest version and try again
I0920 03:50:51.032596       1 horizontal.go:403] Successfully updated status for aurora
W0920 03:50:51.084486       1 horizontal.go:99] Failed to reconcile monolith: failed to update status for monolith: Operation cannot be fulfilled on horizontalpodautoscalers.autoscaling "monolith": the object has been modified; please apply your changes to the latest version and try again

Manually updating affected HPAs using kubectl edit fixes the problem, but this makes me worry how reliable are HPAs for autoscaling.

Any help is appreciated. I am running v1.3.6.

-- VladLosev
kubernetes

1 Answer

10/27/2016

It is not correct to set up more than one HPA pointing to the same target deployment. When two different HPAs point to the same target (as described here), behavior of the system may be weird.

-- Jerzy Szczepkowski
Source: StackOverflow