I am running 3 deployments of the same app on a kubernetes cluster. I recently got around to setting resource requests and limits for one of the deployments.
resources:
limits:
cpu: 350m
memory: 225Mi
requests:
cpu: 250m
memory: 150Mi
After setting these, the affected pods have a much higher computation time compared to the 2 unchanged deployments, which does not make sense regarding kubernetes documentation as I understand it.
Running kubectl top pods
allows me to confirm my pods are running at or below requested resources. When visualizing computation time (Prometheus+Grafana), it is clear one of the deployments is significantly slower:
Two deployments at ~ 60ms and one at ~ 120ms
As this is the only change I have made, I don't understand why there should be any performance degradation. Am I missing something?
Removing the cpu limit
but keeping the request
brings pod performance back to what it's supposed to be. Keeping in mind that these pods are running at cpu request level (around 250mCPU) which is 100mCPU below the limit.
Additional information: these pods are running a NodeJS app.
The pods which are not having any requests and limits might be using the node's resources with out any restriction. so it can be faster.
The pods which are having limits will be limited so they can be slow.
Please check resource consumption metrics of both deployments.
Reading this link i understand that if a pod is successfully started, the container is guaranteed the amount of resources requested so scheduling is based on requests field in yaml and not the limit field but the pods and its containers will not be allowed to exceed the specified limit on yaml.
Pods will be throttled if they exceed their limit. If limit is unspecified, then the pods can use excess CPU when available.
Refer the link for complete read https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node/resource-qos.md#compressible-resource-guarantees
Kubernetes CPU limits might not work as one would assume. I suggest to watch this presentation starting at 13:38.
A solution to the negative effect of CPU limits in k8s might be to set a different CFS quota value. By default it's set to 100ms, a better value might be 5ms. There is also an issue about this.