We have big problems with setting limits/requests in our web django servers/python processors/celery workers. Our current strategy is to look at the usage of the graphs in the last 7 days:
1) get raw average excluding peak
2) add 30% buffer
3) set limits 2 time requests
It works more or less, but then service's code is changed and limits which were set before are not valid anymore. What are the other strategies?
How would you set up limits/requests for those graphs:
1) Processors:
2) Celery-beat
3) Django (artifacts probably connected somehow to rollouts)
I would suggest you to start with the average CPU and memory value that the application takes and then enable auto scaling. Kunernetes has multiple kinds of autoscaling.
Horizontal pod autoscaling is commonly used these days. HPA will automatically create new pods if the pods CPU or memory exceeds the percentage or volume of CPU or memory set as threshold.
Monitor the new releases before deployment and see why exactly the new release needs more memory. Troubleshoot and try to reduce the resource consumption limit. If that is not update the resource request with new CPU and memory value.