Openshift - shared development cluster - terminate older deployments after fixed time

5/30/2020

Our shared Openshift development cluster is regularly loaded down by pods which developers start and then forget about - they are (in many, but not all cases) simply a throw-away development exercise. This is currently dealt with by someone (could be an administrator or another developer with access to the namespace) manually investigating when the cluster becomes overloaded and terminating a few likely-looking suspects.

For our team, we are about to roll out an easy mechanism which enables devs to deploy a personal copy of the project into the development openshift environment for testing. Without having authority to manage or apply rules to the entire cluster, I'd like to ensure that we behave as good citizens and don't become a major cause of further load on the system.

At present I'm thinking that some kind of time-based lifespan for pods in our namespace might be in the way forward - say, after one week developers private pods left running will automatically die. I don't want it to be global, since there are certain services running in this namespace which need to stay up in order to support the development pods. Perhaps some config on the deployment itself. Obviously if a pod terminates, kubernetes will try to restart it, so I need something which un-deploys it properly. Could a "sidecar" container hooked onto the post-start lifecycle event, which un-deploys the pod after a fixed sleep time using k8s commands on itself, be a solution?

I'm interested to hear how other companies have solved this, and what my options are. Whilst cluster-wide solutions are of interest, I'm not a cluster administrator and getting something implemented at that level might be a longer-term solution but would be politically much more difficult, no matter how technically simple.

We are running Openshift 3.11. My depth of Kubernetes knowledge is about 1 year intermittent.

-- Ed Randall
kubernetes
openshift

0 Answers