I have a diverse set of artifacts that require processing and a non-homogenous computing environment, e.g. - some inputs are large and take a long time and lots of memory, others are small so take a short time and little memory, and the nodes do not have a single set number of cores and memory.
I would like a simple set of greedy rules regarding the adding of pods:
If there are cores remaining, add pods with the constraint among inputs to process that chooses the largest input that will fit in remaining memory.
So far, I have found it difficult to express this using the Horizontal Pod Autoscaler average and target CPU usage, especially due to the semantic meaning of average utilization referring to processes within pods. At this point, I think it is required to use custom metrics or some way to integrate the selection of arguments with kubernetes.
But I don't know what the best way forward with this is. If I were to program this to integrate with kubernetes, what API would I use? It would be great if I could register a callback with kubernetes, receive metrics information where different signals could be given to kubernetes for adding pods or removing them by simply returning a type or signal or calling another API.