Distributed Resource Allocation Architecture

3/15/2018

I am currently working on scaling a large scale infrastructure that involves distributing complex calculations over a calculation farm (Cluster with a limited number of machines). The current system is based on a service oriented architecture, whereby a limited number of services runs on each machine in the cluster.

The resources used (CPU, Memory) by each request sent these services vary widely depending on the content of the request, but may be known (or at least predicted) in advance. In other word, it is possible to know, for a given request, the following:

  • Time it will take to process the request. (Can vary from ms to minutes to sometimes hours)
  • Maximum memory required to process the request. (From a few MB to several GB)
  • Maximum number of cores required to process the request. (Mostly mono-threaded, but sometimes multi-threaded)

Our current architecture is problematic because our 'scheduler' does not take any of those parameters into account. Because of this, we often run into issues where one particular server is occupied by very expensive/'incompatible' requests (In terms of memory usage, CPU cores used, etc.), so processing each of them becomes widely inefficient, while other servers are occupied by relatively 'cheap' requests.

We would like to optimise this allocation process by moving our current infrastructure to a more modern orchestration system, such as Kubernetes (or other). The question I have at the moment is, given those requirements (Efficient distribution of requests with varying resource requirements - known before processing the request), what platforms currently available could be a good fit to optimise this type of workflow?

Thanks, Jon

-- Jon
distributed-computing
kubernetes
orchestration
scheduling
service

1 Answer

3/15/2018

Kubernetes seems a good fit for that type of workload. Each request could be run as a Job which would run one or more containers to process the request. These containers can each request a minimum amount of resource they will require ahead of time in their specification and also specify a limit on these resources (e.g. maximum memory and maximum number of cores) and the Kubernetes scheduler can pick a node within the cluster that can satisfy and enforce these requirements.

This will allow you to forget about where the workloads are actually running and focus on making sure you just describe the requirements of each request accurately.

-- dippynark
Source: StackOverflow