Choosing the compute resources of the nodes in the cluster with horizontal scaling

8/29/2018

Horizontal scaling means that we scale by adding more machines into the pool of resources. Still, there is a choice of how much power (CPU, RAM) each node in the cluster will have.

When cluster managed with Kubernetes it is extremely easy to set any CPU and memory limit for Pods. How to choose the optimal CPU and memory size for cluster nodes (or Pods in Kubernetes)?

For example, there are 3 nodes in a cluster with 1 vCPU and 1GB RAM each. To handle more load there are 2 options:

  • Add the 4th node with 1 vCPU and 1GB RAM
  • Add to each of the 3 nodes more power (e.g. 2 vCPU and 2GB RAM)

A straightforward solution is to calculate the throughput and cost of each option and choose the cheaper one. Are there any more advanced approaches for choosing the compute resources of the nodes in a cluster with horizontal scalability?

-- Evgeniy Khyst
cloud
cluster-computing
horizontal-scaling
kubernetes
scalability

2 Answers

9/4/2018

The answer is related to such performance metrics as latency and throughput:

  • Latency is a time interval between sending request and receiving response.
  • Throughput is a request processing rate (requests per second).

Latency has influence on throughput: bigger latency = less throughput.

If a business transaction consists of multiple sequential calls of the services that can't be parallelized, then compute resources (CPU and memory) has to be chosen based on the desired latency value. Adding more instances of the services (horizontal scaling) will not have any positive influence on the latency in this case. Adding more instances of the service increases throughput allowing to process more requests in parallel (if there are no bottlenecks).

In other words, allocate CPU and memory resources so that service has desired response time and add more service instances (scale horizontally) to handle more requests in parallel.

-- Evgeniy Khyst
Source: StackOverflow

8/29/2018

For this particular example I would go for 2x vCPU instead of another 1vCPU node, but that is mainly cause I believe running OS for anything serious on a single vCPU is just wrong. System to behave decently needs 2+ cores available, otherwise it's too easy to overwhelm that one vCPU and send the node into dust. There is no ideal algorithm for this though. It will depend on your budget, on characteristics of your workloads etc.

As a rule of thumb, don't stick to too small instances as you have a bunch of stuff that has to run on them always, regardless of their size and the more node, the more overhead. 3x 4vCpu+16/32GB RAM sounds like nice plan for starters, but again... it depends on what you want, need and can afford.

-- Radek 'Goblin' Pieczonka
Source: StackOverflow