Auto-scaling pods in Kubernetes depending on number of connected users

4/10/2017

In order to practice scaling using Kubernetes, I have created the following scenario:


The Game:

  • I have a game written in Java which has 2 players, a master (chooses a number below 100) and a guesser (tries to guess the number).
  • The server-instance terminates once the guesser has correctly guessed the number, or one of the players disconnects.
  • Every server-instance only allows for a maximum of 2 connections (master, guesser).

Kubernetes:

  • Whenever the server is reaches it maximum connections, I want Kubernetes to automatically start another server-instance and use that new server-instance for new users that connect.

My idea was to use Kubernetes-Client from the Java server-instance, and update the Kubernetes cluster from each server-instance. In this form, I would have a decentralized way of managing the cluster.

What are your thoughts on this? Is there a better way to approach this? Could I for example update metadata on the pod and use some form of auto-scaling for this?

-- Martin J. Rogalla
fabric8
java
kubernetes

1 Answer

4/10/2017

You probably want to use the cluster autoscaler functionality of kubernetes. Please refer to the docs.

Your approach of using a fixed user number as a custom metric will probably work, but maybe you are better off using generic metrics, such as the CPU consumption on your node. For that you define the maximal consumption that your containers are allowed to use. This information is required by the autoscaler to know if it can schedule additional pods on a node.

Note that there are two layers of scalability to consider. The pods and the nodes. The autoscaler will take the metric you define and schedule additional pods when required. Your nodes will have limited capacity for the number of pods, so now, depending on the underlying infrastructure you use, you also need to scale the number of worker nodes. For example if you run on AWS you need an Autoscale group. The best way to set this up is to have the AWS Autoscale group scale up based on queued up, scheduled pods. So whenever there are pods in the queue that cannot be scheduled your Autoscale group will add another node. Here is a great article that explains it.

-- Oswin Noetzelmann
Source: StackOverflow