Scaling service in response to requests on Kubernetes

7/29/2019

Let's say:

  1. I have a service which when started takes an id.

  2. I want each service process to run in a separate k8s pod.

  3. I want to put an API in front of this where a user enters an id N and in response:

    1. If a service for id N is running I route the user to it.

    2. If no service is running for id N I start one (i.e. spin up a new pod) then route the user there.

Some ideas I've had for (3.2):

  1. The "router" service directly spins up new pods using the k8s api. That feels wrong but perhaps it isn't?

  2. Incoming requests that have no running service go in a queue, trigger horizontal pod scaling based on queue size and have the new service take the id off the queue.

Is there a primitive I've missed that could help me here? What's the most idiomatic way to implement this on kubernetes? If it's at all relevant I'll be running all this on AKS.

-- Niall Glynn
kubernetes

1 Answer

8/28/2019

Reading your requirements it looks to me, like you require more a type of own PaaS on top of Kubernetes, rather than kind of Scaling service. There are couple of existing solutions out there, e.g. check 'Deis Workflow'.

If you really intend to create such a solution from scratch I would use as a proof of concept for (1 & 2), a package manager tool for Kubernetes, called helm, which works on higher level abstraction - bundles in single 'release' a Kubernetes resources that make up the whole working application: Pod, Service, Persistence Volume, etc. .

You could literally treat 'release' equally with an 'id'. No releases created in K8S cluster = your service is scaled to zero. Beside 'helm' client tool gives you an easy way to find out application URLs (target route for specific user id). The same information would be accessible from Kubernetes API using Kubernetes API client libraries or directly by Kubernetes REST API, that your frontend would use (3).

-- Nepomucen
Source: StackOverflow