Scale up and Scale down in containers

12/11/2019

I am learning containers and microservices. This question might be basic. Lets us assume I have a microservice running on a container in my PC. I would like to know what happens during a container scale up and scale down. Please explain with a example

-- Baala Srinivas K
containers
docker
kubernetes
microservices

2 Answers

12/11/2019

Your microservice might run as a number of replicated containers, each performing the same function. For example, if you have a web service, instead of just a single web server, you might have multiple web servers, each running as a container and serving the same web page. You would then also have a load balancer in front of these replicas that forwards incoming requests to any of these containers.

The reason for doing this is to increase the capacity of your microservice. If a single web server can serve 10 requests per second, then 10 web servers together can serve 100 requests per second. So, you increase the total capacity of your service to 100 requests per second.

Now, scaling up/down simply means increasing/decreasing the number of replica containers. For example, you can scale up to 20 replicas, which increases the total capacity of your service to 200 requests per second. If you don't need that much capacity and want to save resources, you could scale the application down to 5 replicas, which decreases the capacity to 50 requests per second.

In Kubernetes, the containers run as Pods, the object that manages a set of replicated Pods is a Deployment, and the load balancer is a Service.

-- weibeld
Source: StackOverflow

12/11/2019

I will assume from the tone of the question that this is a more design question, And the fact that you have tagged both kubernetes and docker I will assume you are planning to use or are already using this to some extent.

Ok now to the premise of the question, considering you have a microservice or in this case, it is interchangeable with a docker container that is running on your machine or VM.

So now what does it mean to scale the service

In Docker

  • You have to have multiple container instances running of same images on the said VM
  • These instances are the same docker images running on multiple assigned ports servicing the same application or web app or static website.
  • This scaled application or web-app is now scaled to the size scale of 2 and is running on say port 8080 and 8081 serving a simple web-app.
  • Now to make sure these scaled applications are useful for your use case you will have to add a load balancer on top of these 2 applications which can be done using an Nginx proxy.
  • For scale down, it will mean you will have to stop one of these container instances running using docker stop id
  • Thus now you have scaled down to the size scale of 1.

In Kubernetes

  • In a kubernetes orchestration environment which indirectly uses docker to do its cool stuff, the load balancing and scaling mechanisms, etc are already handled by the kubernetes itself.
  • So, in this case, say you have a web app container running in a pod, when you execute a command say `kubectl scale pod_name 2', this will call the Kubernetes API and scale up or instruct the docker below to launch another instance of your docker image.
  • And since the load balancer is handled by the Kubernetes we will just a simple end-point provisioned for us ( there are some other stuff like services and load balancer endpoints, but we can skip that for our argument)
  • And in similar fashion you can scale down the number of instances running in a pod and kubernetes will take of everything else while the active traffic is taken care of.
-- damitj07
Source: StackOverflow