I have deployed 5 apps using Azure container instances, these are working fine, the issue I have is that currently, all containers are running all the time, which gets expensive.
What I want to do is to start/stop instances when required using for this a Master container or VM that will be working all the time.
E.G.
This master service gets a request to spin up service number 3 for 2 hours then shut it down and all other containers will be off until they receive a similar request.
For my use case, each service will be used for less than 5 hours a day most of the time.
Now, I know Kubernetes its an engine made to manage containers but all examples I have found are for high scale services, not for 5 services with only one container each, also not sure if Kubernetes allows to have all the containers off most of the time.
What I was thinking on is to handle all these throw some API, but I'm not fiding any service in Azure that allows something similar to this, I have only found options to create new containers, not to spin up and shut them down.
EDIT:
Also, this apps run process that are to heavy to have them on a serverless platform.
Solution is to define horizontal pod autoscaler for your deployment.
The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can’t be scaled, for example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by user.
Configuration file should looks like this:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: hpa-images-service
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: example-deployment
minReplicas: 2
maxReplicas: 100
targetCPUUtilizationPercentage: 75
scaleRef should refer toyour deployment definition and minReplicas you can set as 0, value of targetCPUUtilization you can set according to your preferences.. Such approach should help you to save money due to termination pod which have high CPU utilization.
Kubernetes official documentation: kubernetes-hpa.
GKE autoscaler documentation: gke-autoscaler.
Useful blog about saving cash using GCP: kubernetes-google-cloud.