We are running 4 different micro-services on 4 different ec2 autoscaling groups:
service-1 - vcpu:4, RAM:32 GB, VM count:8
service-2 - vcpu:4, RAM:32 GB, VM count:8
service-3 - vcpu:4, RAM:32 GB, VM count:8
service-4 - vcpu:4, RAM:32 GB, VM count:16
We are planning to migrate this workload on EKS (in containers)
We need help in deciding the right node configuration (in EKS) to start with. We can start with a small machine vcpu:4, RAM:32 GB, but will not get any cost saving as each container will need a separate vm. We can use a large machine vcpu:16, RAM: 128 GB, but when these machines scale out, scaled out machine will be large and thus can be underutiliized. Or we can go with a Medium machine like vcpu: 8, RAM:64 GB.
Other than this recommendation, we were also evaluating the cost saving of moving to containers. As per our understanding, every VM machine comes with following overhead
Note: One large VM vs many small VMs cost the same on public cloud as cost is based on number of vCPUs + RAM.
Hypervisor/virtualization cost is only valid if we are running on-prem, so no need to consider this. On the 2nd point, how much resources a typical linux machine can take to run a OS? If we provision a small machine (vcpu:2, RAM:4GB), an approximate cpu usage is 0.2% and memory consumption (other than user space is 500Mb). So, running large instances (count:5 instances in comparison to small instances count:40) can save 35 times of this cpu and RAM, which does not seem significant.
You are unlikely to see any cost savings in resources when you move to containers in EKS from applications running directly on VM's.
A Linux Container is just an isolated Linux process with specified resource limits, it is no different from a normal process when it comes to resource consumption. EKS still uses virtual machines to provide compute to the cluster, so you will still be running processes on a VM, regardless of containerization or not and from a resource point of view it will be equal. (See this answer for a more detailed comparison of VM's and containers)
When you add Kubernetes to the mix you are actually adding more overhead compared to running directly on VM's. The Kubernetes control plane runs on a set of dedicated VM's. In EKS those are fully managed in a PaaS, but Amazon charges a small hourly fee for each cluster.
In addition to the dedicated control plane nodes, each worker node in the cluster need a set of programs (system pods) to function properly (kube-proxy, kubelet etc.) and you may also define containers that must run on each node (daemon sets), like log collectors and security agents.
When it comes to sizing the nodes you need to find a balance between scaling and cost optimization.
I tend to prefer nodes that are small so that scaling can be handled efficiently. They should be slightly larger than what is required from the largest containers, so that system pods and daemon sets also can fit.