I'm currently working on deploying an elasticseacrh cluster in K8s. Can anyone help me understand what are the cons/pros of deploying the ES cluster inside our K8s cluster or outside? Thanks in advance!
A big pro is data ingestion. If you have your ES cluster inside your k8s cluster, data ingestion will be faster.
However, a big con is resources. ES will eat away your resources worse than google-chrome eats your ram. And I mean, a lot.
And maintaining it can be quite cumbersome. Not sure about your use case but if it is logging (as in most cases), usually cloud providers have their own solution for that.
If not, then:
I would recommend having dedicated nodes for ES in your cluster, otherwise it might affect other pods if there are peaks and starts using a lot of node resources.
Also make sure to familiarize yourself and optimize your cold-warm-hot data, it will save you a lot of time and resources.
EDIT
I haven't emphasized how important is this faster data ingestion so it might not seem like a good enough reason to deploy it inside the cluster. Bottom line is pretty obvious: Network latency and bandwidth.
These things can really add up (picking up all those logs from all those pods, then scaling those same pods, then expanding the cluster, then again...), so every unit counts. If your VMs will not suffer from those two (meaning, they have same latency as any other node of the cluster), I think it won't make a huge difference.
On the other hand, I see no big benefit in separating them from the cluster. It is a part of your infrastructure anyway.
What if tomorrow you decide to switch to AWS or GKE? You would have to change your deployments, setup the whole thing again. On the other hand, if it's already a part of your cluster, just kubectl apply
and
I can also guess that you will try to setup an ELK stack. If time and good will allows, give fluentd a chance (it is 100% compatible with all logstash clients but much more lghtweight).