Is it best practice to place monitoring tools like Prometheus and Grafana inside a Kubernetes cluster or outside a Kubernetes cluster?
I can see the case for either. It seems very easy to place it inside the cluster. But that seems less robust.
It seems people do this typically, likely they are running everything in their environment or app under K8S. In a bigger picture view if you have use cases outside of one specific app it likely makes sense to run this on another architecture. The reason why is that Prometheus doesn't support clustering. You can write to two instances, but there is not really an HA plan for this product. To me, that's a major problem for a monitoring technology.
Most organizations who use this tool heavily end up not meeting use cases which APM (transaction tracing and diagnostics) can do. Additionally, you'll need to deploy an ELK/Splunk type stack, so it gets very complex. They also find it difficult to manage and often will look towards a Datadog, SingalFx, Sysdig, or another system which can do more and is fully managed. Naturally, most of what I have said here has cost, so if you do not have a budget then you'll have to spend your time (time=money) doing all the work.