Kubernetes and Cloud Databases

9/9/2018

Could someone explain the benefits/issues with hosting a database in Kubernetes via a persistent volume claim combined with a storage volume over using an actual cloud database resource?

-- Barry Jacobs
database
kubernetes

2 Answers

9/9/2018

It's essentially a trade-off: convenience vs control. Take a concrete example: let's say you pay Amazon money to use Athena, which is really just a nicely packaged version of Facebook Presto which AWS kindly operates for you in exchange for $$. You could run Presto on EKS yourself, but why would you.

Now, let's say you want to or need to use Apache Drill or Apache Impala. Amazon doesn't offer it. Nor does any of the other big public cloud providers at time of writing, as far as I know.

Another thought: what if you want to migrate off of AWS? Your data has gravity as well.

-- Michael Hausenblas
Source: StackOverflow

9/9/2018

Could someone explain the benefits/issues with hosting a database in Kubernetes ... over using an actual cloud database resource?

As previous excellent answer noted:

It's essentially a trade-off: convenience vs control

In addition to previous example (Athena), take a look at RDS as well and see what you would need to handle yourself (why would you, as said already):

  • Automatic backups
  • Multizone deployments
  • Snapshots
  • Engine upgrades
  • Read replicas

and other bells and whistles that come with managed service opposed to self-hosted/managed one.

But there is more to it than just convenience/control that this post I trying to shed light onto:

Kubernetes is adding another layer of abstraction there (pods, services...), and depending on way of handling storage (persistent volumes) you can have two additional considerations:

  • Access speed (depending on your use case this can be negligent or show stopper).
  • Storage that you have at hand might not be optimized for relational database type of I/O (or restrict you to schedule pods efficiently). The very same reasons you are not advised to run db on NFS for example.

There are several recent conference talks on kubernetes pointing out that database is big no-no for kubernetes (although this is highly opinionated, we do run average load mysql and postgresql databases in k8s), and large load/fast I/O is somewhat challenge to get right on k8s as opposed to somebody already fine tuned everything for you in managed cloud solution.

In conclusion:

It is all about convenience, controls and capabilities.

-- Const
Source: StackOverflow