Does elasticsearch need a persistent storage when deployed on kubernetes?

11/9/2015

In the Kubernetes example of Elasticsearch production deployment, there is a warning about using emptyDir, and advises to "be adapted according to your storage needs", which is linked to the documentation of persistent storage on Kubernetes.

Is it better to use a persistent storage, which is an external storage for the node, and so needs (high) I/O over network, or can we deploy a reliable Elasticsearch using multiple data nodes with local emptyDir storage?

Context: We're deploying our Kubernetes on commodity hardware, and we prefer not to use SAN for the storage layer (because it doesn't seem like commodity).

-- Reza Mohammadi
elasticsearch
kubernetes
persistence

1 Answer

11/9/2015

The warning is so that folks don't assume that using emptyDir provides a persistent storage layer. An emptyDir volume will persist as long as the pod is running on the same host. But if the host is replaced or it's disk becomes corrupted, then all data would be lost. Using network mounted storage is one way to work around both of these failure modes. If you want to use replicated storage instead, that works as well.

-- Robert Bailey
Source: StackOverflow