Kubernetes: How to manage data with multiple replicas?

5/21/2019

I am currently learning Kubernetes and I'm stuck on how to handle the following situation:

I have a Spring Boot application which handles files(photos, pdf, etc...) uploaded by users, users can also download these files. This application also produces logs which are spread into 6 different files. To make my life easier I decided to have a root directory containing 2 subdirectories(1 directory for users data and 1 for logs) so the application works only with 1 directory(appData)

.appData
     |__ usersData
     |__ logsFile

I would like to use GKE (Google Kubernetes Engine) to deploy this application but I have these problems:

  • How to handle multiple replicas which will read/write concurrently data + logs in the appData directory?
  • Regarding logs, is it possible to have multiple Pods writing to the same file?
  • Say we have 3 replicas (Pod-A, Pod-B and Pod-C), if user A uploads a file handled by Pod-B, how Pod-A and Pod-C will discover this file if the same user requests later its file?
  • Should each replica have its own volume? (I would like to avoid this situation, which seems the case when using StatefulSet)
  • Should I have only one replica? (using Kubernetes will be useless in that case)

Same questions about database's replicas. I use PostgreSQL and I have the same questions. If we have multiple replicas, as requests are randomly send to replicas, how to be sure that requesting data will return a result?

I know there a lot of questions. Thanks a lot for your clarifications.

-- akuma8
database
google-kubernetes-engine
kubernetes
replication

2 Answers

5/21/2019

I'd do two separate solutions for logs and for shared files.

For logs, look at a log aggregator like fluentd.

For shared file system, you want an NFS. Take a look at this example: https://github.com/kubernetes/examples/tree/master/staging/volumes/nfs. The NFS will use a persistent volume from GKE, Azure, or AWS. It's not cloud agnostic per se, but the only thing you change is your provisioner if you want to work in a different cloud.

-- frankd
Source: StackOverflow

5/21/2019

You can use persistent volume using NFS in GKE (Google Kubernetes Engine) to share files across pods. https://cloud.google.com/filestore/docs/accessing-fileshares

-- dassum
Source: StackOverflow