Multiple Kubernetes pods sharing the same host-path/pvc will duplicate output

10/13/2017

I have a small problem and need to know what is the best way to approach this/solve my issue.

I have deployed few pods on Kubernetes and so far I have enjoyed learning about and working with Kubernetes. Did all the persistent volume, volume claim...etc. and can see my data on the host, as I need those files for further processing.

Now the issue is 2 pods (2 replicas) sharing the same volume claim are writing to the same location on the host, expected, but unfortunately causing the data to be duplicated in the output file.

What I need is:

  • To have a unique output of each pod on the host. Is the only way to achieve this is by having two deployment files, in my case, and each to use a different volume claim/persistent volume ? At the same time not sure if this is an optimal approach for future updates, upgrades, availability of certain number of pods ... etc.
  • Or can I still have one deployment file with 2 or more replicas and still avoid the output duplication when sharing the same pvc ?

Please note that I have one node deployment and that's why I'm using hostpath at the moment.

creating pv:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: ls-pv
  labels:
    type: local
spec:
  storageClassName: manual
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/ls-data/my-data2"

claim-pv:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ls-pv-claim
spec:
  storageClassName: manual
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi

How I use my pv inside my deployment:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: logstash
  namespace: default
  labels:
    component: logstash
spec:
  replicas: 2
  selector:
    matchLabels:
     component: logstash
#omitted 
        ports:
        - containerPort: 5044
          name: logstash-input
          protocol: TCP
        - containerPort: 9600
          name: transport
          protocol: TCP
        volumeMounts:
        - name: ls-pv-store
          mountPath: "/logstash-data"
      volumes:
      - name: ls-pv-store
        persistentVolumeClaim:
         claimName: ls-pv-claim
-- Zee
kubernetes

1 Answer

10/15/2017

Depending on what exactly you are trying to achieve you could use Statefulsets instead of Deployments. Each Pod spawn from the Statefulset's Pod template can have it's own separate PersistentVolumeClaim that is created from the volumeClaimTemplate (see the link for an example). You will need a StorageClass set up for this.

If you are looking for something simpler you write to /mnt/volume/$HOSTNAME from each Pod. This will also ensure that they are using separate files as the hostnames for the Pods are unique.

-- Janos Lenart
Source: StackOverflow