PVC template and failure-domain.beta.kubernetes.io/zone

11/11/2016

I have a PetSet with

  volumeClaimTemplates:
  - metadata:
      name: content
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi
  - metadata:
      name: database
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi
  - metadata:
      name: file
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi
  - metadata:
      name: repository
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 2Gi

If I annotate it with dynamic volume provisioning it will create volume claims and volumes in random availability zone and pets won't be able to start because in this example each pet requires exactly four volumes 2Gi in size to actually be scheduled. If I create volumes manually I can have them labeled with failure-domain.beta.kubernetes.io/zone: us-east-1d for example and this way I can create PVCs with the selector that matchLabels by failure-domain. But how do I do something similar with volumeClaimTemplates? I mean I don't want to stick them all to one failure domain sure. But for some reason volume claim template won't create all the volumes for one pet in the same failure domain.

Ideas?

-- Dmytro Leonenko
amazon-web-services
kubernetes

2 Answers

11/16/2016

PV creation isn't part of the StatefulSet code, so it doesn't "know" that they all need to be in the same failure domain, for a given pod.

There's a piece of code in the volume provisioner that recognises PetSet-style names and hashes the base name, then applies the number as an offset.

Thus, volumes with the same base name get spread across zones - but here we have multiple volume names, so they get hashed differently and the -0 one ends up in multiple zones

If this is a requirement, raising an issue in the tracker about this is likely your best way forward

However, have you considered doing the following?

apiVersion: apps/v1alpha1
kind: PetSet
spec:
  template:
    spec:
      containers:
      -
        volumeMounts:
        - name: contentdatabasefilerepository
          mountPath: /var/www/content
          subPath: content
        - name: contentdatabasefilerepository
          mountPath: /var/database
          subPath: database
  volumeClaimTemplates:
  - metadata:
      name: contentdatabasefilerepository
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 8Gi

You only have one volume per pet, but you can mount subpaths of it in multiple places. Though this won't limit each directory to 2Gi, so might not be suitable for your use-case

-- Mike Bryant
Source: StackOverflow

11/16/2016

You can create a storage class and add the failure zone there. For example, create a storage class like this:

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: gp2storage
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: us-east-1b
  encrypted: "true"

In the example above, we're creating PV's in the zone us-east-1b on AWS. Then in your template reference that storage class:

volumeClaimTemplates:
  - metadata:
      name: data
      annotations:
        volume.beta.kubernetes.io/storage-class: default
-- Steve Sloka
Source: StackOverflow