Cluster: 1 master 2 workers
I am deploying StatefulSet using the local-volume using the PV (kubernetes.io/no-provisioner storageClass) with 3 replicas. Created 2 PV for Both worker nodes.
Expectation: pods will be scheduled on both workers and sharing the same volume.
result: 3 stateful pods are created on single worker node. yaml :-
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: example-local-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-storage
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv-1
spec:
capacity:
storage: 2Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/vol1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-node1
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv-2
spec:
capacity:
storage: 2Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/vol2
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-node2
---
# Headless service for stable DNS entries of StatefulSet members.
apiVersion: v1
kind: Service
metadata:
name: test
labels:
app: test
spec:
ports:
- name: test-headless
port: 8000
clusterIP: None
selector:
app: test
---
apiVersion: v1
kind: Service
metadata:
name: test-service
labels:
app: test
spec:
ports:
- name: test
port: 8000
protocol: TCP
nodePort: 30063
type: NodePort
selector:
app: test
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-stateful
spec:
selector:
matchLabels:
app: test
serviceName: stateful-service
replicas: 6
template:
metadata:
labels:
app: test
spec:
containers:
- name: container-1
image: <Image-name>
imagePullPolicy: Always
ports:
- name: http
containerPort: 8000
volumeMounts:
- name: localvolume
mountPath: /tmp/
volumes:
- name: localvolume
persistentVolumeClaim:
claimName: example-local-claim
This happened because Kubernetes doesn't care about distribution. It has the mechanism for providing specific distribution called Pod Affinity. For distributing pods on all workers, you may use Pod Affinity. Furthermore, you can use soft affinity (the differences I explain here ), it isn't strict and allows to spawn all your pods. For example, StatefulSet will look like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: my-app
replicas: 3
template:
metadata:
labels:
app: my-app
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- my-app
topologyKey: kubernetes.io/hostname
terminationGracePeriodSeconds: 10
containers:
- name: app-name
image: k8s.gcr.io/super-app:0.8
ports:
- containerPort: 21
name: web
This StatefulSet will try to spawn each pod on a new worker; if there are not enough workers, it will spawn the pod on the node where the pod already exists.