I try to set up kubernetes cluster. I have Persistent Volomue, Persistent Volume Claim and Storage class all set-up and running but when I wan to create pod from deployment, pod is created but it hangs in Pending state. After describe I get only this warnig "1 node(s) had volume node affinity conflict." Can somebody tell me what I am missing in my volume configuration?
apiVersion: v1
kind: PersistentVolume
metadata:
creationTimestamp: null
labels:
io.kompose.service: mariadb-pv0
name: mariadb-pv0
spec:
volumeMode: Filesystem
storageClassName: local-storage
local:
path: "/home/gtcontainer/applications/data/db/mariadb"
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
claimRef:
namespace: default
name: mariadb-claim0
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/cvl-gtv-42.corp.globaltelemetrics.eu
operator: In
values:
- master
status: {}
In my case, the root cause was that the persistent volume are in us-west-2c and the new worker nodes are relaunched to be in us-west-2a and us-west-2b. The solution is to either have more worker nodes so they are in more zones, or remove / widen node affinity for the application so that more worker nodes qualifies to be bounded to the persistent volume.
Great answer by Sownak Roy. I've had the same case of a PV being created in a different zone compared to the node that was supposed to use it. The solution I applied was based on Sownak's answer only in my case it was enough to specify the storage class without the "allowedTopologies" list, like this:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: cloud-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
volumeBindingMode: WaitForFirstConsumer
Different case from GCP GKE. Assume that you are using regional cluster and you created two PVC. Both were created in different zones (you didn't notice).
In next step you are trying to run the pod which will have mounted both PVC to the same pod. You have to schedule that pod to specific node in specific zone but because your volumes are on different zones the k8s won't be able to schedule that and you will receive the following problem.
For example - two simple PVC(s) on the regional cluster (nodes in different zones):
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: disk-a
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: disk-b
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Next simple pod:
apiVersion: v1
kind: Pod
metadata:
name: debug
spec:
containers:
- name: debug
image: pnowy/docker-tools:latest
command: [ "sleep" ]
args: [ "infinity" ]
volumeMounts:
- name: disk-a
mountPath: /disk-a
- name: disk-b
mountPath: /disk-b
volumes:
- name: disk-a
persistentVolumeClaim:
claimName: disk-a
- name: disk-b
persistentVolumeClaim:
claimName: disk-b
Finally as a result it could happen that k8s won't be able schedule to pod because the volumes are on different zones.
The error "volume node affinity conflict" happens when the persistent volume claims that the pod is using are scheduled on different zones, rather than on one zone, and so the actual pod was not able to be scheduled because it cannot connect to the volume from another zone. To check this, you can see the details of all the Persistent Volumes. To check that, first get your PVCs:
$ kubectl get pvc -n <namespace>
Then get the details of the Persistent Volumes (not Volume claims)
$ kubectl get pv
Find the PVs, that correspond to your PVCs and describe them
$ kubectl describe pv <pv1> <pv2>
You can check the Source.VolumeID for each of the PV, most likely they will be different availability zone, and so your pod gives the affinity error. To fix this, create a storageclass for a single zone and use that storageclass in your PVC.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: region1storageclass
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
encrypted: "true" # if encryption required
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- eu-west-2b # this is the availability zone, will depend on your cloud provider
# multi-az can be added, but that defeats the purpose in our scenario
There a few things that can cause this error:
Node isn’t labeled properly. I had this issue on AWS when my worker node didn’t have appropriate labels(master had them though) like that:
failure-domain.beta.kubernetes.io/region=us-east-2
failure-domain.beta.kubernetes.io/zone=us-east-2c
After patching the node with the labels, the “1 node(s) had volume node affinity conflict” error was gone, so PV, PVC with a pod were deployed successfully. The value of these labels is cloud provider specific. Basically, it is the job of the cloud provider(with —cloud-provider option defined in cube-controller, API-server, kubelet) to set those labels. If appropriate labels aren’t set, then check that your CloudProvider integration is correct. I used kubeadm, so it is cumbersome to set up but with other tools, kops, for instance, it is working right away.
Based on your PV definition and the usage of nodeAffinity field, you are trying to use a local volume, (read here local volume description link, official docs), then make sure that you set "NodeAffinity field" like that(it worked in my case on AWS):
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- my-node # it must be the name of your node(kubectl get nodes)
So that after creating the resource and running describe on it it will show up there like that:
Required Terms:
Term 0: kubernetes.io/hostname in [your node name]
The "1 node(s) had volume node affinity conflict" error is created by the scheduler because it can't schedule your pod to a node that conforms with the persistenvolume.spec.nodeAffinity
field in your PersistentVolume (PV).
In other words, you say in your PV that a pod using this PV must be scheduled to a node with a label of kubernetes.io/cvl-gtv-42.corp.globaltelemetrics.eu = master
, but this isn't possible for some reason.
There may be various reason that your pod can't be scheduled to such a node:
The place to start looking for the cause is the definition of the node and the pod.
almost same problem described here... https://github.com/kubernetes/kubernetes/issues/61620
"If you're using local volumes, and the node crashes, your pod cannot be rescheduled to a different node. It must be scheduled to the same node. That is the caveat of using local storage, your Pod becomes bound forever to one specific node."