I have been working in k8's workspace in for like 6 months now and always wonder why do we need a Persistent Volume(PV) and Persistent Volume claim(PVC)? Can anyone please make this concept clear to me?
These information are from Kubernetes In Action book.
Decoupling pods from the underlying storage technology
Ideally, a developer deploying their apps on Kubernetes should never have to know what kind of storage technology is used underneath, the same way they don’t have to know what type of physical servers are being used to run their pods.
When a developer needs a certain amount of persistent storage for their application, they can request it from Kubernetes, the same way they can request CPU, memory, and other resources when creating a pod. The system administrator can configure the cluster so it can give the apps what they request.
Introducing PersistentVolumes and PersistentVolumeClaims
To enable apps to request storage in a Kubernetes cluster without having to deal with infrastructure specifics, two new resources were introduced. They are PersistentVolumes and PersistentVolumeClaims.
Instead of the developer adding a technology-specific volume to their pod, it’s the cluster administrator who sets up the underlying storage and then registers it in Kubernetes by creating a PersistentVolume resource through the Kubernetes API server. When creating the PersistentVolume, the admin specifies its size and the access modes it supports.
This image is from Kubernetes In Action book, too.
When a cluster user needs to use persistent storage in one of their pods, they first create a PersistentVolumeClaim manifest, specifying the minimum size and the access mode they require. The user then submits the PersistentVolumeClaim manifest to the Kubernetes API server, and Kubernetes finds the appropriate PersistentVolume and binds the volume to the claim.
The PersistentVolumeClaim can then be used as one of the volumes inside a pod.Other users can not use the same PersistentVolume until it has been released by deleting the bound PersistentVolumeClaim.
The separation of PV and PVCs enable a division of responsibilities in Kubernetes cluster administration and resource management.
The PVs are objects that the cluster administrator(s) would create, and they abstract away underlying storage resources to expose an unified view to the user (ie. this is the "volume" you can use with this much space). They only care about exposing the storage resource to the cluster, and not who or how it's going to be used. Quoting the documentation:
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
The PVC, on the other hand, can be used by the cluster users (ie. those who are deploying and maintaining applications) to dynamically request and release storage blocks without worrying about the underlying infrastructure. They don't necessarily have to care about from where the storage comes from or it's actually managed. Quoting the documentation
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
In dynamically provisioned deployments, Kubernetes takes on the role of the PV management. Users simply request a PVC, and Kubernetes will provision the PV and bind the PVC to PV making the storage provisioning process transparent to the cluster users.
Hope this helps!