Cancel or undo deletion of Persistent Volumes in kubernetes cluster

7/30/2018

Accidentally tried to delete all PV's in cluster but thankfully they still have PVC's that are bound to them so all PV's are stuck in Status: Terminating.

How can I get the PV's out of the "terminating" status and back to a healthy state where it is "bound" to the pvc and is fully working?

The key here is that I don't want to lose any data and I want to make sure the volumes are functional and not at risk of being terminated if claim goes away.

Here are some details from a kubectl describe on the PV.

$ kubectl describe pv persistent-vol-1
Finalizers:      [kubernetes.io/pv-protection foregroundDeletion]
Status:          Terminating (lasts 1h)
Claim:           ns/application
Reclaim Policy:  Delete

Here is the describe on the claim.

$ kubectl describe pvc application
Name:          application
Namespace:     ns
StorageClass:  standard
Status:        Bound
Volume:        persistent-vol-1
-- Hobbit-42
kubernetes
persistent-volumes

4 Answers

7/30/2018

Unfortunately, you can't save your PV's and data in this case. All you may do is recreate PV with Reclaim Policy: Retain - this will prevent data loss in the future. You can read more about reclaim Policies here and here.

What happens if I delete a PersistentVolumeClaim (PVC)? If the volume was dynamically provisioned, then the default reclaim policy is set to “delete”. This means that, by default, when the PVC is deleted, the underlying PV and storage asset will also be deleted. If you want to retain the data stored on the volume, then you must change the reclaim policy from “delete” to “retain” after the PV is provisioned.

-- VKR
Source: StackOverflow

9/14/2018

It is, in fact, possible to save data from your PersistentVolume with Status: Terminating and RetainPolicy set to default (delete). We have done so on GKE, not sure about AWS or Azure but I guess that they are similar

We had the same problem and I will post our solution here in case somebody else has an issue like this.

Your PersistenVolumes will not be terminated until there is a pod, deployment or to be more specific - a PersistentVolumeClaim using it.

The steps we took to remedy our broken state:

Once you are in the situation lke the OP, the first thing you want to do is to create a snapshot of your PersistentVolumes.

In GKE console, go to Compute Engine -> Disks and find your volume there (use kubectl get pv | grep pvc-name) and create a snapshot of your volume.

Use the snapshot to create a disk: gcloud compute disks create name-of-disk --size=10 --source-snapshot=name-of-snapshot --type=pd-standard --zone=your-zone

At this point, stop the services using the volume and delete the volume and volume claim.

Recreate the volume manually with the data from the disk:

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: name-of-pv
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 10Gi
  gcePersistentDisk:
    fsType: ext4
    pdName: name-of-disk
  persistentVolumeReclaimPolicy: Retain

Now just update your volume claim to target a specific volume, the last line of the yaml file:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
  namespace: my-namespace
  labels:
    app: my-app
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: name-of-pv
-- Urosh T.
Source: StackOverflow

1/14/2019

Do not attempt this if you don't know what you're doing

There is another fairly hacky way of undeleting PVs. Directly editing the objects in etcd. Note that the following steps work only if you have control over etcd - this may not be true on certain cloud providers or managed offerings. Also note that you can screw things up much worse easily; since objects in etcd were never meant to be edited directly - so please approach this with caution.

We had a situation wherein our PVs had a policy of delete and I accidentally ran a command deleting a majority of them, on k8s 1.11. Thanks to storage-object-in-use protection, they did not immediately disappear, but they hung around in a dangerous state. Any deletion or restarts of the pods that were binding the PVCs would have caused the kubernetes.io/pvc-protection finalizer to get removed and thereby deletion of the underlying volume (in our case, EBS). New finalizers also cannot be added when the resource is in terminating state - From a k8s design standpoint, this is necessary in order to prevent race conditions.

Below are the steps I followed:

  • Back up the storage volumes you care about. This is just to cover yourself against possible deletion - AWS, GCP, Azure all provide mechanisms to do this and create a new snapshot.
  • Access etcd directly - if it's running as a static pod, you can ssh into it and check the http serving port. By default, this is 4001. If you're running multiple etcd nodes, use any one.
  • Port-forward 4001 to your machine from the pod.
kubectl -n=kube-system port-forward etcd-server-ip-x.y.z.w-compute.internal 4001:4001 
  • Use the REST API, or a tool like etcdkeeper to connect to the cluster.

  • Navigate to /registry/persistentvolumes/ and find the corresponding PVs. The deletion of resources by controllers in k8s is done by setting the .spec.deletionTimeStamp field in the controller spec. Delete this field in order to have the controllers stop trying to delete the PV. This will revert them to the Bound state, which is probably where they were before you ran a delete.

  • You can also carefully edit the reclaimPolicy to Retain and then save the objects back to etcd. The controllers will re-read the state soon and you should see it reflected in kubectl get pv output as well shortly.

Your PVs should go back to the old undeleted state:

$ kubectl get pv

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                                             STORAGECLASS                      REASON    AGE
pvc-b5adexxx   5Gi        RWO            Retain           Bound     zookeeper/datadir-zoo-0                           gp2                                         287d
pvc-b5ae9xxx   5Gi        RWO            Retain           Bound     zookeeper/datalogdir-zoo-0                        gp2                                         287d

As a general best practice, it is best to use RBAC and the right persistent volume reclaim policy to prevent accidental deletion of PVs or the underlying storage.

-- Anirudh Ramanathan
Source: StackOverflow

1/5/2020

I found myself in this same situation due to a careless mistake. It was with a statefulset on Google Cloud/GKE. My PVC said terminating because the pod referencing it was still running and the PV was configured with a retain policy of Deleted. I ended up finding a simpler method to get everything straightened out that also preserved all of the extra Google/Kubernetes metadata and names.

First, I would make a snapshot of your disk as suggested by another answer. You won't need it, but if something goes wrong, the other answer here can then be used to re-create a disk from it.

The short version is that you just need reconfigure the PV to "Retain", allow the PVC to get deleted, then remove the previous claim from the PV. A new PVC can then be bound to it and all is well.

Details:

  1. Find the full name of the PV:
    kubectl get pv
  1. Reconfigure your PV to set the reclaim policy to "Retain": (I'm doing this on Windows so you may need to handle the quotes differently depending on OS)
    kubectl patch pv <your-pv-name-goes-here> -p "{\"spec\":{\"persistentVolumeReclaimPolicy\":\"Retain\"}}"
  1. Verify that the status of the PV is now Retain.
  2. Shutdown your pod/statefulset (and don't allow it to restart). Once that's finished, your PVC will get removed and the PV (and the disk it references!) will be left intact.
  3. Edit the PV:
    kubectl edit pv <your-pv-name-goes-here>
  1. In the editor, remove the entire "claimRef" section. Remove all of the lines from (and including) "claimRef:" until the next tag with the same indentation level. The lines to remove should look more or less like this:
      claimRef:
        apiVersion: v1
        kind: PersistentVolumeClaim
        name: my-app-pvc-my-app-0
        namespace: default
        resourceVersion: "1234567"
        uid: 12345678-1234-1234-1234-1234567890ab
  1. Save the changes and close the editor. Check the status of the PV and it should now show "Available".
  2. Now you can re-create your PVC exactly as you originally did. That should then find the now "Available" PV and bind itself to it. In my case, I have the PVC defined with my statefulset as a volumeClaimTemplate so all I had to do was "kubectl apply" my statefulset.
-- John
Source: StackOverflow