return persistent volume (pv) capacity in integer instead of Gi, Mi, Ki, G, M, K etc

3/22/2021

I would like to calculate the total number of bytes allocated by the persistent volumes (PVs) in a cluster. Using the following:

$ kubectl get pv -A -o json

I can get a JSON list of all the cluster's PVs and for each PV in the items[] list one can read the spec.capacity.storage key to access the necessary information. See example below:

{
  "apiVersion": "v1",
  "kind": "PersistentVolume",
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "capacity": {
      "storage": "500Gi"
    },
    "claimRef": {
      "apiVersion": "v1",
      "kind": "PersistentVolumeClaim",
      "name": "s3-storage-minio",
      "namespace": "default",
      "resourceVersion": "515932",
    },
    "persistentVolumeReclaimPolicy": "Delete",
    "volumeMode": "Filesystem",
  },
  "status": {
    "phase": "Bound"
  }
},

However, the returned values can be represented in different suffix (storage as a plain integer or as a fixed-point number using one of these suffixes: E, P, T, G, M, K. Or similarly, power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki).

Is there a neat way to request the capacity in integer format (or any other format but consistent among all the PVs) using the kubectl?

Otherwise, transforming different suffix to a common one in Bash looks like not very straightforward.

Thanks in advance for your help.

-- fantoman
kubernetes
kubernetes-pvc

1 Answer

3/24/2021

I haven't found a way to transform a value in .spec.capacity.storage using purely kubectl.


I've managed to create a code with Python and it's Kubernetes library to extract the data and calculate the size of all used PV's. Please treat this code as an example and not production ready:

from kubernetes import client, config
import re 

config.load_kube_config() # use .kube/config
v1 = client.CoreV1Api()

multiplier_dict = {"k": 1000, "Ki": 1024, "M": 1000000, "Mi": 1048576 , "G": 1000000000, "Gi": 1073741824} # and so on ... 
size = 0 

# for i in v1.list_persistent_volume_claim_for_all_namespaces(watch=False).items: # PVC

for i in v1.list_persistent_volume(watch=False).items: # PV

    x = i.spec.capacity["storage"] # PV
    # x = i.spec.resources.requests["storage"] # PVC
    y = re.findall(r'[A-Za-z]+|\d+', x)
    print(y)

    # try used if no suffix (like Mi) is used
    try: 
        if y[1] in multiplier_dict: 
            size += multiplier_dict.get(y[1]) * int(y[0])
    except IndexError:
            size += int(y[0])
    
print("The size in bytes of all PV's is: " + str(size))

Having as an example a cluster that has following PV's:

  • $ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
pvc-6b5236ec-547f-4f96-8448-e3dbe01c9039   500Mi      RWO            Delete           Bound    default/pvc-four    hostpath                4m13s
pvc-86d178bc-1673-44e0-9a89-2efb14a1d22c   512M       RWO            Delete           Bound    default/pvc-three   hostpath                4m15s
pvc-89b64f93-6bf4-4987-bdda-0356d19d6f59   1G         RWO            Delete           Bound    default/pvc-one     hostpath                4m15s
pvc-a3455e77-0db0-4cab-99c9-c72721a65632   10Ki       RWO            Delete           Bound    default/pvc-six     hostpath                4m14s
pvc-b47f92ef-f627-4391-943f-efa4241d0811   10k        RWO            Delete           Bound    default/pvc-five    hostpath                4m13s
pvc-c3e13d78-9047-4899-99e7-0b2667ce4698   1Gi        RWO            Delete           Bound    default/pvc-two     hostpath                4m15s
pvc-c57fe2b0-013a-412b-bca9-05050990766a   10         RWO            Delete           Bound    default/pvc-seven   hostpath                113s

The code would produce the output of:

['500', 'Mi']
['512', 'M']
['1', 'G']
['10', 'Ki']
['10', 'k']
['1', 'Gi']
['10']
The size in bytes of all PV's is: 3110050074

Adding to the whole answer remember that there could be differences on the request of a PVC and the actual PV size. Please refer to the storage documentation of your choosing on that regard.

  • pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100M

Part of the $ kubectl get pvc -o yaml output:

  spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 100M # <-- REQUEST
    <-- REDACTED --> 
  status:
    accessModes:
    - ReadWriteOnce
    capacity:
      storage: 1Gi # <-- SIZE OF PV
    phase: Bound

Additional resources:

-- Dawid Kruk
Source: StackOverflow