I would like to calculate the total number of bytes allocated by the persistent volumes (PVs) in a cluster. Using the following:
$ kubectl get pv -A -o json I can get a JSON list of all the cluster's PVs and for each PV in the items[] list one can read the spec.capacity.storage key to access the necessary information.
See example below:
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "500Gi"
},
"claimRef": {
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"name": "s3-storage-minio",
"namespace": "default",
"resourceVersion": "515932",
},
"persistentVolumeReclaimPolicy": "Delete",
"volumeMode": "Filesystem",
},
"status": {
"phase": "Bound"
}
},
However, the returned values can be represented in different suffix (storage as a plain integer or as a fixed-point number using one of these suffixes: E, P, T, G, M, K. Or similarly, power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki).
Is there a neat way to request the capacity in integer format (or any other format but consistent among all the PVs) using the kubectl?
Otherwise, transforming different suffix to a common one in Bash looks like not very straightforward.
Thanks in advance for your help.
I haven't found a way to transform a value in .spec.capacity.storage using purely kubectl.
I've managed to create a code with Python and it's Kubernetes library to extract the data and calculate the size of all used PV's. Please treat this code as an example and not production ready:
from kubernetes import client, config
import re
config.load_kube_config() # use .kube/config
v1 = client.CoreV1Api()
multiplier_dict = {"k": 1000, "Ki": 1024, "M": 1000000, "Mi": 1048576 , "G": 1000000000, "Gi": 1073741824} # and so on ...
size = 0
# for i in v1.list_persistent_volume_claim_for_all_namespaces(watch=False).items: # PVC
for i in v1.list_persistent_volume(watch=False).items: # PV
x = i.spec.capacity["storage"] # PV
# x = i.spec.resources.requests["storage"] # PVC
y = re.findall(r'[A-Za-z]+|\d+', x)
print(y)
# try used if no suffix (like Mi) is used
try:
if y[1] in multiplier_dict:
size += multiplier_dict.get(y[1]) * int(y[0])
except IndexError:
size += int(y[0])
print("The size in bytes of all PV's is: " + str(size))Having as an example a cluster that has following PV's:
$ kubectl get pvNAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-6b5236ec-547f-4f96-8448-e3dbe01c9039 500Mi RWO Delete Bound default/pvc-four hostpath 4m13s
pvc-86d178bc-1673-44e0-9a89-2efb14a1d22c 512M RWO Delete Bound default/pvc-three hostpath 4m15s
pvc-89b64f93-6bf4-4987-bdda-0356d19d6f59 1G RWO Delete Bound default/pvc-one hostpath 4m15s
pvc-a3455e77-0db0-4cab-99c9-c72721a65632 10Ki RWO Delete Bound default/pvc-six hostpath 4m14s
pvc-b47f92ef-f627-4391-943f-efa4241d0811 10k RWO Delete Bound default/pvc-five hostpath 4m13s
pvc-c3e13d78-9047-4899-99e7-0b2667ce4698 1Gi RWO Delete Bound default/pvc-two hostpath 4m15s
pvc-c57fe2b0-013a-412b-bca9-05050990766a 10 RWO Delete Bound default/pvc-seven hostpath 113s
The code would produce the output of:
['500', 'Mi']
['512', 'M']
['1', 'G']
['10', 'Ki']
['10', 'k']
['1', 'Gi']
['10']
The size in bytes of all PV's is: 3110050074Adding to the whole answer remember that there could be differences on the request of a PVC and the actual PV size. Please refer to the storage documentation of your choosing on that regard.
pvc.yaml:apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100MPart of the $ kubectl get pvc -o yaml output:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M # <-- REQUEST
<-- REDACTED -->
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi # <-- SIZE OF PV
phase: BoundAdditional resources: