I would like to calculate the total number of bytes allocated by the persistent volumes (PVs) in a cluster. Using the following:
$ kubectl get pv -A -o json
I can get a JSON list of all the cluster's PVs and for each PV in the items[]
list one can read the spec.capacity.storage
key to access the necessary information.
See example below:
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"capacity": {
"storage": "500Gi"
},
"claimRef": {
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"name": "s3-storage-minio",
"namespace": "default",
"resourceVersion": "515932",
},
"persistentVolumeReclaimPolicy": "Delete",
"volumeMode": "Filesystem",
},
"status": {
"phase": "Bound"
}
},
However, the returned values can be represented in different suffix (storage as a plain integer or as a fixed-point number using one of these suffixes: E, P, T, G, M, K. Or similarly, power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki).
Is there a neat way to request the capacity in integer format (or any other format but consistent among all the PVs) using the kubectl?
Otherwise, transforming different suffix to a common one in Bash looks like not very straightforward.
Thanks in advance for your help.
I haven't found a way to transform a value in .spec.capacity.storage
using purely kubectl
.
I've managed to create a code with Python and it's Kubernetes library to extract the data and calculate the size of all used PV
's. Please treat this code as an example and not production ready:
from kubernetes import client, config
import re
config.load_kube_config() # use .kube/config
v1 = client.CoreV1Api()
multiplier_dict = {"k": 1000, "Ki": 1024, "M": 1000000, "Mi": 1048576 , "G": 1000000000, "Gi": 1073741824} # and so on ...
size = 0
# for i in v1.list_persistent_volume_claim_for_all_namespaces(watch=False).items: # PVC
for i in v1.list_persistent_volume(watch=False).items: # PV
x = i.spec.capacity["storage"] # PV
# x = i.spec.resources.requests["storage"] # PVC
y = re.findall(r'[A-Za-z]+|\d+', x)
print(y)
# try used if no suffix (like Mi) is used
try:
if y[1] in multiplier_dict:
size += multiplier_dict.get(y[1]) * int(y[0])
except IndexError:
size += int(y[0])
print("The size in bytes of all PV's is: " + str(size))
Having as an example a cluster that has following PV
's:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-6b5236ec-547f-4f96-8448-e3dbe01c9039 500Mi RWO Delete Bound default/pvc-four hostpath 4m13s
pvc-86d178bc-1673-44e0-9a89-2efb14a1d22c 512M RWO Delete Bound default/pvc-three hostpath 4m15s
pvc-89b64f93-6bf4-4987-bdda-0356d19d6f59 1G RWO Delete Bound default/pvc-one hostpath 4m15s
pvc-a3455e77-0db0-4cab-99c9-c72721a65632 10Ki RWO Delete Bound default/pvc-six hostpath 4m14s
pvc-b47f92ef-f627-4391-943f-efa4241d0811 10k RWO Delete Bound default/pvc-five hostpath 4m13s
pvc-c3e13d78-9047-4899-99e7-0b2667ce4698 1Gi RWO Delete Bound default/pvc-two hostpath 4m15s
pvc-c57fe2b0-013a-412b-bca9-05050990766a 10 RWO Delete Bound default/pvc-seven hostpath 113s
The code would produce the output of:
['500', 'Mi']
['512', 'M']
['1', 'G']
['10', 'Ki']
['10', 'k']
['1', 'Gi']
['10']
The size in bytes of all PV's is: 3110050074
Adding to the whole answer remember that there could be differences on the request of a PVC
and the actual PV
size. Please refer to the storage documentation of your choosing on that regard.
pvc.yaml
:apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M
Part of the $ kubectl get pvc -o yaml
output:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100M # <-- REQUEST
<-- REDACTED -->
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi # <-- SIZE OF PV
phase: Bound
Additional resources: