Grafana Pod is in Init Error state after adding an existing PVC

5/22/2021

Installing grafana using helm charts, the deployment goes well and the grafana ui is up, needed to add an existence persistence volume, ran the below cmd:

helm install grafana grafana/grafana -n prometheus --set persistence.enabled=true --set persistence.existingClaim=grafana-pvc

The init container crashes, with the below logs:

kubectl logs grafana-847b88556f-gjr8b -n prometheus -c init-chown-data                    
chown: /var/lib/grafana: Operation not permitted
chown: /var/lib/grafana: Operation not permitted

On checking the deployment yaml found this section:

initContainers:
      - command:
        - chown
        - -R
        - 472:472
        - /var/lib/grafana
        image: busybox:1.31.1
        imagePullPolicy: IfNotPresent
        name: init-chown-data
        resources: {}
        securityContext:
          runAsNonRoot: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/grafana
          name: storage
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 472
        runAsGroup: 472
        runAsUser: 472
      serviceAccount: grafana
      serviceAccountName: grafana

Why is the operation failing though its running with runAsUser: 0 ? and the pvc is having access:ReadWriteMany, any workaround ? Or am I missing something

Thanks !!

-- Sanjay M. P.
grafana
kubernetes

2 Answers

5/28/2021

Actually, sometimes you might want to avoid changing your storage provider settings or it is impossible at all.

In my case, the error described in the question occured while deploying kube-prometheus-stack helm chart. I wasn't able to access storage provider settings, so I read through chart's example values, where I noticed following section:

initChownData:
  ## If false, data ownership will not be reset at startup
  ## This allows the prometheus-server to be run with an arbitrary user
  ##
  enabled: true

I changed enabled to false and after helm upgrade ... pod initialized successfully and with storage working as expected. This appears to be somewhat safer solution than changing storage provider's security policy and certainly requires less effort.

-- MichaƂ Grabowski
Source: StackOverflow

5/23/2021

NFS turns on root_squash mode by default which functionally disables uid 0 on clients as a superuser (maps those requests to some other UID/GID, usually 65534). You can disable this in your mount options, or use something other than NFS. I would recommend the latter, NFS is bad.

-- coderanger
Source: StackOverflow