Updating a deployment that uses a ReadWriteOnce volume will fail on mount

1/23/2019

My deployment is using a couple of volumes, all defined as ReadWriteOnce.

When applying the deployment to a clean cluster, pod is created successfuly.

However, if I update my deployment (i.e update container image), when a new pod is created for my deployment it will always fail on volume mount:

/Mugen$ kubectl get pods
NAME                            READY     STATUS              RESTARTS   AGE
my-app-556c8d646b-4s2kg         5/5       Running             1          2d
my-app-6dbbd99cc4-h442r         0/5       ContainerCreating   0          39m

/Mugen$ kubectl describe pod my-app-6dbbd99cc4-h442r
      Type     Reason                  Age                 From                                             Message
      ----     ------                  ----                ----                                             -------
      Normal   Scheduled               9m                  default-scheduler                                Successfully assigned my-app-6dbbd99cc4-h442r to gke-my-test-default-pool-671c9db5-k71l
      Warning  FailedAttachVolume      9m                  attachdetach-controller                          Multi-Attach error for volume "pvc-b57e8a7f-1ca9-11e9-ae03-42010a8400a8" Volume is already used by pod(s) my-app-556c8d646b-4s2kg
      Normal   SuccessfulMountVolume   9m                  kubelet, gke-my-test-default-pool-671c9db5-k71l  MountVolume.SetUp succeeded for volume "default-token-ksrbf"
      Normal   SuccessfulAttachVolume  9m                  attachdetach-controller                          AttachVolume.Attach succeeded for volume "pvc-2cc1955a-1cb2-11e9-ae03-42010a8400a8"
      Normal   SuccessfulAttachVolume  9m                  attachdetach-controller                          AttachVolume.Attach succeeded for volume "pvc-2c8dae3e-1cb2-11e9-ae03-42010a8400a8"
      Normal   SuccessfulMountVolume   9m                  kubelet, gke-my-test-default-pool-671c9db5-k71l  MountVolume.SetUp succeeded for volume "pvc-2cc1955a-1cb2-11e9-ae03-42010a8400a8"
      Normal   SuccessfulMountVolume   9m                  kubelet, gke-my-test-default-pool-671c9db5-k71l  MountVolume.SetUp succeeded for volume "pvc-2c8dae3e-1cb2-11e9-ae03-42010a8400a8"
      Warning  FailedMount             52s (x4 over 7m)    kubelet, gke-my-test-default-pool-671c9db5-k71l  Unable to mount volumes for pod "my-app-6dbbd99cc4-h442r_default(affe75e0-1edd-11e9-bb45-42010a840094)": timeout expired waiting for volumes to attach or mount for pod "default"/"my-app-6dbbd99cc4-h442r". list of unmounted volumes=[...]. list of unattached volumes=[...]

What is the best strategy to apply changes to such a deployment then? Will there have to be some service outage in order to use the same persistence volumes? (I wouldn't want to create new volumes - the data should maintain)

-- Mugen
google-kubernetes-engine
kubernetes
kubernetes-deployment
kubernetes-pvc

3 Answers

6/10/2019

I ended with a better solution, where all my client pods are only readers of the content, and I have an independent CI process writing the content, I do the following:

  • From CI: Write content to a Google Cloud Storage bucket: gs://my-storage, then restart all frontend pods
  • On deployment definition I sync (download) entire bucket to pod volatile storage and serve it from file system with the best performance.

How to achieve that: On the frontend docker image, I added the gcloud installation block from https://github.com/GoogleCloudPlatform/cloud-sdk-docker/blob/master/debian_slim/Dockerfile :

ARG CLOUD_SDK_VERSION=249.0.0
ENV CLOUD_SDK_VERSION=$CLOUD_SDK_VERSION
ARG INSTALL_COMPONENTS
ENV PATH "$PATH:/opt/google-cloud-sdk/bin/"
RUN apt-get update -qqy && apt-get install -qqy \
        curl \
        gcc \
        python-dev \
        python-setuptools \
        apt-transport-https \
        lsb-release \
        openssh-client \
        git \
        gnupg \
    && easy_install -U pip && \
    pip install -U crcmod && \
    export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
    echo "deb https://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" > /etc/apt/sources.list.d/google-cloud-sdk.list && \
    curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
    apt-get update && apt-get install -y google-cloud-sdk=${CLOUD_SDK_VERSION}-0 $INSTALL_COMPONENTS && \
    gcloud config set core/disable_usage_reporting true && \
    gcloud config set component_manager/disable_update_check true && \
    gcloud config set metrics/environment github_docker_image && \
    gcloud --version
VOLUME ["/root/.config"]

And in the pod deployment frontend.yaml I added the following lifecycle event:

...
spec:
  ...
  containers:
  ...
    lifecycle:
    postStart:
      exec:
       command: ["gsutil", "-m", "rsync", "-r", "gs://my-storage", "/usr/share/nginx/html"]

To "refresh" the frontend pods when bucket content is updated, I simply run the following from my CI:

kubectl set env deployment/frontend K8S_FORCE=date +%s``

-- Mugen
Source: StackOverflow

1/30/2019

You will need to tolerate an outage here, due to the access mode. This will delete the existing Pods (unmounting the volumes) before creating new ones.

A Deployment strategy - .spec.strategy.type - of “Recreate” will help achieve this: https://github.com/ContainerSolutions/k8s-deployment-strategies/blob/master/recreate/README.md

-- elithrar
Source: StackOverflow

1/24/2019

That seems like an error because of the ReadWriteOnce access mode. Remember that when you update a deployment, new pods gets created and then the older gets killed. So, maybe the new pod try to mount an already mounted volume and that’s why you get that message.

Have you tried using a volume that allows multiple readers/writers? You can see the list of current volumes in the Kubernetes Volumes documentation.

-- ozrlz
Source: StackOverflow