Formatting GKE local ssd using lvm and btrfs

2/8/2022

I'm trying to format the local ssd that GKE node pools provide using btrfs. The drive comes preformatted with ext4. Is there a way to prevent this so that I can format it using btrfs?

I would prefer not to forcibly unmount the drive or modify the Node in GKE, but haven't been able to find a canonical way of doing this. The answer might be to run my own kubernetes cluster, but would prefer not to do this.

I use terraform to create the cluster and ansible to perform the actual modification of the underlying host:

---
- hosts: localhost
  connection: local
  become: yes
  tasks:
    - name: Update the apt package index
      apt:
        name: "*"
        state: latest
        update_cache: yes
        force_apt_get: yes

    - name: Install btrfs partition tools
      apt:
        name:
          - parted
          - btrfs-progs
        state: latest

    - name: Enable btrfs
      community.general.modprobe:
        name: btrfs
        state: present

    - name: Unmount existing partition
      ansible.posix.mount:
        path: /mnt/stateful_partition/kube-ephemeral-ssd
        state: unmounted

    - name: Delete existing ext4 partition
      parted:
        device: /dev/nvme0n1
        # number: 1
        state: absent

    - name: create partition
      parted:
        device: /dev/nvme0
        number: 1
        flags: [ lvm ]
        state: present

    - name: Create Volume Group
      community.general.lvg:
        vg: replays
        pvs: /dev/nvme0n1
-- Paul Johnson
google-cloud-platform
google-compute-engine
google-kubernetes-engine
kubernetes

1 Answer

2/9/2022

Did some digging through the GKE node initialization scripts and figured out where the setting enabling this was. I was able to disable it by modifying the kube-env metadata field in the instance template that GKE generates.

I wrote a script in typescript to do this modification every time I start the cluster. It's pretty rough and uncommented, so if you need help doing this or node customization in general, just comment.

<s> With a bit more research it looks like it's possible, but pretty ugly. The local SSD isn't simply blank storage, it's used by the kubernetes service itself to store files it uses.

ScyllaDB is doing something similar to what I want to get an xfs filesystem on the local storage.

It does this via a script that: 1. Shuts the kubernetes service on the node down. 2. Copies the kubernetes files on the local ssd to a temp directory. 3. Formats the local storage as xfs. 4. Copies the kubernetes files to the newly created xfs drive. 5. Restarts the kubernetes service.

All this happens in a systemd service to ensure it keeps running when the containers in the daemonset that starts this process shutdown due to the kubernetes service being stopped.

I think I'll investigate other hosted services.

Script: https://github.com/scylladb/scylla-operator/blob/6e9424fa2c4206c1e3e6fd74b9398e5a36d91f26/hack/gke/xfs-formatter/xfs-formatter.sh Daemonset: https://github.com/scylladb/scylla-operator/blob/master/examples/gke/xfs-formatter-daemonset.yaml </s>

-- Paul Johnson
Source: StackOverflow