EKS: Is it possible to move the PV of a StatefulSet to a different Availability Zone by deleting it?

12/22/2021

I have an EKS cluster running a StatefulSet using EC2 nodes and EBS volumes for the storageclass. I want to move a pod of the StatefulSet from node 1 to node 2. I drain node 1 like so:

kubectl drain --ignore-daemonsets --delete-emptydir-data node1

The problem is the pod doesn't come up on node 2, because the PV has been created in us-east-1a and can't be attached to node 2 which is in us-east-1b (cross-zone issue described here: https://stackoverflow.com/a/55514852/1259990).

When I describe the pod, I get the following scheduling error:

1 node(s) had volume node affinity conflict

I'm wondering if I can recreate the PV in us-east-1b without having to delete/redeploy the StatefulSet. If I were to delete the PV from my cluster (and possibly the PVC as well):

kubectl delete pv pv-in-us-east-1a

Would the StatefulSet recreate the PV in the correct zone, if node2 is the only schedulable node? If not, is there another way to accomplish this without deleting/recreating the full StatefulSet? The data on the PV is not important and doesn't need to be saved.

(I would just try to delete the PV, but I don't actually want to bring down this particular service if the PV doesn't get recreated.)

-- blindsnowmobile
amazon-ebs
amazon-ec2
amazon-eks
kubernetes
kubernetes-statefulset

2 Answers

1/12/2022

In case it's helpful to anyone, I was able to resolve my issue as follows:

  1. Cordoned the node that was in the "wrong" availability zone
  2. Deleted the pvc associated with the pod
  3. Deleted the associated pv
  4. Deleted the pod

The new pod came up automatically with a new PV, in the AZ I wanted. My pod is part of a StatefulSet, so it's automatically recreated when deleted.

One thing to note is the old BS volume wasn't automatically cleaned up in my AWS account. I had to delete it manually.

-- blindsnowmobile
Source: StackOverflow

3/16/2022

What you need to do is 1) is labels on your nodes per zone (you do that using the autoscaling group tags) 2) on statefulset that requires PV you use a node selector for a specific zone: i.e. us-east-1

That statefulset will be locked to a specific AZ but you will avoid this problem in the future.

-- Mariano Billinghurst
Source: StackOverflow