Watch and wait for POD deletion with Golang k8s client

2/15/2022

I need to watch (and wait) until a POD is deleted. I need to this is because I need to start a second pod (with the same name) immediately after the first one has been deleted.

This is what I'm trying:

func (k *k8sClient) waitPodDeleted(ctx context.Context, resName string) error {
	watcher, err := k.createPodWatcher(ctx, resName)
	if err != nil {
		return err
	}

	defer watcher.Stop()

	for {
		select {
		case event := <-watcher.ResultChan():

			if event.Type == watch.Deleted {
				k.logger.Debugf("The POD \"%s\" is deleted", resName)

				return nil
			}

		case <-ctx.Done():
			k.logger.Debugf("Exit from waitPodDeleted for POD \"%s\" because the context is done", resName)
			return nil
		}
	}
}

The problem with this approach is that when I get the Deleted event, is when the POD receives the event for deletion, but not when it is actually deleted. Doing some extra tests I ended debugging the process with this code:

case event := <-watcher.ResultChan():

	if event.Type == watch.Deleted {
		pod := event.Object.(*v1.Pod)
		k.logger.Debugf("EVENT %s PHASE %s MESSAGE %s", event.Type, pod.Status.Phase, pod.Status.Message)
	}

The log result for this is:

2022-02-15T08:21:51 DEBUG EVENT DELETED PHASE Running MESSAGE
2022-02-15T08:22:21 DEBUG EVENT DELETED PHASE Running MESSAGE

I'm getting two Deleted events. The first one right away I send the delete command. The last one when the pod is effectively deleted from the cluster.

My questions are:

  • Why I'm getting two Deleted events? How can I differentiate one from another? I've tried to compare the two events and they seems exactly the same (except the timestamps)
  • What is the best approach to watch and wait for a pod deletion, so I can immediately relaunch it? should I poll the API until the pod is not returned?

The usecase I'm trying to solve: In my application there is a feature to replace a tool with another with different options. The feature needs to delete the pod that contains the tool and relaunch it with another set of options. In this scenario I need to wait for the pod deletion so I can start it again.

Thanks in advance!

-- Alejandro Gonz&#225;lez
go
kubernetes
kubernetes-pod

1 Answer

2/21/2022

As I said in the comments, the real problem was the watcher I was creating to watch the pod I want to get deleted. In the watcher I was creating a LabelSelector that was selecting two pods instead of one. This is the complete solution, including the watcher.

func (k *k8sClient) createPodWatcher(ctx context.Context, resName string) (watch.Interface, error) {
	labelSelector := fmt.Sprintf("app.kubernetes.io/instance=%s", resName)
	k.logger.Debugf("Creating watcher for POD with label: %s", labelSelector)

	opts := metav1.ListOptions{
		TypeMeta:      metav1.TypeMeta{},
		LabelSelector: labelSelector,
		FieldSelector: "",
	}

	return k.clientset.CoreV1().Pods(k.cfg.Kubernetes.Namespace).Watch(ctx, opts)
}

func (k *k8sClient) waitPodDeleted(ctx context.Context, resName string) error {
    watcher, err := k.createPodWatcher(ctx, resName)
    if err != nil {
        return err
    }

    defer watcher.Stop()

    for {
        select {
        case event := <-watcher.ResultChan():

            if event.Type == watch.Deleted {
                k.logger.Debugf("The POD \"%s\" is deleted", resName)

                return nil
            }

        case <-ctx.Done():
            k.logger.Debugf("Exit from waitPodDeleted for POD \"%s\" because the context is done", resName)
            return nil
        }
    }
}

func (k *k8sClient) waitPodRunning(ctx context.Context, resName string) error {
	watcher, err := k.createPodWatcher(ctx, resName)
	if err != nil {
		return err
	}

	defer watcher.Stop()

	for {
		select {
		case event := <-watcher.ResultChan():
			pod := event.Object.(*v1.Pod)

			if pod.Status.Phase == v1.PodRunning {
				k.logger.Infof("The POD \"%s\" is running", resName)
				return nil
			}

		case <-ctx.Done():
			k.logger.Debugf("Exit from waitPodRunning for POD \"%s\" because the context is done", resName)
			return nil
		}
	}
}
-- Alejandro Gonz&#225;lez
Source: StackOverflow