Watch in k8s golang api watches and get events but after sometime doesn't get any more events

7/18/2018

I'm using AKS and K8s golang API.

I'm creating a Kubernetes watcher for watching jobs like

watchres, error := jobsClient.Watch(metav1.ListOptions{})

After that i'm getting the events channel like

eventres := watchres.ResultChan()

After that i'm getting events in a loop using

we := <-eventres

then on the basis of these event i'm performing some action (for example delete a resource when kubernetes job gets successful)

The issue i am facing is that everything seems to work fine but after some period of time the watcher does not delete resources however the jobs gets successful, what might be the issue, is there a timeout for the channel?? however i'm not closing the channel implicitly.

-- UmairAhmad
go
kubernetes

2 Answers

2/28/2019

As I said in a previous comment, I've had K8s just not feed me more events down the channel, but not actually hangup like the timeout is supposed to.

So I've worked out something here using select - the idea is to wait for events but restart the watcher every 30 minutes just in case K8s doesn't hang up. So far, this is working:

func watchEvents() {
    for {
        if err := RunLoop(); err != nil {
            log.Error(err)
        }
        time.Sleep(5 * time.Second)
    }
}

func runLoop() error {
    watcher, err := clientset.EventsV1beta1().Events("").Watch(metav1.ListOptions{})
    if err != nil {
        return err
    }
    ch := watcher.ResultChan()
    for {
        select {
        case event, ok := <-ch:
            if !ok {
                // the channel got closed, so we need to restart
                log.Info("Kubernetes hung up on us, restarting event watcher")
                return nil
            }

            // handle the event

        case <-time.After(30 * time.Minute):
            // deal with the issue where we get no events
            log.Infof("Timeout, restarting event watcher")
            return nil
        }
    }
}
-- Michael
Source: StackOverflow

7/18/2018

There is a default timeout on the watch. I believe it is set to 30 minutes.

You can override this value in ListOptions. So for example, to set the timeout to an hour:

timeout := int64(3600)
watchres, error := jobsClient.Watch(metav1.ListOptions{
    TimeoutSeconds: &timeout,
})
-- PhilipGough
Source: StackOverflow