I'm using AKS and K8s golang API.
I'm creating a Kubernetes watcher for watching jobs like
watchres, error := jobsClient.Watch(metav1.ListOptions{})
After that i'm getting the events channel like
eventres := watchres.ResultChan()
After that i'm getting events in a loop using
we := <-eventres
then on the basis of these event i'm performing some action (for example delete a resource when kubernetes job gets successful)
The issue i am facing is that everything seems to work fine but after some period of time the watcher does not delete resources however the jobs gets successful, what might be the issue, is there a timeout for the channel?? however i'm not closing the channel implicitly.
As I said in a previous comment, I've had K8s just not feed me more events down the channel, but not actually hangup like the timeout is supposed to.
So I've worked out something here using select - the idea is to wait for events but restart the watcher every 30 minutes just in case K8s doesn't hang up. So far, this is working:
func watchEvents() {
for {
if err := RunLoop(); err != nil {
log.Error(err)
}
time.Sleep(5 * time.Second)
}
}
func runLoop() error {
watcher, err := clientset.EventsV1beta1().Events("").Watch(metav1.ListOptions{})
if err != nil {
return err
}
ch := watcher.ResultChan()
for {
select {
case event, ok := <-ch:
if !ok {
// the channel got closed, so we need to restart
log.Info("Kubernetes hung up on us, restarting event watcher")
return nil
}
// handle the event
case <-time.After(30 * time.Minute):
// deal with the issue where we get no events
log.Infof("Timeout, restarting event watcher")
return nil
}
}
}
There is a default timeout on the watch. I believe it is set to 30 minutes.
You can override this value in ListOptions
. So for example, to set the timeout to an hour:
timeout := int64(3600)
watchres, error := jobsClient.Watch(metav1.ListOptions{
TimeoutSeconds: &timeout,
})