Kubernetes Job Informer Callback Called Very Late

11/7/2021

We are using the kubernetes informer (source code) to receive job update events from kubernetes api server, and after we receive these events we usually delete the job records from the cluster.

Recently we found there are so many job records staying in the cluster because the client didn't receive the Kubernetes Job Update Event from kuberntes API Server immediately, instead it receives those events more than an hour later.

Here are some information:

  1. job informer callback

    OnAdd(obj interface{})
    OnUpdate(oldObj, newObj interface{})     
    OnDelete(obj interface{})       
  2. cluster information

    kubernetes version: v1.20
    client-go version: v0.19.6

  1. other information
    there are more than 1000 kubernetes Jobs there, they are all in Complete status and we don't remove them only because we need them to debug other business logics.

    But we found that after removing those kubernetes jobs, and restarting the informer, everything became normal.

How to solve the Notification Delay? Is there anyway to debug this kind of issue?

-- Wallace
client-go
kubernetes
kubernetes-jobs

0 Answers