How to check when "kubectl delete" failed with "timeout waiting for ... to be synced"

10/9/2018

I have a Kubernetes v1.10.2 cluster and a cronjob on it. The job config is set to:

   failedJobsHistoryLimit: 1
  successfulJobsHistoryLimit: 3

But it has created more than ten jobs, which are all successful and not removed automatically. Now I am trying to delete them manually, with kubectl delete job XXX, but the command timeout as:

$  kubectl delete job XXX
error: timed out waiting for "XXX" to be synced

I want to know how can I check in such a situation. Is there a log file for the command execution?

I only know the kubectl logs command, but it is not for such a situation.

"kubectl get" shows the job has already finished:

status:
  active: 1
  completionTime: 2018-08-27T21:20:21Z
  conditions:
  - lastProbeTime: 2018-08-27T21:20:21Z
    lastTransitionTime: 2018-08-27T21:20:21Z
    status: "True"
    type: Complete
  failed: 3
  startTime: 2018-08-27T01:00:00Z
  succeeded: 1

and "kubectl describe" output as:

$ kubectl describe job test-elk-xxx-1535331600 -ntest
Name:           test-elk-xxx-1535331600
Namespace:      test
Selector:       controller-uid=863a14e3-a994-11e8-8bd7-fa163e23632f
Labels:         controller-uid=863a14e3-a994-11e8-8bd7-fa163e23632f
                job-name=test-elk-xxx-1535331600
Annotations:    <none>
Controlled By:  CronJob/test-elk-xxx
Parallelism:    0
Completions:    1
Start Time:     Mon, 27 Aug 2018 01:00:00 +0000
Pods Statuses:  1 Running / 1 Succeeded / 3 Failed
Pod Template:
  Labels:  controller-uid=863a14e3-a994-11e8-8bd7-fa163e23632f
           job-name=test-elk-xxx-1535331600
  Containers:
   xxx:
    Image:      test-elk-xxx:18.03-3
    Port:       <none>
    Host Port:  <none>
    Args:
      --config
      /etc/elasticsearch-xxx/xxx.yml
      /etc/elasticsearch-xxx/actions.yml
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:        100m
      memory:     100Mi
    Environment:  <none>
    Mounts:
      /etc/elasticsearch-xxx from xxx-configs (ro)
  Volumes:
   xxx-configs:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      test-elk-xxx
    Optional:  false
Events:        <none>

It indicates still one pod running, but I don't know how to figure out the pod name.

-- Michael.Sun
kubernetes
timeout

3 Answers

10/10/2018

I think this is the same as the problem reported in github:

Cannot delete jobs when their associated pods are gone

This is reported by several people, and it is not fixed still.

And can use the "-v=X" (e.g. -v=8) option for the kubectl command, it will give more detailed debug info.

-- Michael.Sun
Source: StackOverflow

1/15/2020

As taken from https://github.com/kubernetes/kubernetes/issues/43168#issuecomment-375700293

Try using --cascade=false in your delete job command.

It worked for me

-- Mark Tickner
Source: StackOverflow

10/9/2018

Check if kubectl describe pod <pod name> (associated pod of the job) still returns something, which would:

In that state, you can then consider a force deletion.

-- VonC
Source: StackOverflow