How can I see my failing jobs with Kubernetes

1/13/2020

I have an issue with a Job in kubernetes. When I want to debug, I do:

kubectl describe job -n influx pamela-1578898800


Name:           pamela-1578898800
Namespace:      influx
Selector:       controller-uid=xxx
Labels:         controller-uid=xxx
                job-name=pamela-1578898800
Annotations:    <none>
Controlled By:  CronJob/pamela
Parallelism:    1
Completions:    1
Start Time:     Mon, 13 Jan 2020 08:00:04 +0100
Pods Statuses:  0 Running / 0 Succeeded / 5 Failed
Pod Template:
  Labels:  controller-uid=53110b24-35d2-11ea-bca1-06ecc706e86a
           job-name=pamela-1578898800
  Containers:
   pamela:
    Image:      registry.gitlab.com/xxx/pamela:latest
    Port:       <none>
    Host Port:  <none>
    Limits:
      cpu:     800m
      memory:  1000Mi
    Requests:
      cpu:     800m
      memory:  1000Mi
    Environment Variables from:
      pamela-env  Secret  Optional: false
    Environment:  <none>
    Mounts:
      /config from pamela-keys (rw)
      /log from pamela-claim (rw,path="log")
      /raw from pamela-claim (rw,path="raw")
  Volumes:
   pamela-claim:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pamela-claim
    ReadOnly:   false
   pamela-keys:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  pamela-keys
    Optional:    false
Events:          <none>

Here you can see 5 failing pods, but I don't know how to see the logs of failing pods.

When I do :

kubectl get po -A

I have no pods "pamela-xxx"

How should I see the issue ?

EDIT:

Here are the scripts I use to trigger job.

# job.yaml - THIS ONE WORKS, LAUNCHING MANUALLY
apiVersion: batch/v1
kind: Job
metadata:
  name: pamela-singlerun
  namespace: influx
spec:   
  template:
    spec:
      containers:
      - image: registry.gitlab.com/xxx/pamela:latest
        envFrom:
          - secretRef:
              name: pamela-env
        name: pamela
        volumeMounts:
        - mountPath: /raw
          name: pamela-claim
          subPath: raw
        - mountPath: /log
          name: pamela-claim
          subPath: log
        - mountPath: /config
          name: pamela-keys
      restartPolicy: Never
      volumes:
      - name: pamela-claim
        persistentVolumeClaim:
          claimName: pamela-claim
      - name: pamela-keys
        secret:
          secretName: pamela-keys
          items:
            - key: keys.yml
              path: keys.yml
      nodeSelector:
        kops.k8s.io/instancegroup: pamela-nodes
      imagePullSecrets:
        - name: gitlab-registry

And cronjob.yml, THIS ONE DOESN'T WORK

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: pamela
  namespace: influx
spec:
  schedule: "0 7,19 * * *"
  concurrencyPolicy: Replace
  jobTemplate:
    spec:   
      template:
        spec:
          containers:
          - image: registry.gitlab.com/xxx/pamela:latest
            envFrom:
              - secretRef:
                  name: pamela-env
            name: pamela
            resources:
              limits:
                cpu: 800m
                memory: 1000Mi
              requests:
                cpu: 800m
                memory: 1000Mi
            volumeMounts:
            - mountPath: /raw
              name: pamela-claim
              subPath: raw
            - mountPath: /log
              name: pamela-claim
              subPath: log
            - mountPath: /config
              name: pamela-keys
          restartPolicy: Never
          volumes:
          - name: pamela-claim
            persistentVolumeClaim:
              claimName: pamela-claim
          - name: pamela-keys
            secret:
              secretName: pamela-keys
              items:
                - key: keys.yml
                  path: keys.yml
          nodeSelector:
            kops.k8s.io/instancegroup: pamela-nodes
          imagePullSecrets:
            - name: gitlab-registry

EDIT 2: After running cron each 10 minutes, I can see my jobs, and I get expected results ( means it works )

pamela-1578992400-ppgtx                0/1     Completed   0          21m
pamela-1578993000-kn8nd                0/1     Completed   0          11m

But when right after this, I get:

Error from server (NotFound): pods "pamela-1578992400-ppgtx" not found

when trying to get logs, means that ttl should be 10 min. When trying to increase ttl, I get a feature-gates disabled issue. checking how to fix it

It is weird, after setting cron job each 10 min, I get:

➜ kubectl get jobs -n influx
NAME                COMPLETIONS   DURATION   AGE
pamela-1578898800   0/1           32h        32h
pamela-1579007400   1/1           99s        159m
pamela-1579011000   1/1           97s        99m
pamela-1579014600   1/1           108s       39m

I use: schedule: "10 * * * *"

Don't understand what's going on here...

-- Juliatzin
jobs
kubernetes

0 Answers