Troubleshooting Kubernetes Tutorial

7/8/2019

I am working through the coarse parallel processing Kubernetes tutorial located https://kubernetes.io/docs/tasks/job/coarse-parallel-processing-work-queue/#before-you-begin . I have my cluster set up with Rancher on AWS using EC2 instances. When I run

 kubectl apply -f ./job.yaml
 kubectl describe jobs/job-wq-1

I receive the following output

Name:           job-wq-1
Namespace:      default
Selector:       controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
Labels:         controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
                job-name=job-wq-1
Annotations:    kubectl.kubernetes.io/last-applied-configuration:
              {"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"job-wq-1","namespace":"default"},"spec":{"completions":8,"paral...
Parallelism:    2
Completions:    8
Start Time:     Mon, 08 Jul 2019 15:48:35 -0400
Pods Statuses:  0 Running / 0 Succeeded / 2 Failed
Pod Template:
Labels:  controller-uid=5f9e1780-a1b9-11e9-a6b7-026525d9a49a
       job-name=job-wq-1
Containers:
c:
Image:      mgladden/job-wq-1
Port:       <none>
Host Port:  <none>
Environment:
  BROKER_URL:  amqp://guest:guest@rabbitmq-service:5672
  QUEUE:       job1
Mounts:        <none>
Volumes:         <none>
Events:
Type     Reason                Age    From            Message
----     ------                ----   ----            -------
Normal   SuccessfulCreate      10m    job-controller  Created pod: job-wq-1-z8kn6
Normal   SuccessfulCreate      10m    job-controller  Created pod: job-wq-1-lqcfs
Normal   SuccessfulDelete      9m35s  job-controller  Deleted pod: job-wq-1-z8kn6
Normal   SuccessfulDelete      9m35s  job-controller  Deleted pod: job-wq-1-lqcfs

I am unsure of how to troubleshoot at this point. It appears that none succeeded. Could it be due to my Rancher set up? I did notice in the tutorial that Annotations were blank and I have output from my work.

-- Morgan Gladden
kubernetes
rancher

1 Answer

7/9/2019

Thanks for the help. I checked the error logs and found the following error "logging in to AMQP server: a socket error occurred" Looks to be a problem with using the older 14.04 version of ubuntu when building the docker image. When I switched to 18.04 version of ubuntu the tutorial finished as expected.

-- Morgan Gladden
Source: StackOverflow