I try to run kubeflow samples (pipelines/pytorchjob etc) but it stacks as ContainerCreating
.
For this purpose, I want to see dockershim and docker logs. Is there any documentation?
sample code is follows https://github.com/kubeflow/pipelines/tree/master/samples/core/helloworld
kubectl create serviceaccount pipeline-runner
python3 hello_world.py
kubectl create -f hello_world.py.yaml
$ kubectl describe workflow.argoproj.io/my-first-pipeline-wgkg2
Name: my-first-pipeline-wgkg2
Namespace: default
Labels: workflows.argoproj.io/phase=Running
Annotations: pipelines.kubeflow.org/pipeline_spec: {"description": "A hello world pipeline.", "name": "My first pipeline"}
API Version: argoproj.io/v1alpha1
Kind: Workflow
Metadata:
Creation Timestamp: 2020-04-22T00:11:41Z
Generate Name: my-first-pipeline-
Generation: 3
Resource Version: 23748
Self Link: /apis/argoproj.io/v1alpha1/namespaces/default/workflows/my-first-pipeline-wgkg2
UID: 50acafe4-2254-4dea-865d-7ec03496e523
Spec:
Arguments:
Entrypoint: my-first-pipeline
Service Account Name: pipeline-runner
Templates:
Container:
Args:
echo "hello world"
Command:
sh
-c
Image: library/bash:4.4.23
Name:
Resources:
Inputs:
Metadata:
Name: echo
Outputs:
Dag:
Tasks:
Arguments:
Name: echo
Template: echo
Inputs:
Metadata:
Name: my-first-pipeline
Outputs:
Status:
Finished At: <nil>
Nodes:
my-first-pipeline-wgkg2:
Children:
my-first-pipeline-wgkg2-3423630397
Display Name: my-first-pipeline-wgkg2
Finished At: <nil>
Id: my-first-pipeline-wgkg2
Name: my-first-pipeline-wgkg2
Phase: Running
Started At: 2020-04-22T00:11:41Z
Template Name: my-first-pipeline
Type: DAG
my-first-pipeline-wgkg2-3423630397:
Boundary ID: my-first-pipeline-wgkg2
Display Name: echo
Finished At: <nil>
Id: my-first-pipeline-wgkg2-3423630397
Message: ContainerCreating
Name: my-first-pipeline-wgkg2.echo
Phase: Pending
Started At: 2020-04-22T00:11:41Z
Template Name: echo
Type: Pod
Phase: Running
Started At: 2020-04-22T00:11:41Z
Events: <none>
kubectl logs
are follows
$ kubectl logs my-first-pipeline-wgkg2-3423630397 -c wait
Error from server (BadRequest): container "wait" in pod "my-first-pipeline-wgkg2-3423630397" is waiting to start: ContainerCreating
$ kubectl logs my-first-pipeline-wgkg2-3423630397 -c main
Error from server (BadRequest): container "main" in pod "my-first-pipeline-wgkg2-3423630397" is waiting to start: ContainerCreating
After you are in the worker nodes you can do docker ps
to find out containerid
for the container and then to see logs docker logs containerid
command should be useful.
Also look at kubelet logs in the worker nodes to see if there is any issue by running command journalctl -u kubelet.service -f
in the worker nodes.