Sample Argo workflow stuck in pending with no events on logs

7/23/2021

I am new to argo workflow and kubernetes. I am trying to run the hello-world sample workflow on a kind-cluster and it stays pending forever:

Name:                hello-world-pmlm6
Namespace:           argo
ServiceAccount:      default
Status:              Running
Created:             Thu Jul 22 19:34:29 -0500 (8 minutes ago)
Started:             Thu Jul 22 19:34:29 -0500 (8 minutes ago)
Duration:            8 minutes 3 seconds
Progress:            0/1

STEP                  TEMPLATE  PODNAME            DURATION  MESSAGE
hello-world-pmlm6  whalesay  hello-world-pmlm6  8m          

This workflow does not have security context set. You can run your workflow pods more securely by setting it.
Learn more at https://argoproj.github.io/argo-workflows/workflow-pod-security-context/

The details of my workflow can be seen here:

user> user$ kubectl describe workflow hello-world-pmlm6 --namespace=argo
Name:         hello-world-pmlm6
Namespace:    argo
Labels:       workflows.argoproj.io/archive-strategy=false
              workflows.argoproj.io/phase=Running
Annotations:  workflows.argoproj.io/description:
                This is a simple hello world example.
                You can also run it in Python: https://couler-proj.github.io/couler/examples/#hello-world
API Version:  argoproj.io/v1alpha1
Kind:         Workflow
Metadata:
  Creation Timestamp:  2021-07-23T00:34:29Z
  Generate Name:       hello-world-
  Generation:          2
  Managed Fields:
    API Version:  argoproj.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:workflows.argoproj.io/description:
        f:generateName:
        f:labels:
          .:
          f:workflows.argoproj.io/archive-strategy:
      f:spec:
        .:
        f:arguments:
        f:entrypoint:
        f:templates:
      f:status:
        .:
        f:finishedAt:
    Manager:      argo
    Operation:    Update
    Time:         2021-07-23T00:34:29Z
    API Version:  argoproj.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          f:workflows.argoproj.io/phase:
      f:status:
        f:artifactRepositoryRef:
        f:nodes:
        f:phase:
        f:progress:
        f:startedAt:
    Manager:         workflow-controller
    Operation:       Update
    Time:            2021-07-23T00:34:29Z
  Resource Version:  207302
  UID:               23496e4a-9f88-471c-a7d9-3b95bb27c31e
Spec:
  Arguments:
  Entrypoint:  whalesay
  Templates:
    Container:
      Args:
        hello world
      Command:
        cowsay
      Image:  docker/whalesay:latest
      Name:   
      Resources:
    Inputs:
    Metadata:
    Name:  whalesay
    Outputs:
Status:
  Artifact Repository Ref:
    Default:    true
  Finished At:  <nil>
  Nodes:
    hello-world-pmlm6:
      Display Name:    hello-world-pmlm6
      Finished At:     <nil>
      Id:              hello-world-pmlm6
      Name:            hello-world-pmlm6
      Phase:           Pending
      Progress:        0/1
      Started At:      2021-07-23T00:34:29Z
      Template Name:   whalesay
      Template Scope:  local/hello-world-pmlm6
      Type:            Pod
  Phase:               Running
  Progress:            0/1
  Started At:          2021-07-23T00:34:29Z
Events:
  Type    Reason           Age    From                 Message
  ----    ------           ----   ----                 -------
  Normal  WorkflowRunning  3m27s  workflow-controller  Workflow Running

The details of the corresponding workflow-controller pod are:

user> user $ kubectl -n argo describe pods/workflow-controller-df6fb694c-d49c9
Name:         workflow-controller-df6fb694c-d49c9
Namespace:    argo
Priority:     0
Node:         kind-control-plane/172.18.0.2
Start Time:   Wed, 21 Jul 2021 09:48:18 -0500
Labels:       app=workflow-controller
              pod-template-hash=df6fb694c
Annotations:  <none>
Status:       Running
IP:           10.244.0.3
IPs:
  IP:           10.244.0.3
Controlled By:  ReplicaSet/workflow-controller-df6fb694c
Containers:
  workflow-controller:
    Container ID:  containerd://5c1b941ee73dfbfd280f707e3bc59e56abf32cd43816f0250ecaf4960fee71a5
    Image:         argoproj/workflow-controller:v3.0.3
    Image ID:      docker.io/argoproj/workflow-controller@sha256:45c20ed0a68dc8f118e119be051fccee59106654e0795c27c6ff6393c8931212
    Port:          9090/TCP
    Host Port:     0/TCP
    Command:
      workflow-controller
    Args:
      --configmap
      workflow-controller-configmap
      --executor-image
      argoproj/argoexec:v3.0.3
      --namespaced
    State:          Running
      Started:      Thu, 22 Jul 2021 12:26:32 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 22 Jul 2021 12:25:49 -0500
      Finished:     Thu, 22 Jul 2021 12:26:19 -0500
    Ready:          True
    Restart Count:  3
    Liveness:       http-get http://:metrics/metrics delay=30s timeout=1s period=30s #success=1 #failure=3
    Environment:
      LEADER_ELECTION_IDENTITY:  workflow-controller-df6fb694c-d49c9 (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wth84 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  kube-api-access-wth84:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

I am unable to find the source of error. I am using argo version v3.1.2. Any idea why the workflows get stuck in the pending state and how to fix it?

EDIT: I submitted a new hello-world workflow. Here are the results of describe pod:

user> user$ kubectl describe pod hello-world-9ccmt -n argo
Name:           hello-world-9ccmt
Namespace:      argo
Priority:       0
Node:           <none>
Labels:         workflows.argoproj.io/completed=false
                workflows.argoproj.io/workflow=hello-world-9ccmt
Annotations:    workflows.argoproj.io/node-name: hello-world-9ccmt
                workflows.argoproj.io/template:
                  {"name":"whalesay","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["cowsay"],"ar...
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  Workflow/hello-world-9ccmt
Containers:
  wait:
    Image:      argoproj/argoexec:v3.0.3
    Port:       <none>
    Host Port:  <none>
    Command:
      argoexec
      wait
      --loglevel
      info
    Environment:
      ARGO_POD_NAME:                    hello-world-9ccmt (v1:metadata.name)
      GODEBUG:                          x509ignoreCN=0
      ARGO_CONTAINER_RUNTIME_EXECUTOR:  pns
      ARGO_CONTAINER_NAME:              wait
    Mounts:
      /argo/podmetadata from podmetadata (rw)
      /argo/secret/my-minio-cred from my-minio-cred (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zmg5f (ro)
  main:
    Image:      docker/whalesay:latest
    Port:       <none>
    Host Port:  <none>
    Command:
      cowsay
    Args:
      hello world
    Environment:
      ARGO_CONTAINER_NAME:  main
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zmg5f (ro)
Volumes:
  podmetadata:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.annotations -> annotations
  my-minio-cred:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  my-minio-cred
    Optional:    false
  kube-api-access-zmg5f:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
-- Mojo Jojo
argo-workflows
kubernetes

0 Answers