ROS 2 on K8s: Multicast fails within same pod, succeeds in different pods

1/23/2020

Cross-posted from k8s discuss.

Cluster information:

Kubernetes version: v1.17.1 Cloud being used: bare-metal Installation method: kubeadm Host OS: Ubuntu 18.04 LTS Server CNI and version: weave/flannel/kube-router/calico (latest releases of each) CRI and version: docker/containerd (19.03.5/1.2.10)

Problem:

I am attempting to bring up a ROS 2 installation on Kubernetes, ideally using multiple containers in a single pod. Under the hood, ROS 2 relies upon DDS for communication, which is based upon UDP multicast.

When I bring up a simple pod deployment with two containers in a producer-consumer configuration, the consumer rarely (if ever) receives a message from the producer. When I bring up two pods, each with a single container the same producer-consumer configuration, the consumer always receives the messages.

Surprises

Every once in a while, the consumer will start up and receive messages as expected.

Furthermore, if one logs into the consumer with kubectl exec -it ros2-1 -c consumer /bin/bash then runs /ros_entrypoint.sh ros2 run demo_nodes_cpp listener, messages are sometimes received from the producer in the single pod scenario.

Expected Behavior

Successful messages appear in the logs of the consumer container as:

[INFO] [1579805884.017171859] [listener]: I heard: [Hello World: 1]
[INFO] [1579805885.017168023] [listener]: I heard: [Hello World: 2]
[INFO] [1579805886.017025092] [listener]: I heard: [Hello World: 3]

Actual Behavior

No such log messages are observed from the consumer.

Steps to Reproduce:

Failure within same pod

  1. Bring up a kubernetes cluster
  2. Apply the following pod definition: ros2-1.yaml

    apiVersion: v1
    kind: Pod
    metadata:
      name: ros2-1
    spec:
      containers:
      - name: producer
        image: osrf/ros2:nightly
        args: ["ros2", "run", "demo_nodes_cpp", "talker"]
      - name: consumer
        image: osrf/ros2:nightly
        args: ["ros2", "run", "demo_nodes_cpp", "listener"]
      restartPolicy: Never
  3. Watch for messages from the consumer with kubectl logs --follow ros2-1 consumer.

Success in different pods

  1. Bring up a kubernetes cluster
  2. Apply the following pod definition: ros2-2.yaml

    apiVersion: v1
    kind: Pod
    metadata:
      name: ros2-2-producer
    spec:
      containers:
      - name: producer
        image: osrf/ros2:nightly
        args: ["ros2", "run", "demo_nodes_cpp", "talker"]
      restartPolicy: Never
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: ros2-2-consumer
    spec:
      containers:
      - name: consumer
        image: osrf/ros2:nightly
        args: ["ros2", "run", "demo_nodes_cpp", "listener"]
      restartPolicy: Never
  3. Watch for messages from the consumer with kubectl logs --follow ros2-2-consumer.

Questions:

  1. What is causing a single pod deployment to fail, but multi pod deployment to succeed?
  2. I am unfamiliar with debugging networking issues within the Kubernetes environment, while fairly experienced on bare-metal. How should I go about investigating this issue under flannel, weave, or kube-router?
-- oursland
kubernetes
linux
multicast
networking
ros

0 Answers