How can I set up a Helm chart that creates a "rank" system similar to MPI?

4/7/2020

I'm migrating an HPC app originally written with MPI to a Kubernetes cluster. We have removed MPI and are sort of "rolling our own" way of managing the app layout using Helm.

In MPI, you essentially build an "appschema" that looks an awful lot like a Helm chart, but I'm trying to replicate some of the features from MPI in the chart and am unsure about the best approach.

In an MPI application, you launch several copies of the same binary, but each binary is given a unique number, a rank, that identifies it in the application group. This rank is used for determining what part of the problem the binary should work on, and as a way for it to send and receive messages from other binaries in the group. Our approach would use service discovery and something like ZMQ to allow ranks to communicate with each other, but we still need a way of uniquely identifying each rank.

My plan for replicating this behavior is to pass in an environment variable to each pod specifying the rank for its container app, such that in the Docker image, I get the following command:

CMD ["sh", "-c", "/apphome/workerApp $RANKNUM"]

The only thing is, I don't know how best to represent this in a Helm chart. My current line of thinking is to set replicaCount in the values.yaml to the number of desired ranks, but then how can I pass in a unique number for $RANKNUM to each replica? Is this even the best approach or should I use something other than replicaCount?

How can I pass in a unique numerical identifier as an environment variable to each replica in a Kubernetes Helm chart, and is replicaCount the appropriate way to represent MPI rank-like behavior in an HPC app?

-- stix
docker
kubernetes
kubernetes-helm

0 Answers