How are git repositories used by Snakemake via kubernetes?

3/27/2021

The Snakemake documentation here:

https://snakemake.readthedocs.io/en/stable/executing/cloud.html

states the following under the section heading "Executing a Snakemake workflow via kubernetes":

Currently, this mode requires that the Snakemake workflow is stored in a git repository. Snakemake uses git to query necessary source files (the Snakefile, scripts, config, …) for workflow execution and encodes them into the kubernetes job.

This is confusing to me. Looking at the example command line given:

snakemake --kubernetes --use-conda --default-remote-provider $REMOTE --default-remote-prefix $PREFIX

I don't see any reference to a git repository. It seems to me that Snakemake will look for the snakefile on the local host, in the working directory where this command is issued from. What is this business about a git repository?

-- mcrepeau
kubernetes
snakemake

1 Answer

4/5/2021

Okay, I think I figured this out. In the tutorials included with the documentation they clone a github repository that contains the snakefile and other related files. For example, in this tutorial:

https://snakemake.readthedocs.io/en/stable/executor_tutorial/google_lifesciences.html

The command is:

git clone https://github.com/snakemake/snakemake-lsh-tutorial-data

And if you do an ls -a command in the directory that gets created you'll see .git and .github subdirectories, and also a file .gitpod.yml. Apparently snakemake --kubernetes uses these, and if they aren't there with the snakefile the command will fail.

-- mcrepeau
Source: StackOverflow