In this KF example https://github.com/kubeflow/examples/blob/master/financial_time_series/tensorflow_model/ml_pipeline.py an ML pipeline gets constructed that triggers Python functions via command line.
This means that all .py files that are being called (e.g. "python3 train.py --param value") should be in the directory where the process runs. What I don't understand is where exactly should I put the .py files in the context of GCP.
Should I just copy them using Cloud shell? Or should I add git clone <repository with .py files> into my Dockerfile?
To kickstart KFP development using python, try the following tutorial: Data passing in python components
Should I just copy them using Cloud shell? Or should I add git clone <repository with .py files> into my Dockerfile?
Ideally, the files should be inside the container image (the Dockerfile method). This ensures maximum reproducibility.
For not very complex python scripts, the Lightweight python component feature allows you to create component from a python function. In this case the script code is store in the component command-line, so you do not need to upload the code anywhere.
Putting scripts somewhere remote (e.g. cloud storage or website) is possible, but can reduce reliability and reproducibility.