I have a lot of standard runtime docker images like python3 with tensorflow 1.7 installed and I want to use these standard images to run some customers code out side of them. The scenario seems quite similar with the serverless. So what is the best way to put the code into runtime dockers?
Right now I am trying to use a persistent volume to mount the code into runtime. But it has a lot of work. Is there some solution easier for this?
UPDATE
What is the workflow for google machine learning engine or floydhub. I think what I want is similar. They have a command line tool to make the local code combine with a standard env.
If this is your case,
You have a docker image with code in it.
Aim: To update the code inside docker image.
Solution:
Run a bash session with the docker image with a directory in your file system mounted as volume.
Place the updated code in the volume directory.
From the docker bash session replace the real code with updated code from the volume.
Sample Commands
Assume ~/my-dir in your file system has the new code updated-code.py
$ docker run -it --volume ~/my-dir:/workspace --workdir /workspace my-docker-image bash
Now a new bash session will start inside docker container. Assuming you have the code in '/code/code.py' inside docker container, You can simply update the code by
$ cp /workspace/updated-code.py /code/code.py
Or you can create new directory and place the code.
$ cp /workspace/updated-code.py /my-new-dir/code.py
Now the docker container contains updated code. But changes will be reset if you close the container and again run the image. To create a docker image with latest code, save this state of container using docker commit. Open a new tab in the terminal.
$ docker ps
Will list all running docker containers.
Find CONTAINER ID of your docker container and save it.
$ docker commit id-of-your-container new-docker-image-name
Now run the docker image with latest code
$ docker run -it new-docker-image-name
Note: It is recommended to remove the old docker image using docker rmi
command as docker images are heavy.
We're dealing with a similar challenge also. Our approach is to build a static docker image where Tensorflow, Python, etc are built once and maintained.
Each user has a PVC (persistent volume claim) where large files that may change such as datasets and workspaces live.
Then we have a bash shell that launches the cluster resources and syncs the workspace using ksync (like rsync for a kubernetes cluster).
Following cloud native practices, code should be immutable, and releases and their dependencies uniquely identifiable for repeat-ability, replic-ability, etc - in short: you should really create images with your src code.
In your case, that would mean basing your Dockerfile on upstream python3 or TF images, there are a couple projects that may help with the workflow for above (code+build-release-run):
Hope it helps --jjo
One of the best practices is NOT to mount the code from a volume into it, but create a client-specific image that uses your TensorFlow image as a base image:
# Your base image comes in here.
FROM aisensiy/tensorflow:1
# Copy the client into your image.
COPY src /
# As Kubernetes will run your containers with an
# arbitrary UID, we set the user to nobody.
USER nobody
# ... and they will run with GID 0, so we
# need to change the group to 0 and make
# your stuff accessible to GID 0.
RUN \
chgrp -R 0 /src && \
chmod -R g=u && \
true
CMD ["/usr/bin/python", ...]
Some more best practices:
Even more best practices are provided in the OpenShift documentation: https://docs.openshift.org/latest/creating_images/guidelines.html
https://docs.openshift.org/latest/creating_images/guidelines.html
The code file can be passed from stdin when the container is being started. This way you can run arbitrary code when starting the container. Please see below for example:
root@node-1:~# cat hello.py
print("This line will be printed.")
root@node-1:~#
root@node-1:~# docker run --rm -i python python < hello.py
This line will be printed.
root@node-1:~#