I have an python application whose docker build takes about 15-20 minutes. Here is how my Dockerfile looks like more or less
FROM ubuntu:18.04
...
COPY . /usr/local/app
RUN pip install -r /usr/local/app/requirements.txt
...
CMD ...
Now if I use skaffold, any code change triggers a rebuild and it is going to do a reinstall of all requirements(as from the COPY step everything else is going to be rebuild) regardless of whether they were already installed. iIn docker-compose this issue would be solved using volumes. In kubernetes, if we use volumes in the following way:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv # this is the directory of the
# virtualenv of python
name: test-volume
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
will this extra requirements build be resolved with skaffold?
I can't speak for skaffold specifically but the container image build can be improved. If there is layer caching available then only reinstall the dependencies when your requirements.txt
changes. This is documented in the "ADD or COPY" Best Practices.
FROM ubuntu:18.04
...
COPY requirements.txt /usr/local/app/
RUN pip install -r /usr/local/app/requirements.txt
COPY . /usr/local/app
...
CMD ...
You may need to trigger updates some times if the module versions are loosely defined and say you want a new patch version. I've found requirements should be specific so versions don't slide underneath your application without your knowledge/testing.
For kaniko builds to make use of a cache in a cluster where there is no persistent storage by default, kaniko needs either a persistent volume mounted (--cache-dir
) or a container image repo (--cache-repo
) with the layers available.
If your goal is to speed up the dev process: Instead of triggering an entirely new deployment process every time you change a line of code, you can switch to a sync-based dev process to deploy once and then update the files within the running containers when editing code.
Skaffold supports file sync to directly update files inside the deployed containers if you change them on your local machine. However, the docs state "File sync is alpha" (https://skaffold.dev/docs/how-tos/filesync/) and I can completely agree from working with it a while ago: The sync mechanism is only one-directional (no sync from container back to local) and pretty buggy, i.e. it crashes frequently when switching git branches, installing dependencies etc. which can be pretty annoying.
If you want a more stable alternative for sync-based Kubernetes development which is very easy to get started with, take a look at DevSpace: https://github.com/devspace-cloud/devspace
I am one of the maintainers of DevSpace and started the project because Skaffold was much too slow for our team and it did not have a file sync back then.
@Matt's answer is a great best practice (+1) - skaffold
in and of itself won't solve the underlying layer cache invalidation issues which results in having to re-install the requirements during every build.
For additional performance, you can cache all the python
packages in a volume
mounted in your pod for example:
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- image: test:test
name: test-container
volumeMounts:
- mountPath: /usr/local/venv
name: test-volume
- mountPath: /root/.cache/pip
name: pip-cache
volumes:
- name: test-volume
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
- name: pip-cache
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
That way if the build cache is ever invalidated and you have to re-install your requirements.txt
you'd be saving some time by fetching them from cache.
If you're building with kaniko
you can also cache base images to a persistent disk using the kaniko-warmer
, for example:
...
volumeMounts:
...
- mountPath: /cache
name: kaniko-warmer
volumes:
...
- name: kaniko-warmer
awsElasticBlockStore:
volumeID: <volume-id>
fsType: ext4
Running the kaniko-warmer
inside the pod: docker run --rm -it -v /cache:/cache --entrypoint /kaniko/warmer gcr.io/kaniko-project/warmer --cache-dir=/cache --image=python:3.7-slim --image=nginx:1.17.3
. Your skaffold.yaml
might look something like:
apiVersion: skaffold/v1beta13
kind: Config
build:
artifacts:
- image: test:test
kaniko:
buildContext:
localDir: {}
cache:
hostPath: /cache
cluster:
namespace: jx
dockerConfig:
secretName: jenkins-docker-cfg
tagPolicy:
envTemplate:
template: '{{.DOCKER_REGISTRY}}/{{.IMAGE_NAME}}'
deploy:
kubectl: {}