I have a monorepo setup with a growing number of services services. When I deploy the application I run a command and every service will be rebuilt and the final Docker images will be published. But as the number of services grows the time it takes to rebuilt all of them gets longer and longer, although changes were made to only a few of them.
Why does my setup rebuilt all Docker images although only a few have changed? My goal is to rebuilt and publish only the images that have actually changed.
I am using Bazel to build my Docker images, thus in the root of my project there is one BUILD
file which contains the target I run when I want to deploy. It is just a collection of k8s_objects
, where every service is included:
load("@io_bazel_rules_k8s//k8s:objects.bzl", "k8s_objects")
k8s_objects(
name = "kubernetes_deployment",
objects = [
"//services/service1",
"//services/service2",
"//services/service3",
"//services/service4",
# ...
]
)
Likewise there is one BUILD
file for every service which first creates a Typescript library from all the source files, then creates the Node.Js image and finally passes the image to the Kubernetes object:
load("@npm_bazel_typescript//:index.bzl", "ts_library")
ts_library(
name = "lib",
srcs = glob(
include = ["**/*.ts"],
exclude = ["**/*.spec.ts"]
),
deps = [
"//packages/package1",
"//packages/package2",
"//packages/package3",
],
)
load("@io_bazel_rules_docker//nodejs:image.bzl", "nodejs_image")
nodejs_image(
name = "image",
data = [":lib", "//:package.json"],
entry_point = ":index.ts",
)
load("@k8s_deploy//:defaults.bzl", "k8s_deploy")
k8s_object(
name = "service",
template = ":service.yaml",
kind = "deployment",
cluster = "my-cluster"
images = {
"gcr.io/project/service:latest": ":image"
},
)
Note that the Typescript lib also depends on some packages, which should also be accounted for when redeploying!
To deploy I run bazel run :kubernetes_deployment.apply
Initially one reason I decided to choose Bazel is because I thought it would handle building only changed services itself. But obviously this is either not the case or my setup is faulty in some way.
If you need more detailed insight into the project you can check it out here: https://github.com/flolude/cents-ideas
Looks like Bazel repo itself does something similar:
https://github.com/bazelbuild/bazel/blob/ef0f8e61b5d3a139016c53bf04361a8e9a09e9ab/scripts/ci/ci.sh
The rough steps are:
kind(.*_binary, rdeps(//..., set(file1.txt file2.txt)))
will find all binary targets which are dependents of either file1.txt
or file2.txt
)You will need to adapt this script to your need (e.g. make sure it finds docker image targets)
To find out the kind of a target, you can use bazel query //... --output label_kind
EDIT: A bit of warning for anyone who wants to go down this rabbit hole (especially if you absolutely do not want to miss tests in CI):
You need to think about:
I think there is a ton of complexity here going down this route (if even possible). It might be less erro-prone to rely on Bazel itself to figure out what changed, using remote caches & --subcommands
to calculate which side-effects need to be performed.