Our CI pipeline execution time for the main repository had ballooned from a manageable 12 minutes to an unacceptable 48 minutes. The trigger wasn’t a major feature release but a slow, creeping accumulation of services and tools. The repository housed Scala microservices for the core backend, Python scripts for data processing and ML model inference, and a TypeScript/React frontend. A trivial change in a Python utility’s docstring would inexplicably trigger a full rebuild and test run of unrelated Scala services. The developer feedback loop was fundamentally broken, and productivity was grinding to a halt.
The initial setup was a collection of disparate, language-specific tools stitched together with shell scripts in the CI pipeline. Scala services used SBT, Python used a requirements.txt
with pip
, and the frontend used npm
. This approach is simple to start but scales disastrously in a monorepo. It lacks a unified dependency graph; it cannot understand that //services/order-api
(Scala) and //tools/fraud-detection
(Python) both depend on //protos/user/v1
. Without this knowledge, the only “safe” option in CI is to rebuild and test everything, every time. This was the core pain point we had to solve. The objective was clear: achieve incremental builds and tests, supported by a shared remote cache that works across our entire polyglot codebase.
After evaluating several monorepo build tools, we settled on Bazel. Its primary advantage is its emphasis on hermetic, reproducible builds. Bazel forces you to explicitly declare all inputs and outputs for a build step. This strictness allows it to construct a precise, cross-language dependency graph of the entire repository. If the inputs to a build action haven’t changed, its output can be safely retrieved from a cache. This is not just a file-level cache; it’s a content-addressed cache of build actions. The decision came with the explicit acknowledgment of Bazel’s steep learning curve and verbose configuration. However, for our specific problem of CI performance in a complex, polyglot environment, the correctness and caching capabilities outweighed the initial setup cost.
Workspace and Toolchain Configuration
The first step is establishing the root of the Bazel workspace. This is done with an empty WORKSPACE
file. The real work is in defining the toolchains and external dependencies for each language. This is where we pull in the rules for Scala, Python, and Node.js, and configure them.
Our WORKSPACE
file begins by loading the necessary dependencies for the build rules themselves.
# WORKSPACE
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
# --- Scala Toolchain ---
# rules_scala provides the core rules for building Scala code.
# We also pull in rules_jvm_external for managing Maven/Ivy dependencies.
http_archive(
name = "rules_scala",
sha256 = "c3a77197a616214041d00fad218158d641f98d65c3645a2717906ca399f69f41",
strip_prefix = "rules_scala-6.2.0",
url = "https://github.com/bazelbuild/rules_scala/releases/download/v6.2.0/rules_scala-v6.2.0.tar.gz",
)
load("@rules_scala//scala:scala.bzl", "scala_register_toolchains")
scala_register_toolchains()
http_archive(
name = "rules_jvm_external",
sha256 = "623cf37424125b9868352b3c1d556556b5f631d2cfad3aa87332b3c95944063f",
strip_prefix = "rules_jvm_external-5.3",
url = "https://github.com/bazelbuild/rules_jvm_external/archive/refs/tags/5.3.zip",
)
load("@rules_jvm_external//:defs.bzl", "maven_install")
# --- Python Toolchain ---
# rules_python is the standard for integrating Python with Bazel.
http_archive(
name = "rules_python",
sha256 = "6447a2a1970ed3635417d7b1e4c8515c150c95a28b6d4e8c148413b65e714441",
strip_prefix = "rules_python-0.25.0",
url = "https://github.com/bazelbuild/rules_python/releases/download/0.25.0/rules_python-0.25.0.tar.gz",
)
load("@rules_python//python:repositories.bzl", "python_register_toolchains")
# We must specify a Python version to use for the toolchain.
python_register_toolchains(
name = "python311",
python_version = "3.11",
)
load("@python311//:defs.bzl", "interpreter")
load("@rules_python//python:pip.bzl", "pip_parse")
# --- Node.js/TypeScript Toolchain ---
# rules_nodejs manages Node.js, npm/yarn, and TypeScript compilation.
http_archive(
name = "rules_nodejs",
sha256 = "32354a5c53108502f9e421f6d3f2a8f895a0951b14e08c84a5a5196f353a27c0",
strip_prefix = "rules_nodejs-5.8.2",
url = "https://github.com/bazelbuild/rules_nodejs/releases/download/5.8.2/rules_nodejs-5.8.2.tar.gz",
)
load("@rules_nodejs//:index.bzl", "node_repositories")
# --- Protobuf Toolchain ---
# rules_proto is necessary for defining and compiling our shared protocol buffers.
http_archive(
name = "rules_proto",
sha256 = "83f6f1c7cd395b0f51d3886b515b0b6e98782a1f0547012b6a48e75204481358",
strip_prefix = "rules_proto-5.3.0-2.1.0",
urls = ["https://github.com/bazelbuild/rules_proto/releases/download/5.3.0-2.1.0/rules_proto-5.3.0-2.1.0.tar.gz"],
)
load("@rules_proto//proto:repositories.bzl", "proto_repositories")
proto_repositories()
This file sets up the foundation. Now, we define the actual third-party dependencies. For Scala, this means Maven artifacts. For Python, it’s PyPI packages. For the frontend, it’s npm packages.
For Scala dependencies, we use maven_install
which reads a JSON file pinning versions.
# WORKSPACE (continued)
# Pin Scala dependencies
maven_install(
artifacts = [
"org.scala-lang:scala-library:2.13.10",
"com.typesafe.akka:akka-actor-typed_2.13:2.8.0",
"com.typesafe.akka:akka-stream_2.13:2.8.0",
"com.typesafe.akka:akka-http_2.13:10.5.0",
"io.grpc:grpc-netty-shaded:1.53.0",
# ... test dependencies
"org.scalatest:scalatest_2.13:3.2.15",
],
repositories = [
"https://repo1.maven.org/maven2",
],
# This generates a lock file to ensure reproducible builds.
lock_file = "//:maven_lock.json",
)
# Pin Python dependencies from a requirements.txt file.
pip_parse(
name = "pip_dependencies",
requirements_lock = "//:requirements_lock.txt",
python_interpreter_target = interpreter,
)
load("@pip_dependencies//:requirements.bzl", "install_deps")
install_deps()
# Pin Node.js dependencies from package.json
node_repositories(
package_json = ["//:package.json"],
lock_file = "//:package-lock.json",
)
The key here is the use of lock files (maven_lock.json
, requirements_lock.txt
, package-lock.json
). This is critical for hermeticity. Bazel builds must not depend on network access to fetch dependencies during the build itself. Everything is fetched once, version-pinned, and stored under Bazel’s control.
The Cross-Language Nexus: Protobuf
The most critical piece for demonstrating cross-language dependency management is a shared data contract. Protocol Buffers are ideal for this. We define a simple User
service.
File structure:
.
└── protos/
└── user/
└── v1/
├── BUILD.bazel
└── user.proto
The .proto
file is standard.
// protos/user/v1/user.proto
syntax = "proto3";
package user.v1;
option java_package = "com.example.user.v1";
option java_multiple_files = true;
message User {
string user_id = 1;
string email = 2;
int64 created_at = 3;
}
message GetUserRequest {
string user_id = 1;
}
message GetUserResponse {
User user = 1;
}
service UserService {
rpc GetUser(GetUserRequest) returns (GetUserResponse);
}
The magic is in the BUILD.bazel
file, which defines how to compile this .proto
file into both Scala and Python code.
# protos/user/v1/BUILD.bazel
load("@rules_proto//proto:defs.bzl", "proto_library")
load("@rules_scala//scala:scala.bzl", "scala_proto_library")
# Note: Python proto rules are often part of rules_python or a separate dependency.
# Assuming they are configured.
load("@rules_python//python:proto.bzl", "py_proto_library")
# 1. Define the abstract protocol buffer library.
# This target itself doesn't generate code, it just represents the .proto files.
proto_library(
name = "user_proto",
srcs = ["user.proto"],
visibility = ["//visibility:public"],
)
# 2. Generate Scala code from the proto_library.
# This creates a target that Scala services can depend on.
scala_proto_library(
name = "user_scala_proto",
deps = [":user_proto"],
visibility = ["//visibility:public"],
)
# 3. Generate Python code from the same proto_library.
# This creates a target that Python tools can depend on.
py_proto_library(
name = "user_py_proto",
deps = [":user_proto"],
visibility = ["//visibility:public"],
)
Now, any change to user.proto
will invalidate :user_proto
, which in turn invalidates both :user_scala_proto
and :user_py_proto
, and consequently any Scala or Python code that depends on them. This is the explicit dependency graph we were missing.
Implementing the Scala Service
Our Scala user service will implement the gRPC service defined in the proto.
File structure:
.
└── services/
└── user_api/
├── BUILD.bazel
└── src/
└── main/
└── scala/
└── com/
└── example/
└── userapi/
├── Main.scala
└── UserServiceImpl.scala
The implementation is standard Akka gRPC.
// services/user_api/src/main/scala/com/example/userapi/UserServiceImpl.scala
package com.example.userapi
import com.example.user.v1._
import scala.concurrent.Future
class UserServiceImpl extends UserService {
override def getUser(in: GetUserRequest): Future[GetUserResponse] = {
// In a real-world project, this would fetch from a database.
// Here we return a hardcoded user for simplicity.
println(s"Received request for user ID: ${in.userId}")
if (in.userId == "123") {
val user = User(
userId = "123",
email = "[email protected]",
createdAt = System.currentTimeMillis()
)
Future.successful(GetUserResponse(Some(user)))
} else {
Future.failed(new RuntimeException("User not found"))
}
}
}
The BUILD.bazel
file wires everything together.
# services/user_api/BUILD.bazel
load("@rules_scala//scala:scala.bzl", "scala_binary", "scala_library", "scala_test")
# Define a library for our service implementation.
# This is the core logic.
scala_library(
name = "lib",
srcs = glob(["src/main/scala/**/*.scala"]),
deps = [
# This is the critical dependency on the generated Scala code.
"//protos/user/v1:user_scala_proto",
"@maven//:com_typesafe_akka_akka_actor_typed_2_13",
"@maven//:com_typesafe_akka_akka_stream_2_13",
"@maven//:com_typesafe_akka_akka_http_2_13",
"@maven//:io_grpc_grpc_netty_shaded",
],
)
# Define a runnable binary.
# This depends on our library and provides the main entry point.
scala_binary(
name = "user_api_server",
main_class = "com.example.userapi.Main",
deps = [":lib"],
)
# Define a test suite for our library.
scala_test(
name = "tests",
srcs = glob(["src/test/scala/**/*.scala"]),
deps = [
":lib",
"@maven//:org_scalatest_scalatest_2_13",
],
)
With this file, we can now run bazel build //services/user_api:user_api_server
or bazel test //services/user_api:tests
. Bazel will first compile the proto, then compile the Scala library against the generated code, and finally build the binary or run the tests.
Implementing the Python Tool
Now we create a simple Python client that uses the same proto definition.
File structure:
.
└── tools/
└── user_client/
├── BUILD.bazel
├── requirements.txt # for pip_parse
└── client.py
The Python code imports the generated stubs.
# tools/user_client/client.py
import grpc
import logging
# This import path is determined by Bazel from the proto definition.
from protos.user.v1 import user_pb2
from protos.user.v1 import user_pb2_grpc
def run(user_id: str):
"""
Makes a gRPC request to the user service.
In a real scenario, the channel address would come from config.
"""
logging.basicConfig(level=logging.INFO)
try:
with grpc.insecure_channel('localhost:8080') as channel:
stub = user_pb2_grpc.UserServiceStub(channel)
request = user_pb2.GetUserRequest(user_id=user_id)
logging.info(f"Requesting user info for ID: {user_id}")
response = stub.GetUser(request)
logging.info(f"Received user: ID={response.user.user_id}, Email={response.user.email}")
except grpc.RpcError as e:
logging.error(f"RPC failed: {e.code()} - {e.details()}")
if __name__ == '__main__':
# This is a simple example; proper arg parsing should be used.
run("123")
The BUILD.bazel
file for the Python tool mirrors the Scala one, but uses Python rules.
# tools/user_client/BUILD.bazel
load("@rules_python//python:defs.bzl", "py_binary")
load("@pip_dependencies//:requirements.bzl", "requirement")
# A runnable Python binary.
py_binary(
name = "user_client",
srcs = ["client.py"],
# Specify the Python version to ensure consistency.
python_version = "PY3",
deps = [
# The key dependency on the generated Python code from our shared proto.
"//protos/user/v1:user_py_proto",
# Dependencies from our requirements_lock.txt file.
requirement("grpcio"),
requirement("protobuf"),
],
)
Now, bazel run //tools/user_client
will build and execute the client. Crucially, it depends on the same //protos/user/v1:user_proto
target as the Scala service.
The Frontend Application with Emotion
To complete the picture, we add a simple React/TypeScript frontend. While it doesn’t directly consume the Protobuf-generated code, it depends on the API contract implicitly. In a more advanced setup, we could generate an OpenAPI spec from the Scala service and then generate a TypeScript client from that spec, creating another explicit link in the dependency graph. For now, we’ll keep it simple.
.
└── web/
└── app/
├── BUILD.bazel
├── package.json
├── src/
│ └── App.tsx
└── tsconfig.json
A sample component using Emotion for styling:
// web/app/src/App.tsx
import React, { useState, useEffect } from 'react';
import styled from '@emotion/styled';
const AppContainer = styled.div`
font-family: sans-serif;
text-align: center;
padding: 2rem;
background-color: #f0f2f5;
`;
const UserCard = styled.div`
background-color: white;
border-radius: 8px;
padding: 1.5rem;
margin: 1rem auto;
max-width: 400px;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
border: 1px solid #ddd;
`;
const UserId = styled.p`
color: #555;
font-size: 0.9rem;
code {
background-color: #eee;
padding: 2px 4px;
border-radius: 4px;
}
`;
const App = () => {
const [user, setUser] = useState(null);
// This simulates fetching data from our backend service.
useEffect(() => {
// In a real app, this would be a fetch to an endpoint like '/api/v1/users/123'
// which is served by our Scala user_api_server.
setTimeout(() => {
setUser({ userId: '123', email: '[email protected]' });
}, 1000);
}, []);
return (
<AppContainer>
<h1>User Profile</h1>
{user ? (
<UserCard>
<h2>{user.email}</h2>
<UserId><code>ID: {user.userId}</code></UserId>
</UserCard>
) : (
<p>Loading user data...</p>
)}
</AppContainer>
);
};
export default App;
The BUILD.bazel
file uses rules_nodejs
to build the application. This typically involves compiling TypeScript and then bundling it.
# web/app/BUILD.bazel
load("@npm//:defs.bzl", "npm_link_all_packages")
load("@rules_nodejs//nodejs:defs.bzl", "nodejs_binary")
load("@rules_nodejs//typescript:defs.bzl", "ts_project")
# This creates a node_modules directory symlink for tools to use
npm_link_all_packages(name = "node_modules")
# Compile our TypeScript sources into JavaScript
ts_project(
name = "compile_ts",
srcs = glob(["src/**/*.ts*"]),
tsconfig = "//web/app:tsconfig.json",
deps = ["//web/app:node_modules"],
)
# A simplified build script target. In a real project, this would use
# a bundler like Vite or Webpack via a nodejs_binary rule.
# This target represents the final, bundled web assets.
nodejs_binary(
name = "build_app",
data = [
":compile_ts",
# ... bundler config files
],
# The entry_point would be the bundler's CLI script.
entry_point = "@npm//:node_modules/vite/bin/vite.js",
args = ["build"],
)
Enabling the Remote Cache
With the entire project structure defined in Bazel, enabling remote caching is surprisingly simple. It’s a configuration change, not a code change. We’ll use the open-source bazel-remote
server, which can be run easily via Docker.
A docker-compose.yml
to run the cache:
# docker-compose.yml
version: '3.7'
services:
bazel-remote-cache:
image: buchgr/bazel-remote-cache:v2.6.0
container_name: bazel-remote-cache
ports:
- "9092:9092" # gRPC port
volumes:
- bazel_cache_data:/data
command: --max_size=10 # Max cache size in GB
volumes:
bazel_cache_data:
Next, we instruct Bazel to use this cache by adding configuration to a .bazelrc
file at the root of the repository.
# .bazelrc
# Common settings for all builds
build --disk_cache=~/.bazel/disk_cache
# Configuration for CI builds to use the remote cache
build:ci --remote_cache=grpc://localhost:9092
test:ci --remote_cache=grpc://localhost:9092
run:ci --remote_cache=grpc://localhost:9092
# Enable remote caching for all commands when using the :ci config
build:ci --remote_upload_local_results=true
test:ci --remote_upload_local_results=true
run:ci --remote_upload_local_results=true
# A pitfall is that Bazel can be picky about environment variables.
# This ensures a more consistent build environment between local and CI.
build:ci --incompatible_strict_action_env
Now, in our CI script, we can run bazel test --config=ci //...
.
The flow is as follows:
- CI job starts,
docker-compose up -d
brings up the cache. - CI runs
bazel test --config=ci //...
. - For each build action (e.g., compiling a Scala file), Bazel calculates a hash of its inputs (the file itself, compiler version, dependencies, compiler flags).
- It first checks the remote cache for an existing result for that hash.
- If found (a cache hit), the output artifact is downloaded immediately. The action is not re-executed.
- If not found (a cache miss), the action is executed locally. Upon completion, the output artifact is uploaded to the remote cache.
The first full build will be slow as it populates the cache. But subsequent builds are dramatically faster. If a developer pushes a change to only the Python client’s logging statement, the CI run will look like this:
-
//protos/**
: CACHED (no change) -
//services/user_api/**
: CACHED (no change and its dependencies are unchanged) -
//web/app/**
: CACHED (no change) - **
//tools/user_client/**
: Thepy_binary
rule will be re-run because its source file changed. The test for it will also re-run.
The CI pipeline time drops from 48 minutes to under 5 minutes, as the vast majority of work is now just downloading pre-computed artifacts.
graph TD subgraph CI Pipeline direction LR A[Start CI Job] --> B{bazel test //...}; B --> C{Check Remote Cache}; C -- Hit --> D[Download Artifact]; C -- Miss --> E[Execute Action Locally]; E --> F[Upload Artifact]; F --> G[Test Complete]; D --> G; end subgraph Dependency Graph in Bazel P["protos/user/v1/..._proto"] S["services/user_api/..._library"] PY["tools/user_client/..._binary"] W["web/app/..._bundle"] P --> S P --> PY end style S fill:#d4f0f0,stroke:#333,stroke-width:2px style PY fill:#fff0b3,stroke:#333,stroke-width:2px style W fill:#f5c2e0,stroke:#333,stroke-width:2px
This journey from a slow, unreliable build script orchestra to a fast, correct, and cached build system was a significant engineering investment. The main lingering issue is the cognitive overhead for new developers. Onboarding now requires a dedicated session on Bazel concepts like targets, visibility, and BUILD file syntax. IDE integration, particularly with IntelliJ’s Scala plugin, required careful configuration to make it recognize Bazel-managed dependencies. Furthermore, the remote cache itself is a piece of infrastructure that needs monitoring and maintenance; its storage is not infinite, and a cache eviction policy is necessary for long-term use. The next logical step in this evolution is to move from remote caching to remote execution, which would offload the build actions themselves to a dedicated cluster, further parallelizing the workload and reducing CI runner resource requirements.