README
¶
Kernel-build benchmarks
Benchmark pipelines that compile a real Linux kernel through hpcc and report cache hit rate, cold-vs-warm wall time, and entry counts. They cover every shipped dispatch path:
| Script | Mode | hpcc surface exercised |
|---|---|---|
kernel-bench.sh |
local, source mode picked by HPCC_BENCH_SOURCE_MODE |
hpcc wrap → in-process compile → disk cache |
kernel-bench-both.sh |
local, runs once per source mode | wraps kernel-bench.sh; prints side-by-side summary |
kernel-bench-fc.sh |
firecracker (P4), source mode picked by HPCC_BENCH_SOURCE_MODE |
hpcc wrap → dispatch → scheduler → worker → FC VM |
kernel-bench-fc-both.sh |
firecracker, runs once per source mode | wraps kernel-bench-fc.sh; prints side-by-side summary |
Both benches default to source_mode = "cas"; pass
HPCC_BENCH_SOURCE_MODE=preprocessed (or use a -both.sh wrapper)
to exercise the preprocessed-bytes path. source_mode drives both
the local cache-key algorithm (cas → manifest digest, preprocessed →
hashed gcc -E output) and, under remote dispatch, the wire-format
choice. CAS mode also drops a .hpcc marker at the kernel root so
manifest paths normalize project-relative — the gate for cross-worker
/ cross-developer compile-result hits.
Each script:
- Builds the
hpccbinary from source. - Materializes the right config (a single disk cache for local
mode; a
remote = enabledTOML pointing at the in-process Phase 4 stack for FC mode). - Shallow-clones a pinned Linux tag (
v7.0by default) and runs the chosen kconfig target. - Builds the kernel once cold, once warm (
make cleanbetween, so only object files vanish —.configand the prepared sources stay). - Writes a per-run
report.md+stats.jsonunderbench/results/<timestamp>-<mode>/plus the raw build logs. - Fails non-zero if the hit-rate / warm-vs-cold threshold isn't met.
Running locally
Local mode
./bench/kernel-bench.sh
Linux-only (needs to build vmlinux). Reasonable on any host with
gcc, make, git, awk, and go on PATH. Defaults to
make defconfig and uses nproc jobs.
Knobs (env vars, all optional):
| Var | Default | Meaning |
|---|---|---|
HPCC_BENCH_KERNEL_TAG |
v7.0 |
Git tag in torvalds/linux to clone |
HPCC_BENCH_JOBS |
nproc |
make -j parallelism |
HPCC_BENCH_CONFIG |
defconfig |
kconfig target |
HPCC_BENCH_TARGET |
vmlinux |
Top-level make target |
HPCC_BENCH_MIN_HIT_RATE |
95 |
Required cache hit rate (%) |
HPCC_BENCH_MAX_WARM_PCT |
50 |
Max acceptable median warm wall time as % of cold |
HPCC_BENCH_WARM_RUNS |
3 |
Warm-build repeats after the cold pass (median is the headline number) |
HPCC_BENCH_KEEP |
0 |
1 preserves the work dir for inspection |
HPCC_BENCH_SOURCE_MODE |
cas |
cas or preprocessed; picks the local cache-key algorithm |
Firecracker mode
sudo -E \
HPCC_FIRECRACKER_BIN=/usr/local/bin/firecracker \
HPCC_JAILER_BIN=/usr/local/bin/jailer \
HPCC_TEST_KERNEL=/tmp/fcassets/vmlinux \
./bench/kernel-bench-fc.sh
Linux + KVM + root only. The supervisor binary
(bench/cmd/fcstack/main.go) brings up the
scheduler, worker, IdP, and pre-stages a rootfs from a public
toolchain OCI image; the shell script then drives the make dance
through it.
Cache configuration: the FC bench runs the worker in paranoid
mode with a worker-side disk cache at bench/work/firecracker/stack-<ts>/cache.
This is both the regulated-environment posture the README leads with
and the only configuration that meaningfully benchmarks the Phase 4
path — in non-paranoid mode every warm compile would short-circuit
on a client-side cache and never re-exercise scheduler/worker/FC
dispatch. The bench reads entry counts directly from that worker
cache directory; hpcc stats against the paranoid-mode client is
a no-op because clients have no local stores by design.
Defaults differ from local mode because per-TU FC dispatch overhead means warm builds never hit the same speedup multiplier:
| Var | Default |
|---|---|
HPCC_BENCH_CONFIG |
tinyconfig (defconfig is hours) |
HPCC_BENCH_MAX_WARM_PCT |
75 |
HPCC_BENCH_MIN_HIT_RATE |
90 |
HPCC_BENCH_VM_MEMORY |
2GB |
HPCC_BENCH_VM_VCPUS |
2 |
HPCC_BENCH_POOL_MAX |
8 (concurrent VMs per tenant) |
HPCC_BENCH_TOOLCHAIN_IMAGE |
docker.io/library/gcc:13.2.0 (matched to ubuntu-latest's apt gcc patch version) |
HPCC_BENCH_WARM_RUNS |
2 (fewer than local — each FC build is much slower) |
HPCC_BENCH_SOURCE_MODE |
cas (also accepts preprocessed) |
Both source modes back-to-back
sudo -E \
HPCC_FIRECRACKER_BIN=/usr/local/bin/firecracker \
HPCC_JAILER_BIN=/usr/local/bin/jailer \
HPCC_TEST_KERNEL=/tmp/fcassets/vmlinux \
./bench/kernel-bench-fc-both.sh
Runs kernel-bench-fc.sh once with source_mode=cas, once with
source_mode=preprocessed, and prints a side-by-side table of cold
seconds, warm-median, hit-rate, and cache size. Each leg gets its
own bench/results/<ts>-firecracker-<mode>/ directory and its own
isolated stack (separate worker cache, daemon discovery file, xdg
config root) so the runs don't share state.
./bench/kernel-bench-both.sh is the local-mode equivalent: same
side-by-side shape, but exercising kernel-bench.sh instead — so
the comparison is between local cache-key strategies rather than
between dispatch wire formats.
What's measured
Each run consists of one cold build plus N warm builds (default
HPCC_BENCH_WARM_RUNS=3 for local, =2 for FC). The cache is
populated by the cold build and reused across all warm runs; make clean between each run wipes object files but keeps .config. The
report surfaces the cold time, every warm sample, and the
min/median/max over the warm samples. The threshold check uses the
median warm time against cold — robust to a single noisy run on
a shared CI machine.
Output at bench/results/<timestamp>-<mode>/:
report.md— markdown table with the headline numbersstats.json— same numbers + raw warm-sample array in JSONbuild-cold.log,build-warm-1.log,build-warm-2.log, … — rawmakeoutput per runhpcc-config.toml— the exact client config used (for repro)fcstack.log(FC mode only) — supervisor stderr
The success signal in both modes is a high cache hit rate combined
with a large cold/warm wall-time delta. Local mode reads hit-rate
from hpcc stats (single client-side disk cache); FC mode walks the
worker's paranoid-mode disk cache dir directly since the paranoid
client has no hpcc stats surface.
TODO
-
make modulestarget.vmlinuxcovers the bulk of the TU count but excludes most ofdrivers/. Adding a second pass withmake moduleswould round out the workload. -
Toolchain-identity parity between local and FC modes. Today the local bench uses the host's apt
gcc(gcc 13.2.0 on ubuntu-latest) and the FC bench uses whatever the configured OCI image ships. We pingcc:13.2.0as a workaround so the two versions match, but that's a manual chase — every kernel-tag bump orgcc:13retag could break it again. The real fix is making the FC stack's toolchain provably the same as the host's: snapshot the host gcc/binutils/libc into a custom image at fcstack startup, or run a startup probe that assertsgcc --versionparity inside vs outside the VM and refuses to run on mismatch. This is also what makes the README's "image digest IS the toolchain identity" pitch (plan §4) meaningful for cross-developer hit rates — without parity, two developers with nominally identical hpcc setups can produce bit-different.ooutputs depending on which dispatch path their compiles took.
Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
fcstack
command
Command fcstack boots an end-to-end hpcc Phase 4 stack in a single process so the kernel-build benchmark has a target to dispatch against.
|
Command fcstack boots an end-to-end hpcc Phase 4 stack in a single process so the kernel-build benchmark has a target to dispatch against. |