Benchmarks → Docs Pipeline
This note records the Phase 1 design + Phase 2 implementation work for surfacing Criterion micro-benchmarks inside the mdBook site. It follows the flow described in AGENTS.md: tickets → specs → code/tests → data.
Layout, naming, retention
- Raw output lives under
target/criterion/...whilecargo benchruns.scripts/rust-bench.shcopies only thecriterionsubtree intodata/bench/criterion/(Git LFS) right after the run so we keep curated JSON without pollutingdata/with arbitrary build detritus. - Build junk (
data/bench/release,tmp/,.rustc_info.json) stays untracked via.gitignore. If Cargo drops new cache directories, add them here before running benches on a ticket. - Derived tables land in
docs/assets/bench/as<timestamp>_<group>.csvplus a Markdown twin for mdBook. Each CSV gets a.run.jsonsidecar withgit_commit, UTCtimestamp, host/rust info, and the row count. Small symlinks (current.csv,current_<group>.csv,current_<group>.md) point at the latest export so docs can{{#include}}a stable path. - Retention: the Python stage keeps the most recent five exports per group by default (value stored in
configs/bench/docs_local.json). Older CSV/MD pairs + their.run.jsonsidecars are deleted to avoid asset bloat while still keeping short-term history for review diffs.
How to run
- Make sure the benches you want exist (currently
crates/viterbo/benches/poly2_bench.rs) and hydrate Git LFS fordata/**if needed. - Run the benches through the wrapper (safe-wrapped):
This writes raw Criterion JSON tobash scripts/safe.sh --timeout 300 -- bash scripts/rust-bench.sh # optional envs: BENCH_EXPORT_DIR=/tmp/bench, BENCH_RUN_POSTPROCESS=1 (to immediately run the stage)target/criterion, rsyncs the curated snapshot intodata/bench/criterion, and leaves Cargo caches alone. - Refresh the docs tables via the Python stage (defaults live in
configs/bench/docs_local.json):bash scripts/safe.sh --timeout 120 -- uv run python -m viterbo.bench.stage_docs \ --config configs/bench/docs_local.json # add --bench-root /custom/path or --keep 10 if you need overrides - Commit the tiny files in
docs/assets/bench/together with thedata/bench/criterion/**snapshot (Git LFS handles the bulk). If you want the wrapper to handle step 3 automatically, exportBENCH_RUN_POSTPROCESS=1when invokingscripts/rust-bench.sh.
scripts/reproduce.sh now runs both steps (bench + docs stage) unconditionally so every thesis build and mdBook render derives from freshly generated measurements. Whenever a ticket adds a new artifact or visualization, update reproduce.sh in the same PR.
Latest snapshot
The Markdown fragment below is generated by python -m viterbo.bench.stage_docs and pulled in verbatim so reviewers always see the freshest numbers without copy/paste.
| bench | parameter | samples | min (ns) | mean (ns) | stddev (ns) |
|---|---|---|---|---|---|
| halfspace_intersection | 0 | 50 | 4.571 | 4.977 | 0.346 |
| halfspace_intersection | 10 | 50 | 598.399 | 671.668 | 20.930 |
| halfspace_intersection | 20 | 50 | 1106.233 | 1155.103 | 28.151 |
| halfspace_intersection | 50 | 50 | 2572.787 | 2707.089 | 67.563 |
| halfspace_intersection | 100 | 50 | 5070.971 | 5218.790 | 145.902 |
| push_forward_strict | 0 | 50 | 13.587 | 13.865 | 0.199 |
| push_forward_strict | 10 | 50 | 493.217 | 516.936 | 15.231 |
| push_forward_strict | 20 | 50 | 1710.680 | 1807.478 | 70.256 |
| push_forward_strict | 50 | 50 | 6042.619 | 6140.683 | 88.946 |
| push_forward_strict | 100 | 50 | 14986.763 | 16127.213 | 801.691 |
Updated 2025-11-11 01:41:11Z · commit 585a129 · host ab5b4864ef14 · rustc rustc 1.91.1 (ed61e7d7e 2025-11-07)
Interpretation cheat sheet
- Both
halfspace_intersectionandpush_forward_strictscale roughly super-linear with the number of halfspacesm, but the latter is consistently ~3× slower for the samembecause it performs an extra affine transform before the set operation. - For tiny polytopes (
m ≤ 10) the kernel stays in the sub-microsecond regime, which makes it viable to run exhaustive smoke tests inside CI; bym=100we are in the ~5 μs (intersection) and ~15 μs (push-forward) range, which is still cheap for batched evaluation. - The
stddevcolumns are low relative to the mean for larger inputs, which indicates the batches are deterministic enough that storing a single snapshot per commit is meaningful. - If you see
samples < 100, it means Criterion bailed early because the stage was faster than the configured measurement window; rerun with--significance-leveltweaks if you need denser samples.
Script internals (reference)
viterbo.bench.stage_docsparsesestimates.jsonandsample.jsonfor each<group>/<bench>/<param>tuple, computesmin,mean,stddev, and copies system metadata fromgit,platform, andrustc --version.- Output schema matches the CSV header, so CSVs remain diffable while Markdown renders nicely inside mdBook; both variants share the same timestamp + provenance file.
- Symlinks are relative so Git diffs stay stable and docs can just use the include macro for the “current” snapshot. For example (shown literally, not executed):
{{# include ../assets/bench/current_<group>.md}}(note the space after#prevents mdBook from treating this as a real include). - Use
uv run python -m viterbo.bench.stage_docs --config configs/bench/docs_local.json --keep 10if you need a longer breadcrumb trail before trimming old exports.