Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Benchmarks → Docs Pipeline

This note records the Phase 1 design + Phase 2 implementation work for surfacing Criterion micro-benchmarks inside the mdBook site. It follows the flow described in AGENTS.md: tickets → specs → code/tests → data.

Layout, naming, retention

  • Raw output lives under target/criterion/... while cargo bench runs. scripts/rust-bench.sh copies only the criterion subtree into data/bench/criterion/ (Git LFS) right after the run so we keep curated JSON without polluting data/ with arbitrary build detritus.
  • Build junk (data/bench/release, tmp/, .rustc_info.json) stays untracked via .gitignore. If Cargo drops new cache directories, add them here before running benches on a ticket.
  • Derived tables land in docs/assets/bench/ as <timestamp>_<group>.csv plus a Markdown twin for mdBook. Each CSV gets a .run.json sidecar with git_commit, UTC timestamp, host/rust info, and the row count. Small symlinks (current.csv, current_<group>.csv, current_<group>.md) point at the latest export so docs can {{#include}} a stable path.
  • Retention: the Python stage keeps the most recent five exports per group by default (value stored in configs/bench/docs_local.json). Older CSV/MD pairs + their .run.json sidecars are deleted to avoid asset bloat while still keeping short-term history for review diffs.

How to run

  1. Make sure the benches you want exist (currently crates/viterbo/benches/poly2_bench.rs) and hydrate Git LFS for data/** if needed.
  2. Run the benches through the wrapper (safe-wrapped):
    bash scripts/safe.sh --timeout 300 -- bash scripts/rust-bench.sh
    # optional envs: BENCH_EXPORT_DIR=/tmp/bench, BENCH_RUN_POSTPROCESS=1 (to immediately run the stage)
    
    This writes raw Criterion JSON to target/criterion, rsyncs the curated snapshot into data/bench/criterion, and leaves Cargo caches alone.
  3. Refresh the docs tables via the Python stage (defaults live in configs/bench/docs_local.json):
    bash scripts/safe.sh --timeout 120 -- uv run python -m viterbo.bench.stage_docs \
      --config configs/bench/docs_local.json
    # add --bench-root /custom/path or --keep 10 if you need overrides
    
  4. Commit the tiny files in docs/assets/bench/ together with the data/bench/criterion/** snapshot (Git LFS handles the bulk). If you want the wrapper to handle step 3 automatically, export BENCH_RUN_POSTPROCESS=1 when invoking scripts/rust-bench.sh.

scripts/reproduce.sh now runs both steps (bench + docs stage) unconditionally so every thesis build and mdBook render derives from freshly generated measurements. Whenever a ticket adds a new artifact or visualization, update reproduce.sh in the same PR.

Latest snapshot

The Markdown fragment below is generated by python -m viterbo.bench.stage_docs and pulled in verbatim so reviewers always see the freshest numbers without copy/paste.

benchparametersamplesmin (ns)mean (ns)stddev (ns)
halfspace_intersection0504.5714.9770.346
halfspace_intersection1050598.399671.66820.930
halfspace_intersection20501106.2331155.10328.151
halfspace_intersection50502572.7872707.08967.563
halfspace_intersection100505070.9715218.790145.902
push_forward_strict05013.58713.8650.199
push_forward_strict1050493.217516.93615.231
push_forward_strict20501710.6801807.47870.256
push_forward_strict50506042.6196140.68388.946
push_forward_strict1005014986.76316127.213801.691

Updated 2025-11-11 01:41:11Z · commit 585a129 · host ab5b4864ef14 · rustc rustc 1.91.1 (ed61e7d7e 2025-11-07)

Interpretation cheat sheet

  • Both halfspace_intersection and push_forward_strict scale roughly super-linear with the number of halfspaces m, but the latter is consistently ~3× slower for the same m because it performs an extra affine transform before the set operation.
  • For tiny polytopes (m ≤ 10) the kernel stays in the sub-microsecond regime, which makes it viable to run exhaustive smoke tests inside CI; by m=100 we are in the ~5 μs (intersection) and ~15 μs (push-forward) range, which is still cheap for batched evaluation.
  • The stddev columns are low relative to the mean for larger inputs, which indicates the batches are deterministic enough that storing a single snapshot per commit is meaningful.
  • If you see samples < 100, it means Criterion bailed early because the stage was faster than the configured measurement window; rerun with --significance-level tweaks if you need denser samples.

Script internals (reference)

  • viterbo.bench.stage_docs parses estimates.json and sample.json for each <group>/<bench>/<param> tuple, computes min, mean, stddev, and copies system metadata from git, platform, and rustc --version.
  • Output schema matches the CSV header, so CSVs remain diffable while Markdown renders nicely inside mdBook; both variants share the same timestamp + provenance file.
  • Symlinks are relative so Git diffs stay stable and docs can just use the include macro for the “current” snapshot. For example (shown literally, not executed): {{# include ../assets/bench/current_<group>.md}} (note the space after # prevents mdBook from treating this as a real include).
  • Use uv run python -m viterbo.bench.stage_docs --config configs/bench/docs_local.json --keep 10 if you need a longer breadcrumb trail before trimming old exports.