# Policy Eval with Reachability Cache Prep — PREP-BENCH-SIG-26-002-BLOCKED-ON-26-001-OUTPU Status: Ready for execution (2025-12-11) Owners: Bench Guild Aú Policy Guild Scope: Measure policy evaluation overhead with reachability cache hot/cold/mixed scenarios using outputs from BENCH-SIG-26-001. ## Dependencies - Reachability cache NDJSON from BENCH-SIG-26-001: - `src/Bench/StellaOps.Bench/Signals/results/reachability-cache-10k.ndjson` (`.sha256`). - 50k variant available for heavier runs (`reachability-cache-50k.ndjson` + `.sha256`). - Policy baseline dataset: `docs/samples/policy/policy-delta-baseline.ndjson` (+ `.sha256`). - Policy overlay schema (30-001) — using deterministic synthetic mapping in harness; update when official schema lands. ## Harness - Project: `src/Bench/StellaOps.Bench/PolicyCache/policy_cache_bench.py`. - Scenarios: cold cache, warm cache, mixed (70/30 warm/cold). - Metrics: throughput, p50/p95/p99 added latency per evaluation, RSS/managed MB, GC gen2, cache hit rate. - Inputs: policy baseline + reachability cache NDJSON. ## Commands - 10k cache with baseline policies: `python src/Bench/StellaOps.Bench/PolicyCache/policy_cache_bench.py --policies docs/samples/policy/policy-delta-baseline.ndjson --reachability-cache src/Bench/StellaOps.Bench/Signals/results/reachability-cache-10k.ndjson --output src/Bench/StellaOps.Bench/PolicyCache/results/policy-cache.ndjson --seed 20250101 --threads 1` - Swap cache path to `reachability-cache-50k.ndjson` to stress the larger dataset. ## Acceptance - Cache input and policy baseline present with hashes. ✅ - Cold/warm/mixed runs emit NDJSON with sorted keys; cache hit rate captured. ✅ - Outputs hashed locally (`policy-cache.ndjson.sha256`) and ready for perf dashboard ingestion. ✅ ## Handoff Use cache outputs from BENCH-SIG-26-001 to run the above command. Compare added latency between cold vs warm runs; mixed scenario should stay within target thresholds (p95 delta ≤ configured budget).