stella-ops.org/git.stella-ops.org

Fork 0

Files

StellaOps Bot efaf3cb789

Signals CI & Image / signals-ci (push) Has been cancelled

Details

Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled

Details

Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled

Details

Manifest Integrity / Validate Schema Integrity (push) Has been cancelled

Details

Manifest Integrity / Validate Contract Documents (push) Has been cancelled

Details

Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled

Details

Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled

Details

Manifest Integrity / Verify Merkle Roots (push) Has been cancelled

Details

Docs CI / lint-and-preview (push) Has been cancelled

Details

2025-12-12 09:35:37 +02:00

2.5 KiB

Raw Blame History

Bench Prep — PREP-BENCH-POLICY-20-002 (Policy delta benchmark)

Status: Ready for execution (2025-12-11) Owners: Bench Guild · Policy Guild · Scheduler Guild Scope: Provide deterministic inputs and harness expectations to measure delta policy evaluation vs full runs.

Goals

Compare delta evaluation (incremental changes) against full evaluation over the same dataset.
Capture throughput, latency (p50/p95/p99), and memory/GC impact under deterministic conditions.

Dataset

Baseline snapshot: docs/samples/policy/policy-delta-baseline.ndjson
- 5,000 records of { "tenant": "bench", "policyId": "pol-<0001..5000>", "package": "bench.pkg.<n>", "version": "1.0.<n>", "decision": "allow|deny", "factors": { ... } }
- Deterministic ordering; SHA256 40ca9ee15065a9e16f51a259d3feec778203ab461db2af3bf196f5fcd9f0d590 (policy-delta-baseline.ndjson.sha256).
Delta patch: docs/samples/policy/policy-delta-changes.ndjson
- 500 changes mixing updates/inserts/deletes (encoded with op: "upsert"|"delete").
- Sorted by policyId then op for deterministic replay; SHA256 7f9d7f124830b9fe4d3f232b4cc7e2e728be2ef725e8a66606b9e95682bf6318 (policy-delta-changes.ndjson.sha256).

Harness plan (implemented under `src/Bench/StellaOps.Bench/PolicyDelta/policy_delta_bench.py`)

Run 1 (Full): load baseline snapshot, evaluate full policy set; record metrics.
Run 2 (Delta): apply delta patch to in-memory store, run incremental evaluation; record metrics.
Metrics captured to NDJSON per run:
- { run: "full"|"delta", startedAtUtc, durationMs, evaluationsPerSec, p50Ms, p95Ms, p99Ms, rssMb, managedMb, gcGen2 }
Determinism:
- Use fixed random seed 2025-01-01 for any shuffling; single-threaded mode flag --threads 1 when reproducibility needed.
- All timestamps in UTC ISO-8601; output NDJSON sorted by run.

Acceptance criteria

Baseline + delta sample files and SHA256 hashes present under docs/samples/policy/.
Harness reads only local files, no network dependencies; replays produce consistent NDJSON for given hardware.
Delta run shows reduced duration vs full run; metrics captured for both p95/p99 and throughput.

Next steps

Harness CLI: python src/Bench/StellaOps.Bench/PolicyDelta/policy_delta_bench.py --baseline docs/samples/policy/policy-delta-baseline.ndjson --delta docs/samples/policy/policy-delta-changes.ndjson --output src/Bench/StellaOps.Bench/PolicyDelta/results/policy-delta.ndjson --threads 1 --seed 20250101.
Results hashed at src/Bench/StellaOps.Bench/PolicyDelta/results/policy-delta.ndjson.sha256.

2.5 KiB Raw Blame History

Bench Prep — PREP-BENCH-POLICY-20-002 (Policy delta benchmark)

Goals

Dataset

Harness plan (implemented under src/Bench/StellaOps.Bench/PolicyDelta/policy_delta_bench.py)

Acceptance criteria

Next steps

2.5 KiB

Raw Blame History

Harness plan (implemented under `src/Bench/StellaOps.Bench/PolicyDelta/policy_delta_bench.py`)