Files
git.stella-ops.org/docs/features/unchecked/policy/proof-replay-deterministic-verdict-replay.md

3.4 KiB

Proof Replay / Deterministic Verdict Replay

Module

Policy

Status

IMPLEMENTED

Description

Full replay service with a dedicated module, determinism verifier, run manifests, and extensive E2E tests that verify byte-identical verdict replay across runs.

Implementation Details

  • ReplayEngine: src/Policy/__Libraries/StellaOps.Policy/Replay/ReplayEngine.cs (sealed class implements IReplayEngine)
    • ReplayAsync(ReplayRequest) replays policy evaluation with frozen snapshot inputs
    • Pipeline: load snapshot -> resolve frozen inputs -> execute with frozen inputs -> compare with original -> generate delta report
    • Uses ISnapshotService for snapshot loading, IKnowledgeSourceResolver for input resolution, IVerdictComparer for comparison
    • Returns ReplayResult with MatchStatus, ReplayedVerdict, OriginalVerdict, DeltaReport, Duration
  • ReplayRequest: src/Policy/__Libraries/StellaOps.Policy/Replay/ReplayRequest.cs (sealed record)
    • ArtifactDigest, SnapshotId, OriginalVerdictId (optional, for comparison)
    • ReplayOptions: CompareWithOriginal (default true), AllowNetworkFetch (default false), GenerateDetailedReport (default true), ScoreTolerance (default 0.001)
  • ReplayResult: src/Policy/__Libraries/StellaOps.Policy/Replay/ReplayResult.cs (sealed record)
    • ReplayMatchStatus: ExactMatch, MatchWithinTolerance, Mismatch, NoComparison, ReplayFailed
    • ReplayedVerdict: VerdictId, ArtifactDigest, Decision (Pass/Fail/PassWithExceptions/Indeterminate), Score, FindingIds, KnowledgeSnapshotId
    • ReplayDeltaReport: Summary, FieldDeltas (FieldName, OriginalValue, ReplayedValue), FindingDeltas (FindingId, DeltaType, Description), SuspectedCauses
  • VerdictComparer: src/Policy/__Libraries/StellaOps.Policy/Replay/VerdictComparer.cs -- compares replayed vs original verdicts with tolerance
  • ReplayReport: src/Policy/__Libraries/StellaOps.Policy/Replay/ReplayReport.cs -- detailed replay report generation
  • KnowledgeSourceResolver: src/Policy/__Libraries/StellaOps.Policy/Replay/KnowledgeSourceResolver.cs -- resolves snapshot sources to frozen data
  • KnowledgeSnapshotManifest: src/Policy/__Libraries/StellaOps.Policy/Snapshots/KnowledgeSnapshotManifest.cs -- content-addressed snapshot used as replay input
  • SnapshotAwarePolicyEvaluator: src/Policy/__Libraries/StellaOps.Policy/Snapshots/SnapshotAwarePolicyEvaluator.cs -- evaluates using pinned snapshot inputs

E2E Test Plan

  • Replay verdict with same snapshot; verify MatchStatus=ExactMatch (deterministic evaluation)
  • Replay verdict with OriginalVerdictId; verify OriginalVerdict is loaded and compared
  • Replay with CompareWithOriginal=false; verify MatchStatus=NoComparison, no OriginalVerdict
  • Replay with missing snapshot; verify MatchStatus=ReplayFailed, DeltaReport contains error
  • Replay with incomplete snapshot (missing source); verify ReplayFailed with missing source names
  • Replay with score difference within ScoreTolerance=0.001; verify MatchStatus=MatchWithinTolerance
  • Replay with score difference exceeding tolerance; verify MatchStatus=Mismatch with FieldDeltas
  • Replay with finding difference; verify DeltaReport.FindingDeltas contains Added/Removed/Modified entries
  • Replay with GenerateDetailedReport=true; verify DeltaReport.SuspectedCauses is populated
  • Verify AllowNetworkFetch=false prevents network access for missing sources
  • Verify replay Duration is recorded in result