Immutable Advisory Feed Snapshots

Module

Replay

Status

IMPLEMENTED

Description

The replay infrastructure supports input manifests and determinism tracking which conceptually align with point-in-time query capability, but a dedicated feed snapshotting system with per-provider immutable blobs and point-in-time advisory resolution is not directly implemented as described.

What's Implemented

Input Manifest Resolver: src/Replay/StellaOps.Replay.Core/InputManifestResolver.cs -- resolves input manifests that capture the exact inputs (feed data, SBOM, VEX, policy) used for a verdict, enabling replay with identical inputs. This provides partial snapshot functionality by recording what feed data was consumed.
Determinism Verifier: src/Replay/StellaOps.Replay.Core/DeterminismVerifier.cs -- verifies that replaying a verdict with the same inputs produces the same output, which indirectly validates feed data consistency.
Replay Executor: src/Replay/StellaOps.Replay.Core/ReplayExecutor.cs -- executes verdict replay using captured input manifests, consuming the recorded feed data rather than live feeds.
Policy Simulation Input Lock: src/Replay/StellaOps.Replay.Core/PolicySimulationInputLock.cs -- locks policy simulation inputs to prevent mutation during replay, ensuring deterministic execution.
Replay Job Queue: src/Replay/StellaOps.Replay.Core/ReplayJobQueue.cs -- manages replay job scheduling and execution.
Trace Anonymizer: src/Replay/StellaOps.Replay.Anonymization/TraceAnonymizer.cs -- anonymizes replay traces for sharing without exposing sensitive feed data.
Verdict Replay Endpoints: src/Replay/StellaOps.Replay.WebService/VerdictReplayEndpoints.cs -- API endpoints for triggering and querying verdict replays.

What's Missing

Per-Provider Feed Snapshots: No system exists to capture immutable snapshots of advisory feeds on a per-provider basis (e.g., NVD snapshot at epoch T, GHSA snapshot at epoch T). The input manifest records which feed data was used but does not create addressable, immutable blob snapshots.
Point-in-Time Advisory Resolution: No API exists to query "what was the advisory state for CVE-X at time T?" across all providers. Feed data is consumed in real-time; historical queries require replaying from input manifests.
Feed Snapshot Storage: No dedicated content-addressable storage for feed snapshots (e.g., immutable blobs with digest-based retrieval). Feed data flows through the pipeline but is not persisted as versioned snapshots.
Snapshot Epoch Registry: No registry that maps epoch identifiers to feed snapshot digests, enabling O(1) lookup of historical feed state.
Snapshot Attestation: No attestation mechanism for feed snapshots that proves the snapshot was captured at a specific time and has not been tampered with.

Implementation Plan

Design a per-provider feed snapshot format (content-addressable blob with provider ID, epoch timestamp, digest)
Implement a snapshot capture service that creates immutable blobs when feed data is ingested, storing them in content-addressable storage
Build a snapshot epoch registry mapping epoch IDs to snapshot digests for all providers
Add point-in-time advisory resolution API that resolves advisory state by looking up the appropriate epoch snapshot
Add snapshot attestation (signed digest + timestamp) for tamper-evidence
Integrate with the existing InputManifestResolver so replay can reference snapshots by epoch/digest rather than inline data

Replay infrastructure: src/Replay/StellaOps.Replay.Core/
Feed ingestion (Concelier): src/Concelier/
Feed processing (Excititor): src/Excititor/
Determinism testing: src/__Tests/__Libraries/StellaOps.Testing.Determinism/

3.7 KiB Raw Blame History