Files
git.stella-ops.org/docs/features/unchecked/replay/immutable-advisory-feed-snapshots.md

3.7 KiB

Immutable Advisory Feed Snapshots

Module

Replay

Status

IMPLEMENTED

Description

The replay infrastructure supports input manifests and determinism tracking which conceptually align with point-in-time query capability, but a dedicated feed snapshotting system with per-provider immutable blobs and point-in-time advisory resolution is not directly implemented as described.

What's Implemented

  • Input Manifest Resolver: src/Replay/StellaOps.Replay.Core/InputManifestResolver.cs -- resolves input manifests that capture the exact inputs (feed data, SBOM, VEX, policy) used for a verdict, enabling replay with identical inputs. This provides partial snapshot functionality by recording what feed data was consumed.
  • Determinism Verifier: src/Replay/StellaOps.Replay.Core/DeterminismVerifier.cs -- verifies that replaying a verdict with the same inputs produces the same output, which indirectly validates feed data consistency.
  • Replay Executor: src/Replay/StellaOps.Replay.Core/ReplayExecutor.cs -- executes verdict replay using captured input manifests, consuming the recorded feed data rather than live feeds.
  • Policy Simulation Input Lock: src/Replay/StellaOps.Replay.Core/PolicySimulationInputLock.cs -- locks policy simulation inputs to prevent mutation during replay, ensuring deterministic execution.
  • Replay Job Queue: src/Replay/StellaOps.Replay.Core/ReplayJobQueue.cs -- manages replay job scheduling and execution.
  • Trace Anonymizer: src/Replay/StellaOps.Replay.Anonymization/TraceAnonymizer.cs -- anonymizes replay traces for sharing without exposing sensitive feed data.
  • Verdict Replay Endpoints: src/Replay/StellaOps.Replay.WebService/VerdictReplayEndpoints.cs -- API endpoints for triggering and querying verdict replays.

What's Missing

  • Per-Provider Feed Snapshots: No system exists to capture immutable snapshots of advisory feeds on a per-provider basis (e.g., NVD snapshot at epoch T, GHSA snapshot at epoch T). The input manifest records which feed data was used but does not create addressable, immutable blob snapshots.
  • Point-in-Time Advisory Resolution: No API exists to query "what was the advisory state for CVE-X at time T?" across all providers. Feed data is consumed in real-time; historical queries require replaying from input manifests.
  • Feed Snapshot Storage: No dedicated content-addressable storage for feed snapshots (e.g., immutable blobs with digest-based retrieval). Feed data flows through the pipeline but is not persisted as versioned snapshots.
  • Snapshot Epoch Registry: No registry that maps epoch identifiers to feed snapshot digests, enabling O(1) lookup of historical feed state.
  • Snapshot Attestation: No attestation mechanism for feed snapshots that proves the snapshot was captured at a specific time and has not been tampered with.

Implementation Plan

  • Design a per-provider feed snapshot format (content-addressable blob with provider ID, epoch timestamp, digest)
  • Implement a snapshot capture service that creates immutable blobs when feed data is ingested, storing them in content-addressable storage
  • Build a snapshot epoch registry mapping epoch IDs to snapshot digests for all providers
  • Add point-in-time advisory resolution API that resolves advisory state by looking up the appropriate epoch snapshot
  • Add snapshot attestation (signed digest + timestamp) for tamper-evidence
  • Integrate with the existing InputManifestResolver so replay can reference snapshots by epoch/digest rather than inline data
  • Replay infrastructure: src/Replay/StellaOps.Replay.Core/
  • Feed ingestion (Concelier): src/Concelier/
  • Feed processing (Excititor): src/Excititor/
  • Determinism testing: src/__Tests/__Libraries/StellaOps.Testing.Determinism/