Below are internal guidelines for Stella Ops Product Managers and Development Managers for the capability: **Knowledge Snapshots / Time‑Travel Replay**. This is written as an implementable operating standard (not a concept note).

---

# Knowledge Snapshots / Time‑Travel Replay

## Product and Engineering Guidelines for Stella Ops

## 1) Purpose and value proposition

### What this capability must achieve

Enable Stella Ops to **reproduce any historical risk decision** (scan result, policy evaluation, verdict) **deterministically**, using a **cryptographically bound snapshot** of the exact knowledge inputs that were available at the time the decision was made.

### Why customers pay for it

This capability is primarily purchased for:

* **Auditability**: “Show me what you knew, when you knew it, and why the system decided pass/fail.”
* **Incident response**: reproduce prior posture using historical feeds/VEX/policies and explain deltas.
* **Air‑gapped / regulated environments**: deterministic, offline decisioning with attested knowledge state.
* **Change control**: prove whether a decision changed due to code change vs knowledge change.

### Core product promise

For a given artifact and snapshot:

* **Same inputs → same outputs** (verdict, scores, findings, evidence pointers), or Stella Ops must clearly declare the precise exceptions.

---

## 2) Definitions (PMs and engineers must align on these)

### Knowledge input

Any external or semi-external information that can influence the outcome:

* vulnerability databases and advisories (any source)
* exploit-intel signals
* VEX statements (OpenVEX, CSAF, CycloneDX VEX, etc.)
* SBOM ingestion logic and parsing rules
* package identification rules (including distro/backport logic)
* policy content and policy engine version
* scoring rules (including weights and thresholds)
* trust anchors and signature verification policy
* plugin versions and enabled capabilities
* configuration defaults and overrides that change analysis

### Knowledge Snapshot

A **sealed record** of:

1. **References** (which inputs were used), and
2. **Content** (the exact bytes used), and
3. **Execution contract** (the evaluator and ruleset versions)

### Time‑Travel Replay

Re-running evaluation of an artifact **using only** the snapshot content and the recorded execution contract, producing the same decision and explainability artifacts.

---

## 3) Product principles (non‑negotiables)

1. **Determinism is a product requirement**, not an engineering detail.
2. **Snapshots are first‑class artifacts** with explicit lifecycle (create, verify, export/import, retain, expire).
3. **The snapshot is cryptographically bound** to outcomes and evidence (tamper-evident chain).
4. **Replays must be possible offline** (when the snapshot includes content) and must fail clearly when not possible.
5. **Minimal surprise**: the UI must explain when a verdict changed due to “knowledge drift” vs “artifact drift.”
6. **Scalability by content addressing**: the platform must deduplicate knowledge content aggressively.
7. **Backward compatibility**: old snapshots must remain replayable within a documented support window.

---

## 4) Scope boundaries (what this is not)

### Non-goals (explicitly out of scope for v1 unless approved)

* Reconstructing *external internet state* beyond what is recorded (no “fetch historical CVE state from the web”).
* Guaranteeing replay across major engine rewrites without a compatibility plan.
* Storing sensitive proprietary customer code in snapshots (unless explicitly enabled).
* Replaying “live runtime signals” unless those signals were captured into the snapshot at decision time.

---

## 5) Personas and use cases (PM guidance)

### Primary personas

* **Security Governance / GRC**: needs audit packs, controls evidence, deterministic history.
* **Incident response / AppSec lead**: needs “what changed and why” quickly.
* **Platform engineering / DevOps**: needs reproducible CI gates and air‑gap workflows.
* **Procurement / regulated customers**: needs proof of process and defensible attestations.

### Must-support use cases

1. **Replay a past release gate decision** in a new environment (including offline) and get identical outcome.
2. **Explain drift**: “This build fails today but passed last month—why?”
3. **Air‑gap export/import**: create snapshots in connected environment, import to disconnected one.
4. **Audit bundle generation**: export snapshot + verdict(s) + evidence pointers.

---

## 6) Functional requirements (PM “must/should” list)

### Must

* **Snapshot creation** for every material evaluation (or for every “decision object” chosen by configuration).
* **Snapshot manifest** containing:

  * unique snapshot ID (content-addressed)
  * list of knowledge sources with hashes/digests
  * policy IDs and exact policy content hashes
  * engine version and plugin versions
  * timestamp and clock source metadata
  * trust anchor set hash and verification policy hash
* **Snapshot sealing**:

  * snapshot manifest is signed
  * signed link from verdict → snapshot ID
* **Replay**:

  * re-evaluate using only snapshot inputs
  * output must match prior results (or emit a deterministic mismatch report)
* **Export/import**:

  * portable bundle format
  * import verifies integrity and signatures before allowing use
* **Retention controls**:

  * configurable retention windows and storage quotas
  * deduplication and garbage collection

### Should

* **Partial snapshots** (reference-only) vs **full snapshots** (content included), with explicit replay guarantees.
* **Diff views**: compare two snapshots and highlight what knowledge changed.
* **Multi-snapshot replay**: run “as-of snapshot A” and “as-of snapshot B” to show drift impact.

### Could

* Snapshot “federation” for large orgs (mirrors/replication with policy controls).
* Snapshot “pinning” to releases or environments as a governance policy.

---

## 7) UX and workflow guidelines (PM + Eng)

### UI must communicate three states clearly

1. **Reproducible offline**: snapshot includes all required content.
2. **Reproducible with access**: snapshot references external sources that must be available.
3. **Not reproducible**: missing content or unsupported evaluator version.

### Required UI objects

* **Snapshot Details page**

  * snapshot ID and signature status
  * list of knowledge sources (name, version/epoch, digest, size)
  * policy bundle version, scoring rules version
  * trust anchors + verification policy digest
  * replay status: “verified reproducible / reproducible / not reproducible”
* **Verdict page**

  * links to snapshot(s)
  * “replay now” action
  * “compare to latest knowledge” action

### UX guardrails

* Never show “pass/fail” without also showing:

  * snapshot ID
  * policy ID/version
  * verification status
* When results differ on replay, show:

  * exact mismatch class (engine mismatch, missing data, nondeterminism, corrupted snapshot)
  * what input changed (if known)
  * remediation steps

---

## 8) Data model and format guidelines (Development Managers)

### Canonical objects (recommended minimum set)

* **KnowledgeSnapshotManifest (KSM)**
* **KnowledgeBlob** (content-addressed bytes)
* **KnowledgeSourceDescriptor**
* **PolicyBundle**
* **TrustBundle**
* **Verdict** (signed decision artifact)
* **ReplayReport** (records replay result and mismatches)

### Content addressing

* Use a stable hash (e.g., SHA‑256) for:

  * each knowledge blob
  * manifest
  * policy bundle
  * trust bundle
* Snapshot ID should be derived from manifest digest.

### Example manifest shape (illustrative)

```json
{
  "snapshot_id": "ksm:sha256:…",
  "created_at": "2025-12-19T10:15:30Z",
  "engine": { "name": "stella-evaluator", "version": "1.7.0", "build": "…"},
  "plugins": [
    { "name": "pkg-id", "version": "2.3.1", "digest": "sha256:…" }
  ],
  "policy": { "bundle_id": "pol:sha256:…", "digest": "sha256:…" },
  "scoring": { "ruleset_id": "score:sha256:…", "digest": "sha256:…" },
  "trust": { "bundle_id": "trust:sha256:…", "digest": "sha256:…" },
  "sources": [
    {
      "name": "nvd",
      "epoch": "2025-12-18",
      "kind": "vuln_feed",
      "content_digest": "sha256:…",
      "licenses": ["…"],
      "origin": { "uri": "…", "retrieved_at": "…" }
    },
    {
      "name": "customer-vex",
      "kind": "vex",
      "content_digest": "sha256:…"
    }
  ],
  "environment": {
    "determinism_profile": "strict",
    "timezone": "UTC",
    "normalization": { "line_endings": "LF", "sort_order": "canonical" }
  }
}
```

### Versioning rules

* Every object is immutable once written.
* Changes create new digests; never mutate in place.
* Support schema evolution via:

  * `schema_version`
  * strict validation + migration tooling
* Keep manifests small; store large data as blobs.

---

## 9) Determinism contract (Engineering must enforce)

### Determinism requirements

* Stable ordering: sort inputs and outputs canonically.
* Stable timestamps: timestamps may exist but must not change computed scores/verdict.
* Stable randomization: no RNG; if unavoidable, fixed seed recorded in snapshot.
* Stable parsers: parser versions are pinned by digest; parsing must be deterministic.

### Allowed nondeterminism (if any) must be explicit

If you must allow nondeterminism, it must be:

* documented,
* surfaced in UI,
* included in replay report as “non-deterministic factor,”
* and excluded from the signed decision if it affects pass/fail.

---

## 10) Security model (Development Managers)

### Threats this feature must address

* Feed poisoning (tampered vulnerability data)
* Time-of-check/time-of-use drift (same artifact evaluated against moving feeds)
* Replay manipulation (swap snapshot content)
* “Policy drift hiding” (claiming old decision used different policies)
* Signature bypass (trust anchors altered)

### Controls required

* Sign manifests and verdicts.
* Bind verdict → snapshot ID → policy bundle hash → trust bundle hash.
* Verify on every import and on every replay invocation.
* Audit log:

  * snapshot created
  * snapshot imported
  * replay executed
  * verification failures

### Key handling

* Decide and document:

  * who signs snapshots/verdicts (service keys vs tenant keys)
  * rotation policy
  * revocation/compromise handling
* Avoid designing cryptography from scratch; use well-established signing formats and separation of duties.

---

## 11) Offline / air‑gapped requirements

### Snapshot levels (PM packaging guideline)

Offer explicit snapshot types with clear guarantees:

* **Level A: Reference-only snapshot**

  * stores hashes + source descriptors
  * replay requires access to original sources
* **Level B: Portable snapshot**

  * includes blobs necessary for replay
  * replay works offline
* **Level C: Sealed portable snapshot**

  * portable + signed + includes trust anchors
  * replay works offline and can be verified independently

Do not market air‑gap support without specifying which level is provided.

---

## 12) Performance and storage guidelines

### Principles

* Content-address knowledge blobs to maximize deduplication.
* Separate “hot” knowledge (recent epochs) from cold storage.
* Support snapshot compaction and garbage collection.

### Operational requirements

* Retention policies per tenant/project/environment.
* Quotas and alerting when snapshot storage approaches limits.
* Export bundles should be chunked/streamable for large feeds.

---

## 13) Testing and acceptance criteria

### Required test categories

1. **Golden replay tests**

   * same artifact + same snapshot → identical outputs
2. **Corruption tests**

   * bit flips in blobs/manifests are detected and rejected
3. **Version skew tests**

   * old snapshot + new engine should either replay deterministically or fail with a clear incompatibility report
4. **Air‑gap tests**

   * export → import → replay without network access
5. **Diff accuracy tests**

   * compare snapshots and ensure the diff identifies actual knowledge changes, not noise

### Definition of Done (DoD) for the feature

* Snapshots are created automatically according to policy.
* Snapshots can be exported and imported with verified integrity.
* Replay produces matching verdicts for a representative corpus.
* UI exposes snapshot provenance and replay status.
* Audit log records snapshot lifecycle events.
* Clear failure modes exist (missing blobs, incompatible engine, signature failure).

---

## 14) Metrics (PM ownership)

Track metrics that prove this is a moat, not a checkbox.

### Core KPIs

* **Replay success rate** (strict determinism)
* **Time to explain drift** (median time from “why changed” to root cause)
* **% verdicts with sealed portable snapshots**
* **Audit effort reduction** (customer-reported or measured via workflow steps)
* **Storage efficiency** (dedup ratio; bytes per snapshot over time)

### Guardrail metrics

* Snapshot creation latency impact on CI
* Snapshot storage growth per tenant
* Verification failure rates

---

## 15) Common failure modes (what to prevent)

1. Treating snapshots as “metadata only” and still claiming replayability.
2. Allowing “latest feed fetch” during replay (breaks the promise).
3. Not pinning parser/policy/scoring versions—causes silent drift.
4. Missing clear UX around replay limitations and failure reasons.
5. Overcapturing sensitive inputs (privacy and customer trust risk).
6. Underinvesting in dedup/retention (cost blowups).

---

## 16) Management checklists

### PM checklist (before commitment)

* Precisely define “replay” guarantee level (A/B/C) for each SKU/environment.
* Define which inputs are in scope (feeds, VEX, policies, trust bundles, plugins).
* Define customer-facing workflows:

  * “replay now”
  * “compare to latest”
  * “export for audit / air-gap”
* Confirm governance outcomes:

  * audit pack integration
  * exception linkage
  * release gate linkage

### Development Manager checklist (before build)

* Establish canonical schemas and versioning plan.
* Establish content-addressed storage + dedup plan.
* Establish signing and trust anchor strategy.
* Establish deterministic evaluation contract and test harness.
* Establish import/export packaging and verification.
* Establish retention, quotas, and GC.

---

## 17) Minimal phased delivery (recommended)

**Phase 1: Reference snapshot + verdict binding**

* Record source descriptors + hashes, policy/scoring/trust digests.
* Bind snapshot ID into verdict artifacts.

**Phase 2: Portable snapshots**

* Store knowledge blobs locally with dedup.
* Export/import with integrity verification.

**Phase 3: Sealed portable snapshots + replay tooling**

* Sign snapshots.
* Deterministic replay pipeline + replay report.
* UI surfacing and audit logs.

**Phase 4: Snapshot diff + drift explainability**

* Compare snapshots.
* Attribute decision drift to knowledge changes vs artifact changes.

---

If you want this turned into an internal PRD template, I can rewrite it into a structured PRD format with: objectives, user stories, functional requirements, non-functional requirements, security/compliance, dependencies, risks, and acceptance tests—ready for Jira/Linear epics and engineering design review.