Files
git.stella-ops.org/docs-archived/product-advisories/2025-12-21-moat-phase2/20-Dec-2025 - Moat Explanation - Knowledge Snapshots and Time‑Travel Replay.md
2026-01-05 16:02:11 +02:00

15 KiB
Raw Blame History

Below are internal guidelines for Stella Ops Product Managers and Development Managers for the capability: Knowledge Snapshots / TimeTravel Replay. This is written as an implementable operating standard (not a concept note).


Knowledge Snapshots / TimeTravel Replay

Product and Engineering Guidelines for Stella Ops

1) Purpose and value proposition

What this capability must achieve

Enable Stella Ops to reproduce any historical risk decision (scan result, policy evaluation, verdict) deterministically, using a cryptographically bound snapshot of the exact knowledge inputs that were available at the time the decision was made.

Why customers pay for it

This capability is primarily purchased for:

  • Auditability: “Show me what you knew, when you knew it, and why the system decided pass/fail.”
  • Incident response: reproduce prior posture using historical feeds/VEX/policies and explain deltas.
  • Airgapped / regulated environments: deterministic, offline decisioning with attested knowledge state.
  • Change control: prove whether a decision changed due to code change vs knowledge change.

Core product promise

For a given artifact and snapshot:

  • Same inputs → same outputs (verdict, scores, findings, evidence pointers), or Stella Ops must clearly declare the precise exceptions.

2) Definitions (PMs and engineers must align on these)

Knowledge input

Any external or semi-external information that can influence the outcome:

  • vulnerability databases and advisories (any source)
  • exploit-intel signals
  • VEX statements (OpenVEX, CSAF, CycloneDX VEX, etc.)
  • SBOM ingestion logic and parsing rules
  • package identification rules (including distro/backport logic)
  • policy content and policy engine version
  • scoring rules (including weights and thresholds)
  • trust anchors and signature verification policy
  • plugin versions and enabled capabilities
  • configuration defaults and overrides that change analysis

Knowledge Snapshot

A sealed record of:

  1. References (which inputs were used), and
  2. Content (the exact bytes used), and
  3. Execution contract (the evaluator and ruleset versions)

TimeTravel Replay

Re-running evaluation of an artifact using only the snapshot content and the recorded execution contract, producing the same decision and explainability artifacts.


3) Product principles (nonnegotiables)

  1. Determinism is a product requirement, not an engineering detail.
  2. Snapshots are firstclass artifacts with explicit lifecycle (create, verify, export/import, retain, expire).
  3. The snapshot is cryptographically bound to outcomes and evidence (tamper-evident chain).
  4. Replays must be possible offline (when the snapshot includes content) and must fail clearly when not possible.
  5. Minimal surprise: the UI must explain when a verdict changed due to “knowledge drift” vs “artifact drift.”
  6. Scalability by content addressing: the platform must deduplicate knowledge content aggressively.
  7. Backward compatibility: old snapshots must remain replayable within a documented support window.

4) Scope boundaries (what this is not)

Non-goals (explicitly out of scope for v1 unless approved)

  • Reconstructing external internet state beyond what is recorded (no “fetch historical CVE state from the web”).
  • Guaranteeing replay across major engine rewrites without a compatibility plan.
  • Storing sensitive proprietary customer code in snapshots (unless explicitly enabled).
  • Replaying “live runtime signals” unless those signals were captured into the snapshot at decision time.

5) Personas and use cases (PM guidance)

Primary personas

  • Security Governance / GRC: needs audit packs, controls evidence, deterministic history.
  • Incident response / AppSec lead: needs “what changed and why” quickly.
  • Platform engineering / DevOps: needs reproducible CI gates and airgap workflows.
  • Procurement / regulated customers: needs proof of process and defensible attestations.

Must-support use cases

  1. Replay a past release gate decision in a new environment (including offline) and get identical outcome.
  2. Explain drift: “This build fails today but passed last month—why?”
  3. Airgap export/import: create snapshots in connected environment, import to disconnected one.
  4. Audit bundle generation: export snapshot + verdict(s) + evidence pointers.

6) Functional requirements (PM “must/should” list)

Must

  • Snapshot creation for every material evaluation (or for every “decision object” chosen by configuration).

  • Snapshot manifest containing:

    • unique snapshot ID (content-addressed)
    • list of knowledge sources with hashes/digests
    • policy IDs and exact policy content hashes
    • engine version and plugin versions
    • timestamp and clock source metadata
    • trust anchor set hash and verification policy hash
  • Snapshot sealing:

    • snapshot manifest is signed
    • signed link from verdict → snapshot ID
  • Replay:

    • re-evaluate using only snapshot inputs
    • output must match prior results (or emit a deterministic mismatch report)
  • Export/import:

    • portable bundle format
    • import verifies integrity and signatures before allowing use
  • Retention controls:

    • configurable retention windows and storage quotas
    • deduplication and garbage collection

Should

  • Partial snapshots (reference-only) vs full snapshots (content included), with explicit replay guarantees.
  • Diff views: compare two snapshots and highlight what knowledge changed.
  • Multi-snapshot replay: run “as-of snapshot A” and “as-of snapshot B” to show drift impact.

Could

  • Snapshot “federation” for large orgs (mirrors/replication with policy controls).
  • Snapshot “pinning” to releases or environments as a governance policy.

7) UX and workflow guidelines (PM + Eng)

UI must communicate three states clearly

  1. Reproducible offline: snapshot includes all required content.
  2. Reproducible with access: snapshot references external sources that must be available.
  3. Not reproducible: missing content or unsupported evaluator version.

Required UI objects

  • Snapshot Details page

    • snapshot ID and signature status
    • list of knowledge sources (name, version/epoch, digest, size)
    • policy bundle version, scoring rules version
    • trust anchors + verification policy digest
    • replay status: “verified reproducible / reproducible / not reproducible”
  • Verdict page

    • links to snapshot(s)
    • “replay now” action
    • “compare to latest knowledge” action

UX guardrails

  • Never show “pass/fail” without also showing:

    • snapshot ID
    • policy ID/version
    • verification status
  • When results differ on replay, show:

    • exact mismatch class (engine mismatch, missing data, nondeterminism, corrupted snapshot)
    • what input changed (if known)
    • remediation steps

8) Data model and format guidelines (Development Managers)

  • KnowledgeSnapshotManifest (KSM)
  • KnowledgeBlob (content-addressed bytes)
  • KnowledgeSourceDescriptor
  • PolicyBundle
  • TrustBundle
  • Verdict (signed decision artifact)
  • ReplayReport (records replay result and mismatches)

Content addressing

  • Use a stable hash (e.g., SHA256) for:

    • each knowledge blob
    • manifest
    • policy bundle
    • trust bundle
  • Snapshot ID should be derived from manifest digest.

Example manifest shape (illustrative)

{
  "snapshot_id": "ksm:sha256:…",
  "created_at": "2025-12-19T10:15:30Z",
  "engine": { "name": "stella-evaluator", "version": "1.7.0", "build": "…"},
  "plugins": [
    { "name": "pkg-id", "version": "2.3.1", "digest": "sha256:…" }
  ],
  "policy": { "bundle_id": "pol:sha256:…", "digest": "sha256:…" },
  "scoring": { "ruleset_id": "score:sha256:…", "digest": "sha256:…" },
  "trust": { "bundle_id": "trust:sha256:…", "digest": "sha256:…" },
  "sources": [
    {
      "name": "nvd",
      "epoch": "2025-12-18",
      "kind": "vuln_feed",
      "content_digest": "sha256:…",
      "licenses": ["…"],
      "origin": { "uri": "…", "retrieved_at": "…" }
    },
    {
      "name": "customer-vex",
      "kind": "vex",
      "content_digest": "sha256:…"
    }
  ],
  "environment": {
    "determinism_profile": "strict",
    "timezone": "UTC",
    "normalization": { "line_endings": "LF", "sort_order": "canonical" }
  }
}

Versioning rules

  • Every object is immutable once written.

  • Changes create new digests; never mutate in place.

  • Support schema evolution via:

    • schema_version
    • strict validation + migration tooling
  • Keep manifests small; store large data as blobs.


9) Determinism contract (Engineering must enforce)

Determinism requirements

  • Stable ordering: sort inputs and outputs canonically.
  • Stable timestamps: timestamps may exist but must not change computed scores/verdict.
  • Stable randomization: no RNG; if unavoidable, fixed seed recorded in snapshot.
  • Stable parsers: parser versions are pinned by digest; parsing must be deterministic.

Allowed nondeterminism (if any) must be explicit

If you must allow nondeterminism, it must be:

  • documented,
  • surfaced in UI,
  • included in replay report as “non-deterministic factor,”
  • and excluded from the signed decision if it affects pass/fail.

10) Security model (Development Managers)

Threats this feature must address

  • Feed poisoning (tampered vulnerability data)
  • Time-of-check/time-of-use drift (same artifact evaluated against moving feeds)
  • Replay manipulation (swap snapshot content)
  • “Policy drift hiding” (claiming old decision used different policies)
  • Signature bypass (trust anchors altered)

Controls required

  • Sign manifests and verdicts.

  • Bind verdict → snapshot ID → policy bundle hash → trust bundle hash.

  • Verify on every import and on every replay invocation.

  • Audit log:

    • snapshot created
    • snapshot imported
    • replay executed
    • verification failures

Key handling

  • Decide and document:

    • who signs snapshots/verdicts (service keys vs tenant keys)
    • rotation policy
    • revocation/compromise handling
  • Avoid designing cryptography from scratch; use well-established signing formats and separation of duties.


11) Offline / airgapped requirements

Snapshot levels (PM packaging guideline)

Offer explicit snapshot types with clear guarantees:

  • Level A: Reference-only snapshot

    • stores hashes + source descriptors
    • replay requires access to original sources
  • Level B: Portable snapshot

    • includes blobs necessary for replay
    • replay works offline
  • Level C: Sealed portable snapshot

    • portable + signed + includes trust anchors
    • replay works offline and can be verified independently

Do not market airgap support without specifying which level is provided.


12) Performance and storage guidelines

Principles

  • Content-address knowledge blobs to maximize deduplication.
  • Separate “hot” knowledge (recent epochs) from cold storage.
  • Support snapshot compaction and garbage collection.

Operational requirements

  • Retention policies per tenant/project/environment.
  • Quotas and alerting when snapshot storage approaches limits.
  • Export bundles should be chunked/streamable for large feeds.

13) Testing and acceptance criteria

Required test categories

  1. Golden replay tests

    • same artifact + same snapshot → identical outputs
  2. Corruption tests

    • bit flips in blobs/manifests are detected and rejected
  3. Version skew tests

    • old snapshot + new engine should either replay deterministically or fail with a clear incompatibility report
  4. Airgap tests

    • export → import → replay without network access
  5. Diff accuracy tests

    • compare snapshots and ensure the diff identifies actual knowledge changes, not noise

Definition of Done (DoD) for the feature

  • Snapshots are created automatically according to policy.
  • Snapshots can be exported and imported with verified integrity.
  • Replay produces matching verdicts for a representative corpus.
  • UI exposes snapshot provenance and replay status.
  • Audit log records snapshot lifecycle events.
  • Clear failure modes exist (missing blobs, incompatible engine, signature failure).

14) Metrics (PM ownership)

Track metrics that prove this is a moat, not a checkbox.

Core KPIs

  • Replay success rate (strict determinism)
  • Time to explain drift (median time from “why changed” to root cause)
  • % verdicts with sealed portable snapshots
  • Audit effort reduction (customer-reported or measured via workflow steps)
  • Storage efficiency (dedup ratio; bytes per snapshot over time)

Guardrail metrics

  • Snapshot creation latency impact on CI
  • Snapshot storage growth per tenant
  • Verification failure rates

15) Common failure modes (what to prevent)

  1. Treating snapshots as “metadata only” and still claiming replayability.
  2. Allowing “latest feed fetch” during replay (breaks the promise).
  3. Not pinning parser/policy/scoring versions—causes silent drift.
  4. Missing clear UX around replay limitations and failure reasons.
  5. Overcapturing sensitive inputs (privacy and customer trust risk).
  6. Underinvesting in dedup/retention (cost blowups).

16) Management checklists

PM checklist (before commitment)

  • Precisely define “replay” guarantee level (A/B/C) for each SKU/environment.

  • Define which inputs are in scope (feeds, VEX, policies, trust bundles, plugins).

  • Define customer-facing workflows:

    • “replay now”
    • “compare to latest”
    • “export for audit / air-gap”
  • Confirm governance outcomes:

    • audit pack integration
    • exception linkage
    • release gate linkage

Development Manager checklist (before build)

  • Establish canonical schemas and versioning plan.
  • Establish content-addressed storage + dedup plan.
  • Establish signing and trust anchor strategy.
  • Establish deterministic evaluation contract and test harness.
  • Establish import/export packaging and verification.
  • Establish retention, quotas, and GC.

Phase 1: Reference snapshot + verdict binding

  • Record source descriptors + hashes, policy/scoring/trust digests.
  • Bind snapshot ID into verdict artifacts.

Phase 2: Portable snapshots

  • Store knowledge blobs locally with dedup.
  • Export/import with integrity verification.

Phase 3: Sealed portable snapshots + replay tooling

  • Sign snapshots.
  • Deterministic replay pipeline + replay report.
  • UI surfacing and audit logs.

Phase 4: Snapshot diff + drift explainability

  • Compare snapshots.
  • Attribute decision drift to knowledge changes vs artifact changes.

If you want this turned into an internal PRD template, I can rewrite it into a structured PRD format with: objectives, user stories, functional requirements, non-functional requirements, security/compliance, dependencies, risks, and acceptance tests—ready for Jira/Linear epics and engineering design review.