Files

master d0a7b88398 move docs/**/archived/* to docs-archived/**/*

2026-01-05 16:02:11 +02:00

15 KiB

Raw Blame History

Below are internal guidelines for Stella Ops Product Managers and Development Managers for the capability: Knowledge Snapshots / Time‑Travel Replay. This is written as an implementable operating standard (not a concept note).

Knowledge Snapshots / Time‑Travel Replay

Product and Engineering Guidelines for Stella Ops

1) Purpose and value proposition

What this capability must achieve

Enable Stella Ops to reproduce any historical risk decision (scan result, policy evaluation, verdict) deterministically, using a cryptographically bound snapshot of the exact knowledge inputs that were available at the time the decision was made.

Why customers pay for it

This capability is primarily purchased for:

Auditability: “Show me what you knew, when you knew it, and why the system decided pass/fail.”
Incident response: reproduce prior posture using historical feeds/VEX/policies and explain deltas.
Air‑gapped / regulated environments: deterministic, offline decisioning with attested knowledge state.
Change control: prove whether a decision changed due to code change vs knowledge change.

Core product promise

For a given artifact and snapshot:

Same inputs → same outputs (verdict, scores, findings, evidence pointers), or Stella Ops must clearly declare the precise exceptions.

2) Definitions (PMs and engineers must align on these)

Knowledge input

Any external or semi-external information that can influence the outcome:

vulnerability databases and advisories (any source)
exploit-intel signals
VEX statements (OpenVEX, CSAF, CycloneDX VEX, etc.)
SBOM ingestion logic and parsing rules
package identification rules (including distro/backport logic)
policy content and policy engine version
scoring rules (including weights and thresholds)
trust anchors and signature verification policy
plugin versions and enabled capabilities
configuration defaults and overrides that change analysis

Knowledge Snapshot

A sealed record of:

References (which inputs were used), and
Content (the exact bytes used), and
Execution contract (the evaluator and ruleset versions)

Time‑Travel Replay

Re-running evaluation of an artifact using only the snapshot content and the recorded execution contract, producing the same decision and explainability artifacts.

3) Product principles (non‑negotiables)

Determinism is a product requirement, not an engineering detail.
Snapshots are first‑class artifacts with explicit lifecycle (create, verify, export/import, retain, expire).
The snapshot is cryptographically bound to outcomes and evidence (tamper-evident chain).
Replays must be possible offline (when the snapshot includes content) and must fail clearly when not possible.
Minimal surprise: the UI must explain when a verdict changed due to “knowledge drift” vs “artifact drift.”
Scalability by content addressing: the platform must deduplicate knowledge content aggressively.
Backward compatibility: old snapshots must remain replayable within a documented support window.

4) Scope boundaries (what this is not)

Non-goals (explicitly out of scope for v1 unless approved)

Reconstructing external internet state beyond what is recorded (no “fetch historical CVE state from the web”).
Guaranteeing replay across major engine rewrites without a compatibility plan.
Storing sensitive proprietary customer code in snapshots (unless explicitly enabled).
Replaying “live runtime signals” unless those signals were captured into the snapshot at decision time.

5) Personas and use cases (PM guidance)

Primary personas

Security Governance / GRC: needs audit packs, controls evidence, deterministic history.
Incident response / AppSec lead: needs “what changed and why” quickly.
Platform engineering / DevOps: needs reproducible CI gates and air‑gap workflows.
Procurement / regulated customers: needs proof of process and defensible attestations.

Must-support use cases

Replay a past release gate decision in a new environment (including offline) and get identical outcome.
Explain drift: “This build fails today but passed last month—why?”
Air‑gap export/import: create snapshots in connected environment, import to disconnected one.
Audit bundle generation: export snapshot + verdict(s) + evidence pointers.

6) Functional requirements (PM “must/should” list)

Must

Snapshot creation for every material evaluation (or for every “decision object” chosen by configuration).
Snapshot manifest containing:
- unique snapshot ID (content-addressed)
- list of knowledge sources with hashes/digests
- policy IDs and exact policy content hashes
- engine version and plugin versions
- timestamp and clock source metadata
- trust anchor set hash and verification policy hash
Snapshot sealing:
- snapshot manifest is signed
- signed link from verdict → snapshot ID
Replay:
- re-evaluate using only snapshot inputs
- output must match prior results (or emit a deterministic mismatch report)
Export/import:
- portable bundle format
- import verifies integrity and signatures before allowing use
Retention controls:
- configurable retention windows and storage quotas
- deduplication and garbage collection

Should

Partial snapshots (reference-only) vs full snapshots (content included), with explicit replay guarantees.
Diff views: compare two snapshots and highlight what knowledge changed.
Multi-snapshot replay: run “as-of snapshot A” and “as-of snapshot B” to show drift impact.

Could

Snapshot “federation” for large orgs (mirrors/replication with policy controls).
Snapshot “pinning” to releases or environments as a governance policy.

7) UX and workflow guidelines (PM + Eng)

UI must communicate three states clearly

Reproducible offline: snapshot includes all required content.
Reproducible with access: snapshot references external sources that must be available.
Not reproducible: missing content or unsupported evaluator version.

Required UI objects

Snapshot Details page
- snapshot ID and signature status
- list of knowledge sources (name, version/epoch, digest, size)
- policy bundle version, scoring rules version
- trust anchors + verification policy digest
- replay status: “verified reproducible / reproducible / not reproducible”
Verdict page
- links to snapshot(s)
- “replay now” action
- “compare to latest knowledge” action

UX guardrails

Never show “pass/fail” without also showing:
- snapshot ID
- policy ID/version
- verification status
When results differ on replay, show:
- exact mismatch class (engine mismatch, missing data, nondeterminism, corrupted snapshot)
- what input changed (if known)
- remediation steps

8) Data model and format guidelines (Development Managers)

Canonical objects (recommended minimum set)

KnowledgeSnapshotManifest (KSM)
KnowledgeBlob (content-addressed bytes)
KnowledgeSourceDescriptor
PolicyBundle
TrustBundle
Verdict (signed decision artifact)
ReplayReport (records replay result and mismatches)

Content addressing

Use a stable hash (e.g., SHA‑256) for:
- each knowledge blob
- manifest
- policy bundle
- trust bundle
Snapshot ID should be derived from manifest digest.

Example manifest shape (illustrative)

{
  "snapshot_id": "ksm:sha256:…",
  "created_at": "2025-12-19T10:15:30Z",
  "engine": { "name": "stella-evaluator", "version": "1.7.0", "build": "…"},
  "plugins": [
    { "name": "pkg-id", "version": "2.3.1", "digest": "sha256:…" }
  ],
  "policy": { "bundle_id": "pol:sha256:…", "digest": "sha256:…" },
  "scoring": { "ruleset_id": "score:sha256:…", "digest": "sha256:…" },
  "trust": { "bundle_id": "trust:sha256:…", "digest": "sha256:…" },
  "sources": [
    {
      "name": "nvd",
      "epoch": "2025-12-18",
      "kind": "vuln_feed",
      "content_digest": "sha256:…",
      "licenses": ["…"],
      "origin": { "uri": "…", "retrieved_at": "…" }
    },
    {
      "name": "customer-vex",
      "kind": "vex",
      "content_digest": "sha256:…"
    }
  ],
  "environment": {
    "determinism_profile": "strict",
    "timezone": "UTC",
    "normalization": { "line_endings": "LF", "sort_order": "canonical" }
  }
}

Versioning rules

Every object is immutable once written.
Changes create new digests; never mutate in place.
Support schema evolution via:
- schema_version
- strict validation + migration tooling
Keep manifests small; store large data as blobs.

9) Determinism contract (Engineering must enforce)

Determinism requirements

Stable ordering: sort inputs and outputs canonically.
Stable timestamps: timestamps may exist but must not change computed scores/verdict.
Stable randomization: no RNG; if unavoidable, fixed seed recorded in snapshot.
Stable parsers: parser versions are pinned by digest; parsing must be deterministic.

Allowed nondeterminism (if any) must be explicit

If you must allow nondeterminism, it must be:

documented,
surfaced in UI,
included in replay report as “non-deterministic factor,”
and excluded from the signed decision if it affects pass/fail.

10) Security model (Development Managers)

Threats this feature must address

Feed poisoning (tampered vulnerability data)
Time-of-check/time-of-use drift (same artifact evaluated against moving feeds)
Replay manipulation (swap snapshot content)
“Policy drift hiding” (claiming old decision used different policies)
Signature bypass (trust anchors altered)

Controls required

Sign manifests and verdicts.
Bind verdict → snapshot ID → policy bundle hash → trust bundle hash.
Verify on every import and on every replay invocation.
Audit log:
- snapshot created
- snapshot imported
- replay executed
- verification failures

Key handling

Decide and document:
- who signs snapshots/verdicts (service keys vs tenant keys)
- rotation policy
- revocation/compromise handling
Avoid designing cryptography from scratch; use well-established signing formats and separation of duties.

11) Offline / air‑gapped requirements

Snapshot levels (PM packaging guideline)

Offer explicit snapshot types with clear guarantees:

Level A: Reference-only snapshot
- stores hashes + source descriptors
- replay requires access to original sources
Level B: Portable snapshot
- includes blobs necessary for replay
- replay works offline
Level C: Sealed portable snapshot
- portable + signed + includes trust anchors
- replay works offline and can be verified independently

Do not market air‑gap support without specifying which level is provided.

12) Performance and storage guidelines

Principles

Content-address knowledge blobs to maximize deduplication.
Separate “hot” knowledge (recent epochs) from cold storage.
Support snapshot compaction and garbage collection.

Operational requirements

Retention policies per tenant/project/environment.
Quotas and alerting when snapshot storage approaches limits.
Export bundles should be chunked/streamable for large feeds.

13) Testing and acceptance criteria

Required test categories

Golden replay tests
- same artifact + same snapshot → identical outputs
Corruption tests
- bit flips in blobs/manifests are detected and rejected
Version skew tests
- old snapshot + new engine should either replay deterministically or fail with a clear incompatibility report
Air‑gap tests
- export → import → replay without network access
Diff accuracy tests
- compare snapshots and ensure the diff identifies actual knowledge changes, not noise

Definition of Done (DoD) for the feature

Snapshots are created automatically according to policy.
Snapshots can be exported and imported with verified integrity.
Replay produces matching verdicts for a representative corpus.
UI exposes snapshot provenance and replay status.
Audit log records snapshot lifecycle events.
Clear failure modes exist (missing blobs, incompatible engine, signature failure).

14) Metrics (PM ownership)

Track metrics that prove this is a moat, not a checkbox.

Core KPIs

Replay success rate (strict determinism)
Time to explain drift (median time from “why changed” to root cause)
% verdicts with sealed portable snapshots
Audit effort reduction (customer-reported or measured via workflow steps)
Storage efficiency (dedup ratio; bytes per snapshot over time)

Guardrail metrics

Snapshot creation latency impact on CI
Snapshot storage growth per tenant
Verification failure rates

15) Common failure modes (what to prevent)

Treating snapshots as “metadata only” and still claiming replayability.
Allowing “latest feed fetch” during replay (breaks the promise).
Not pinning parser/policy/scoring versions—causes silent drift.
Missing clear UX around replay limitations and failure reasons.
Overcapturing sensitive inputs (privacy and customer trust risk).
Underinvesting in dedup/retention (cost blowups).

16) Management checklists

PM checklist (before commitment)

Precisely define “replay” guarantee level (A/B/C) for each SKU/environment.
Define which inputs are in scope (feeds, VEX, policies, trust bundles, plugins).
Define customer-facing workflows:
- “replay now”
- “compare to latest”
- “export for audit / air-gap”
Confirm governance outcomes:
- audit pack integration
- exception linkage
- release gate linkage

Development Manager checklist (before build)

Establish canonical schemas and versioning plan.
Establish content-addressed storage + dedup plan.
Establish signing and trust anchor strategy.
Establish deterministic evaluation contract and test harness.
Establish import/export packaging and verification.
Establish retention, quotas, and GC.

17) Minimal phased delivery (recommended)

Phase 1: Reference snapshot + verdict binding

Record source descriptors + hashes, policy/scoring/trust digests.
Bind snapshot ID into verdict artifacts.

Phase 2: Portable snapshots

Store knowledge blobs locally with dedup.
Export/import with integrity verification.

Phase 3: Sealed portable snapshots + replay tooling

Sign snapshots.
Deterministic replay pipeline + replay report.
UI surfacing and audit logs.

Phase 4: Snapshot diff + drift explainability

Compare snapshots.
Attribute decision drift to knowledge changes vs artifact changes.

If you want this turned into an internal PRD template, I can rewrite it into a structured PRD format with: objectives, user stories, functional requirements, non-functional requirements, security/compliance, dependencies, risks, and acceptance tests—ready for Jira/Linear epics and engineering design review.

15 KiB Raw Blame History Unescape Escape