save progress

2025-12-18 09:53:46 +02:00
parent 28823a8960
commit 7d5250238c
87 changed files with 9750 additions and 2026 deletions
--- a/docs/product-advisories/18-Dec-2025
+++ b/docs/product-advisories/18-Dec-2025
@@ -1,919 +0,0 @@
-Here’s a compact, practical way to add two high‑leverage capabilities to your scanner: **DSSE‑signed path witnesses** and **Smart‑Diff × Reachability**—what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony.
-
---
-
-# 1) DSSE‑signed path witnesses (entrypoint → calls → sink)
-
-**What it is (in plain terms):**
-When you flag a CVE as “reachable,” also emit a tiny, human‑readable proof: the **exact path** from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the **vulnerable sink**. Wrap that proof in a **DSSE** envelope and sign it. Anyone can verify the witness later—offline—without rerunning analysis.
-
-**Why it matters:**
-
-* Turns red flags into **auditable evidence** (quiet‑by‑design).
-* Lets CI/CD, auditors, and customers **verify** findings independently.
-* Enables **deterministic replay** and provenance chains (ties nicely to in‑toto/SLSA).
-
-**Minimal JSON witness (stable, vendor‑neutral):**
-
-```json
-{
-  "witness_schema": "stellaops.witness.v1",
-  "artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
-  "vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "≤1.2.3" },
-  "entrypoint": { "kind": "http", "name": "GET /billing/pay" },
-  "path": [
-    {"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
-    {"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
-    {"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
-  ],
-  "sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
-  "evidence": {
-    "callgraph_digest": "sha256:...",
-    "build_id": "dotnet:RID:linux-x64:sha256:...",
-    "analysis_config_digest": "sha256:..."
-  },
-  "observed_at": "2025-12-18T00:00:00Z"
-}
-```
-
-**Wrap in DSSE (payloadType & payload are required)**
-
-```json
-{
-  "payloadType": "application/vnd.stellaops.witness+json",
-  "payload": "base64(JSON_above)",
-  "signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
-}
-```
-
-**.NET 10 signing/verifying (Ed25519)**
-
-```csharp
-using System.Security.Cryptography;
-using System.Text.Json;
-
-var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
-var dsse = new {
-  payloadType = "application/vnd.stellaops.witness+json",
-  payload = Convert.ToBase64String(payloadBytes),
-  signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
-};
-byte[] Sign(byte[] data, byte[] privateKey)
-{
-    using var ed = new Ed25519();
-    // import private key, sign data (left as your Ed25519 helper)
-    return ed.SignData(data, privateKey);
-}
-```
-
-**Where to emit:**
-
-* **Scanner.Worker**: after reachability confirms `reachable=true`, emit witness → **Attestor** signs → **Authority** stores (Postgres) → optional Rekor‑style mirror.
-* Expose `/witness/{findingId}` for download & independent verification.
-
---
-
-# 2) Smart‑Diff × Reachability (incremental, low‑noise updates)
-
-**What it is:**
-On **SBOM/VEX/dependency** deltas, don’t rescan everything. Update only **affected regions** of the call graph and recompute reachability **just for changed nodes/edges**.
-
-**Why it matters:**
-
-* **Order‑of‑magnitude faster** incremental scans.
-* Fewer flaky diffs; triage stays focused on **meaningful risk change**.
-* Perfect for PR gating: “what changed” → “what became reachable/unreachable.”
-
-**Core idea (graph‑reachability):**
-
-* Maintain a per‑service **call graph** `G = (V, E)` with **entrypoint set** `S`.
-* On diff: compute changed nodes/edges ΔV/ΔE.
-* Run **incremental BFS/DFS** from impacted nodes to sinks (forward or backward), reusing memoized results.
-* Recompute only **frontiers** touched by Δ.
-
-**Minimal tables (Postgres):**
-
-```sql
-- Nodes (functions/methods)
-CREATE TABLE cg_nodes(
-  id BIGSERIAL PRIMARY KEY,
-  service TEXT, symbol TEXT, file TEXT, line INT,
-  hash TEXT, UNIQUE(service, hash)
-);
-- Edges (calls)
-CREATE TABLE cg_edges(
-  src BIGINT REFERENCES cg_nodes(id),
-  dst BIGINT REFERENCES cg_nodes(id),
-  kind TEXT, PRIMARY KEY(src, dst)
-);
-- Entrypoints & Sinks
-CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
-CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);
-
-- Memoized reachability cache
-CREATE TABLE cg_reach_cache(
-  entry_id BIGINT, sink_id BIGINT,
-  path JSONB, reachable BOOLEAN,
-  updated_at TIMESTAMPTZ,
-  PRIMARY KEY(entry_id, sink_id)
-);
-```
-
-**Incremental algorithm (pseudocode):**
-
-```text
-Input: ΔSBOM, ΔDeps, ΔCode → ΔNodes, ΔEdges
-1) Apply Δ to cg_nodes/cg_edges
-2) ImpactSet = neighbors(ΔNodes ∪ endpoints(ΔEdges))
-3) For each e∈Entrypoints intersect ancestors(ImpactSet):
-     Recompute forward search to affected sinks, stop early on unchanged subgraphs
-     Update cg_reach_cache; if state flips, emit new/updated DSSE witness
-```
-
-**.NET 10 reachability sketch (fast & local):**
-
-```csharp
-HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
-foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
-{
-    var res = BoundedReach(e, affectedSinks, graph, cache);
-    foreach (var r in res.Changed)
-    {
-        cache.Upsert(e, r.Sink, r.Path, r.Reachable);
-        if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
-    }
-}
-```
-
-**CI/PR flow:**
-
-1. Build → SBOM diff → Dependency diff → Call‑graph delta.
-2. Run incremental reachability.
-3. If any `unreachable→reachable` transitions: **fail gate**, attach DSSE witnesses.
-4. If `reachable→unreachable`: auto‑close prior findings (and archive prior witness).
-
---
-
-# UX hooks (quick wins)
-
-* In findings list, add a **“Show Witness”** button → modal renders the signed path (entrypoint→…→sink) + **“Verify Signature”** one‑click.
-* In PR checks, summarize only **state flips** with tiny links: “+2 reachable (view witness)” / “−1 (now unreachable)”.
-
---
-
-# Minimal tasks to get this live
-
-* **Scanner.Worker**: build call‑graph extraction (per language), add incremental graph store, reachability cache.
-* **Attestor**: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
-* **Authority**: tables above + witness storage + retrieval API.
-* **Router/CI plugin**: PR annotation with **state flips** and links to witnesses.
-* **UI**: witness modal + signature verify.
-
-If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path.
-Below is a concrete, buildable blueprint for an **advanced reachability analysis engine** inside Stella Ops. I’m going to assume your “Stella Ops” components are roughly:
-
-* **Scanner.Worker**: runs analyses in CI / on artifacts
-* **Authority**: stores graphs/findings/witnesses
-* **Attestor**: signs DSSE envelopes (Ed25519)
-* (optional) **SurfaceBuilder**: background worker that computes “vuln surfaces” for packages
-
-The key advance is: **don’t treat a CVE as “a package”**. Treat it as a **set of trigger methods** (public API) that can reach the vulnerable code inside the dependency—computed by “Smart‑Diff” once, reused everywhere.
-
---
-
-## 0) Define the contract (precision/soundness) up front
-
-If you don’t write this down, you’ll fight false positives/negatives forever.
-
-### What Stella Ops will guarantee (first release)
-
-* **Whole-program static call graph** (app + selected dependency assemblies)
-* **Context-insensitive** (fast), **path witness** extracted (shortest path)
-* **Dynamic dispatch handled** with CHA/RTA (+ DI hints), with explicit uncertainty flags
-* **Reflection handled best-effort** (constant-string resolution), otherwise “unknown edge”
-
-### What it will NOT guarantee (first release)
-
-* Perfect handling of reflection / `dynamic` / runtime codegen
-* Perfect delegate/event resolution across complex flows
-* Full taint/dataflow reachability (you can add later)
-
-This is fine. The major value is: “**we can show you the call path**” and “**we can prove the vuln is triggered by calling these library APIs**”.
-
---
-
-## 1) The big idea: “Vuln surfaces” (Smart-Diff → triggers)
-
-### Problem
-
-CVE feeds typically say “package X version range Y is vulnerable” but rarely say *which methods*. If you only do package-level reachability, noise is huge.
-
-### Solution
-
-For each CVE+package, compute a **vulnerability surface**:
-
-* **Candidate sinks** = methods changed between vulnerable and fixed versions (diff at IL level)
-* **Trigger methods** = *public/exported* methods in the vulnerable version that can reach those changed methods internally
-
-Then your service scan becomes:
-
-> “Can any entrypoint reach any trigger method?”
-
-This is both faster and more precise.
-
---
-
-## 2) Data model (Authority / Postgres)
-
-You already had call graph tables; here’s a concrete schema that supports:
-
-* graph snapshots
-* incremental updates
-* vuln surfaces
-* reachability cache
-* DSSE witnesses
-
-### 2.1 Graph tables
-
-```sql
-CREATE TABLE cg_snapshots (
-  snapshot_id BIGSERIAL PRIMARY KEY,
-  service TEXT NOT NULL,
-  build_id TEXT NOT NULL,
-  graph_digest TEXT NOT NULL,
-  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-  UNIQUE(service, build_id)
-);
-
-CREATE TABLE cg_nodes (
-  node_id BIGSERIAL PRIMARY KEY,
-  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
-  method_key TEXT NOT NULL,              -- stable key (see below)
-  asm_name TEXT,
-  type_name TEXT,
-  method_name TEXT,
-  file_path TEXT,
-  line_start INT,
-  il_hash TEXT,                          -- normalized IL hash for diffing
-  flags INT NOT NULL DEFAULT 0,          -- bitflags: has_reflection, compiler_generated, etc.
-  UNIQUE(snapshot_id, method_key)
-);
-
-CREATE TABLE cg_edges (
-  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
-  src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
-  dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
-  kind SMALLINT NOT NULL,                -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
-  PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
-);
-
-CREATE TABLE cg_entrypoints (
-  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
-  node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
-  kind TEXT NOT NULL,                    -- http, grpc, cli, job, etc.
-  name TEXT NOT NULL,                    -- GET /foo, "Main", etc.
-  PRIMARY KEY(snapshot_id, node_id, kind, name)
-);
-```
-
-### 2.2 Vuln surface tables (Smart‑Diff artifacts)
-
-```sql
-CREATE TABLE vuln_surfaces (
-  surface_id BIGSERIAL PRIMARY KEY,
-  ecosystem TEXT NOT NULL,               -- nuget
-  package TEXT NOT NULL,
-  cve_id TEXT NOT NULL,
-  vuln_version TEXT NOT NULL,            -- a representative vulnerable version
-  fixed_version TEXT NOT NULL,
-  surface_digest TEXT NOT NULL,
-  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-  UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
-);
-
-CREATE TABLE vuln_surface_sinks (
-  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
-  sink_method_key TEXT NOT NULL,
-  reason TEXT NOT NULL,                  -- changed|added|removed|heuristic
-  PRIMARY KEY(surface_id, sink_method_key)
-);
-
-CREATE TABLE vuln_surface_triggers (
-  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
-  trigger_method_key TEXT NOT NULL,
-  sink_method_key TEXT NOT NULL,
-  internal_path JSONB,                   -- optional: library internal witness path
-  PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
-);
-```
-
-### 2.3 Reachability cache & witnesses
-
-```sql
-CREATE TABLE reach_findings (
-  finding_id BIGSERIAL PRIMARY KEY,
-  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
-  cve_id TEXT NOT NULL,
-  ecosystem TEXT NOT NULL,
-  package TEXT NOT NULL,
-  package_version TEXT NOT NULL,
-  reachable BOOLEAN NOT NULL,
-  reachable_entrypoints INT NOT NULL DEFAULT 0,
-  updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-  UNIQUE(snapshot_id, cve_id, package, package_version)
-);
-
-CREATE TABLE reach_witnesses (
-  witness_id BIGSERIAL PRIMARY KEY,
-  finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
-  entry_node_id BIGINT REFERENCES cg_nodes(node_id),
-  dsse_envelope JSONB NOT NULL,
-  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
-);
-```
-
---
-
-## 3) Stable identity: MethodKey + IL hash
-
-### 3.1 MethodKey (must be stable across builds)
-
-Use a normalized string like:
-
-```
-{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)
-```
-
-Examples:
-
-* `MyApp|BillingController|Pay(System.String)`
-* `LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)`
-
-### 3.2 Normalized IL hash (for smart-diff + incremental graph updates)
-
-Raw IL bytes aren’t stable (metadata tokens change). Normalize:
-
-* opcode names
-* branch targets by *instruction index*, not offset
-* method operands by **resolved MethodKey**
-* string operands by literal or hashed literal
-* type operands by full name
-
-Then hash `SHA256(normalized_bytes)`.
-
---
-
-## 4) Call graph extraction for .NET (concrete, doable)
-
-### Tooling choice
-
-Start with **Mono.Cecil** (MIT license, easy IL traversal). You can later swap to `System.Reflection.Metadata` for speed.
-
-### 4.1 Build process (Scanner.Worker)
-
-1. `dotnet restore` (use your locked restore)
-2. `dotnet build -c Release /p:DebugType=portable /p:DebugSymbols=true`
-3. Collect:
-
-   * app assemblies: `bin/Release/**/publish/*.dll` or build output
-   * `.pdb` files for sequence points (file/line for witnesses)
-
-### 4.2 Cecil loader
-
-```csharp
-var rp = new ReaderParameters {
-    ReadSymbols = true,
-    SymbolReaderProvider = new PortablePdbReaderProvider()
-};
-
-var asm = AssemblyDefinition.ReadAssembly(dllPath, rp);
-```
-
-### 4.3 Node extraction (methods)
-
-Walk all types, including nested:
-
-```csharp
-IEnumerable<TypeDefinition> AllTypes(ModuleDefinition m)
-{
-    var stack = new Stack<TypeDefinition>(m.Types);
-    while (stack.Count > 0)
-    {
-        var t = stack.Pop();
-        yield return t;
-        foreach (var nt in t.NestedTypes) stack.Push(nt);
-    }
-}
-
-foreach (var type in AllTypes(asm.MainModule))
-foreach (var method in type.Methods)
-{
-    var key = MethodKey.From(method);           // your normalizer
-    var (file, line) = PdbFirstSequencePoint(method);
-    var ilHash = method.HasBody ? ILFingerprint(method) : null;
-
-    // store node (method_key, file, line, il_hash, flags...)
-}
-```
-
-### 4.4 Edge extraction (direct calls)
-
-```csharp
-foreach (var method in type.Methods.Where(m => m.HasBody))
-{
-    var srcKey = MethodKey.From(method);
-    foreach (var ins in method.Body.Instructions)
-    {
-        if (ins.Operand is MethodReference mr)
-        {
-            if (ins.OpCode.Code is Code.Call or Code.Callvirt or Code.Newobj)
-            {
-                var dstKey = MethodKey.From(mr); // important: stable even if not resolved
-                edges.Add(new Edge(srcKey, dstKey, kind: CallKind.Direct));
-            }
-            if (ins.OpCode.Code is Code.Ldftn or Code.Ldvirtftn)
-            {
-                // delegate capture (handle later)
-            }
-        }
-    }
-}
-```
-
---
-
-## 5) Advanced precision: dynamic dispatch + DI + async/await
-
-If you stop at direct edges only, you’ll miss many real paths.
-
-### 5.1 Async/await mapping (critical for readable witnesses)
-
-Async methods compile into a state machine `MoveNext()`. You want edges attributed back to the original method.
-
-In Cecil:
-
-* Check `AsyncStateMachineAttribute` on a method
-* It references a state machine type
-* Find that type’s `MoveNext` method
-* Map `MoveNextKey -> OriginalMethodKey`
-
-Then, while extracting edges:
-
-```csharp
-srcKey = MoveNextToOriginal.TryGetValue(srcKey, out var original) ? original : srcKey;
-```
-
-Do the same for iterator state machines.
-
-### 5.2 Virtual/interface dispatch (CHA/RTA)
-
-You need 2 maps:
-
-1. **type hierarchy / interface impl map**
-2. **override map** from “declared method” → “implementation method(s)”
-
-**Build override map**
-
-```csharp
-// For each method, Cecil exposes method.Overrides for explicit implementations.
-overrideMap[MethodKey.From(overrideRef)] = MethodKey.From(methodDef);
-```
-
-**CHA**: for callvirt to virtual method `T.M`, add edges to overrides in derived classes
-**RTA**: restrict to derived classes that are actually instantiated.
-
-How to get instantiated types:
-
-* look for `newobj` instructions and add the created type to `InstantiatedTypes`
-* plus DI registrations (below)
-
-### 5.3 DI hints (Microsoft.Extensions.DependencyInjection)
-
-You will see calls like:
-
-* `ServiceCollectionServiceExtensions.AddTransient<TService, TImpl>(...)`
-
-In IL these are generic method calls. Detect and record `TService -> TImpl` as “instantiated”. This massively improves RTA for modern .NET apps.
-
-### 5.4 Delegates/lambdas (good enough approach)
-
-Implement intraprocedural tracking:
-
-* when you see `ldftn SomeMethod` then `newobj Action::.ctor` then `stloc.s X`
-* store `delegateTargets[local X] += SomeMethod`
-* when you see `ldloc.s X` and later `callvirt Invoke`, add edges to targets
-
-This makes Minimal API entrypoint discovery work too.
-
-### 5.5 Reflection (best-effort)
-
-Implement only high-signal heuristics:
-
-* `typeof(T).GetMethod("Foo")` with constant "Foo"
-* `GetType().GetMethod("Foo")` with constant "Foo" (type unknown → mark uncertain)
-
-If resolved, add edge with `kind=reflection_guess`.
-If not, set node flag `has_reflection = true` and in results show “may be incomplete”.
-
---
-
-## 6) Entrypoint detection (concrete detectors)
-
-### 6.1 MVC controllers
-
-Detect:
-
-* types deriving from `Microsoft.AspNetCore.Mvc.ControllerBase`
-* methods:
-
-  * public
-  * not `[NonAction]`
-  * has `[HttpGet]`, `[HttpPost]`, `[Route]` etc.
-
-Extract route template from attributes’ ctor arguments.
-
-Store in `cg_entrypoints`:
-
-* kind = `http`
-* name = `GET /billing/pay` (compose verb+template)
-
-### 6.2 Minimal APIs
-
-Scan `Program.Main` IL:
-
-* find calls to `MapGet`, `MapPost`, ...
-* extract route string from preceding `ldstr`
-* resolve handler method via delegate tracking (ldftn)
-
-Entry:
-
-* kind = `http`
-* name = `GET /foo`
-
-### 6.3 CLI
-
-Find assembly entry point method (`asm.EntryPoint`) or `static Main`.
-Entry:
-
-* kind = `cli`
-* name = `Main`
-
-Start here. Add gRPC/jobs later.
-
---
-
-## 7) Smart-Diff SurfaceBuilder (the “advanced” part)
-
-This is what makes your reachability actually meaningful for CVEs.
-
-### 7.1 SurfaceBuilder inputs
-
-From your vuln ingestion pipeline:
-
-* ecosystem = nuget
-* package = `LibXYZ`
-* affected range = `<= 1.2.3`
-* fixed version = `1.2.4`
-* CVE id
-
-### 7.2 Choose a vulnerable version to diff
-
-Pick the **highest affected version below fixed**.
-
-* fixed = 1.2.4
-* vulnerable representative = 1.2.3
-
-(If multiple fixed versions exist, build multiple surfaces.)
-
-### 7.3 Download both packages
-
-Use NuGet.Protocol to download `.nupkg`, unzip, pick TFMs you care about (often `netstandard2.0` is safest). Compute fingerprints for each assembly.
-
-### 7.4 Compute method fingerprints
-
-For each method:
-
-* MethodKey
-* Normalized IL hash
-
-### 7.5 Diff
-
-```
-ChangedMethods = { k | hashVuln[k] != hashFixed[k] } ∪ added ∪ removed
-```
-
-Store these as `vuln_surface_sinks` with reason.
-
-### 7.6 Build internal library call graph
-
-Same Cecil extraction, but only for package assemblies.
-Now compute triggers:
-
-**Reverse BFS from sinks**:
-
-* Start from all sink method keys
-* Walk predecessors
-* When you encounter a **public/exported method**, record it as a trigger
-
-Also store one internal path for each trigger → sink (for witnesses).
-
-### 7.7 Add interface/base declarations as triggers
-
-Important: your app might call a library via an interface method signature, not the concrete implementation.
-
-For each trigger implementation method:
-
-* for each `method.Overrides` entry, add the overridden method key as an additional trigger
-
-This reduces dependence on perfect dispatch expansion during app scanning.
-
-### 7.8 Persist the surface
-
-Store:
-
-* sinks set
-* triggers set
-* internal witness paths (optional but highly valuable)
-
-Now you’ve converted a “version range” CVE into “these specific library APIs are dangerous”.
-
---
-
-## 8) Reachability engine (fast, witness-producing)
-
-### 8.1 In-memory graph format (CSR)
-
-Don’t BFS off dictionaries; you’ll die on perf.
-
-Build integer indices:
-
-* `method_key -> nodeIndex (0..N-1)`
-* store arrays:
-
-  * `predOffsets[N+1]`
-  * `preds[edgeCount]`
-
-Construction:
-
-1. count predecessors per node
-2. prefix sum to offsets
-3. fill preds
-
-### 8.2 Reverse BFS from sinks
-
-This computes:
-
-* `visited[node]` = can reach a sink
-* `parent[node]` = next node toward a sink (for path reconstruction)
-
-```csharp
-public sealed class ReachabilityEngine
-{
-    public ReachabilityResult Compute(
-        Graph g,
-        ReadOnlySpan<int> entrypoints,
-        ReadOnlySpan<int> sinks)
-    {
-        var visitedMark = g.VisitMark;      // int[] length N (reused across runs)
-        var parent = g.Parent;              // int[] length N (reused)
-        g.RunId++;
-
-        var q = new IntQueue(capacity: g.NodeCount);
-        var sinkSet = new BitSet(g.NodeCount);
-        foreach (var s in sinks)
-        {
-            sinkSet.Set(s);
-            visitedMark[s] = g.RunId;
-            parent[s] = s;
-            q.Enqueue(s);
-        }
-
-        while (q.TryDequeue(out var v))
-        {
-            var start = g.PredOffsets[v];
-            var end = g.PredOffsets[v + 1];
-            for (int i = start; i < end; i++)
-            {
-                var p = g.Preds[i];
-                if (visitedMark[p] == g.RunId) continue;
-                visitedMark[p] = g.RunId;
-                parent[p] = v;
-                q.Enqueue(p);
-            }
-        }
-
-        // Collect reachable entrypoints and paths
-        var results = new List<EntryWitness>();
-        foreach (var e in entrypoints)
-        {
-            if (visitedMark[e] != g.RunId) continue;
-            var path = ReconstructPath(e, parent, sinkSet);
-            results.Add(new EntryWitness(e, path));
-        }
-
-        return new ReachabilityResult(results);
-    }
-
-    private static int[] ReconstructPath(int entry, int[] parent, BitSet sinks)
-    {
-        var path = new List<int>(32);
-        int cur = entry;
-        path.Add(cur);
-
-        // follow parent pointers until a sink
-        for (int guard = 0; guard < 10_000; guard++)
-        {
-            if (sinks.Get(cur)) break;
-            var nxt = parent[cur];
-            if (nxt == cur || nxt < 0) break; // safety
-            cur = nxt;
-            path.Add(cur);
-        }
-        return path.ToArray();
-    }
-}
-```
-
-### 8.3 Producing the witness
-
-For each node index in the path:
-
-* method_key
-* file_path / line_start (if known)
-* optional flags (reflection_guess edge, dispatch edge)
-
-Then attach:
-
-* vuln id, package, version
-* entrypoint kind/name
-* graph digest + config digest
-* surface digest
-* timestamp
-
-Send JSON to Attestor for DSSE signing, store envelope in Authority.
-
---
-
-## 9) Scaling: don’t do BFS 500 times if you can avoid it
-
-### 9.1 First-line scaling (usually enough)
-
-* Group vulnerabilities by package/version → surfaces reused
-* Only run reachability for vulns where:
-
-  * dependency present AND
-  * surface exists OR fallback mode
-* Limit witnesses per vuln (top 3)
-
-In practice, with N~50k nodes and E~200k edges, a reverse BFS is fast in C# if done with arrays.
-
-### 9.2 Incremental Smart-Diff × Reachability (your “low noise” killer feature)
-
-#### Step A: compute graph delta between snapshots
-
-Use `il_hash` per method to detect changed nodes:
-
-* added / removed / changed nodes
-* edges updated only for changed nodes
-
-#### Step B: decide which vulnerabilities need recompute
-
-Store a cached reverse-reachable set per vuln surface if you want (bitset), OR just do a cheaper heuristic:
-
-Recompute for vulnerability if:
-
-* sink set changed (new surface or version changed), OR
-* any changed node is on any previously stored witness path, OR
-* entrypoints changed, OR
-* impacted nodes touch any trigger node’s predecessors (use a small localized search)
-
-A practical approach:
-
-* store all node IDs that appear in any witness path for that vuln
-* if delta touches any of those nodes/edges, recompute
-* otherwise reuse cached result
-
-This yields a massive win on PR scans where most code is unchanged.
-
-#### Step C: “Impact frontier” recompute (optional)
-
-If you want more advanced:
-
-* compute `ImpactSet = ΔNodes ∪ endpoints(ΔEdges)`
-* run reverse BFS **starting from ImpactSet ∩ ReverseReachSet** and update visited marks
-  This is trickier to implement correctly (dynamic graph), so I’d ship the heuristic first.
-
---
-
-## 10) Practical fallback modes (don’t block shipping)
-
-You won’t have surfaces for every CVE on day 1. Handle this gracefully:
-
-### Mode 1: Surface-based reachability (best)
-
-* sink = trigger methods from surface
-* result: “reachable” with path
-
-### Mode 2: Package API usage (good fallback)
-
-* sink = *any* method in that package that is called by app
-* result: “package reachable” (lower confidence), still provide path to callsite
-
-### Mode 3: Dependency present only (SBOM level)
-
-* no call graph needed
-* result: “present” only
-
-Your UI can show confidence tiers:
-
-* **Confirmed reachable (surface)**
-* **Likely reachable (package API)**
-* **Present only (SBOM)**
-
---
-
-## 11) Integration points inside Stella Ops
-
-### Scanner.Worker (per build)
-
-1. Build/collect assemblies + pdb
-2. `CallGraphBuilder` → nodes/edges/entrypoints + graph_digest
-3. Load SBOM vulnerabilities list
-4. For each vuln:
-
-   * resolve surface triggers; if missing → enqueue SurfaceBuilder job + fallback mode
-   * run reachability BFS
-   * for each reachable entrypoint: emit DSSE witness
-5. Persist findings/witnesses
-
-### SurfaceBuilder (async worker)
-
-* triggered by “surface missing” events or nightly preload of top packages
-* computes surface once, stores forever
-
-### Authority
-
-* stores graphs, surfaces, findings, witnesses
-* provides retrieval APIs for UI/CI
-
---
-
-## 12) What to implement first (in the order that produces value fastest)
-
-### Week 1–2 scope (realistic, shippable)
-
-1. Cecil call graph extraction (direct calls)
-2. MVC + Minimal API entrypoints
-3. Reverse BFS reachability with path witnesses
-4. DSSE witness signing + storage
-5. SurfaceBuilder v1:
-
-   * IL hash per method
-   * changed methods as sinks
-   * triggers via internal reverse BFS
-6. UI: “Show Witness” + “Verify Signature”
-
-### Next increment (precision upgrades)
-
-7. async/await mapping to original methods
-8. RTA + DI registration hints
-9. delegate tracking for Minimal API handlers (if not already)
-10. interface override triggers in surface builder
-
-### Later (if you want “attackability”, not just “reachability”)
-
-11. taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
-12. sanitizer modeling & parameter constraints
-
---
-
-## 13) Common failure modes and how to harden
-
-### MethodKey mismatches (surface vs app call)
-
-* Ensure both are generated from the same normalization rules
-* For generic methods, prefer **definition** keys (strip instantiation)
-* Store both “exact” and “erased generic” variants if needed
-
-### Multi-target frameworks
-
-* SurfaceBuilder: compute triggers for each TFM, union them
-* App scan: choose TFM closest to build RID, but allow fallback to union
-
-### Huge graphs
-
-* Drop `System.*` nodes/edges unless:
-
-  * the vuln is in System.* (rare, but handle separately)
-* Deduplicate nodes by MethodKey across assemblies where safe
-* Use CSR arrays + pooled queues
-
-### Reflection heavy projects
-
-* Mark analysis confidence lower
-* Include “unknown edges present” in finding metadata
-* Still produce a witness path up to the reflective callsite
-
---
-
-If you want, I can also paste a **complete Cecil-based CallGraphBuilder class** (nodes+edges+PDB lines), plus the **SurfaceBuilder** that downloads NuGet packages and generates `vuln_surface_triggers` end-to-end.
--- a/docs/product-advisories/18-Dec-2025
+++ b/docs/product-advisories/18-Dec-2025
@@ -1,869 +0,0 @@
-Here’s a compact, practical blueprint for bringing **EPSS** into your stack without chaos: a **3‑layer ingestion model** that keeps raw data, produces clean probabilities, and emits “signal‑ready” events your risk engine can use immediately.
-
---
-
-# Why this matters (super short)
-
-* **EPSS** = predicted probability a vuln will be exploited soon.
-* Mixing “raw EPSS feed” directly into decisions makes audits, rollbacks, and model upgrades painful.
-* A **layered model** lets you **version probability evolution**, compare vendors, and train **meta‑predictors on deltas** (how risk changes over time), not just on snapshots.
-
---
-
-# The three layers (and how they map to Stella Ops)
-
-1. **Raw feed layer (immutable)**
-
-* **Goal:** Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all).
-* **Stella modules:** `Concelier` (preserve‑prune source) writes; `Authority` handles signatures/hashes.
-* **Storage:** `postgres.epss_raw` (partitioned by day); blob column for the untouched payload; SHA‑256 of source file.
-* **Why:** Full provenance + deterministic replay.
-
-2. **Normalized probabilistic layer**
-
-* **Goal:** Clean, typed tables keyed by `cve_id`, with **probability, percentile, model_version, asof_ts**.
-* **Stella modules:** `Excititor` (transform); `Policy Engine` reads.
-* **Storage:** `postgres.epss_prob` with a **surrogate key** `(cve_id, model_version, asof_ts)` and computed **delta fields** vs previous `asof_ts`.
-* **Extras:** Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side‑by‑side.
-
-3. **Signal‑ready layer (risk engine contracts)**
-
-* **Goal:** Pre‑chewed “events” your **Signals/Router** can route instantly.
-* **What’s inside:** Only the fields needed for gating and UI: `cve_id`, `prob_now`, `prob_delta`, `percentile`, `risk_band`, `explain_hash`.
-* **Emit:** `first_signal`, `risk_increase`, `risk_decrease`, `quieted` with **idempotent event keys**.
-* **Stella modules:** `Signals` publishes, `Router` fan‑outs, `Timeline` records; `Notify` handles subscriptions.
-
---
-
-# Minimal Postgres schema (ready to paste)
-
-```sql
-- 1) Raw (immutable)
-create table epss_raw (
-  id bigserial primary key,
-  source_uri text not null,
-  ingestion_ts timestamptz not null default now(),
-  asof_date date not null,
-  payload jsonb not null,
-  payload_sha256 bytea not null
-);
-create index on epss_raw (asof_date);
-
-- 2) Normalized
-create table epss_prob (
-  id bigserial primary key,
-  cve_id text not null,
-  model_version text not null,         -- e.g., 'EPSS-4.0-Falcon-2025-12'
-  asof_ts timestamptz not null,
-  probability double precision not null,
-  percentile double precision,
-  features jsonb,                      -- optional: normalized features used
-  unique (cve_id, model_version, asof_ts)
-);
-- delta against prior point (materialized view or nightly job)
-create materialized view epss_prob_delta as
-select p.*,
-       p.probability - lag(p.probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
-from epss_prob p;
-
-- 3) Signal-ready
-create table epss_signal (
-  signal_id bigserial primary key,
-  cve_id text not null,
-  asof_ts timestamptz not null,
-  probability double precision not null,
-  prob_delta double precision,
-  risk_band text not null,             -- e.g., 'LOW/MED/HIGH/CRITICAL'
-  model_version text not null,
-  explain_hash bytea not null,         -- hash of inputs -> deterministic
-  unique (cve_id, model_version, asof_ts)
-);
-```
-
---
-
-# C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style)
-
-```csharp
-// 1) Fetch & store raw (Concelier)
-public async Task IngestRawAsync(Uri src, DateOnly asOfDate) {
-    var bytes = await http.GetByteArrayAsync(src);
-    var sha = SHA256.HashData(bytes);
-    await pg.ExecuteAsync(
-        "insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)",
-        new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha });
-}
-
-// 2) Normalize (Excititor)
-public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) {
-    var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate });
-    foreach (var r in raws) {
-        foreach (var row in ParseCsvOrJson(r.Payload)) {
-            await pg.ExecuteAsync(
-              @"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features)
-                values (@cve,@mv,@ts,@prob,@pct,@feat)
-                on conflict do nothing",
-              new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features });
-        }
-    }
-}
-
-// 3) Emit signal-ready (Signals)
-public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) {
-    var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability,
-        probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
-      from epss_prob where model_version=@mv", new { mv = modelVersion });
-
-    foreach (var r in rows) {
-        var band = Band(r.probability); // map to LOW/MED/HIGH/CRITICAL
-        if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) {
-            var explainHash = DeterministicExplainHash(r);
-            await pg.ExecuteAsync(@"insert into epss_signal
-                (cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash)
-                values (@c,@t,@p,@d,@b,@mv,@h)
-                on conflict do nothing",
-                new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash });
-
-            await bus.PublishAsync("risk.epss.delta", new {
-                cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash)
-            });
-        }
-    }
-}
-```
-
---
-
-# Versioning & experiments (the secret sauce)
-
-* **Model namespace:** `EPSS‑4.0‑<regressor‑name>‑<date>` so you can run multiple variants in parallel.
-* **Delta‑training:** Train a small meta‑predictor on **Δprobability** to forecast **“risk jumps in next N days.”**
-* **A/B in production:** Route `model_version=x` to 50% of projects; compare **MTTA to patch** and **false‑alarm rate**.
-
---
-
-# Policy & UI wiring (quick contracts)
-
-**Policy gates** (OPA/Rego or internal rules):
-
-* Block if `risk_band ∈ {HIGH, CRITICAL}` **AND** `prob_delta >= 0.1` in last 72h.
-* Soften if asset not reachable or mitigated by VEX.
-
-**UI (Evidence pane):**
-
-* Show **sparkline of EPSS over time**, highlight last Δ.
-* “Why now?” button reveals **explain_hash** → deterministic evidence payload.
-
---
-
-# Ops & reliability
-
-* Daily ingestion with **idempotent** runs (raw SHA guard).
-* Backfills: re‑normalize from `epss_raw` for any new model without re‑downloading.
-* **Deterministic replay:** export `(raw, transform code hash, model_version)` alongside results.
-
---
-
-If you want, I can drop this as a ready‑to‑run **.sql + .csproj** seed with a tiny CLI (`ingest`, `normalize`, `emit`) tailored to your `Postgres + Valkey` profile.
-Below is a “do this, then this” implementation guide for a **layered EPSS pipeline** inside **Stella Ops**, with concrete schemas, job boundaries, idempotency rules, and the tricky edge cases (model-version shifts, noise control, backfills).
-
-I’ll assume:
-
-* **Postgres** is your system of record, **Valkey** is available for caching,
-* you run **.NET workers** (like `StellaOps.Scanner.Worker.DotNet`),
-* Stella modules you referenced map roughly like this:
-
-  * **Concelier** = ingest + preserve/prune raw sources
-  * **Authority** = provenance (hashes, immutability, signature-like guarantees)
-  * **Excititor** = transform/normalize
-  * **Signals / Router / Timeline / Notify** = event pipeline + audit trail + subscriptions
-
-I’ll anchor the EPSS feed details to FIRST’s docs:
-
-* The data feed fields are `cve`, `epss`, `percentile` and are refreshed daily. ([FIRST][1])
-* Historical daily `.csv.gz` files exist at `https://epss.empiricalsecurity.com/epss_scores-YYYY-mm-dd.csv.gz`. ([FIRST][1])
-* The API base is `https://api.first.org/data/v1/epss` and supports per-CVE and time-series queries. ([FIRST][2])
-* FIRST notes model-version shifts (v2/v3/v4) and that the daily files include a leading `#` comment indicating model version/publish date (important for delta correctness). ([FIRST][1])
-* FIRST’s guidance: use **probability** as the primary score and **show percentile alongside it**; raw feeds provide both as decimals 0–1. ([FIRST][3])
-
---
-
-## 0) Target architecture and data contracts
-
-### The 3 layers and what must be true in each
-
-1. **Raw layer (immutable)**
-
-   * You can replay exactly what you ingested, byte-for-byte.
-   * Contains: file bytes or object-store pointer, headers (ETag, Last-Modified), SHA-256, parsed “header comment” (the `# …` line), ingestion status.
-
-2. **Normalized probability layer (typed, queryable, historical)**
-
-   * One row per `(model_name, asof_date, cve_id)`.
-   * Contains: `epss` probability (0–1), `percentile` (0–1), `model_version` (from file header comment if available).
-   * Built for joins against vulnerability inventory and for time series.
-
-3. **Signal-ready layer (risk engine contract)**
-
-   * Contains only actionable changes (crossing thresholds, jumps, newly-scored, etc.), ideally scoped to **observed CVEs** in your environment to avoid noise.
-   * Events are idempotent, audit-friendly, and versioned.
-
---
-
-## 1) Data source choice and acquisition strategy
-
-### Prefer the daily bulk `.csv.gz` over paging the API for full refresh
-
-* FIRST explicitly documents the “ALL CVEs for a date” bulk file URL pattern. ([FIRST][2])
-* The API is great for:
-
-  * “give me EPSS for this CVE list”
-  * “give me last 30 days time series for CVE X” ([FIRST][2])
-
-**Recommendation**
-
-* Daily job pulls the bulk file for “latest available date”.
-* A separate on-demand endpoint uses the API time-series for UI convenience (optional).
-
-### Robust “latest available date” probing
-
-Because the “current day” file may not be published when your cron fires:
-
-Algorithm:
-
-1. Let `d0 = UtcToday`.
-2. For `d in [d0, d0-1, d0-2, d0-3]`:
-
-   * Try `GET https://epss.empiricalsecurity.com/epss_scores-{d:yyyy-MM-dd}.csv.gz`
-   * If HTTP 200: ingest that as `asof_date = d` and stop.
-3. If none succeed: fail the job with a clear message + alert.
-
-This avoids timezone and publishing-time ambiguity.
-
---
-
-## 2) Layer 1: Raw feed (Concelier + Authority)
-
-### 2.1 Schema for raw + lineage
-
-Use a dedicated schema `epss` so the pipeline is easy to reason about.
-
-```sql
-create schema if not exists epss;
-
-- Immutable file-level record
-create table if not exists epss.raw_file (
-  raw_id           bigserial primary key,
-  source_uri       text not null,
-  asof_date        date not null,
-  fetched_at       timestamptz not null default now(),
-
-  http_etag        text,
-  http_last_modified timestamptz,
-  content_len      bigint,
-
-  content_sha256   bytea not null,
-
-  -- first non-empty comment lines like "# model=... date=..."
-  header_comment   text,
-  model_version    text,
-  model_published_on date,
-
-  -- storage: either inline bytea OR object storage pointer
-  storage_kind     text not null default 'pg_bytea', -- 'pg_bytea' | 's3' | 'fs'
-  storage_ref      text,
-  content_gz       bytea,  -- nullable if stored externally
-
-  parse_status     text not null default 'pending', -- pending|parsed|failed
-  parse_error      text,
-
-  unique (source_uri, asof_date, content_sha256)
-);
-
-create index if not exists ix_epss_raw_file_asof on epss.raw_file(asof_date);
-create index if not exists ix_epss_raw_file_status on epss.raw_file(parse_status);
-```
-
-**Why store `model_version` here?**
-FIRST warns that model updates cause “major shifts” and the daily files include a `#` comment with model version/publish date. If you ignore this, your delta logic will misfire on model-change days. ([FIRST][1])
-
-### 2.2 Raw ingestion idempotency rules
-
-A run is “already ingested” if:
-
-* a row exists for `(source_uri, asof_date)` with the same `content_sha256`, OR
-* you implement “single truth per day” and treat any new sha for the same date as “replace” (rare, but can happen).
-
-Recommended:
-
-* **Treat as replace only if** you’re confident the source can republish the same date. If not, keep both but mark the superseded one.
-
-### 2.3 Raw ingestion implementation details (.NET)
-
-**Key constraints**
-
-* Download as a stream (`ResponseHeadersRead`)
-* Compute SHA-256 while streaming
-* Store bytes or stream them into object storage
-* Capture ETag/Last-Modified headers if present
-
-Pseudo-implementation structure:
-
-* `EpssFetchJob`
-
-  * `ProbeLatestDateAsync()`
-  * `DownloadAsync(uri)`
-  * `ExtractHeaderCommentAsync(gzipStream)` (read a few first lines after decompression)
-  * `InsertRawFileRecord(...)` (Concelier + Authority)
-
-**Header comment extraction**
-FIRST indicates files may start with `# ... model version ... publish date ...`. ([FIRST][1])
-So do:
-
-* Decompress
-* Read lines until you find first non-empty non-`#` line (that’s likely CSV header / first row)
-* Save the concatenated `#` lines as `header_comment`
-* Regex best-effort parse:
-
-  * `model_version`: something like `v2025.03.14`
-  * `model_published_on`: `YYYY-MM-DD`
-
-If parsing fails, still store `header_comment`.
-
-### 2.4 Pruning raw (Concelier “preserve-prune”)
-
-Define retention policy:
-
-* Keep **raw bytes** 90–180 days (cheap enough; each `.csv.gz` is usually a few–tens of MB)
-* Keep **metadata** forever (tiny, essential for audits)
-
-Nightly cleanup job:
-
-* delete `content_gz` or external object for `raw_file` older than retention
-* keep row but set `storage_kind='pruned'`, `content_gz=null`, `storage_ref=null`
-
---
-
-## 3) Layer 2: Normalized probability tables (Excititor)
-
-### 3.1 Core normalized table design
-
-Requirements:
-
-* Efficient time series per CVE
-* Efficient “latest score per CVE”
-* Efficient join to “observed vulnerabilities” tables
-
-#### Daily score table (partitioned)
-
-```sql
-create table if not exists epss.daily_score (
-  model_name     text not null,            -- 'FIRST_EPSS'
-  asof_date      date not null,
-  cve_id         text not null,
-  epss           double precision not null,
-  percentile     double precision,
-  model_version  text,                     -- from raw header if available
-  raw_id         bigint references epss.raw_file(raw_id),
-  loaded_at      timestamptz not null default now(),
-
-  -- Guards
-  constraint ck_epss_range check (epss >= 0.0 and epss <= 1.0),
-  constraint ck_percentile_range check (percentile is null or (percentile >= 0.0 and percentile <= 1.0)),
-
-  primary key (model_name, asof_date, cve_id)
-) partition by range (asof_date);
-
-- Example monthly partitions (create via migration script generator)
-create table if not exists epss.daily_score_2025_12
-  partition of epss.daily_score for values from ('2025-12-01') to ('2026-01-01');
-
-create index if not exists ix_epss_daily_score_cve on epss.daily_score (model_name, cve_id, asof_date desc);
-create index if not exists ix_epss_daily_score_epss on epss.daily_score (model_name, asof_date, epss desc);
-create index if not exists ix_epss_daily_score_pct on epss.daily_score (model_name, asof_date, percentile desc);
-```
-
-**Field semantics**
-
-* `epss` is the probability of exploitation in the next 30 days, 0–1. ([FIRST][1])
-* `percentile` is relative rank among all scored vulnerabilities. ([FIRST][1])
-
-### 3.2 Maintain a “latest” table for fast joins
-
-Don’t compute latest via window functions in hot paths (policy evaluation / scoring). Materialize it.
-
-```sql
-create table if not exists epss.latest_score (
-  model_name     text not null,
-  cve_id         text not null,
-  asof_date      date not null,
-  epss           double precision not null,
-  percentile     double precision,
-  model_version  text,
-  updated_at     timestamptz not null default now(),
-  primary key (model_name, cve_id)
-);
-
-create index if not exists ix_epss_latest_epss on epss.latest_score(model_name, epss desc);
-create index if not exists ix_epss_latest_pct on epss.latest_score(model_name, percentile desc);
-```
-
-Update logic (after loading a day):
-
-* Upsert each CVE (or do a set-based upsert):
-
-  * `asof_date` should only move forward
-  * if a backfill loads an older day, do not overwrite latest
-
-### 3.3 Delta table for change detection
-
-Store deltas per day (this powers signals and “sparkline deltas”).
-
-```sql
-create table if not exists epss.daily_delta (
-  model_name        text not null,
-  asof_date         date not null,
-  cve_id            text not null,
-
-  epss              double precision not null,
-  prev_asof_date    date,
-  prev_epss         double precision,
-  epss_delta        double precision,
-
-  percentile        double precision,
-  prev_percentile   double precision,
-  percentile_delta  double precision,
-
-  model_version     text,
-  prev_model_version text,
-  is_model_change   boolean not null default false,
-
-  created_at        timestamptz not null default now(),
-  primary key (model_name, asof_date, cve_id)
-);
-
-create index if not exists ix_epss_daily_delta_cve on epss.daily_delta(model_name, cve_id, asof_date desc);
-create index if not exists ix_epss_daily_delta_delta on epss.daily_delta(model_name, asof_date, epss_delta desc);
-```
-
-**Model update handling**
-
-* On a model version change day (v3→v4 etc), many deltas will jump.
-* FIRST explicitly warns model shifts. ([FIRST][1])
-  So:
-* detect if today’s `model_version != previous_day.model_version`
-* set `is_model_change = true`
-* optionally **suppress delta-based signals** that day (or emit a separate “MODEL_UPDATED” event)
-
-### 3.4 Normalization job mechanics
-
-Implement `EpssNormalizeJob`:
-
-1. Select `raw_file` rows where `parse_status='pending'`.
-2. Decompress `content_gz` or fetch from object store.
-3. Parse CSV:
-
-   * skip `#` comment lines
-   * expect columns: `cve,epss,percentile` (FIRST documents these fields). ([FIRST][1])
-4. Validate:
-
-   * CVE format: `^CVE-\d{4}-\d{4,}$`
-   * numeric parse for epss/percentile
-   * range checks 0–1
-5. Load into Postgres fast:
-
-   * Use `COPY` (binary import) into a **staging table** `epss.stage_score`
-   * Then set-based insert into `epss.daily_score`
-6. Update `epss.raw_file.parse_status='parsed'` or `failed`.
-
-#### Staging table pattern
-
-```sql
-create unlogged table if not exists epss.stage_score (
-  model_name  text not null,
-  asof_date   date not null,
-  cve_id      text not null,
-  epss        double precision not null,
-  percentile  double precision,
-  model_version text,
-  raw_id      bigint not null
-);
-```
-
-In the job:
-
-* `truncate epss.stage_score;`
-* `COPY epss.stage_score FROM STDIN (FORMAT BINARY)`
-* Then (transactionally):
-
-  * `delete from epss.daily_score where model_name=@m and asof_date=@d;`  *(idempotency for reruns)*
-  * `insert into epss.daily_score (...) select ... from epss.stage_score;`
-
-This avoids `ON CONFLICT` overhead and guarantees deterministic reruns.
-
-### 3.5 Delta + latest materialization job
-
-Implement `EpssMaterializeJob` after successful daily_score insert.
-
-**Compute previous available date**
-
-```sql
-- previous date available for that model_name
-select max(asof_date)
-from epss.daily_score
-where model_name = @model
-  and asof_date < @asof_date;
-```
-
-**Populate delta (set-based)**
-
-```sql
-insert into epss.daily_delta (
-  model_name, asof_date, cve_id,
-  epss, prev_asof_date, prev_epss, epss_delta,
-  percentile, prev_percentile, percentile_delta,
-  model_version, prev_model_version, is_model_change
-)
-select
-  cur.model_name,
-  cur.asof_date,
-  cur.cve_id,
-  cur.epss,
-  prev.asof_date as prev_asof_date,
-  prev.epss as prev_epss,
-  cur.epss - prev.epss as epss_delta,
-  cur.percentile,
-  prev.percentile as prev_percentile,
-  (cur.percentile - prev.percentile) as percentile_delta,
-  cur.model_version,
-  prev.model_version,
-  (cur.model_version is not null and prev.model_version is not null and cur.model_version <> prev.model_version) as is_model_change
-from epss.daily_score cur
-left join epss.daily_score prev
-  on prev.model_name = cur.model_name
- and prev.asof_date = @prev_asof_date
- and prev.cve_id = cur.cve_id
-where cur.model_name = @model
-  and cur.asof_date = @asof_date;
-```
-
-**Update latest_score (set-based upsert)**
-
-```sql
-insert into epss.latest_score(model_name, cve_id, asof_date, epss, percentile, model_version)
-select model_name, cve_id, asof_date, epss, percentile, model_version
-from epss.daily_score
-where model_name=@model and asof_date=@asof_date
-on conflict (model_name, cve_id) do update
-set asof_date = excluded.asof_date,
-    epss = excluded.epss,
-    percentile = excluded.percentile,
-    model_version = excluded.model_version,
-    updated_at = now()
-where epss.latest_score.asof_date < excluded.asof_date;
-```
-
---
-
-## 4) Layer 3: Signal-ready output (Signals + Router + Timeline + Notify)
-
-### 4.1 Decide what “signal” means in Stella Ops
-
-You do **not** want to emit 300k events daily.
-
-You want “actionable” events, ideally:
-
-* only for CVEs that are **observed** in your tenant’s environment, and
-* only when something meaningful happens.
-
-Examples:
-
-* Risk band changes (based on percentile or probability)
-* ΔEPS S crosses a threshold (e.g., jump ≥ 0.05)
-* Newly scored CVEs that are present in environment
-* Model version change day → one summary event instead of 300k deltas
-
-### 4.2 Risk band mapping (internal heuristic)
-
-FIRST explicitly does **not** “officially bin” EPSS scores; binning is subjective. ([FIRST][3])
-But operationally you’ll want bands. Use config-driven thresholds.
-
-Default band function based on percentile:
-
-* `CRITICAL` if `percentile >= 0.995`
-* `HIGH` if `percentile >= 0.99`
-* `MEDIUM` if `percentile >= 0.90`
-* else `LOW`
-
-Store these in config per tenant/policy pack.
-
-### 4.3 Signal table for idempotency + audit
-
-```sql
-create table if not exists epss.signal (
-  signal_id     bigserial primary key,
-  tenant_id     uuid not null,
-  model_name    text not null,
-  asof_date     date not null,
-  cve_id        text not null,
-
-  event_type    text not null,         -- 'RISK_BAND_UP' | 'RISK_SPIKE' | 'MODEL_UPDATED' | ...
-  risk_band     text,
-  epss          double precision,
-  epss_delta    double precision,
-  percentile    double precision,
-  percentile_delta double precision,
-
-  is_model_change boolean not null default false,
-
-  -- deterministic idempotency key
-  dedupe_key    text not null,
-  payload       jsonb not null,
-
-  created_at    timestamptz not null default now(),
-
-  unique (tenant_id, dedupe_key)
-);
-
-create index if not exists ix_epss_signal_tenant_date on epss.signal(tenant_id, asof_date desc);
-create index if not exists ix_epss_signal_cve on epss.signal(tenant_id, cve_id, asof_date desc);
-```
-
-**Dedupe key pattern**
-Make it deterministic:
-
-```
-dedupe_key = $"{model_name}:{asof_date:yyyy-MM-dd}:{cve_id}:{event_type}:{band_before}->{band_after}"
-```
-
-### 4.4 Signal generation job
-
-Implement `EpssSignalJob(tenant)`:
-
-1. Get tenant’s **observed CVEs** from your vuln inventory (whatever your table is; call it `vuln.instance`):
-
-   * only open/unremediated vulns
-   * optionally only “reachable” or “internet exposed” assets
-
-2. Join against today’s `epss.daily_delta` (or `epss.daily_score` if you skipped delta):
-
-Pseudo-SQL:
-
-```sql
-select d.*
-from epss.daily_delta d
-join vuln.observed_cve oc
-  on oc.tenant_id = @tenant
- and oc.cve_id = d.cve_id
-where d.model_name=@model
-  and d.asof_date=@asof_date;
-```
-
-3. Suppress noise:
-
-* if `is_model_change=true`, skip “delta spike” events and instead emit one `MODEL_UPDATED` summary event per tenant (and maybe per policy domain).
-* else evaluate:
-
-  * `abs(epss_delta) >= delta_threshold`
-  * band change
-  * percentile crosses a cutoff
-
-4. Insert into `epss.signal` with dedupe key, then publish to Signals bus:
-
-* topic: `signals.epss`
-* payload includes `tenant_id`, `cve_id`, `asof_date`, `epss`, `percentile`, deltas, band, and an `evidence` block.
-
-5. Timeline + Notify:
-
-* Timeline: record the event (what changed, when, data source sha)
-* Notify: notify subscribed channels (Slack/email/etc) based on tenant policy
-
-### 4.5 Evidence payload structure
-
-Keep evidence deterministic + replayable:
-
-```json
-{
-  "source": {
-    "provider": "FIRST",
-    "feed": "epss_scores-YYYY-MM-DD.csv.gz",
-    "asof_date": "2025-12-17",
-    "raw_sha256": "…",
-    "model_version": "v2025.03.14",
-    "header_comment": "# ... "
-  },
-  "metrics": {
-    "epss": 0.153,
-    "percentile": 0.92,
-    "epss_delta": 0.051,
-    "percentile_delta": 0.03
-  },
-  "decision": {
-    "event_type": "RISK_SPIKE",
-    "thresholds": {
-      "delta_threshold": 0.05,
-      "critical_percentile": 0.995
-    }
-  }
-}
-```
-
-This aligns with FIRST’s recommendation to present probability with percentile when possible. ([FIRST][3])
-
---
-
-## 5) Integration points inside Stella Ops
-
-### 5.1 Policy Engine usage
-
-Policy Engine should **only** read from Layer 2 (normalized) and Layer 3 (signals), never raw.
-
-Patterns:
-
-* For gating decisions: query `epss.latest_score` for each CVE in a build/image/SBOM scan result.
-* For “why was this blocked?”: show evidence that references `raw_sha256` and `model_version`.
-
-### 5.2 Vuln scoring pipeline
-
-When you compute “Stella Risk Score” for a vuln instance:
-
-* Join `vuln_instance.cve_id` → `epss.latest_score`
-* Combine with CVSS, KEV, exploit maturity, asset exposure, etc.
-* EPSS alone is **threat likelihood**, not impact; FIRST explicitly says it’s not a complete picture of risk. ([FIRST][4])
-
-### 5.3 UI display
-
-Recommended UI string (per FIRST guidance):
-
-* Show **probability** as a percent + show percentile:
-
-  * `15.3% (92nd percentile)` ([FIRST][3])
-
-For sparklines:
-
-* Use `epss.daily_score` time series for last N days
-* Annotate model-version change days (vertical marker)
-
---
-
-## 6) Operational hardening
-
-### 6.1 Scheduling
-
-* Run daily at a fixed time in UTC.
-* Probe up to 3 back days for latest file.
-
-### 6.2 Exactly-once semantics
-
-Use three safeguards:
-
-1. `epss.raw_file` uniqueness on `(source_uri, asof_date, sha256)`
-2. Transactional load:
-
-   * delete existing `daily_score` for that `(model_name, asof_date)`
-   * insert freshly parsed rows
-3. Advisory lock per `(model_name, asof_date)` to prevent concurrent loads:
-
-   * `pg_advisory_xact_lock(hashtext(model_name), asof_date::int)`
-
-### 6.3 Monitoring (must-have metrics)
-
-Emit metrics per job stage:
-
-* download success/failure
-* bytes downloaded
-* sha256 computed
-* rows parsed
-* parse error count
-* rows inserted into `daily_score`
-* delta rows created
-* signal events emitted
-* “model version changed” boolean
-
-Alert conditions:
-
-* no new asof_date ingested for > 48 hours
-* parse failure
-* row count drops by > X% from previous day (data anomaly)
-
-### 6.4 Backfills
-
-Implement `epss backfill --from 2021-04-14 --to 2025-12-17`:
-
-* Fetch raw files for each day
-* Normalize daily_score
-* Materialize latest and delta
-* **Disable signals** during bulk backfill (or route to “silent” topic) to avoid spamming.
-
-FIRST notes historical data begins 2021-04-14. ([FIRST][1])
-
---
-
-## 7) Reference .NET job skeletons
-
-### Job boundaries
-
-* `EpssFetchJob` → writes `epss.raw_file`
-* `EpssNormalizeJob` → fills `epss.daily_score`
-* `EpssMaterializeJob` → updates `epss.daily_delta` and `epss.latest_score`
-* `EpssSignalJob` → per-tenant emission into `epss.signal` + bus publish
-
-### Performance notes
-
-* Use `GZipStream` + `StreamReader` line-by-line (no full file into memory)
-* Use `NpgsqlBinaryImporter` for `COPY` into staging
-* Use set-based SQL for delta/latest
-
---
-
-## 8) The “gotchas” that make or break EPSS pipelines
-
-1. **Model version changes create false deltas**
-   Store `model_version` and mark `is_model_change`. FIRST explicitly warns about score shifts on model updates and notes v4 began publishing on 2025‑03‑17. ([FIRST][1])
-
-2. **Percentile is relative; probability is primary**
-   Probability should remain your canonical numeric score; percentile provides context for humans. ([FIRST][3])
-
-3. **Don’t emit global events**
-   Restrict signals to observed CVEs per tenant/environment.
-
-4. **Keep raw provenance**
-   Your audit story depends on storing:
-
-   * exact source URI, as-of date, sha256, header comment
-
---
-
-## 9) Minimal “definition of done” checklist
-
-### Data correctness
-
-* [ ] For a known CVE, `epss.latest_score` matches the daily file for the latest asof_date
-* [ ] `epss` and `percentile` ranges enforced (0–1)
-* [ ] Model version extracted when present; otherwise stored as null but header_comment preserved
-* [ ] Delta rows created and `is_model_change` flips on version changes
-
-### Operational
-
-* [ ] Daily job retries on transient HTTP failures
-* [ ] Alert if no new asof_date in 48h
-* [ ] Raw bytes retention + metadata retention
-
-### Product
-
-* [ ] UI displays `probability% (percentile)` per FIRST recommendation ([FIRST][3])
-* [ ] Signal events link to evidence (raw sha, model version, asof date)
-* [ ] Policy Engine consumes `latest_score` only (never raw)
-
---
-
-If you want, I can also provide:
-
-* a **migration script generator** that auto-creates monthly partitions for `epss.daily_score`,
-* an example **Valkey caching strategy** (`epss:latest:{cve}` with a 48h TTL, warmed only for observed CVEs),
-* and a concrete **“observed CVE” join contract** (what columns to expose from your vuln inventory so EPSS signals stay noise-free).
-
-[1]: https://www.first.org/epss/data_stats "Exploit Prediction Scoring System (EPSS)"
-[2]: https://www.first.org/epss/api "Exploit Prediction Scoring System (EPSS)"
-[3]: https://www.first.org/epss/articles/prob_percentile_bins "Exploit Prediction Scoring System (EPSS)"
-[4]: https://www.first.org/epss/faq "EPSS Frequently Asked Questions"
--- a/docs/product-advisories/archived/18-Dec-2025
+++ b/docs/product-advisories/archived/18-Dec-2025
--- a/docs/product-advisories/archived/18-Dec-2025
+++ b/docs/product-advisories/archived/18-Dec-2025
@@ -0,0 +1,444 @@
+# ARCHIVED ADVISORY
+
+> **Status:** Archived
+> **Archived Date:** 2025-12-18
+> **Implementation Sprints:**
+> - `SPRINT_3700_0001_0001_witness_foundation.md` - BLAKE3 + Witness Schema
+> - `SPRINT_3700_0002_0001_vuln_surfaces_core.md` - Vuln Surface Builder
+> - `SPRINT_3700_0003_0001_trigger_extraction.md` - Trigger Method Extraction
+> - `SPRINT_3700_0004_0001_reachability_integration.md` - Reachability Integration
+> - `SPRINT_3700_0005_0001_witness_ui_cli.md` - Witness UI/CLI
+> - `SPRINT_3700_0006_0001_incremental_cache.md` - Incremental Cache
+>
+> **Gap Analysis:** See `C:\Users\vlindos\.claude\plans\lexical-knitting-map.md`
+
+---
+
+Here's a compact, practical way to add two high-leverage capabilities to your scanner: **DSSE-signed path witnesses** and **Smart-Diff x Reachability**-what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony.
+
+---
+
+# 1) DSSE-signed path witnesses (entrypoint -> calls -> sink)
+
+**What it is (in plain terms):**
+When you flag a CVE as "reachable," also emit a tiny, human-readable proof: the **exact path** from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the **vulnerable sink**. Wrap that proof in a **DSSE** envelope and sign it. Anyone can verify the witness later-offline-without rerunning analysis.
+
+**Why it matters:**
+
+* Turns red flags into **auditable evidence** (quiet-by-design).
+* Lets CI/CD, auditors, and customers **verify** findings independently.
+* Enables **deterministic replay** and provenance chains (ties nicely to in-toto/SLSA).
+
+**Minimal JSON witness (stable, vendor-neutral):**
+
+```json
+{
+  "witness_schema": "stellaops.witness.v1",
+  "artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
+  "vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "<=1.2.3" },
+  "entrypoint": { "kind": "http", "name": "GET /billing/pay" },
+  "path": [
+    {"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
+    {"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
+    {"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
+  ],
+  "sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
+  "evidence": {
+    "callgraph_digest": "sha256:...",
+    "build_id": "dotnet:RID:linux-x64:sha256:...",
+    "analysis_config_digest": "sha256:..."
+  },
+  "observed_at": "2025-12-18T00:00:00Z"
+}
+```
+
+**Wrap in DSSE (payloadType & payload are required)**
+
+```json
+{
+  "payloadType": "application/vnd.stellaops.witness+json",
+  "payload": "base64(JSON_above)",
+  "signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
+}
+```
+
+**.NET 10 signing/verifying (Ed25519)**
+
+```csharp
+using System.Security.Cryptography;
+using System.Text.Json;
+
+var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
+var dsse = new {
+  payloadType = "application/vnd.stellaops.witness+json",
+  payload = Convert.ToBase64String(payloadBytes),
+  signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
+};
+byte[] Sign(byte[] data, byte[] privateKey)
+{
+    using var ed = new Ed25519();
+    // import private key, sign data (left as your Ed25519 helper)
+    return ed.SignData(data, privateKey);
+}
+```
+
+**Where to emit:**
+
+* **Scanner.Worker**: after reachability confirms `reachable=true`, emit witness -> **Attestor** signs -> **Authority** stores (Postgres) -> optional Rekor-style mirror.
+* Expose `/witness/{findingId}` for download & independent verification.
+
+---
+
+# 2) Smart-Diff x Reachability (incremental, low-noise updates)
+
+**What it is:**
+On **SBOM/VEX/dependency** deltas, don't rescan everything. Update only **affected regions** of the call graph and recompute reachability **just for changed nodes/edges**.
+
+**Why it matters:**
+
+* **Order-of-magnitude faster** incremental scans.
+* Fewer flaky diffs; triage stays focused on **meaningful risk change**.
+* Perfect for PR gating: "what changed" -> "what became reachable/unreachable."
+
+**Core idea (graph-reachability):**
+
+* Maintain a per-service **call graph** `G = (V, E)` with **entrypoint set** `S`.
+* On diff: compute changed nodes/edges DV/DE.
+* Run **incremental BFS/DFS** from impacted nodes to sinks (forward or backward), reusing memoized results.
+* Recompute only **frontiers** touched by D.
+
+**Minimal tables (Postgres):**
+
+```sql
+-- Nodes (functions/methods)
+CREATE TABLE cg_nodes(
+  id BIGSERIAL PRIMARY KEY,
+  service TEXT, symbol TEXT, file TEXT, line INT,
+  hash TEXT, UNIQUE(service, hash)
+);
+-- Edges (calls)
+CREATE TABLE cg_edges(
+  src BIGINT REFERENCES cg_nodes(id),
+  dst BIGINT REFERENCES cg_nodes(id),
+  kind TEXT, PRIMARY KEY(src, dst)
+);
+-- Entrypoints & Sinks
+CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
+CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);
+
+-- Memoized reachability cache
+CREATE TABLE cg_reach_cache(
+  entry_id BIGINT, sink_id BIGINT,
+  path JSONB, reachable BOOLEAN,
+  updated_at TIMESTAMPTZ,
+  PRIMARY KEY(entry_id, sink_id)
+);
+```
+
+**Incremental algorithm (pseudocode):**
+
+```text
+Input: DSBOM, DDeps, DCode -> DNodes, DEdges
+1) Apply D to cg_nodes/cg_edges
+2) ImpactSet = neighbors(DNodes U endpoints(DEdges))
+3) For each e in Entrypoints intersect ancestors(ImpactSet):
+     Recompute forward search to affected sinks, stop early on unchanged subgraphs
+     Update cg_reach_cache; if state flips, emit new/updated DSSE witness
+```
+
+**.NET 10 reachability sketch (fast & local):**
+
+```csharp
+HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
+foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
+{
+    var res = BoundedReach(e, affectedSinks, graph, cache);
+    foreach (var r in res.Changed)
+    {
+        cache.Upsert(e, r.Sink, r.Path, r.Reachable);
+        if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
+    }
+}
+```
+
+**CI/PR flow:**
+
+1. Build -> SBOM diff -> Dependency diff -> Call-graph delta.
+2. Run incremental reachability.
+3. If any `unreachable->reachable` transitions: **fail gate**, attach DSSE witnesses.
+4. If `reachable->unreachable`: auto-close prior findings (and archive prior witness).
+
+---
+
+# UX hooks (quick wins)
+
+* In findings list, add a **"Show Witness"** button -> modal renders the signed path (entrypoint->...->sink) + **"Verify Signature"** one-click.
+* In PR checks, summarize only **state flips** with tiny links: "+2 reachable (view witness)" / "-1 (now unreachable)".
+
+---
+
+# Minimal tasks to get this live
+
+* **Scanner.Worker**: build call-graph extraction (per language), add incremental graph store, reachability cache.
+* **Attestor**: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
+* **Authority**: tables above + witness storage + retrieval API.
+* **Router/CI plugin**: PR annotation with **state flips** and links to witnesses.
+* **UI**: witness modal + signature verify.
+
+If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path.
+Below is a concrete, buildable blueprint for an **advanced reachability analysis engine** inside Stella Ops. I'm going to assume your "Stella Ops" components are roughly:
+
+* **Scanner.Worker**: runs analyses in CI / on artifacts
+* **Authority**: stores graphs/findings/witnesses
+* **Attestor**: signs DSSE envelopes (Ed25519)
+* (optional) **SurfaceBuilder**: background worker that computes "vuln surfaces" for packages
+
+The key advance is: **don't treat a CVE as "a package"**. Treat it as a **set of trigger methods** (public API) that can reach the vulnerable code inside the dependency-computed by "Smart-Diff" once, reused everywhere.
+
+---
+
+## 0) Define the contract (precision/soundness) up front
+
+If you don't write this down, you'll fight false positives/negatives forever.
+
+### What Stella Ops will guarantee (first release)
+
+* **Whole-program static call graph** (app + selected dependency assemblies)
+* **Context-insensitive** (fast), **path witness** extracted (shortest path)
+* **Dynamic dispatch handled** with CHA/RTA (+ DI hints), with explicit uncertainty flags
+* **Reflection handled best-effort** (constant-string resolution), otherwise "unknown edge"
+
+### What it will NOT guarantee (first release)
+
+* Perfect handling of reflection / `dynamic` / runtime codegen
+* Perfect delegate/event resolution across complex flows
+* Full taint/dataflow reachability (you can add later)
+
+This is fine. The major value is: "**we can show you the call path**" and "**we can prove the vuln is triggered by calling these library APIs**".
+
+---
+
+## 1) The big idea: "Vuln surfaces" (Smart-Diff -> triggers)
+
+### Problem
+
+CVE feeds typically say "package X version range Y is vulnerable" but rarely say *which methods*. If you only do package-level reachability, noise is huge.
+
+### Solution
+
+For each CVE+package, compute a **vulnerability surface**:
+
+* **Candidate sinks** = methods changed between vulnerable and fixed versions (diff at IL level)
+* **Trigger methods** = *public/exported* methods in the vulnerable version that can reach those changed methods internally
+
+Then your service scan becomes:
+
+> "Can any entrypoint reach any trigger method?"
+
+This is both faster and more precise.
+
+---
+
+## 2) Data model (Authority / Postgres)
+
+You already had call graph tables; here's a concrete schema that supports:
+
+* graph snapshots
+* incremental updates
+* vuln surfaces
+* reachability cache
+* DSSE witnesses
+
+### 2.1 Graph tables
+
+```sql
+CREATE TABLE cg_snapshots (
+  snapshot_id BIGSERIAL PRIMARY KEY,
+  service TEXT NOT NULL,
+  build_id TEXT NOT NULL,
+  graph_digest TEXT NOT NULL,
+  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+  UNIQUE(service, build_id)
+);
+
+CREATE TABLE cg_nodes (
+  node_id BIGSERIAL PRIMARY KEY,
+  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
+  method_key TEXT NOT NULL,              -- stable key (see below)
+  asm_name TEXT,
+  type_name TEXT,
+  method_name TEXT,
+  file_path TEXT,
+  line_start INT,
+  il_hash TEXT,                          -- normalized IL hash for diffing
+  flags INT NOT NULL DEFAULT 0,          -- bitflags: has_reflection, compiler_generated, etc.
+  UNIQUE(snapshot_id, method_key)
+);
+
+CREATE TABLE cg_edges (
+  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
+  src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
+  dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
+  kind SMALLINT NOT NULL,                -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
+  PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
+);
+
+CREATE TABLE cg_entrypoints (
+  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
+  node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
+  kind TEXT NOT NULL,                    -- http, grpc, cli, job, etc.
+  name TEXT NOT NULL,                    -- GET /foo, "Main", etc.
+  PRIMARY KEY(snapshot_id, node_id, kind, name)
+);
+```
+
+### 2.2 Vuln surface tables (Smart-Diff artifacts)
+
+```sql
+CREATE TABLE vuln_surfaces (
+  surface_id BIGSERIAL PRIMARY KEY,
+  ecosystem TEXT NOT NULL,               -- nuget
+  package TEXT NOT NULL,
+  cve_id TEXT NOT NULL,
+  vuln_version TEXT NOT NULL,            -- a representative vulnerable version
+  fixed_version TEXT NOT NULL,
+  surface_digest TEXT NOT NULL,
+  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+  UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
+);
+
+CREATE TABLE vuln_surface_sinks (
+  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
+  sink_method_key TEXT NOT NULL,
+  reason TEXT NOT NULL,                  -- changed|added|removed|heuristic
+  PRIMARY KEY(surface_id, sink_method_key)
+);
+
+CREATE TABLE vuln_surface_triggers (
+  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
+  trigger_method_key TEXT NOT NULL,
+  sink_method_key TEXT NOT NULL,
+  internal_path JSONB,                   -- optional: library internal witness path
+  PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
+);
+```
+
+### 2.3 Reachability cache & witnesses
+
+```sql
+CREATE TABLE reach_findings (
+  finding_id BIGSERIAL PRIMARY KEY,
+  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
+  cve_id TEXT NOT NULL,
+  ecosystem TEXT NOT NULL,
+  package TEXT NOT NULL,
+  package_version TEXT NOT NULL,
+  reachable BOOLEAN NOT NULL,
+  reachable_entrypoints INT NOT NULL DEFAULT 0,
+  updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+  UNIQUE(snapshot_id, cve_id, package, package_version)
+);
+
+CREATE TABLE reach_witnesses (
+  witness_id BIGSERIAL PRIMARY KEY,
+  finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
+  entry_node_id BIGINT REFERENCES cg_nodes(node_id),
+  dsse_envelope JSONB NOT NULL,
+  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+```
+
+---
+
+## 3) Stable identity: MethodKey + IL hash
+
+### 3.1 MethodKey (must be stable across builds)
+
+Use a normalized string like:
+
+```
+{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)
+```
+
+Examples:
+
+* `MyApp|BillingController|Pay(System.String)`
+* `LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)`
+
+### 3.2 Normalized IL hash (for smart-diff + incremental graph updates)
+
+Raw IL bytes aren't stable (metadata tokens change). Normalize:
+
+* opcode names
+* branch targets by *instruction index*, not offset
+* method operands by **resolved MethodKey**
+* string operands by literal or hashed literal
+* type operands by full name
+
+Then hash `SHA256(normalized_bytes)`.
+
+---
+
+*[Remainder of advisory truncated for brevity - see original file for full content]*
+
+---
+
+## 12) What to implement first (in the order that produces value fastest)
+
+### Week 1-2 scope (realistic, shippable)
+
+1. Cecil call graph extraction (direct calls)
+2. MVC + Minimal API entrypoints
+3. Reverse BFS reachability with path witnesses
+4. DSSE witness signing + storage
+5. SurfaceBuilder v1:
+
+   * IL hash per method
+   * changed methods as sinks
+   * triggers via internal reverse BFS
+6. UI: "Show Witness" + "Verify Signature"
+
+### Next increment (precision upgrades)
+
+7. async/await mapping to original methods
+8. RTA + DI registration hints
+9. delegate tracking for Minimal API handlers (if not already)
+10. interface override triggers in surface builder
+
+### Later (if you want "attackability", not just "reachability")
+
+11. taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
+12. sanitizer modeling & parameter constraints
+
+---
+
+## 13) Common failure modes and how to harden
+
+### MethodKey mismatches (surface vs app call)
+
+* Ensure both are generated from the same normalization rules
+* For generic methods, prefer **definition** keys (strip instantiation)
+* Store both "exact" and "erased generic" variants if needed
+
+### Multi-target frameworks
+
+* SurfaceBuilder: compute triggers for each TFM, union them
+* App scan: choose TFM closest to build RID, but allow fallback to union
+
+### Huge graphs
+
+* Drop `System.*` nodes/edges unless:
+
+  * the vuln is in System.* (rare, but handle separately)
+* Deduplicate nodes by MethodKey across assemblies where safe
+* Use CSR arrays + pooled queues
+
+### Reflection heavy projects
+
+* Mark analysis confidence lower
+* Include "unknown edges present" in finding metadata
+* Still produce a witness path up to the reflective callsite
+
+---
+
+If you want, I can also paste a **complete Cecil-based CallGraphBuilder class** (nodes+edges+PDB lines), plus the **SurfaceBuilder** that downloads NuGet packages and generates `vuln_surface_triggers` end-to-end.
--- a/docs/product-advisories/archived/18-Dec-2025/18-Dec-2025
+++ b/docs/product-advisories/archived/18-Dec-2025/18-Dec-2025
@@ -0,0 +1,197 @@
+# ARCHIVED ADVISORY
+
+> **Archived**: 2025-12-18
+> **Status**: IMPLEMENTED
+> **Analysis**: Plan file `C:\Users\vlindos\.claude\plans\quizzical-hugging-hearth.md`
+>
+> ## Implementation Summary
+>
+> This advisory was analyzed and merged into the existing EPSS implementation plan:
+>
+> - **Master Plan**: `IMPL_3410_epss_v4_integration_master_plan.md` updated with raw + signal layer schemas
+> - **Sprint**: `SPRINT_3413_0001_0001_epss_live_enrichment.md` created with 30 tasks (original 14 + 16 from advisory)
+> - **Migrations Created**:
+>   - `011_epss_raw_layer.sql` - Full JSONB payload storage (~5GB/year)
+>   - `012_epss_signal_layer.sql` - Tenant-scoped signals with dedupe_key and explain_hash
+>
+> ## Gap Analysis Result
+>
+> | Advisory Proposal | Decision | Rationale |
+> |-------------------|----------|-----------|
+> | Raw feed layer (Layer 1) | IMPLEMENTED | Full JSONB storage for deterministic replay |
+> | Normalized layer (Layer 2) | ALIGNED | Already existed in IMPL_3410 |
+> | Signal-ready layer (Layer 3) | IMPLEMENTED | Tenant-scoped signals, model change detection |
+> | Multi-model support | DEFERRED | No customer demand |
+> | Meta-predictor training | SKIPPED | Out of scope (ML complexity) |
+> | A/B testing | SKIPPED | Infrastructure overhead |
+>
+> ## Key Enhancements Implemented
+>
+> 1. **Raw Feed Layer** (`epss_raw` table) - Stores full CSV payload as JSONB for replay
+> 2. **Signal-Ready Layer** (`epss_signal` table) - Tenant-scoped actionable events
+> 3. **Model Version Change Detection** - Suppresses noisy deltas on model updates
+> 4. **Explain Hash** - Deterministic SHA-256 for audit trail
+> 5. **Risk Band Mapping** - CRITICAL/HIGH/MEDIUM/LOW based on percentile
+
+---
+
+# Original Advisory Content
+
+Here's a compact, practical blueprint for bringing **EPSS** into your stack without chaos: a **3-layer ingestion model** that keeps raw data, produces clean probabilities, and emits "signal-ready" events your risk engine can use immediately.
+
+---
+
+# Why this matters (super short)
+
+* **EPSS** = predicted probability a vuln will be exploited soon.
+* Mixing "raw EPSS feed" directly into decisions makes audits, rollbacks, and model upgrades painful.
+* A **layered model** lets you **version probability evolution**, compare vendors, and train **meta-predictors on deltas** (how risk changes over time), not just on snapshots.
+
+---
+
+# The three layers (and how they map to Stella Ops)
+
+1. **Raw feed layer (immutable)**
+
+* **Goal:** Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all).
+* **Stella modules:** `Concelier` (preserve-prune source) writes; `Authority` handles signatures/hashes.
+* **Storage:** `postgres.epss_raw` (partitioned by day); blob column for the untouched payload; SHA-256 of source file.
+* **Why:** Full provenance + deterministic replay.
+
+2. **Normalized probabilistic layer**
+
+* **Goal:** Clean, typed tables keyed by `cve_id`, with **probability, percentile, model_version, asof_ts**.
+* **Stella modules:** `Excititor` (transform); `Policy Engine` reads.
+* **Storage:** `postgres.epss_prob` with a **surrogate key** `(cve_id, model_version, asof_ts)` and computed **delta fields** vs previous `asof_ts`.
+* **Extras:** Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side-by-side.
+
+3. **Signal-ready layer (risk engine contracts)**
+
+* **Goal:** Pre-chewed "events" your **Signals/Router** can route instantly.
+* **What's inside:** Only the fields needed for gating and UI: `cve_id`, `prob_now`, `prob_delta`, `percentile`, `risk_band`, `explain_hash`.
+* **Emit:** `first_signal`, `risk_increase`, `risk_decrease`, `quieted` with **idempotent event keys**.
+* **Stella modules:** `Signals` publishes, `Router` fan-outs, `Timeline` records; `Notify` handles subscriptions.
+
+---
+
+# Minimal Postgres schema (ready to paste)
+
+```sql
+-- 1) Raw (immutable)
+create table epss_raw (
+  id bigserial primary key,
+  source_uri text not null,
+  ingestion_ts timestamptz not null default now(),
+  asof_date date not null,
+  payload jsonb not null,
+  payload_sha256 bytea not null
+);
+create index on epss_raw (asof_date);
+
+-- 2) Normalized
+create table epss_prob (
+  id bigserial primary key,
+  cve_id text not null,
+  model_version text not null,
+  asof_ts timestamptz not null,
+  probability double precision not null,
+  percentile double precision,
+  features jsonb,
+  unique (cve_id, model_version, asof_ts)
+);
+
+-- 3) Signal-ready
+create table epss_signal (
+  signal_id bigserial primary key,
+  cve_id text not null,
+  asof_ts timestamptz not null,
+  probability double precision not null,
+  prob_delta double precision,
+  risk_band text not null,
+  model_version text not null,
+  explain_hash bytea not null,
+  unique (cve_id, model_version, asof_ts)
+);
+```
+
+---
+
+# C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style)
+
+```csharp
+// 1) Fetch & store raw (Concelier)
+public async Task IngestRawAsync(Uri src, DateOnly asOfDate) {
+    var bytes = await http.GetByteArrayAsync(src);
+    var sha = SHA256.HashData(bytes);
+    await pg.ExecuteAsync(
+        "insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)",
+        new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha });
+}
+
+// 2) Normalize (Excititor)
+public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) {
+    var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate });
+    foreach (var r in raws) {
+        foreach (var row in ParseCsvOrJson(r.Payload)) {
+            await pg.ExecuteAsync(
+              @"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features)
+                values (@cve,@mv,@ts,@prob,@pct,@feat)
+                on conflict do nothing",
+              new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features });
+        }
+    }
+}
+
+// 3) Emit signal-ready (Signals)
+public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) {
+    var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability,
+        probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
+      from epss_prob where model_version=@mv", new { mv = modelVersion });
+
+    foreach (var r in rows) {
+        var band = Band(r.probability);
+        if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) {
+            var explainHash = DeterministicExplainHash(r);
+            await pg.ExecuteAsync(@"insert into epss_signal
+                (cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash)
+                values (@c,@t,@p,@d,@b,@mv,@h)
+                on conflict do nothing",
+                new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash });
+
+            await bus.PublishAsync("risk.epss.delta", new {
+                cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash)
+            });
+        }
+    }
+}
+```
+
+---
+
+# Versioning & experiments (the secret sauce)
+
+* **Model namespace:** `EPSS-4.0-<regressor-name>-<date>` so you can run multiple variants in parallel.
+* **Delta-training:** Train a small meta-predictor on **delta-probability** to forecast **"risk jumps in next N days."**
+* **A/B in production:** Route `model_version=x` to 50% of projects; compare **MTTA to patch** and **false-alarm rate**.
+
+---
+
+# Policy & UI wiring (quick contracts)
+
+**Policy gates** (OPA/Rego or internal rules):
+
+* Block if `risk_band in {HIGH, CRITICAL}` **AND** `prob_delta >= 0.1` in last 72h.
+* Soften if asset not reachable or mitigated by VEX.
+
+**UI (Evidence pane):**
+
+* Show **sparkline of EPSS over time**, highlight last delta.
+* "Why now?" button reveals **explain_hash** -> deterministic evidence payload.
+
+---
+
+# Ops & reliability
+
+* Daily ingestion with **idempotent** runs (raw SHA guard).
+* Backfills: re-normalize from `epss_raw` for any new model without re-downloading.
+* **Deterministic replay:** export `(raw, transform code hash, model_version)` alongside results.