save progress
This commit is contained in:
@@ -0,0 +1,444 @@
|
||||
# ARCHIVED ADVISORY
|
||||
|
||||
> **Status:** Archived
|
||||
> **Archived Date:** 2025-12-18
|
||||
> **Implementation Sprints:**
|
||||
> - `SPRINT_3700_0001_0001_witness_foundation.md` - BLAKE3 + Witness Schema
|
||||
> - `SPRINT_3700_0002_0001_vuln_surfaces_core.md` - Vuln Surface Builder
|
||||
> - `SPRINT_3700_0003_0001_trigger_extraction.md` - Trigger Method Extraction
|
||||
> - `SPRINT_3700_0004_0001_reachability_integration.md` - Reachability Integration
|
||||
> - `SPRINT_3700_0005_0001_witness_ui_cli.md` - Witness UI/CLI
|
||||
> - `SPRINT_3700_0006_0001_incremental_cache.md` - Incremental Cache
|
||||
>
|
||||
> **Gap Analysis:** See `C:\Users\vlindos\.claude\plans\lexical-knitting-map.md`
|
||||
|
||||
---
|
||||
|
||||
Here's a compact, practical way to add two high-leverage capabilities to your scanner: **DSSE-signed path witnesses** and **Smart-Diff x Reachability**-what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony.
|
||||
|
||||
---
|
||||
|
||||
# 1) DSSE-signed path witnesses (entrypoint -> calls -> sink)
|
||||
|
||||
**What it is (in plain terms):**
|
||||
When you flag a CVE as "reachable," also emit a tiny, human-readable proof: the **exact path** from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the **vulnerable sink**. Wrap that proof in a **DSSE** envelope and sign it. Anyone can verify the witness later-offline-without rerunning analysis.
|
||||
|
||||
**Why it matters:**
|
||||
|
||||
* Turns red flags into **auditable evidence** (quiet-by-design).
|
||||
* Lets CI/CD, auditors, and customers **verify** findings independently.
|
||||
* Enables **deterministic replay** and provenance chains (ties nicely to in-toto/SLSA).
|
||||
|
||||
**Minimal JSON witness (stable, vendor-neutral):**
|
||||
|
||||
```json
|
||||
{
|
||||
"witness_schema": "stellaops.witness.v1",
|
||||
"artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
|
||||
"vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "<=1.2.3" },
|
||||
"entrypoint": { "kind": "http", "name": "GET /billing/pay" },
|
||||
"path": [
|
||||
{"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
|
||||
{"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
|
||||
{"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
|
||||
],
|
||||
"sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
|
||||
"evidence": {
|
||||
"callgraph_digest": "sha256:...",
|
||||
"build_id": "dotnet:RID:linux-x64:sha256:...",
|
||||
"analysis_config_digest": "sha256:..."
|
||||
},
|
||||
"observed_at": "2025-12-18T00:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**Wrap in DSSE (payloadType & payload are required)**
|
||||
|
||||
```json
|
||||
{
|
||||
"payloadType": "application/vnd.stellaops.witness+json",
|
||||
"payload": "base64(JSON_above)",
|
||||
"signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
|
||||
}
|
||||
```
|
||||
|
||||
**.NET 10 signing/verifying (Ed25519)**
|
||||
|
||||
```csharp
|
||||
using System.Security.Cryptography;
|
||||
using System.Text.Json;
|
||||
|
||||
var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
|
||||
var dsse = new {
|
||||
payloadType = "application/vnd.stellaops.witness+json",
|
||||
payload = Convert.ToBase64String(payloadBytes),
|
||||
signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
|
||||
};
|
||||
byte[] Sign(byte[] data, byte[] privateKey)
|
||||
{
|
||||
using var ed = new Ed25519();
|
||||
// import private key, sign data (left as your Ed25519 helper)
|
||||
return ed.SignData(data, privateKey);
|
||||
}
|
||||
```
|
||||
|
||||
**Where to emit:**
|
||||
|
||||
* **Scanner.Worker**: after reachability confirms `reachable=true`, emit witness -> **Attestor** signs -> **Authority** stores (Postgres) -> optional Rekor-style mirror.
|
||||
* Expose `/witness/{findingId}` for download & independent verification.
|
||||
|
||||
---
|
||||
|
||||
# 2) Smart-Diff x Reachability (incremental, low-noise updates)
|
||||
|
||||
**What it is:**
|
||||
On **SBOM/VEX/dependency** deltas, don't rescan everything. Update only **affected regions** of the call graph and recompute reachability **just for changed nodes/edges**.
|
||||
|
||||
**Why it matters:**
|
||||
|
||||
* **Order-of-magnitude faster** incremental scans.
|
||||
* Fewer flaky diffs; triage stays focused on **meaningful risk change**.
|
||||
* Perfect for PR gating: "what changed" -> "what became reachable/unreachable."
|
||||
|
||||
**Core idea (graph-reachability):**
|
||||
|
||||
* Maintain a per-service **call graph** `G = (V, E)` with **entrypoint set** `S`.
|
||||
* On diff: compute changed nodes/edges DV/DE.
|
||||
* Run **incremental BFS/DFS** from impacted nodes to sinks (forward or backward), reusing memoized results.
|
||||
* Recompute only **frontiers** touched by D.
|
||||
|
||||
**Minimal tables (Postgres):**
|
||||
|
||||
```sql
|
||||
-- Nodes (functions/methods)
|
||||
CREATE TABLE cg_nodes(
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
service TEXT, symbol TEXT, file TEXT, line INT,
|
||||
hash TEXT, UNIQUE(service, hash)
|
||||
);
|
||||
-- Edges (calls)
|
||||
CREATE TABLE cg_edges(
|
||||
src BIGINT REFERENCES cg_nodes(id),
|
||||
dst BIGINT REFERENCES cg_nodes(id),
|
||||
kind TEXT, PRIMARY KEY(src, dst)
|
||||
);
|
||||
-- Entrypoints & Sinks
|
||||
CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
|
||||
CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);
|
||||
|
||||
-- Memoized reachability cache
|
||||
CREATE TABLE cg_reach_cache(
|
||||
entry_id BIGINT, sink_id BIGINT,
|
||||
path JSONB, reachable BOOLEAN,
|
||||
updated_at TIMESTAMPTZ,
|
||||
PRIMARY KEY(entry_id, sink_id)
|
||||
);
|
||||
```
|
||||
|
||||
**Incremental algorithm (pseudocode):**
|
||||
|
||||
```text
|
||||
Input: DSBOM, DDeps, DCode -> DNodes, DEdges
|
||||
1) Apply D to cg_nodes/cg_edges
|
||||
2) ImpactSet = neighbors(DNodes U endpoints(DEdges))
|
||||
3) For each e in Entrypoints intersect ancestors(ImpactSet):
|
||||
Recompute forward search to affected sinks, stop early on unchanged subgraphs
|
||||
Update cg_reach_cache; if state flips, emit new/updated DSSE witness
|
||||
```
|
||||
|
||||
**.NET 10 reachability sketch (fast & local):**
|
||||
|
||||
```csharp
|
||||
HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
|
||||
foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
|
||||
{
|
||||
var res = BoundedReach(e, affectedSinks, graph, cache);
|
||||
foreach (var r in res.Changed)
|
||||
{
|
||||
cache.Upsert(e, r.Sink, r.Path, r.Reachable);
|
||||
if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**CI/PR flow:**
|
||||
|
||||
1. Build -> SBOM diff -> Dependency diff -> Call-graph delta.
|
||||
2. Run incremental reachability.
|
||||
3. If any `unreachable->reachable` transitions: **fail gate**, attach DSSE witnesses.
|
||||
4. If `reachable->unreachable`: auto-close prior findings (and archive prior witness).
|
||||
|
||||
---
|
||||
|
||||
# UX hooks (quick wins)
|
||||
|
||||
* In findings list, add a **"Show Witness"** button -> modal renders the signed path (entrypoint->...->sink) + **"Verify Signature"** one-click.
|
||||
* In PR checks, summarize only **state flips** with tiny links: "+2 reachable (view witness)" / "-1 (now unreachable)".
|
||||
|
||||
---
|
||||
|
||||
# Minimal tasks to get this live
|
||||
|
||||
* **Scanner.Worker**: build call-graph extraction (per language), add incremental graph store, reachability cache.
|
||||
* **Attestor**: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
|
||||
* **Authority**: tables above + witness storage + retrieval API.
|
||||
* **Router/CI plugin**: PR annotation with **state flips** and links to witnesses.
|
||||
* **UI**: witness modal + signature verify.
|
||||
|
||||
If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path.
|
||||
Below is a concrete, buildable blueprint for an **advanced reachability analysis engine** inside Stella Ops. I'm going to assume your "Stella Ops" components are roughly:
|
||||
|
||||
* **Scanner.Worker**: runs analyses in CI / on artifacts
|
||||
* **Authority**: stores graphs/findings/witnesses
|
||||
* **Attestor**: signs DSSE envelopes (Ed25519)
|
||||
* (optional) **SurfaceBuilder**: background worker that computes "vuln surfaces" for packages
|
||||
|
||||
The key advance is: **don't treat a CVE as "a package"**. Treat it as a **set of trigger methods** (public API) that can reach the vulnerable code inside the dependency-computed by "Smart-Diff" once, reused everywhere.
|
||||
|
||||
---
|
||||
|
||||
## 0) Define the contract (precision/soundness) up front
|
||||
|
||||
If you don't write this down, you'll fight false positives/negatives forever.
|
||||
|
||||
### What Stella Ops will guarantee (first release)
|
||||
|
||||
* **Whole-program static call graph** (app + selected dependency assemblies)
|
||||
* **Context-insensitive** (fast), **path witness** extracted (shortest path)
|
||||
* **Dynamic dispatch handled** with CHA/RTA (+ DI hints), with explicit uncertainty flags
|
||||
* **Reflection handled best-effort** (constant-string resolution), otherwise "unknown edge"
|
||||
|
||||
### What it will NOT guarantee (first release)
|
||||
|
||||
* Perfect handling of reflection / `dynamic` / runtime codegen
|
||||
* Perfect delegate/event resolution across complex flows
|
||||
* Full taint/dataflow reachability (you can add later)
|
||||
|
||||
This is fine. The major value is: "**we can show you the call path**" and "**we can prove the vuln is triggered by calling these library APIs**".
|
||||
|
||||
---
|
||||
|
||||
## 1) The big idea: "Vuln surfaces" (Smart-Diff -> triggers)
|
||||
|
||||
### Problem
|
||||
|
||||
CVE feeds typically say "package X version range Y is vulnerable" but rarely say *which methods*. If you only do package-level reachability, noise is huge.
|
||||
|
||||
### Solution
|
||||
|
||||
For each CVE+package, compute a **vulnerability surface**:
|
||||
|
||||
* **Candidate sinks** = methods changed between vulnerable and fixed versions (diff at IL level)
|
||||
* **Trigger methods** = *public/exported* methods in the vulnerable version that can reach those changed methods internally
|
||||
|
||||
Then your service scan becomes:
|
||||
|
||||
> "Can any entrypoint reach any trigger method?"
|
||||
|
||||
This is both faster and more precise.
|
||||
|
||||
---
|
||||
|
||||
## 2) Data model (Authority / Postgres)
|
||||
|
||||
You already had call graph tables; here's a concrete schema that supports:
|
||||
|
||||
* graph snapshots
|
||||
* incremental updates
|
||||
* vuln surfaces
|
||||
* reachability cache
|
||||
* DSSE witnesses
|
||||
|
||||
### 2.1 Graph tables
|
||||
|
||||
```sql
|
||||
CREATE TABLE cg_snapshots (
|
||||
snapshot_id BIGSERIAL PRIMARY KEY,
|
||||
service TEXT NOT NULL,
|
||||
build_id TEXT NOT NULL,
|
||||
graph_digest TEXT NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE(service, build_id)
|
||||
);
|
||||
|
||||
CREATE TABLE cg_nodes (
|
||||
node_id BIGSERIAL PRIMARY KEY,
|
||||
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
|
||||
method_key TEXT NOT NULL, -- stable key (see below)
|
||||
asm_name TEXT,
|
||||
type_name TEXT,
|
||||
method_name TEXT,
|
||||
file_path TEXT,
|
||||
line_start INT,
|
||||
il_hash TEXT, -- normalized IL hash for diffing
|
||||
flags INT NOT NULL DEFAULT 0, -- bitflags: has_reflection, compiler_generated, etc.
|
||||
UNIQUE(snapshot_id, method_key)
|
||||
);
|
||||
|
||||
CREATE TABLE cg_edges (
|
||||
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
|
||||
src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
|
||||
dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
|
||||
kind SMALLINT NOT NULL, -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
|
||||
PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
|
||||
);
|
||||
|
||||
CREATE TABLE cg_entrypoints (
|
||||
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
|
||||
node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
|
||||
kind TEXT NOT NULL, -- http, grpc, cli, job, etc.
|
||||
name TEXT NOT NULL, -- GET /foo, "Main", etc.
|
||||
PRIMARY KEY(snapshot_id, node_id, kind, name)
|
||||
);
|
||||
```
|
||||
|
||||
### 2.2 Vuln surface tables (Smart-Diff artifacts)
|
||||
|
||||
```sql
|
||||
CREATE TABLE vuln_surfaces (
|
||||
surface_id BIGSERIAL PRIMARY KEY,
|
||||
ecosystem TEXT NOT NULL, -- nuget
|
||||
package TEXT NOT NULL,
|
||||
cve_id TEXT NOT NULL,
|
||||
vuln_version TEXT NOT NULL, -- a representative vulnerable version
|
||||
fixed_version TEXT NOT NULL,
|
||||
surface_digest TEXT NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
|
||||
);
|
||||
|
||||
CREATE TABLE vuln_surface_sinks (
|
||||
surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
|
||||
sink_method_key TEXT NOT NULL,
|
||||
reason TEXT NOT NULL, -- changed|added|removed|heuristic
|
||||
PRIMARY KEY(surface_id, sink_method_key)
|
||||
);
|
||||
|
||||
CREATE TABLE vuln_surface_triggers (
|
||||
surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
|
||||
trigger_method_key TEXT NOT NULL,
|
||||
sink_method_key TEXT NOT NULL,
|
||||
internal_path JSONB, -- optional: library internal witness path
|
||||
PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
|
||||
);
|
||||
```
|
||||
|
||||
### 2.3 Reachability cache & witnesses
|
||||
|
||||
```sql
|
||||
CREATE TABLE reach_findings (
|
||||
finding_id BIGSERIAL PRIMARY KEY,
|
||||
snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
|
||||
cve_id TEXT NOT NULL,
|
||||
ecosystem TEXT NOT NULL,
|
||||
package TEXT NOT NULL,
|
||||
package_version TEXT NOT NULL,
|
||||
reachable BOOLEAN NOT NULL,
|
||||
reachable_entrypoints INT NOT NULL DEFAULT 0,
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE(snapshot_id, cve_id, package, package_version)
|
||||
);
|
||||
|
||||
CREATE TABLE reach_witnesses (
|
||||
witness_id BIGSERIAL PRIMARY KEY,
|
||||
finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
|
||||
entry_node_id BIGINT REFERENCES cg_nodes(node_id),
|
||||
dsse_envelope JSONB NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3) Stable identity: MethodKey + IL hash
|
||||
|
||||
### 3.1 MethodKey (must be stable across builds)
|
||||
|
||||
Use a normalized string like:
|
||||
|
||||
```
|
||||
{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
* `MyApp|BillingController|Pay(System.String)`
|
||||
* `LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)`
|
||||
|
||||
### 3.2 Normalized IL hash (for smart-diff + incremental graph updates)
|
||||
|
||||
Raw IL bytes aren't stable (metadata tokens change). Normalize:
|
||||
|
||||
* opcode names
|
||||
* branch targets by *instruction index*, not offset
|
||||
* method operands by **resolved MethodKey**
|
||||
* string operands by literal or hashed literal
|
||||
* type operands by full name
|
||||
|
||||
Then hash `SHA256(normalized_bytes)`.
|
||||
|
||||
---
|
||||
|
||||
*[Remainder of advisory truncated for brevity - see original file for full content]*
|
||||
|
||||
---
|
||||
|
||||
## 12) What to implement first (in the order that produces value fastest)
|
||||
|
||||
### Week 1-2 scope (realistic, shippable)
|
||||
|
||||
1. Cecil call graph extraction (direct calls)
|
||||
2. MVC + Minimal API entrypoints
|
||||
3. Reverse BFS reachability with path witnesses
|
||||
4. DSSE witness signing + storage
|
||||
5. SurfaceBuilder v1:
|
||||
|
||||
* IL hash per method
|
||||
* changed methods as sinks
|
||||
* triggers via internal reverse BFS
|
||||
6. UI: "Show Witness" + "Verify Signature"
|
||||
|
||||
### Next increment (precision upgrades)
|
||||
|
||||
7. async/await mapping to original methods
|
||||
8. RTA + DI registration hints
|
||||
9. delegate tracking for Minimal API handlers (if not already)
|
||||
10. interface override triggers in surface builder
|
||||
|
||||
### Later (if you want "attackability", not just "reachability")
|
||||
|
||||
11. taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
|
||||
12. sanitizer modeling & parameter constraints
|
||||
|
||||
---
|
||||
|
||||
## 13) Common failure modes and how to harden
|
||||
|
||||
### MethodKey mismatches (surface vs app call)
|
||||
|
||||
* Ensure both are generated from the same normalization rules
|
||||
* For generic methods, prefer **definition** keys (strip instantiation)
|
||||
* Store both "exact" and "erased generic" variants if needed
|
||||
|
||||
### Multi-target frameworks
|
||||
|
||||
* SurfaceBuilder: compute triggers for each TFM, union them
|
||||
* App scan: choose TFM closest to build RID, but allow fallback to union
|
||||
|
||||
### Huge graphs
|
||||
|
||||
* Drop `System.*` nodes/edges unless:
|
||||
|
||||
* the vuln is in System.* (rare, but handle separately)
|
||||
* Deduplicate nodes by MethodKey across assemblies where safe
|
||||
* Use CSR arrays + pooled queues
|
||||
|
||||
### Reflection heavy projects
|
||||
|
||||
* Mark analysis confidence lower
|
||||
* Include "unknown edges present" in finding metadata
|
||||
* Still produce a witness path up to the reflective callsite
|
||||
|
||||
---
|
||||
|
||||
If you want, I can also paste a **complete Cecil-based CallGraphBuilder class** (nodes+edges+PDB lines), plus the **SurfaceBuilder** that downloads NuGet packages and generates `vuln_surface_triggers` end-to-end.
|
||||
Reference in New Issue
Block a user