Files

StellaOps Bot 7d5250238c save progress

2025-12-18 09:53:46 +02:00

15 KiB

Raw Blame History

ARCHIVED ADVISORY

Status: Archived Archived Date: 2025-12-18 Implementation Sprints:

SPRINT_3700_0001_0001_witness_foundation.md - BLAKE3 + Witness Schema

SPRINT_3700_0002_0001_vuln_surfaces_core.md - Vuln Surface Builder

SPRINT_3700_0003_0001_trigger_extraction.md - Trigger Method Extraction

SPRINT_3700_0004_0001_reachability_integration.md - Reachability Integration

SPRINT_3700_0005_0001_witness_ui_cli.md - Witness UI/CLI

SPRINT_3700_0006_0001_incremental_cache.md - Incremental Cache

Gap Analysis: See C:\Users\vlindos\.claude\plans\lexical-knitting-map.md

Here's a compact, practical way to add two high-leverage capabilities to your scanner: DSSE-signed path witnesses and Smart-Diff x Reachability-what they are, why they matter, and exactly how to implement them in Stella Ops without ceremony.

1) DSSE-signed path witnesses (entrypoint -> calls -> sink)

What it is (in plain terms): When you flag a CVE as "reachable," also emit a tiny, human-readable proof: the exact path from a real entrypoint (e.g., HTTP route, CLI verb, cron) through functions/methods to the vulnerable sink. Wrap that proof in a DSSE envelope and sign it. Anyone can verify the witness later-offline-without rerunning analysis.

Why it matters:

Turns red flags into auditable evidence (quiet-by-design).
Lets CI/CD, auditors, and customers verify findings independently.
Enables deterministic replay and provenance chains (ties nicely to in-toto/SLSA).

Minimal JSON witness (stable, vendor-neutral):

{
  "witness_schema": "stellaops.witness.v1",
  "artifact": { "sbom_digest": "sha256:...", "component_purl": "pkg:nuget/Example@1.2.3" },
  "vuln": { "id": "CVE-2024-XXXX", "source": "NVD", "range": "<=1.2.3" },
  "entrypoint": { "kind": "http", "name": "GET /billing/pay" },
  "path": [
    {"symbol": "BillingController.Pay()", "file": "BillingController.cs", "line": 42},
    {"symbol": "PaymentsService.Authorize()", "file": "PaymentsService.cs", "line": 88},
    {"symbol": "LibXYZ.Parser.Parse()", "file": "Parser.cs", "line": 17}
  ],
  "sink": { "symbol": "LibXYZ.Parser.Parse()", "type": "deserialization" },
  "evidence": {
    "callgraph_digest": "sha256:...",
    "build_id": "dotnet:RID:linux-x64:sha256:...",
    "analysis_config_digest": "sha256:..."
  },
  "observed_at": "2025-12-18T00:00:00Z"
}

Wrap in DSSE (payloadType & payload are required)

{
  "payloadType": "application/vnd.stellaops.witness+json",
  "payload": "base64(JSON_above)",
  "signatures": [{ "keyid": "attestor-stellaops-ed25519", "sig": "base64(...)" }]
}

.NET 10 signing/verifying (Ed25519)

using System.Security.Cryptography;
using System.Text.Json;

var payloadBytes = JsonSerializer.SerializeToUtf8Bytes(witnessJsonObj);
var dsse = new {
  payloadType = "application/vnd.stellaops.witness+json",
  payload = Convert.ToBase64String(payloadBytes),
  signatures = new [] { new { keyid = keyId, sig = Convert.ToBase64String(Sign(payloadBytes, privateKey)) } }
};
byte[] Sign(byte[] data, byte[] privateKey)
{
    using var ed = new Ed25519();
    // import private key, sign data (left as your Ed25519 helper)
    return ed.SignData(data, privateKey);
}

Where to emit:

Scanner.Worker: after reachability confirms reachable=true, emit witness -> Attestor signs -> Authority stores (Postgres) -> optional Rekor-style mirror.
Expose /witness/{findingId} for download & independent verification.

2) Smart-Diff x Reachability (incremental, low-noise updates)

What it is: On SBOM/VEX/dependency deltas, don't rescan everything. Update only affected regions of the call graph and recompute reachability just for changed nodes/edges.

Why it matters:

Order-of-magnitude faster incremental scans.
Fewer flaky diffs; triage stays focused on meaningful risk change.
Perfect for PR gating: "what changed" -> "what became reachable/unreachable."

Core idea (graph-reachability):

Maintain a per-service call graph G = (V, E) with entrypoint set S.
On diff: compute changed nodes/edges DV/DE.
Run incremental BFS/DFS from impacted nodes to sinks (forward or backward), reusing memoized results.
Recompute only frontiers touched by D.

Minimal tables (Postgres):

-- Nodes (functions/methods)
CREATE TABLE cg_nodes(
  id BIGSERIAL PRIMARY KEY,
  service TEXT, symbol TEXT, file TEXT, line INT,
  hash TEXT, UNIQUE(service, hash)
);
-- Edges (calls)
CREATE TABLE cg_edges(
  src BIGINT REFERENCES cg_nodes(id),
  dst BIGINT REFERENCES cg_nodes(id),
  kind TEXT, PRIMARY KEY(src, dst)
);
-- Entrypoints & Sinks
CREATE TABLE cg_entrypoints(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY);
CREATE TABLE cg_sinks(node_id BIGINT REFERENCES cg_nodes(id) PRIMARY KEY, sink_type TEXT);

-- Memoized reachability cache
CREATE TABLE cg_reach_cache(
  entry_id BIGINT, sink_id BIGINT,
  path JSONB, reachable BOOLEAN,
  updated_at TIMESTAMPTZ,
  PRIMARY KEY(entry_id, sink_id)
);

Incremental algorithm (pseudocode):

Input: DSBOM, DDeps, DCode -> DNodes, DEdges
1) Apply D to cg_nodes/cg_edges
2) ImpactSet = neighbors(DNodes U endpoints(DEdges))
3) For each e in Entrypoints intersect ancestors(ImpactSet):
     Recompute forward search to affected sinks, stop early on unchanged subgraphs
     Update cg_reach_cache; if state flips, emit new/updated DSSE witness

.NET 10 reachability sketch (fast & local):

HashSet<int> ImpactSet = ComputeImpact(deltaNodes, deltaEdges);
foreach (var e in Intersect(Entrypoints, Ancestors(ImpactSet)))
{
    var res = BoundedReach(e, affectedSinks, graph, cache);
    foreach (var r in res.Changed)
    {
        cache.Upsert(e, r.Sink, r.Path, r.Reachable);
        if (r.Reachable) EmitDsseWitness(e, r.Sink, r.Path);
    }
}

CI/PR flow:

Build -> SBOM diff -> Dependency diff -> Call-graph delta.
Run incremental reachability.
If any unreachable->reachable transitions: fail gate, attach DSSE witnesses.
If reachable->unreachable: auto-close prior findings (and archive prior witness).

UX hooks (quick wins)

In findings list, add a "Show Witness" button -> modal renders the signed path (entrypoint->...->sink) + "Verify Signature" one-click.
In PR checks, summarize only state flips with tiny links: "+2 reachable (view witness)" / "-1 (now unreachable)".

Minimal tasks to get this live

Scanner.Worker: build call-graph extraction (per language), add incremental graph store, reachability cache.
Attestor: DSSE signing endpoint + key management (Ed25519 by default; PQC mode later).
Authority: tables above + witness storage + retrieval API.
Router/CI plugin: PR annotation with state flips and links to witnesses.
UI: witness modal + signature verify.

If you want, I can draft the exact Postgres migrations, the C# repositories, and a tiny verifier CLI that checks DSSE signatures and prints the call path. Below is a concrete, buildable blueprint for an advanced reachability analysis engine inside Stella Ops. I'm going to assume your "Stella Ops" components are roughly:

Scanner.Worker: runs analyses in CI / on artifacts
Authority: stores graphs/findings/witnesses
Attestor: signs DSSE envelopes (Ed25519)
(optional) SurfaceBuilder: background worker that computes "vuln surfaces" for packages

The key advance is: don't treat a CVE as "a package". Treat it as a set of trigger methods (public API) that can reach the vulnerable code inside the dependency-computed by "Smart-Diff" once, reused everywhere.

0) Define the contract (precision/soundness) up front

If you don't write this down, you'll fight false positives/negatives forever.

What Stella Ops will guarantee (first release)

Whole-program static call graph (app + selected dependency assemblies)
Context-insensitive (fast), path witness extracted (shortest path)
Dynamic dispatch handled with CHA/RTA (+ DI hints), with explicit uncertainty flags
Reflection handled best-effort (constant-string resolution), otherwise "unknown edge"

What it will NOT guarantee (first release)

Perfect handling of reflection / dynamic / runtime codegen
Perfect delegate/event resolution across complex flows
Full taint/dataflow reachability (you can add later)

This is fine. The major value is: "we can show you the call path" and "we can prove the vuln is triggered by calling these library APIs".

1) The big idea: "Vuln surfaces" (Smart-Diff -> triggers)

Problem

CVE feeds typically say "package X version range Y is vulnerable" but rarely say which methods. If you only do package-level reachability, noise is huge.

Solution

For each CVE+package, compute a vulnerability surface:

Candidate sinks = methods changed between vulnerable and fixed versions (diff at IL level)
Trigger methods = public/exported methods in the vulnerable version that can reach those changed methods internally

Then your service scan becomes:

"Can any entrypoint reach any trigger method?"

This is both faster and more precise.

2) Data model (Authority / Postgres)

You already had call graph tables; here's a concrete schema that supports:

graph snapshots
incremental updates
vuln surfaces
reachability cache
DSSE witnesses

2.1 Graph tables

CREATE TABLE cg_snapshots (
  snapshot_id BIGSERIAL PRIMARY KEY,
  service TEXT NOT NULL,
  build_id TEXT NOT NULL,
  graph_digest TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(service, build_id)
);

CREATE TABLE cg_nodes (
  node_id BIGSERIAL PRIMARY KEY,
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  method_key TEXT NOT NULL,              -- stable key (see below)
  asm_name TEXT,
  type_name TEXT,
  method_name TEXT,
  file_path TEXT,
  line_start INT,
  il_hash TEXT,                          -- normalized IL hash for diffing
  flags INT NOT NULL DEFAULT 0,          -- bitflags: has_reflection, compiler_generated, etc.
  UNIQUE(snapshot_id, method_key)
);

CREATE TABLE cg_edges (
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  src_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  dst_node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  kind SMALLINT NOT NULL,                -- 0=call,1=newobj,2=dispatch,3=delegate,4=reflection_guess,...
  PRIMARY KEY(snapshot_id, src_node_id, dst_node_id, kind)
);

CREATE TABLE cg_entrypoints (
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  node_id BIGINT REFERENCES cg_nodes(node_id) ON DELETE CASCADE,
  kind TEXT NOT NULL,                    -- http, grpc, cli, job, etc.
  name TEXT NOT NULL,                    -- GET /foo, "Main", etc.
  PRIMARY KEY(snapshot_id, node_id, kind, name)
);

2.2 Vuln surface tables (Smart-Diff artifacts)

CREATE TABLE vuln_surfaces (
  surface_id BIGSERIAL PRIMARY KEY,
  ecosystem TEXT NOT NULL,               -- nuget
  package TEXT NOT NULL,
  cve_id TEXT NOT NULL,
  vuln_version TEXT NOT NULL,            -- a representative vulnerable version
  fixed_version TEXT NOT NULL,
  surface_digest TEXT NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(ecosystem, package, cve_id, vuln_version, fixed_version)
);

CREATE TABLE vuln_surface_sinks (
  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
  sink_method_key TEXT NOT NULL,
  reason TEXT NOT NULL,                  -- changed|added|removed|heuristic
  PRIMARY KEY(surface_id, sink_method_key)
);

CREATE TABLE vuln_surface_triggers (
  surface_id BIGINT REFERENCES vuln_surfaces(surface_id) ON DELETE CASCADE,
  trigger_method_key TEXT NOT NULL,
  sink_method_key TEXT NOT NULL,
  internal_path JSONB,                   -- optional: library internal witness path
  PRIMARY KEY(surface_id, trigger_method_key, sink_method_key)
);

2.3 Reachability cache & witnesses

CREATE TABLE reach_findings (
  finding_id BIGSERIAL PRIMARY KEY,
  snapshot_id BIGINT REFERENCES cg_snapshots(snapshot_id) ON DELETE CASCADE,
  cve_id TEXT NOT NULL,
  ecosystem TEXT NOT NULL,
  package TEXT NOT NULL,
  package_version TEXT NOT NULL,
  reachable BOOLEAN NOT NULL,
  reachable_entrypoints INT NOT NULL DEFAULT 0,
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  UNIQUE(snapshot_id, cve_id, package, package_version)
);

CREATE TABLE reach_witnesses (
  witness_id BIGSERIAL PRIMARY KEY,
  finding_id BIGINT REFERENCES reach_findings(finding_id) ON DELETE CASCADE,
  entry_node_id BIGINT REFERENCES cg_nodes(node_id),
  dsse_envelope JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

3) Stable identity: MethodKey + IL hash

3.1 MethodKey (must be stable across builds)

Use a normalized string like:

{AssemblyName}|{DeclaringTypeFullName}|{MethodName}`{GenericArity}({ParamType1},{ParamType2},...)

Examples:

MyApp|BillingController|Pay(System.String)
LibXYZ|LibXYZ.Parser|Parse(System.ReadOnlySpan<System.Byte>)

3.2 Normalized IL hash (for smart-diff + incremental graph updates)

Raw IL bytes aren't stable (metadata tokens change). Normalize:

opcode names
branch targets by instruction index, not offset
method operands by resolved MethodKey
string operands by literal or hashed literal
type operands by full name

Then hash SHA256(normalized_bytes).

[Remainder of advisory truncated for brevity - see original file for full content]

12) What to implement first (in the order that produces value fastest)

Week 1-2 scope (realistic, shippable)

Cecil call graph extraction (direct calls)
MVC + Minimal API entrypoints
Reverse BFS reachability with path witnesses
DSSE witness signing + storage
SurfaceBuilder v1:
- IL hash per method
- changed methods as sinks
- triggers via internal reverse BFS
UI: "Show Witness" + "Verify Signature"

Next increment (precision upgrades)

async/await mapping to original methods
RTA + DI registration hints
delegate tracking for Minimal API handlers (if not already)
interface override triggers in surface builder

Later (if you want "attackability", not just "reachability")

taint/dataflow for top sink classes (deserialization, path traversal, SQL, command exec)
sanitizer modeling & parameter constraints

13) Common failure modes and how to harden

MethodKey mismatches (surface vs app call)

Ensure both are generated from the same normalization rules
For generic methods, prefer definition keys (strip instantiation)
Store both "exact" and "erased generic" variants if needed

Multi-target frameworks

SurfaceBuilder: compute triggers for each TFM, union them
App scan: choose TFM closest to build RID, but allow fallback to union

Huge graphs

Drop System.* nodes/edges unless:
- the vuln is in System.* (rare, but handle separately)
Deduplicate nodes by MethodKey across assemblies where safe
Use CSR arrays + pooled queues

Reflection heavy projects

Mark analysis confidence lower
Include "unknown edges present" in finding metadata
Still produce a witness path up to the reflective callsite

If you want, I can also paste a complete Cecil-based CallGraphBuilder class (nodes+edges+PDB lines), plus the SurfaceBuilder that downloads NuGet packages and generates vuln_surface_triggers end-to-end.

15 KiB Raw Blame History