Remove global.json and add extensive documentation for SBOM-first supply chain spine, diff-aware releases, binary intelligence graph, reachability proofs, smart-diff evidence, risk budget visualization, and weighted confidence for VEX sources. Introduce solution file for Concelier web service project.

2025-12-26 11:27:18 +02:00
parent 4f6dd4de83
commit e95eff2542
12 changed files with 695 additions and 144227 deletions
--- a/docs/product-advisories/26-Dec-2026
+++ b/docs/product-advisories/26-Dec-2026
@@ -0,0 +1,145 @@
+Here’s a compact blueprint for a **binary‑level knowledge base** that maps ELF Build‑IDs / PE signatures to vulnerable functions, patch lineage, and reachability hints—so your scanner can act like a provenance‑aware “binary oracle,” not just a CVE lookup.
+
+---
+
+# Why this matters (in plain terms)
+
+* **Same version ≠ same risk.** Distros (and vendors) frequently **backport** fixes without bumping versions. Only the **binary** tells the truth.
+* **Function‑level matching** turns noisy “package has CVE” into precise “this exact function range is vulnerable in your binary.”
+* **Reachability hints** cut triage noise by ranking vulns the code path can actually hit at runtime.
+
+---
+
+# Minimal starter schema (MVP)
+
+Keep it tiny so it grows with real evidence:
+
+**artifacts**
+
+* `id (pk)`
+* `platform` (linux, windows)
+* `format` (ELF, PE)
+* `build_id` (ELF `.note.gnu.build-id`), `pdb_guid` / `pe_imphash` (Windows)
+* `sha256` (whole‑file)
+* `compiler_fingerprint` (e.g., `gcc-13.2`, `msvc-19.39`)
+* `source_hint` (optional: pname/version if known)
+
+**symbols**
+
+* `artifact_id (fk)`
+* `symbol_name`
+* `addr_start`, `addr_end` (or RVA for PE)
+* `section`, `file_offset` (optional)
+
+**vuln_segments**
+
+* `id (pk)`
+* `cve_id` (CVE‑YYYY‑NNNN)
+* `function_signature` (normalized name + arity)
+* `byte_sig` (short stable pattern around the vulnerable hunk)
+* `patch_sig` (pattern from fixed hunk)
+* `evidence_ref` (link to patch diff, commit, or NVD note)
+* `backport_flag` (bool)
+* `introduced_in`, `fixed_in` (semver-ish text; note “backport” when used)
+
+**matches**
+
+* `artifact_id (fk)`, `vuln_segment_id (fk)`
+* `match_type` (`byte`, `range`, `symbol`)
+* `confidence` (0–1)
+* `explain` (why we think this matches)
+
+**reachability_hints**
+
+* `artifact_id (fk)`, `symbol_name`
+* `hint_type` (`imported`, `exported`, `hot`, `ebpf_seen`, `graph_core`)
+* `weight` (0–100)
+
+---
+
+# How the oracle answers “Am I affected?”
+
+1. **Identify**: Look up by Build‑ID / PE signature; fall back to file hash.
+2. **Locate**: Map symbols → address ranges; scan for `byte_sig`/`patch_sig`.
+3. **Decide**:
+
+   * if `patch_sig` present ⇒ **Not affected (backported)**.
+   * if `byte_sig` present and reachable (weighted) ⇒ **Affected (prioritized)**.
+   * if only `byte_sig` present, unreachable ⇒ **Affected (low priority)**.
+   * if neither ⇒ **Unknown**.
+4. **Explain**: Attach `evidence_ref`, the exact offsets, and the reason (match_type + reachability).
+
+---
+
+# Ingestion pipeline (no humans in the loop)
+
+* **Fingerprinting**: extract Build‑ID / PE GUID; compute `sha256`.
+* **Symbol map**: parse DWARF/PDB if present; else fall back to heuristics (ELF `symtab`, PE exports).
+* **Patch intelligence**: auto‑diff upstream commits (plus major distros) → synthesize short **byte signatures** around changed hunks (stable across relocations).
+* **Evidence links**: store URLs/commit IDs for cross‑audit.
+* **Noise control**: only accept a vuln signature if it hits N≥3 independent binaries across distros (tunable).
+
+---
+
+# Deterministic verdicts (fit to Stella Ops)
+
+* **Inputs**: `(artifact fingerprint, vuln_segments@version, reachability@policy)`
+* **Output**: **Signed OCI attestation** “verdict.json” (same inputs → same verdict).
+* **Replay**: keep rule bundle & feed hashes for audit.
+* **Backport precedence**: `patch_sig` beats package version claims every time.
+
+---
+
+# Fast path to MVP (2 sprints)
+
+* Add a **Build‑ID/PE indexer** to Scanner.
+* Teach Feedser/Vexer to ingest `vuln_segments` (with `byte_sig`/`patch_sig`).
+* Implement matching + verdict attestation; surface **“Backported & Safe”** vs **“Affected & Reachable”** badges in UI.
+* Seed DB with 10 high‑impact CVEs (OpenSSL, zlib, xz, glibc, libxml2, curl, musl, busybox, OpenSSH, sudo).
+
+---
+
+# Example: SQL skeleton (Postgres)
+
+```sql
+create table artifacts(
+  id bigserial primary key,
+  platform text, format text,
+  build_id text, pdb_guid text, pe_imphash text,
+  sha256 bytea not null unique,
+  compiler_fingerprint text, source_hint text
+);
+
+create table symbols(
+  artifact_id bigint references artifacts(id),
+  symbol_name text, addr_start bigint, addr_end bigint,
+  section text, file_offset bigint
+);
+
+create table vuln_segments(
+  id bigserial primary key,
+  cve_id text, function_signature text,
+  byte_sig bytea, patch_sig bytea,
+  evidence_ref text, backport_flag boolean,
+  introduced_in text, fixed_in text
+);
+
+create table matches(
+  artifact_id bigint references artifacts(id),
+  vuln_segment_id bigint references vuln_segments(id),
+  match_type text, confidence real, explain text
+);
+
+create table reachability_hints(
+  artifact_id bigint references artifacts(id),
+  symbol_name text, hint_type text, weight int
+);
+```
+
+---
+
+If you want, I can:
+
+* drop in a tiny **.NET 10** matcher (ELF/PE parsers + byte‑window scanner),
+* wire verdicts as **OCI attestations** in your current pipeline,
+* and prep the first **10 CVE byte/patch signatures** to seed the DB.