Remove global.json and add extensive documentation for SBOM-first supply chain spine, diff-aware releases, binary intelligence graph, reachability proofs, smart-diff evidence, risk budget visualization, and weighted confidence for VEX sources. Introduce solution file for Concelier web service project.

This commit is contained in:
StellaOps Bot
2025-12-26 11:27:18 +02:00
parent 4f6dd4de83
commit e95eff2542
12 changed files with 695 additions and 144227 deletions

View File

@@ -0,0 +1,145 @@
Heres a compact blueprint for a **binarylevel knowledge base** that maps ELF BuildIDs / PE signatures to vulnerable functions, patch lineage, and reachability hints—so your scanner can act like a provenanceaware “binary oracle,” not just a CVE lookup.
---
# Why this matters (in plain terms)
* **Same version ≠ same risk.** Distros (and vendors) frequently **backport** fixes without bumping versions. Only the **binary** tells the truth.
* **Functionlevel matching** turns noisy “package has CVE” into precise “this exact function range is vulnerable in your binary.”
* **Reachability hints** cut triage noise by ranking vulns the code path can actually hit at runtime.
---
# Minimal starter schema (MVP)
Keep it tiny so it grows with real evidence:
**artifacts**
* `id (pk)`
* `platform` (linux, windows)
* `format` (ELF, PE)
* `build_id` (ELF `.note.gnu.build-id`), `pdb_guid` / `pe_imphash` (Windows)
* `sha256` (wholefile)
* `compiler_fingerprint` (e.g., `gcc-13.2`, `msvc-19.39`)
* `source_hint` (optional: pname/version if known)
**symbols**
* `artifact_id (fk)`
* `symbol_name`
* `addr_start`, `addr_end` (or RVA for PE)
* `section`, `file_offset` (optional)
**vuln_segments**
* `id (pk)`
* `cve_id` (CVEYYYYNNNN)
* `function_signature` (normalized name + arity)
* `byte_sig` (short stable pattern around the vulnerable hunk)
* `patch_sig` (pattern from fixed hunk)
* `evidence_ref` (link to patch diff, commit, or NVD note)
* `backport_flag` (bool)
* `introduced_in`, `fixed_in` (semver-ish text; note “backport” when used)
**matches**
* `artifact_id (fk)`, `vuln_segment_id (fk)`
* `match_type` (`byte`, `range`, `symbol`)
* `confidence` (01)
* `explain` (why we think this matches)
**reachability_hints**
* `artifact_id (fk)`, `symbol_name`
* `hint_type` (`imported`, `exported`, `hot`, `ebpf_seen`, `graph_core`)
* `weight` (0100)
---
# How the oracle answers “Am I affected?”
1. **Identify**: Look up by BuildID / PE signature; fall back to file hash.
2. **Locate**: Map symbols → address ranges; scan for `byte_sig`/`patch_sig`.
3. **Decide**:
* if `patch_sig` present ⇒ **Not affected (backported)**.
* if `byte_sig` present and reachable (weighted) ⇒ **Affected (prioritized)**.
* if only `byte_sig` present, unreachable ⇒ **Affected (low priority)**.
* if neither ⇒ **Unknown**.
4. **Explain**: Attach `evidence_ref`, the exact offsets, and the reason (match_type + reachability).
---
# Ingestion pipeline (no humans in the loop)
* **Fingerprinting**: extract BuildID / PE GUID; compute `sha256`.
* **Symbol map**: parse DWARF/PDB if present; else fall back to heuristics (ELF `symtab`, PE exports).
* **Patch intelligence**: autodiff upstream commits (plus major distros) → synthesize short **byte signatures** around changed hunks (stable across relocations).
* **Evidence links**: store URLs/commit IDs for crossaudit.
* **Noise control**: only accept a vuln signature if it hits N≥3 independent binaries across distros (tunable).
---
# Deterministic verdicts (fit to StellaOps)
* **Inputs**: `(artifact fingerprint, vuln_segments@version, reachability@policy)`
* **Output**: **Signed OCI attestation** “verdict.json” (same inputs → same verdict).
* **Replay**: keep rule bundle & feed hashes for audit.
* **Backport precedence**: `patch_sig` beats package version claims every time.
---
# Fast path to MVP (2 sprints)
* Add a **BuildID/PE indexer** to Scanner.
* Teach Feedser/Vexer to ingest `vuln_segments` (with `byte_sig`/`patch_sig`).
* Implement matching + verdict attestation; surface **“Backported & Safe”** vs **“Affected & Reachable”** badges in UI.
* Seed DB with 10 highimpact CVEs (OpenSSL, zlib, xz, glibc, libxml2, curl, musl, busybox, OpenSSH, sudo).
---
# Example: SQL skeleton (Postgres)
```sql
create table artifacts(
id bigserial primary key,
platform text, format text,
build_id text, pdb_guid text, pe_imphash text,
sha256 bytea not null unique,
compiler_fingerprint text, source_hint text
);
create table symbols(
artifact_id bigint references artifacts(id),
symbol_name text, addr_start bigint, addr_end bigint,
section text, file_offset bigint
);
create table vuln_segments(
id bigserial primary key,
cve_id text, function_signature text,
byte_sig bytea, patch_sig bytea,
evidence_ref text, backport_flag boolean,
introduced_in text, fixed_in text
);
create table matches(
artifact_id bigint references artifacts(id),
vuln_segment_id bigint references vuln_segments(id),
match_type text, confidence real, explain text
);
create table reachability_hints(
artifact_id bigint references artifacts(id),
symbol_name text, hint_type text, weight int
);
```
---
If you want, I can:
* drop in a tiny **.NET 10** matcher (ELF/PE parsers + bytewindow scanner),
* wire verdicts as **OCI attestations** in your current pipeline,
* and prep the first **10 CVE byte/patch signatures** to seed the DB.