Files

Codex Assistant 8f0320edd5 product advisories add change contiang folder

2026-01-08 09:06:03 +02:00

7.6 KiB

Raw Blame History

Here’s a compact, practical blueprint for a binary‑fingerprint store + trust‑scoring engine that lets you quickly tell whether a system binary is patched, backported, or risky—even fully offline.

Why this matters (plain English)

Package versions lie (backports!). Instead of trusting names like libssl 1.1.1k, we trust what’s inside: build IDs, section hashes, compiler metadata, and signed provenance. With that, we can answer: Is this exact binary known‑good, known‑bad, or unknown—on this distro, on this date, with these patches?

Core concept

Binary Fingerprint = tuple of:
- Build‑ID (ELF/PE), if present.
- Section‑level hashes (e.g., .text, .rodata, selected function ranges).
- Compiler/Linker metadata (vendor/version, LTO flags, PIE/RELRO, sanitizer bits).
- Symbol graph sketch (optional, min‑hash of exported symbol names + sizes).
- Feature toggles (FIPS mode, CET/CFI present, Fortify level, RELRO type, SSP).
Provenance Chain (who built it): Upstream → Distro vendor (with patchset) → Local rebuild.
Trust Score: combines provenance weight + cryptographic attestations + “golden set” matches + observed patch deltas.

Minimal architecture (fits Stella Ops style)

Ingesters
- ingester.distro: walks repo mirrors or local systems, extracts ELF/PE, computes fingerprints, captures package→file mapping, vendor patch metadata (changelog, source SRPM diffs).
- ingester.upstream: indexes upstream releases, commit tags, and official build artifacts.
- ingester.local: indexes CI outputs (your own builds), in‑toto/DSSE attestations if available.
Fingerprint Store (offline‑ready)
- Primary DB: PostgreSQL (authoritative).
- Accelerator: Valkey (ephemeral) for fast lookup by Build‑ID and section hash prefixes.
- Bundle Export: signed, chunked SQLite/Parquet packs for air‑gapped sites.
Trust Engine
- Scores (0–100) per binary instance using:
  - Provenance weight (Upstream signed > Distro signed > Local unsigned).
  - Attestation presence/quality (in‑toto/DSSE, reproducible build stamp).
  - Patch alignment vs Golden Set (reference fingerprints for “fixed” and “vulnerable” builds).
  - Hardening baseline (RELRO/PIE/SSP/CET/CFI).
  - Divergence penalty (unexpected section deltas vs vendor‑declared patch).
- Emits Verdict: Patched, Likely Patched (Backport), Unpatched, Unknown, with rationale.
Query APIs
- /lookup/by-buildid/{id}
- /lookup/by-hash/{algo}/{prefix}
- /classify (batch): accepts an SBOM file list or live filesystem scan.
- /explain/{fingerprint}: returns diff vs Golden Set and the proof trail.

Data model (tables you can lift into Postgres)

artifact (artifact_id PK, file_sha256, size, mime, elf_machine, pe_machine, ts, signers[])
fingerprint (fp_id PK, artifact_id, build_id, text_hash, rodata_hash, sym_sketch, compiler_vendor, compiler_ver, lto, pie, relro, ssp, cfi, cet, flags jsonb)
provenance (prov_id PK, fp_id, origin ENUM('upstream','distro','local'), vendor, distro, release, package, version, source_commit, patchset jsonb, attestation_hash, attestation_quality_score)
golden_set (golden_id PK, package, cve, status ENUM('fixed','vulnerable'), fp_ref, method ENUM('vendor-advisory','diff-sig','function-patch'), notes)
trust_score (fp_id, score int, verdict, reasons jsonb, computed_at)

Indexes: (build_id), (text_hash), (rodata_hash), (package, version), GIN on patchset, reasons.

How detection works (fast path)

Exact match Build‑ID hit → join golden_set → return verdict + reason.
Near match (backport mode) No Build‑ID match → compare .text/.rodata and function‑range hashes against “fixed” Golden Set:
- If patched function ranges match, mark Likely Patched (Backport).
- If vulnerable function ranges match, mark Unpatched.
Heuristic fallback Symbol sketch + compiler metadata + hardening flags narrow candidate set; compute targeted function hashes only (don’t hash the whole file).

Building the “Golden Set”

Sources:
- Vendor advisories (per‑CVE “fixed in” builds).
- Upstream tags containing the fix commit.
- Distro SRPM diffs for backports (extract exact hunk regions; compute function‑range hashes pre/post).
Store both:
- “Fixed” fingerprints (post‑patch).
- “Vulnerable” fingerprints (pre‑patch).
Annotate evidence method:
- vendor-advisory (strong), diff-sig (strong if clean hunk), function-patch (targeted).

Trust scoring (example)

Base by provenance:
- Upstream + signed + reproducible: +40
- Distro signed with changelog & SRPM diff: +30
- Local unsigned: +10
Attestations:
- Valid DSSE + in‑toto chain: +20
- Reproducible build proof: +10
Golden Set alignment:
- Matches “fixed”: +20
- Matches “vulnerable”: −40
- Partial (patched functions match, rest differs): +10
Hardening:
- PIE/RELRO/SSP/CET/CFI each +2 (cap +10)
Divergence penalties:
- Unexplained text‑section drift −10
- Suspicious toolchain fingerprint −5

Verdict bands: ≥80 Patched, 65–79 Likely Patched (Backport), 35–64 Unknown, <35 Unpatched.

CLI outline (Stella Ops‑style)

# Index a filesystem or package repo
stella-fp index /usr/bin /lib --out fp.db --bundle out.bundle.parquet

# Score a host (offline)
stella-fp classify --fp-store fp.db --golden golden.db --out verdicts.json

# Explain a result
stella-fp explain --fp <fp_id> --golden golden.db

# Maintain Golden Set
stella-fp golden add --package openssl --cve CVE-2023-XXXX --status fixed --from-srpm path.src.rpm
stella-fp golden add --package openssl --cve CVE-2023-XXXX --status vulnerable --from-upstream v1.1.1k

Implementation notes (ELF/PE)

ELF: read Build‑ID from .note.gnu.build-id; hash .text and selected function ranges (use DWARF/eh_frame or symbol table when present; otherwise lightweight linear‑sweep with sanity checks). Record RELRO/PIE from program headers.
PE: use Debug Directory (GUID/age) and Section Table; capture CFG/ASLR/NX/GS flags.
Function‑range hashing: normalize NOPs/padding, zero relocation slots, mask address‑relative operands (keeps hashes stable across vendor rebuilds).
Performance: cache per‑section hash; only compute function hashes when near‑match needs confirmation.

How this plugs into your world

Sbomer/Vexer: attach trust scores & verdicts to components in CycloneDX/SPDX; emit VEX statements like “Fixed by backport: evidence=diff‑sig, source=Astra/RedHat SRPM.”
Feedser: when CVE feed says “vulnerable by version,” override with binary proof from Golden Set.
Policy Engine: gate deployments on verdict ∈ {Patched, Likely Patched} OR score ≥ 65.

Next steps you can action today

Create schemas above in Postgres; scaffold a small stella-fp Go/.NET tool to compute fingerprints for /bin, /lib* on one reference host (e.g., Debian + Alpine).
Hand‑curate a pilot Golden Set for 3 noisy CVEs (OpenSSL, glibc, curl). Store both pre/post patch fingerprints and 2–3 backported vendor builds each.
Wire a classify step into your CI/CD and surface the verdict + rationale in your VEX output.

If you want, I can drop in starter code (C#/.NET 10) for the fingerprint extractor and the Postgres schema migration, plus a tiny “function‑range hasher” that masks relocations and normalizes padding.

7.6 KiB Raw Blame History Unescape Escape