feat: Add new provenance and crypto registry documentation

- Introduced attestation inventory and subject-rekor mapping files for tracking Docker packages. - Added a comprehensive crypto registry decision document outlining defaults and required follow-ups. - Created an offline feeds manifest for bundling air-gap resources. - Implemented a script to generate and update binary manifests for curated binaries. - Added a verification script to ensure binary artefacts are located in approved directories. - Defined new schemas for AdvisoryEvidenceBundle, OrchestratorEnvelope, ScannerReportReadyPayload, and ScannerScanCompletedPayload. - Established project files for StellaOps.Orchestrator.Schemas and StellaOps.PolicyAuthoritySignals.Contracts. - Updated vendor manifest to track pinned binaries for integrity.
2025-11-18 23:47:13 +02:00
parent d3ecd7f8e6
commit e91da22836
44 changed files with 6793 additions and 99 deletions
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
@@ -0,0 +1,927 @@
+
+Here’s a crisp idea that could give Stella Ops a real moat: **binary‑level reachability**—linking CVEs directly to the exact functions and offsets inside compiled artifacts (ELF/PE/Mach‑O), not just to packages.
+
+---
+
+### Why this matters (quick background)
+
+* **Package‑level flags are noisy.** Most scanners say “vuln in `libX v1.2`,” but that library might be present and never executed.
+* **Language‑level call graphs help** (when you have source or rich metadata), but containers often ship only **stripped binaries**.
+* **Binary reachability** answers: *Is the vulnerable function actually in this image? Is its code path reachable from the entrypoints we observed or can construct?*
+
+---
+
+### The missing layer: Symbolization
+
+Build a **symbolization layer** that normalizes debug and symbol info across platforms:
+
+* **Inputs**: DWARF (ELF/Mach‑O), PDB (PE/Windows), symtabs, exported symbols, `.eh_frame`, and (when stripped) heuristic signatures (e.g., function byte‑hashes, CFG fingerprints).
+* **Outputs**: a source‑agnostic map: `{binary → sections → functions → (addresses, ranges, hashes, demangled names, inlined frames)}`.
+* **Normalization**: Put everything into a common schema (e.g., `Stella.Symbolix.v1`) so higher layers don’t care if it came from DWARF or PDB.
+
+---
+
+### End‑to‑end reachability (binary‑first, source‑agnostic)
+
+1. **Acquire & parse**
+
+   * Detect format (ELF/PE/Mach‑O), parse headers, sections, symbol tables.
+   * If debug info present: parse DWARF/PDB; else fall back to disassembly + function boundary recovery.
+2. **Function catalog**
+
+   * Assign stable IDs per function: `(imageHash, textSectionHash, startVA, size, fnHashXX)`.
+   * Record x‑refs (calls/jumps), imports/exports, PLT/IAT edges.
+3. **Entrypoint discovery**
+
+   * Docker entry, process launch args, service scripts; infer likely mains (Go `main.main`, .NET hostfxr path, JVM launcher, etc.).
+4. **Call‑graph build (binary CFG)**
+
+   * Build inter/intra‑procedural graph (direct + resolved indirect via IAT/PLT). Keep “unknown‑target” edges for conservative safety.
+5. **CVE→function linking**
+
+   * Maintain a **signature bank** per CVE advisory: vulnerable function names, file paths, and—crucially—**byte‑sequence or basic‑block fingerprints** for patched vs vulnerable versions (works even when stripped).
+6. **Reachability analysis**
+
+   * Is the vulnerable function present? Is there a path from any entrypoint to it (under conservative assumptions)? Tag as `Present+Reachable`, `Present+Uncertain`, or `Absent`.
+7. **Runtime confirmation (optional, when users allow)**
+
+   * Lightweight probes (eBPF on Linux, ETW on Windows, perf/JFR/EventPipe) capture function hits; cross‑check with the static result to upgrade confidence.
+
+---
+
+### Minimal component plan (drop into Stella Ops)
+
+* **Scanner.Symbolizer**
+  Parsers: ELF/DWARF (libdw or pure‑managed reader), PE/PDB (Dia/LLVM PDB), Mach‑O/DSYM.
+  Output: `Symbolix.v1` blobs stored in OCI layer cache.
+* **Scanner.CFG**
+  Lifts functions to a normalized IR (capstone/iced‑x86 for decode) → builds CFG & call graph.
+* **Advisory.FingerprintBank**
+  Ingests CSAF/OpenVEX plus curated fingerprints (fn names, block hashes, patch diff markers). Versioned, signed, air‑gap‑syncable.
+* **Reachability.Engine**
+  Joins (`Symbolix` + `CFG` + `FingerprintBank`) → emits `ReachabilityEvidence` with lattice states for VEX.
+* **VEXer.Adapter**
+  Emits **OpenVEX** statements with `status: affected/not_affected` and `justification: function_not_present | function_not_reachable | mitigated_at_runtime`, attaching Evidence URIs.
+* **Console UX**
+  “Why not affected?” panel showing entrypoint→…→function path (or absence), with byte‑hash proof.
+
+---
+
+### Data model sketch (concise)
+
+* `ImageFunction { id, name?, startVA, size, fnHash, sectionHash, demangled?, provenance:{DWARF|PDB|Heuristic} }`
+* `Edge { srcFnId, dstFnId, kind:{direct|plt|iat|indirect?} }`
+* `CveSignature { cveId, fnName?, libHints[], blockFingerprints[], versionRanges }`
+* `Evidence { cveId, imageId, functionMatches[], reachable: bool?, confidence:[low|med|high], method:[static|runtime|hybrid] }`
+
+---
+
+### Practical phases (8–10 weeks of focused work)
+
+1. **P0**: ELF/DWARF symbolizer + basic function catalog; link a handful of CVEs via name‑only; emit OpenVEX `function_not_present`.
+2. **P1**: CFG builder (direct calls) + PLT/IAT resolution; simple reachability; first fingerprints for top 50 CVEs in glibc, openssl, curl, zlib.
+3. **P2**: Stripped‑binary heuristics (block hashing) + Go/Rust name demangling; Windows PDB ingestion for PE.
+4. **P3**: Runtime probes (opt‑in) + confidence upgrade logic; Console path explorer; evidence signing (DSSE).
+
+---
+
+### KPIs to prove the moat
+
+* **Noise cut**: % reduction in “affected” flags after reachability (target 40–70% on typical containers).
+* **Precision**: Ground‑truth validation vs PoC images (TP/FP/FN on presence & reachability).
+* **Coverage**: % images where we can make a determination without source (goal: >80%).
+* **Latency**: Added scan time per image (<15s typical with caches).
+
+---
+
+### Risks & how to handle them
+
+* **Stripped binaries** → mitigate with block‑hash fingerprints & library‑version heuristics.
+* **Obfuscated/packed code** → mark `Uncertain`; allow user‑supplied hints; prefer runtime confirmation.
+* **Advisory inconsistency** → keep our own curated CVE→function fingerprint bank; sign & version it.
+* **Platform spread** → start Linux/ELF, then Windows/PDB, then Mach‑O.
+
+---
+
+### Why competitors struggle
+
+Most tools stop at packages because binary CFG + fingerprint curation is hard and expensive. Shipping a **source‑agnostic reachability engine** tied to signed evidence in VEX would set Stella Ops apart—especially in offline/air‑gapped and sovereign contexts you already target.
+
+If you want, I can draft:
+
+* the `Symbolix.v1` protobuf,
+* a tiny PoC (ELF→functions→match CVE with a block fingerprint),
+* and the OpenVEX emission snippet your VEXer can produce.
+Below is a detailed architecture plan for implementing reachability and call-graph analysis in Stella Ops, covering JavaScript, Python, PHP, and binaries, and integrating with your existing Scanner / Concelier / VEXer stack.
+
+I will assume:
+
+* .NET 10 for core services.
+* Scanner is the place where all “trust algebra / lattice” runs (per your standing rule).
+* Concelier and VEXer remain “preserve/prune” layers and do not run lattice logic.
+* Output must be JSON-centric with PURLs and OpenVEX.
+
+---
+
+## 1. Scope & Objectives
+
+### 1.1 Primary goals
+
+1. From an OCI image, build:
+
+   * A **library-level usage graph** (which libraries are used by which entrypoints).
+   * A **function-level call graph** for JS / Python / PHP / binaries.
+2. Map CVEs (from Concelier) to:
+
+   * Concrete **components** (PURLs) in the SBOM.
+   * Concrete **functions / entrypoints / code regions** inside those components.
+3. Perform **reachability analysis** to classify each vulnerability as:
+
+   * `present + reachable`
+   * `present + not_reachable`
+   * `function_not_present` (no vulnerable symbol)
+   * `uncertain` (dynamic features, unresolved calls)
+4. Emit:
+
+   * **Structured JSON** with PURLs and call-graph nodes/edges (“reachability evidence”).
+   * **OpenVEX** documents with appropriate `status`/`justification`.
+
+### 1.2 Non-goals (for now)
+
+* Full dynamic analysis of the running container (eBPF, ptrace, etc.) – leave as Phase 3+ optional add-on.
+* Perfect call graph precision for dynamic languages (aim for safe, conservative approximations).
+* Automatic “fix recommendations” (handled by other Stella Ops agents later).
+
+---
+
+## 2. High-Level Architecture
+
+### 2.1 Major components
+
+Within Stella Ops:
+
+* **Scanner.WebService**
+
+  * User-facing API.
+  * Orchestrates full scan (SBOM, CVEs, reachability).
+  * Hosts the **Lattice/Policy engine** that merges evidence and produces decisions.
+* **Scanner.Worker**
+
+  * Runs per-image analysis jobs.
+  * Invokes analyzers (JS, Python, PHP, Binary) inside its own container context.
+* **Scanner.Reachability Core Library**
+
+  * Unified IR for call graphs and reachability evidence.
+  * Interfaces for language and binary analyzers.
+  * Graph algorithms (BFS/DFS, lattice evaluation, entrypoint expansion).
+* **Language Analyzers**
+
+  * `Scanner.Analyzers.JavaScript`
+  * `Scanner.Analyzers.Python`
+  * `Scanner.Analyzers.Php`
+  * `Scanner.Analyzers.Binary`
+* **Symbolization & CFG (for binaries)**
+
+  * `Scanner.Symbolization` (ELF, PE, Mach-O parsers, DWARF/PDB)
+  * `Scanner.Cfg` (CFG + call graph for binaries)
+* **Vulnerability Signature Bank**
+
+  * `Concelier.Signatures` (curated CVE→function/library fingerprints).
+  * Exposed to Scanner as **offline bundle**.
+* **VEXer**
+
+  * `Vexer.Adapter.Reachability` – transforms reachability evidence into OpenVEX.
+
+### 2.2 Data flow (logical)
+
+```mermaid
+flowchart LR
+  A[OCI Image / Tar] --> B[Scanner.Worker: Extract FS]
+  B --> C[SBOM Engine (CycloneDX/SPDX)]
+  C --> D[Vuln Match (Concelier feeds)]
+  B --> E1[JS Analyzer]
+  B --> E2[Python Analyzer]
+  B --> E3[PHP Analyzer]
+  B --> E4[Binary Analyzer + Symbolizer/CFG]
+
+  D --> F[Reachability Orchestrator]
+  E1 --> F
+  E2 --> F
+  E3 --> F
+  E4 --> F
+  F --> G[Lattice/Policy Engine (Scanner.WebService)]
+  G --> H[Reachability Evidence JSON]
+  G --> I[VEXer: OpenVEX]
+  G --> J[Graph/Cartographer (optional)]
+```
+
+---
+
+## 3. Data Model & JSON Contracts
+
+### 3.1 Core IR types (Scanner.Reachability)
+
+Define in a central assembly, e.g. `StellaOps.Scanner.Reachability`:
+
+```csharp
+public record ComponentRef(
+    string Purl,
+    string? BomRef,
+    string? Name,
+    string? Version);
+
+public enum SymbolKind { Function, Method, Constructor, Lambda, Import, Export }
+
+public record SymbolId(
+    string Language,       // "js", "python", "php", "binary"
+    string ComponentPurl,  // SBOM component PURL or "" for app code
+    string LogicalName,    // e.g., "server.js:handleLogin"
+    string? FilePath,
+    int? Line);
+
+public record CallGraphNode(
+    string Id,                 // stable id, e.g., hash(SymbolId)
+    SymbolId Symbol,
+    SymbolKind Kind,
+    bool IsEntrypoint);
+
+public enum CallEdgeKind { Direct, Indirect, Dynamic, External, Ffi }
+
+public record CallGraphEdge(
+    string FromNodeId,
+    string ToNodeId,
+    CallEdgeKind Kind);
+
+public record CallGraph(
+    string GraphId,
+    IReadOnlyList<CallGraphNode> Nodes,
+    IReadOnlyList<CallGraphEdge> Edges);
+```
+
+### 3.2 Vulnerability mapping
+
+```csharp
+public record VulnerabilitySignature(
+    string Source,             // "csaf", "nvd", "vendor"
+    string Id,                 // "CVE-2023-12345"
+    IReadOnlyList<string> Purls,
+    IReadOnlyList<string> TargetSymbolPatterns, // glob-like or regex
+    IReadOnlyList<string>? FilePathPatterns,
+    IReadOnlyList<string>? BlockFingerprints    // for binaries, optional
+);
+```
+
+### 3.3 Reachability evidence
+
+```csharp
+public enum ReachabilityStatus
+{
+    PresentReachable,
+    PresentNotReachable,
+    FunctionNotPresent,
+    Unknown
+}
+
+public record ReachabilityEvidence
+(
+    string ImageRef,
+    string VulnId,               // CVE or advisory id
+    ComponentRef Component,
+    ReachabilityStatus Status,
+    double Confidence,           // 0..1
+    string Method,               // "static-callgraph", "binary-fingerprint", etc.
+    IReadOnlyList<string> EntrypointNodeIds,
+    IReadOnlyList<IReadOnlyList<string>>? ExamplePaths // optional list of node-paths
+);
+```
+
+### 3.4 JSON structure (external)
+
+Minimal external JSON (what you store / expose):
+
+```json
+{
+  "image": "registry.example.com/app:1.2.3",
+  "components": [
+    {
+      "purl": "pkg:npm/express@4.18.0",
+      "bomRef": "component-1"
+    }
+  ],
+  "callGraphs": [
+    {
+      "graphId": "js-main",
+      "language": "js",
+      "nodes": [ /* CallGraphNode */ ],
+      "edges": [ /* CallGraphEdge */ ]
+    }
+  ],
+  "reachability": [
+    {
+      "vulnId": "CVE-2023-12345",
+      "componentPurl": "pkg:npm/express@4.18.0",
+      "status": "PresentReachable",
+      "confidence": 0.92,
+      "entrypoints": [ "node:..." ],
+      "paths": [
+        ["node:entry", "node:routeHandler", "node:vulnFn"]
+      ]
+    }
+  ]
+}
+```
+
+---
+
+## 4. Scanner-Side Architecture
+
+### 4.1 Project layout (suggested)
+
+```text
+src/
+  Scanner/
+    StellaOps.Scanner.WebService/
+    StellaOps.Scanner.Worker/
+    StellaOps.Scanner.Core/        # shared scan domain
+    StellaOps.Scanner.Reachability/
+    StellaOps.Scanner.Symbolization/
+    StellaOps.Scanner.Cfg/
+    StellaOps.Scanner.Analyzers.JavaScript/
+    StellaOps.Scanner.Analyzers.Python/
+    StellaOps.Scanner.Analyzers.Php/
+    StellaOps.Scanner.Analyzers.Binary/
+```
+
+### 4.2 API surface (Scanner.WebService)
+
+* `POST /api/scan/image`
+
+  * Request: `{ "imageRef": "...", "profile": { "reachability": true, ... } }`
+  * Returns: scan id.
+* `GET /api/scan/{id}/reachability`
+
+  * Returns: `ReachabilityEvidence[]`, plus call graph summary (optional).
+* `GET /api/scan/{id}/vex`
+
+  * Returns: OpenVEX with statuses based on reachability lattice.
+
+### 4.3 Worker orchestration
+
+`StellaOps.Scanner.Worker`:
+
+1. Receives scan job with `imageRef`.
+
+2. Extracts filesystem (layered rootfs) under `/mnt/scans/{scanId}/rootfs`.
+
+3. Invokes SBOM generator (CycloneDX/SPDX).
+
+4. Invokes Concelier via offline feeds to get:
+
+   * Component vulnerabilities (CVE list per PURL).
+   * Vulnerability signatures (fingerprints).
+
+5. Builds a `ReachabilityPlan`:
+
+   ```csharp
+   public record ReachabilityPlan(
+       IReadOnlyList<ComponentRef> Components,
+       IReadOnlyList<VulnerabilitySignature> Vulns,
+       IReadOnlyList<AnalyzerTarget> AnalyzerTargets // files/dirs grouped by language
+   );
+   ```
+
+6. For each language target, dispatch analyzer:
+
+   * JavaScript: `IReachabilityAnalyzer` implementation for JS.
+   * Python: likewise.
+   * PHP: likewise.
+   * Binary: symbolizer + CFG.
+
+7. Collects call graphs from each analyzer and merges them into a single IR (or separate per-language graphs with shared IDs).
+
+8. Sends merged graphs + vuln list to **Reachability Engine** (Scanner.Reachability).
+
+---
+
+## 5. Language Analyzers (JS / Python / PHP)
+
+All analyzers implement a common interface:
+
+```csharp
+public interface IReachabilityAnalyzer
+{
+    string Language { get; } // "js", "python", "php"
+
+    Task<CallGraph> AnalyzeAsync(AnalyzerContext context, CancellationToken ct);
+}
+
+public record AnalyzerContext(
+    string RootFsPath,
+    IReadOnlyList<ComponentRef> Components,
+    IReadOnlyList<VulnerabilitySignature> Vulnerabilities,
+    IReadOnlyDictionary<string, string> Env,   // container env, entrypoint, etc.
+    string? EntrypointCommand                  // container CMD/ENTRYPOINT
+);
+```
+
+### 5.1 JavaScript (Node.js focus)
+
+**Inputs:**
+
+* `/app` tree inside container (or discovered via SBOM).
+* `package.json` files.
+* Container entrypoint (e.g., `["node", "server.js"]`).
+
+**Core steps:**
+
+1. Identify **app root**:
+
+   * Heuristics: directory containing `package.json` that owns the entry script.
+2. Parse:
+
+   * All `.js`, `.mjs`, `.cjs` in app and `node_modules` for vulnerable PURLs.
+   * Use a parsing frontend (e.g., Tree-sitter via .NET binding, or Node+AST-as-JSON).
+3. Build module graph:
+
+   * `require`, `import`, `export`.
+4. Function-level graph:
+
+   * For each function/method, create `CallGraphNode`.
+   * For each `callExpression`, create `CallGraphEdge` (try to resolve callee).
+5. Entrypoints:
+
+   * Main script in CMD/ENTRYPOINT.
+   * HTTP route handlers (for express/koa) detected by patterns (e.g., `app.get("/...")`).
+6. Map vulnerable symbols:
+
+   * From `VulnerabilitySignature.TargetSymbolPatterns` (e.g., `express/lib/router/layer.js:handle_request`).
+   * Identify nodes whose `SymbolId` matches patterns.
+
+**Output:**
+
+* `CallGraph` for JS with:
+
+  * `IsEntrypoint = true` for main and detected handlers.
+  * Node attributes include file path, line, component PURL.
+
+### 5.2 Python
+
+**Inputs:**
+
+* Site-packages paths from SBOM.
+* Entrypoint script (CMD/ENTRYPOINT).
+* Framework heuristics (Django, Flask) from environment variables or common entrypoints.
+
+**Core steps:**
+
+1. Discover Python interpreter chain: not needed for pure static, but useful for heuristics.
+2. Parse `.py` files of:
+
+   * App code.
+   * Vulnerable packages (per PURL).
+3. Build module import graph (`import`, `from x import y`).
+4. Function-level graph:
+
+   * Nodes for functions, methods, class constructors.
+   * Edges for call expressions; conservative for dynamic calls.
+5. Entrypoints:
+
+   * Main script.
+   * WSGI callable (e.g., `application` in `wsgi.py`).
+   * Django URLconf -> view functions.
+6. Map vulnerable symbols using `TargetSymbolPatterns` like `django.middleware.security.SecurityMiddleware.__call__`.
+
+### 5.3 PHP
+
+**Inputs:**
+
+* Web root (from container image or conventional paths `/var/www/html`, `/app/public`, etc.).
+* Composer metadata (`composer.json`, `vendor/`).
+* Web server config if present (optional).
+
+**Core steps:**
+
+1. Discover front controllers (e.g., `index.php`, `public/index.php`).
+2. Parse PHP files (again, via Tree-sitter or any suitable parser).
+3. Resolve include/require chains to build file-level inclusion graph.
+4. Build function/method graph:
+
+   * Functions, methods, class constructors.
+   * Calls with best-effort resolution for namespaced functions.
+5. Entrypoints:
+
+   * Front controllers and router entrypoints (e.g., Symfony, Laravel detection).
+6. Map vulnerable symbols (e.g., functions in certain vendor packages, particular methods).
+
+---
+
+## 6. Binary Analyzer & Symbolizer
+
+Project: `StellaOps.Scanner.Analyzers.Binary` + `Symbolization` + `Cfg`.
+
+### 6.1 Inputs
+
+* All binaries and shared libraries in:
+
+  * `/usr/lib`, `/lib`, `/app/bin`, etc.
+* SBOM link: each binary mapped to its component PURL when possible.
+* Vulnerability signatures for native libs: function names, symbol names, fingerprints.
+
+### 6.2 Symbolization
+
+Module: `StellaOps.Scanner.Symbolization`
+
+* Detect format: ELF, PE, Mach-O.
+* For ELF/Mach-O:
+
+  * Parse symbol tables (`.symtab`, `.dynsym`).
+  * Parse DWARF (if present) to map functions to source files/lines.
+* For PE:
+
+  * Parse PDB (if present) or export table.
+* For stripped binaries:
+
+  * Run function boundary recovery (linear sweep + heuristic).
+  * Compute block/fn-level hashes for fingerprinting.
+
+Output:
+
+```csharp
+public record ImageFunction(
+    string ImageId,      // e.g., SHA256 of file
+    ulong StartVa,
+    uint Size,
+    string? SymbolName,  // demangled if possible
+    string FnHash,       // stable hash of bytes / CFG
+    string? SourceFile,
+    int? SourceLine);
+```
+
+### 6.3 CFG + Call graph
+
+Module: `StellaOps.Scanner.Cfg`
+
+* Disassemble `.text` using Capstone/Iced.x86.
+* Build basic blocks and CFG.
+* Identify:
+
+  * Direct calls (resolved).
+  * PLT/IAT indirections to shared libraries.
+* Build `CallGraph` for binary functions:
+
+  * Entrypoints: `main`, exported functions, Go `main.main`, etc.
+  * Map application functions to library functions via PLT/IAT edges.
+
+### 6.4 Linking vulnerabilities
+
+* For each vulnerability affecting a native library (e.g., OpenSSL):
+
+  * Map to candidate binaries via SBOM + PURL.
+  * Within library image, find `ImageFunction`s matching:
+
+    * `SymbolName` patterns.
+    * `FnHash` / `BlockFingerprints` (for precise detection).
+* Determine reachability:
+
+  * Starting from application entrypoints, traverse call graph to see if calls to vulnerable library function occur.
+
+---
+
+## 7. Reachability Engine & Lattice (Scanner.WebService)
+
+Project: `StellaOps.Scanner.Reachability`
+
+### 7.1 Inputs to engine
+
+* Combined `CallGraph[]` (per language + binary).
+* Vulnerability list (CVE, GHSA, etc.) with affected PURLs.
+* Vulnerability signatures.
+* Entrypoint hints:
+
+  * Container CMD/ENTRYPOINT.
+  * Detected HTTP handlers, WSGI/PSGI entrypoints, etc.
+
+### 7.2 Algorithm steps
+
+1. **Entrypoint expansion**
+
+   * Identify all `CallGraphNode` with `IsEntrypoint=true`.
+   * Add language-specific “framework entrypoints” (e.g., Express route dispatch, Django URL dispatch) when detected.
+
+2. **Graph traversal**
+
+   * For each entrypoint node:
+
+     * BFS/DFS through edges.
+     * Maintain `reachable` bit on each node.
+   * For dynamic edges:
+
+     * Conservative: if target cannot be resolved, mark affected path as partially unknown and downgrade confidence.
+
+3. **Vuln symbol resolution**
+
+   * For each vulnerability:
+
+     * For each vulnerable component PURL found in SBOM:
+
+       * Find candidate nodes whose `SymbolId` matches `TargetSymbolPatterns` / binary fingerprints.
+   * If none found:
+
+     * `FunctionNotPresent` (if component version range indicates vulnerable but we cannot find symbol – low confidence).
+   * If found:
+
+     * Check `reachable` bit:
+
+       * If reachable by at least one entrypoint, `PresentReachable`.
+       * Else, `PresentNotReachable`.
+
+4. **Confidence computation**
+
+   * Start from:
+
+     * `1.0` for direct match with explicit function name & static call.
+     * Lower for:
+
+       * Heuristic framework entrypoints.
+       * Dynamic calls.
+       * Fingerprint-only matches on stripped binaries.
+   * Example rule-of-thumb:
+
+     * direct static path only: 0.95–1.0.
+     * dynamic edges but symbol found: 0.7–0.9.
+     * symbol not found but version says vulnerable: 0.4–0.6.
+
+5. **Lattice merge**
+
+   * Represent each CVE+component pair as a lattice element with states: `{affected, not_affected, unknown}`.
+   * Reachability engine produces a **local state**:
+
+     * `PresentReachable` → candidate `affected`.
+     * `PresentNotReachable` or `FunctionNotPresent` → candidate `not_affected`.
+     * `Unknown` → `unknown`.
+   * Merge with:
+
+     * Upstream vendor VEX (from Concelier).
+     * Policy overrides (e.g., “treat certain CVEs as affected unless vendor says otherwise”).
+   * Final state computed here (Scanner.WebService), not in Concelier or VEXer.
+
+6. **Evidence output**
+
+   * For each vulnerability:
+
+     * Emit `ReachabilityEvidence` with:
+
+       * Status.
+       * Confidence.
+       * Method.
+       * Example entrypoint paths (for UX and audit).
+   * Persist this evidence alongside regular scan results.
+
+---
+
+## 8. Integration with SBOM & VEX
+
+### 8.1 SBOM annotation
+
+* Extend SBOM documents (CycloneDX / SPDX) with extra properties:
+
+  * CycloneDX:
+
+    * `component.properties`:
+
+      * `stellaops:reachability:status` = `present_reachable|present_not_reachable|function_not_present|unknown`
+      * `stellaops:reachability:confidence` = `0.0-1.0`
+  * SPDX:
+
+    * `Annotation` or `ExternalRef` with similar metadata.
+
+### 8.2 OpenVEX generation
+
+Module: `StellaOps.Vexer.Adapter.Reachability`
+
+* For each `(vuln, component)` pair:
+
+  * Map to VEX statement:
+
+    * If `PresentReachable`:
+
+      * `status: affected`
+      * `justification: component_not_fixed` or similar.
+    * If `PresentNotReachable`:
+
+      * `status: not_affected`
+      * `justification: function_not_reachable`
+    * If `FunctionNotPresent`:
+
+      * `status: not_affected`
+      * `justification: component_not_present` or `function_not_present`
+    * If `Unknown`:
+
+      * `status: under_investigation` (configurable).
+
+* Attach evidence via:
+
+  * `analysis` / `details` fields (link to internal evidence JSON or audit link).
+
+* VEXer does not recalculate reachability; it uses the already computed decision + evidence.
+
+---
+
+## 9. Executable Containers & Offline Operation
+
+### 9.1 Executable containers
+
+* Analyzers run inside a dedicated Scanner worker container that has:
+
+  * .NET 10 runtime.
+  * Language runtimes if needed for parsing (Node, Python, PHP), or Tree-sitter-based parsing.
+* Target image filesystem is mounted read-only under `/mnt/rootfs`.
+* No network access (offline/air-gap).
+* This satisfies “we will use executable containers” while keeping separation between:
+
+  * Target image (mount only).
+  * Analyzer container (StellaOps code).
+
+### 9.2 Offline signature bundles
+
+* Concelier periodically exports:
+
+  * Vulnerability database (CSAF/NVD).
+  * Vulnerability Signature Bank.
+* Bundles are:
+
+  * DSSE-signed.
+  * Versioned (e.g., `signatures-2025-11-01.tar.zst`).
+* Scanner uses:
+
+  * The bundle digest as part of the **Scan Manifest** for deterministic replay.
+
+---
+
+## 10. Determinism & Caching
+
+### 10.1 Layer-level caching
+
+* Key: `layerDigest + analyzerVersion + signatureBundleVersion`.
+* Cache artifacts:
+
+  * CallGraph(s) per layer (for JS/Python/PHP code present in that layer).
+  * Symbolization results per binary file hash.
+* For images sharing layers:
+
+  * Merge cached graphs instead of re-analyzing.
+
+### 10.2 Deterministic scan manifest
+
+For each scan, produce:
+
+```json
+{
+  "imageRef": "registry/app:1.2.3",
+  "imageDigest": "sha256:...",
+  "scannerVersion": "1.4.0",
+  "analyzerVersions": {
+    "js": "1.0.0",
+    "python": "1.0.0",
+    "php": "1.0.0",
+    "binary": "1.0.0"
+  },
+  "signatureBundleDigest": "sha256:...",
+  "callGraphDigest": "sha256:...",    // canonical JSON hash
+  "reachabilityEvidenceDigest": "sha256:..."
+}
+```
+
+This manifest can be signed (Authority module) and used for audits and replay.
+
+---
+
+## 11. Implementation Roadmap (Phased)
+
+### Phase 0 – Infrastructure & Binary presence
+
+**Duration:** 1 sprint
+
+* Set up `Scanner.Reachability` core types and interfaces.
+* Implement:
+
+  * Basic Symbolizer for ELF + DWARF.
+  * Binary function catalog without CFG.
+* Link a small set of CVEs to binary function presence via `SymbolName`.
+* Expose minimal evidence:
+
+  * `PresentReachable`/`FunctionNotPresent` based only on presence (no call graph).
+* Integrate with VEXer to emit `function_not_present` justifications.
+
+**Success criteria:**
+
+* For selected demo images with known vulnerable/ patched OpenSSL, scanner can:
+
+  * Distinguish images where vulnerable function is present vs. absent.
+  * Emit OpenVEX with correct `not_affected` when patched.
+
+---
+
+### Phase 1 – JS/Python/PHP call graphs & basic reachability
+
+**Duration:** 1–2 sprints
+
+* Implement:
+
+  * `Scanner.Analyzers.JavaScript` with module + function call graph.
+  * `Scanner.Analyzers.Python` and `Scanner.Analyzers.Php` with basic graphs.
+* Entrypoint detection:
+
+  * JS: main script from CMD, basic HTTP handlers.
+  * Python: main script + Django/Flask heuristics.
+  * PHP: front controllers.
+* Implement core reachability algorithm (BFS/DFS).
+* Implement simple `VulnerabilitySignature` that uses function names and file paths.
+* Hook lattice engine in Scanner.WebService and integrate with:
+
+  * Concelier vulnerability feeds.
+  * VEXer.
+
+**Success criteria:**
+
+* For demo apps (Node, Django, Laravel):
+
+  * Identify vulnerable functions and mark them reachable/unreachable.
+  * Demonstrate noise reduction (some CVEs flagged as `not_affected`).
+
+---
+
+### Phase 2 – Binary CFG & Fingerprinting, Improved Confidence
+
+**Duration:** 1–2 sprints
+
+* Extend Symbolizer & CFG for:
+
+  * Stripped binaries (function hashing).
+  * Shared libraries (PLT/IAT resolution).
+* Implement `VulnerabilitySignature.BlockFingerprints` to distinguish patched vs vulnerable binary functions.
+* Refine confidence scoring:
+
+  * Use fingerprint match quality.
+  * Consider presence/absence of debug info.
+* Expand coverage:
+
+  * glibc, curl, zlib, OpenSSL, libxml2, etc.
+
+**Success criteria:**
+
+* For curated images:
+
+  * Confirm ability to differentiate patched vs vulnerable versions even when binaries are stripped.
+  * Reachability reflects true call paths across app→lib boundaries.
+
+---
+
+### Phase 3 – Runtime hooks (optional), UX, and Hardening
+
+**Duration:** 2+ sprints
+
+* Add opt-in runtime confirmation:
+
+  * eBPF probes for function hits (Linux).
+  * Map runtime addresses back to `ImageFunction` via symbolization.
+* Enhance console UX:
+
+  * Path explorer UI: show entrypoint → … → vulnerable function path.
+  * Evidence view with hash-based proofs.
+* Hardening:
+
+  * Performance optimization for large images (parallel analysis, caching).
+  * Conservative fallbacks for dynamic language features.
+
+**Success criteria:**
+
+* For selected environments where runtime is allowed:
+
+  * Static reachability is confirmed by runtime traces in majority of cases.
+  * No significant performance regression on typical images.
+
+---
+
+## 12. How this satisfies your initial bullets
+
+From your initial requirements:
+
+1. **JavaScript, Python, PHP, binary**
+   → Dedicated analyzers per language + binary symbolization/CFG, unified in `Scanner.Reachability`.
+
+2. **Executable containers**
+   → Analyzers run inside Scanner’s worker container, mounting the target image rootfs; no network access.
+
+3. **Libraries usage call graph**
+   → Call graphs map from entrypoints → app code → library functions; SBOM + PURLs tie functions to libraries.
+
+4. **Reachability analysis**
+   → BFS/DFS from entrypoints over per-language and binary graphs, with lattice-based merging in `Scanner.WebService`.
+
+5. **JSON + PURLs**
+   → All evidence is JSON with PURL-tagged components; SBOM is annotated, and VEX statements reference those PURLs.
+
+---
+
+If you like, next step can be: I draft concrete C# interface definitions (including some initial Tree-sitter integration stubs for JS/Python/PHP) and a skeleton of the `ReachabilityPlan` and `ReachabilityEngine` classes that you can drop into the monorepo.