Here’s a compact blueprint for bringing **stripped ELF binaries** into StellaOps’s **call‑graph + reachability scoring**—from raw bytes → neutral JSON → deterministic scoring. --- # Why this matters (quick) Even when symbols are missing, you can still (1) recover functions, (2) build a call graph, and (3) decide if a vulnerable function is *actually* reachable from the binary’s entrypoints. This feeds StellaOps’s deterministic scoring/lattice engine so VEX decisions are evidence‑backed, not guesswork. --- # High‑level pipeline 1. **Ingest** * Accept: ELF (static/dynamic), PIE, musl/glibc, multiple arches (x86_64, aarch64, armhf, riscv64). * Normalize: compute file hash set (SHA‑256, BLAKE3), note `PT_DYNAMIC`, `DT_NEEDED`, interpreter, RPATH/RUNPATH. 2. **Symbolization (best‑effort)** * **If DWARF present**: read `.debug_*` (function names, inlines, CU boundaries, ranges). * **If stripped**: * Use disassembler to **discover functions** (prolog patterns, xref‑to‑targets, thunk detection). * Derive **synthetic names**: `sub_`, `plt_` (from dynamic symbol table if available), `extern@libc.so.6:memcpy`. * Lift exported dynsyms and PLT stubs even when local symbols are removed. * Recover **string‑referenced names** (e.g., Go/Python/C++ RTTI/Itanium mangling where present). 3. **Disassembly & IR** * Disassemble to basic blocks; lift to a neutral IR (SSA‑like) sufficient for: * Call edges (direct `call`/`bl`). * **Indirect calls** via GOT/IAT, vtables, function pointers (approximate with points‑to sets). * Tailcalls, thunks, PLT interposition. 4. **Call‑graph build** * Start from **entrypoints**: * ELF entry (`_start`), constructors (`.init_array`), exported API (public symbols), `main` (if recoverable). * Optional: **entry‑trace** (cmd‑line + env + loader path) from container image to seed realistic roots. * Build **CG** with: * Direct edges: precise. * Indirect edges: conservative, with **evidence tags** (GOT target set, vtable class set, signature match). * Record **inter‑module edges** to shared libs (soname + version) with relocation evidence. 5. **Reachability scoring (deterministic)** * Input: list of vulnerable functions/paths (from CSAF/CVE KB) normalized to **function‑level identifiers** (soname!symbol or hash‑based if unnamed). * Compute **reachability** from roots → target: * `REACHABLE_CONFIRMED` (path with only precise edges), * `REACHABLE_POSSIBLE` (path contains conservative edges), * `NOT_REACHABLE_FOUNDATION` (no path in current graph), * Add **confidence** derived from edge evidence + relocation proof. * Emit **proof trails** (the exact path: nodes, edges, evidence). 6. **Neutral JSON intermediate (NJIF)** * Stored in cache; signed for deterministic replay. * Consumed by StellaOps.Policy/Lattice to merge with VEX. --- # Neutral JSON Intermediate Format (NJIF) ```json { "artifact": { "path": "/work/bin/app", "hashes": {"sha256": "…", "blake3": "…"}, "arch": "x86_64", "elf": { "type": "ET_DYN", "interpreter": "/lib64/ld-linux-x86-64.so.2", "needed": ["libc.so.6", "libssl.so.3"], "rpath": [], "runpath": [] } }, "symbols": { "exported": [ {"id": "libc.so.6!memcpy", "kind": "dynsym", "addr": "0x0", "plt": true} ], "functions": [ {"id": "sub_401000", "addr": "0x401000", "size": 112, "name_hint": null, "from": "disasm"}, {"id": "main", "addr": "0x4023d0", "size": 348, "from": "dwarf|heuristic"} ] }, "cfg": [ {"func": "main", "blocks": [ {"b": "0x4023d0", "succ": ["0x402415"], "calls": [{"type": "direct", "target": "sub_401000"}]}, {"b": "0x402415", "succ": ["0x402440"], "calls": [{"type": "plt", "target": "libc.so.6!memcpy"}]} ]} ], "cg": { "nodes": [ {"id": "main", "evidence": ["dwarf|heuristic"]}, {"id": "sub_401000"}, {"id": "libc.so.6!memcpy", "external": true, "lib": "libc.so.6"} ], "edges": [ {"from": "main", "to": "sub_401000", "kind": "direct"}, {"from": "main", "to": "libc.so.6!memcpy", "kind": "plt", "evidence": ["reloc@GOT"]} ], "roots": ["_start", "init_array[]", "main"] }, "reachability": [ { "target": "libssl.so.3!SSL_free", "status": "NOT_REACHABLE_FOUNDATION", "path": [] }, { "target": "libc.so.6!memcpy", "status": "REACHABLE_CONFIRMED", "path": ["main", "libc.so.6!memcpy"], "confidence": 0.98, "evidence": ["plt", "dynsym", "reloc"] } ], "provenance": { "toolchain": { "disasm": "ghidra_headless|radare2|llvm-mca", "version": "…" }, "scan_manifest_hash": "…", "timestamp_utc": "2025-11-16T00:00:00Z" } } ``` --- # Practical extractors (headless/CLI) * **DWARF**: `llvm-dwarfdump`/`eu-readelf` for quick CU/function ranges; fall back to the disassembler. * **Disassembly/CFG/CG** (choose one or more; wrap with a stable adapter): * **Ghidra Headless API**: recover functions, basic blocks, references, PLT/GOT, vtables; export via a custom headless script to NJIF. * **radare2 / rizin**: `aaa`, `agCd`, `aflj`, `agj` to export functions/graphs as JSON. * **Binary Ninja headless** (if license permits) for cleaner IL and indirect‑call modeling. * **angr** for path‑sensitive refinement on tricky indirect calls (optional, gated by budget). **Adapter principle:** All tools output a **small, consistent NJIF** so the scoring engine and lattice logic never depend on any single RE tool. --- # Indirect call modeling (concise rules) * **PLT/GOT**: edge from caller → `soname!symbol` with evidence: `plt`, `reloc@GOT`. * **Function pointers**: if a store to a pointer is found and targets a known function set `{f1…fk}`, add edges with `kind: "indirect"`, `evidence: ["xref-store", "sig-compatible"]`. * **Virtual calls / vtables**: class‑method set from RTTI/vtable scans; mark edges `evidence: ["vtable-match"]`. * **Tailcalls**: treat as edges, not fallthrough. Each conservative step lowers **confidence**, but keeps determinism: the rules and their hashes are in the scan manifest. --- # Deterministic scoring (plug into Stella’s lattice) * **Inputs**: NJIF, CVE→function mapping (`soname!symbol` or function hash), policy knobs. * **States**: `{NOT_OBSERVED < POSSIBLE < REACHABLE_CONFIRMED}` with **monotone** merge (never oscillates). * **Confidence**: product of edge evidences (configurable weights): `direct=1.0, plt=0.98, vtable=0.85, funcptr=0.7`. * **Output**: OpenVEX/CSAF annotations + human proof path; signed with DSSE to preserve replayability. --- # Minimal Ghidra headless skeleton (exporter idea) ```bash analyzeHeadless /work/gh_proj MyProj -import app -scriptPath scripts \ -postScript ExportNjif.java /out/app.njif.json ``` ```java // ExportNjif.java (outline) public class ExportNjif extends GhidraScript { public void run() throws Exception { var fns = getFunctions(true); // collect functions, blocks, calls, externs/PLT // map non‑named functions to sub_ // detect PLT thunks → dynsym names // write NJIF JSON deterministically (sorted keys, stable ordering) } } ``` --- # Integration points in StellaOps * **Scanner.Analyzers.Binary.Elf** * `ElfNormalizer` → hashes, dynamic deps. * `Symbolizer` → DWARF reader + HeuristicDisasm (via tool adapter). * `CgBuilder` → NJIF builder/merger (multi‑module). * `ReachabilityEngine` → path search + confidence math. * `Emitter` → NJIF cache + VEX/CSAF notes. * **Scheduler**: memoize by `(hashes, toolchain_version, ruleset_hash)` to ensure replayable results. * **Authority**: sign NJIF + scoring outputs; store manifests (feeds, rule weights, tool versions). --- # Test fixtures (suggested) * Tiny ELF zoo: statically linked, PIE, stripped/non‑stripped, C++ with vtables, musl vs glibc. * Known CVE libs (e.g., `libssl`, `zlib`) with versioned symbols to validate soname!symbol mapping. * Synthetic binaries with function‑pointer tables to validate conservative edges. --- If you want, I can generate: * A ready‑to‑run **Ghidra headless exporter** (Java) that writes NJIF exactly like above. * A small **.NET parser** that ingests NJIF and emits StellaOps reachability + OpenVEX notes. Below is a full architecture plan for implementing **stripped-ELF binary reachability** (call graph + NJIF + deterministic scoring, with a hook for patch-oracles) inside **StellaOps**. I will assume .NET 10, existing microservice split (Scanner.WebService, Scanner.Worker, Concelier, Excitior, Authority, Scheduler, Sbomer, Signals), and your standing rule: **all lattice logic runs in Scanner.WebService**. --- ## 1. Scope, Objectives, Non-Goals ### 1.1 Objectives 1. **Recover function-level call graphs from ELF binaries**, including **stripped** ones: * Support ET_EXEC / ET_DYN / PIE, static & dynamic linking. * Support at least **x86_64, aarch64** in v1, later armhf, riscv64. 2. **Produce a neutral, deterministic JSON representation (NJIF)**: * Tool-agnostic: can be generated from Ghidra, radare2/rizin, Binary Ninja, angr, etc. * Stable identifiers and schema so downstream services don’t depend on a specific RE engine. 3. **Compute function-level reachability for vulnerabilities**: * Given CVE → `soname!symbol` (and later function-hash) mappings from Concelier, * Decide `REACHABLE_CONFIRMED` / `REACHABLE_POSSIBLE` / `NOT_REACHABLE_FOUNDATION` with evidence and confidence. 4. **Integrate with StellaOps lattice and VEX outputs**: * Lattice logic runs in **Scanner.WebService**. * Results flow into Excitior (VEX) and Sbomer (SBOM annotations), preserving provenance. 5. **Enable deterministic replay**: * Every analysis run is tied to a **Scan Manifest**: tool versions, ruleset hashes, policy hashes, container image digests. ### 1.2 Non-Goals (v1) * No dynamic runtime probes (EventPipe/JFR) in this phase. * No full decompilation; we only need enough IR for calls/edges. * No aggressive path-sensitive analysis (symbolic execution) in v1; that can be a v2 enhancement. --- ## 2. High-Level System Architecture ### 2.1 Components * **Scanner.WebService (existing)** * REST/gRPC API for scans. * Orchestrates analysis jobs via Scheduler. * Hosts **Lattice & Reachability Engine** for all artifact types. * Reads NJIF results, merges with Concelier function mappings and policies. * **Scanner.Worker (existing, extended)** * Executes **Binary Analyzer Pipelines**. * Invokes RE tools (Ghidra, rizin, etc.) in dedicated containers. * Produces NJIF and persists it. * **Binary Tools Containers (new)** * `stellaops-tools-ghidra:` * `stellaops-tools-rizin:` * Optionally `stellaops-tools-angr` for advanced passes. * Pinned versions, no network access (for determinism & air-gap). * **Storage & Metadata** * **DB (PostgreSQL)**: scan records, NJIF metadata, reachability summaries. * **Object store** (MinIO/S3/Filesystem): NJIF JSON blobs, tool logs. * **Authority**: DSSE signatures for Scan Manifest, NJIF, and reachability outputs. * **Concelier** * Provides **CVE → component → function symbol/hashes** resolution. * Exposes “Link-Not-Merge” graph of advisory, component, and function nodes. * **Excitior (VEX)** * Consumes Scanner.WebService reachability states. * Emits OpenVEX/CSAF with properly justified statuses. * **UnknownsRegistry (future)** * Receives unresolvable call edges / ambiguous functions from the analyzer, * Feeds them into “adaptive security” workflows. ### 2.2 End-to-End Flow (Binary / Image Scan) 1. Client requests scan (binary or container image) via **Scanner.WebService**. 2. WebService: * Extracts binaries from OCI layers (if scanning image), * Registers **Scan Manifest**, * Submits a job to Scheduler (queue: `binary-elfflow`). 3. Scanner.Worker dequeues the job: * Detects ELF binaries, * Runs **Binary Analyzer Pipeline** for each unique binary hash. 4. Worker uses tools containers: * Ghidra/rizin → CFG, function discovery, call graph, * Converts to **NJIF**. 5. Worker persists NJIF + metadata; marks analysis complete. 6. Scanner.WebService picks up NJIF: * Fetches advisory function mappings from Concelier, * Runs **Reachability & Lattice scoring**, * Updates scan results and triggers Excitior / Sbomer. All steps are deterministic given: * Input artifact, * Tool container digests, * Ruleset/policy versions. --- ## 3. Binary Analyzer Subsystem (Scanner.Worker) Introduce a dedicated module: * `StellaOps.Scanner.Analyzers.Binary.Elf` ### 3.1 Internal Layers 1. **ElfDetector** * Inspects files in a scan: * Magic `0x7f 'E' 'L' 'F'`, * Confirms architecture via ELF header. * Produces `BinaryArtifact` records with: * `hashes` (SHA-256, BLAKE3), * `path` in container, * `arch`, `endianness`. 2. **ElfNormalizer** * Uses a lightweight library (e.g., ElfSharp) to extract: * `ElfType` (ET_EXEC, ET_DYN), * interpreter (`PT_INTERP`), * `DT_NEEDED` list, * RPATH/RUNPATH, * presence/absence of DWARF sections. * Emits a normalized `ElfMetadata` DTO. 3. **Symbolization Layer** * Sub-components: * `DwarfSymbolReader`: if DWARF present, read CU, function ranges, names, inlines. * `DynsymReader`: parse `.dynsym`, `.plt`, exported symbols. * `HeuristicFunctionFinder`: * For stripped binaries: * Use disassembler xrefs, prolog patterns, return instructions, call-targets. * Recognize PLT thunks → `soname!symbol`. * Consolidates into `FunctionSymbol` entities: * `id` (e.g., `main`, `sub_401000`, `libc.so.6!memcpy`), * `addr`, `size`, `is_external`, `from` (`dwarf`, `dynsym`, `heuristic`). 4. **Disassembly & IR Layer** * Abstraction: `IDisassemblyAdapter`: * `Task AnalyzeAsync(BinaryArtifact, ElfMetadata, ScanManifest)` * Implementations: * `GhidraDisassemblyAdapter`: * Invokes headless Ghidra in container, * Receives machine-readable JSON (script-produced), * Extracts functions, basic blocks, calls, GOT/PLT info, vtables. * `RizinDisassemblyAdapter` (backup/fallback). * Produces: * `BasicBlock` objects, * `Instruction` metadata where needed for calls, * `CallSite` records (direct, PLT, indirect). 5. **Call-Graph Builder** * Consumes `FunctionSymbol` + `CallSite` sets. * Identifies **roots**: * `_start`, `.init_array` entries, * `main` (if present), * Exported API functions for shared libs. * Creates `CallGraph`: * Nodes: functions (`FunctionNode`), * Edges: `CallEdge` with: * `kind`: `direct`, `plt`, `indirect-funcptr`, `indirect-vtable`, `tailcall`, * `evidence`: tags like `["reloc@GOT", "sig-match", "vtable-class"]`. 6. **Evidence & Confidence Annotator** * For each edge, computes a **local confidence**: * `direct`: 1.0 * `plt`: 0.98 * `indirect-funcptr`: 0.7 * `indirect-vtable`: 0.85 * For each path later, Scanner.WebService composes these. 7. **NJIF Serializer** * Transforms domain objects into **NJIF JSON**: * Sorted keys, stable ordering for determinism. * Writes: * `artifact`, `elf`, `symbols`, `cfg`, `cg`, and partial `reachability: []` (filled by WebService). * Stores in object store, returns location + hash to DB. 8. **Unknowns Reporting** * Any unresolved: * Indirect call with empty target set, * Function region not mapped to symbol, * Logged as `UnknownEvidence` records and optionally published to **UnknownsRegistry** stream. --- ## 4. NJIF Data Model (Neutral JSON Intermediate Format) Define a stable schema with a top-level `njif_schema_version` field. ### 4.1 Top-Level Shape ```json { "njif_schema_version": "1.0.0", "artifact": { ... }, "symbols": { ... }, "cfg": [ ... ], "cg": { ... }, "reachability": [ ... ], "provenance": { ... } } ``` ### 4.2 Key Sections 1. `artifact` * `path`, `hashes`, `arch`, `elf.type`, `interpreter`, `needed`, `rpath`, `runpath`. 2. `symbols` * `exported`: external/dynamic symbols, especially PLT: * `id`, `kind`, `plt`, `lib`. * `functions`: * `id` (synthetic or real name), * `addr`, `size`, `from` (source of naming info), * `name_hint` (optional). 3. `cfg` * Per-function basic block CFG plus call sites: * Blocks with `succ`, `calls` entries. * Sufficient for future static checks, not full IR. 4. `cg` * `nodes`: function nodes with evidence tags. * `edges`: call edges with: * `from`, `to`, `kind`, `evidence`. * `roots`: entrypoints for reachability algorithms. 5. `reachability` * Initially empty from Worker. * Populated in Scanner.WebService as: ```json { "target": "libssl.so.3!SSL_free", "status": "REACHABLE_CONFIRMED", "path": ["_start", "main", "libssl.so.3!SSL_free"], "confidence": 0.93, "evidence": ["plt", "dynsym", "reloc"] } ``` 6. `provenance` * `toolchain`: * `disasm`: `"ghidra_headless:10.4"`, etc. * `scan_manifest_hash`, * `timestamp_utc`. ### 4.3 Persisting NJIF * Object store (versioned path): * `njif/{sha256}/njif-v1.json` * DB table `binary_njif`: * `binary_hash`, `njif_hash`, `schema_version`, `toolchain_digest`, `scan_manifest_id`. --- ## 5. Reachability & Lattice Integration (Scanner.WebService) ### 5.1 Inputs * **NJIF** for each binary (possibly multiple binaries per container). * Concelier’s **CVE → (component, function)** resolution: * `component_id` → `soname!symbol` sets, and where available, function hashes. * Scanner’s existing **lattice policies**: * States: e.g. `NOT_OBSERVED < POSSIBLE < REACHABLE_CONFIRMED`. * Merge rules are monotone. ### 5.2 Reachability Engine New service module: * `StellaOps.Scanner.Domain.Reachability` * `INjifRepository` (reads NJIF JSON), * `IFunctionMappingResolver` (Concelier adapter), * `IReachabilityCalculator`. Algorithm per target function: 1. Resolve vulnerable function(s): * From Concelier: `soname!symbol` and/or `func_hash`. * Map to NJIF `symbols.exported` or `symbols.functions`. 2. For each binary: * Use `cg.roots` as entry set. * BFS/DFS along edges until: * Reaching target node(s), * Or graph fully explored. 3. For each successful path: * Collect edges’ `confidence` weights, compute path confidence: * e.g., product of edge confidences or a log/additive scheme. 4. Aggregate result: * If ≥ 1 path with only `direct/plt` edges: * `status = REACHABLE_CONFIRMED`. * Else if only paths with indirect edges: * `status = REACHABLE_POSSIBLE`. * Else: * `status = NOT_REACHABLE_FOUNDATION`. 5. Emit `reachability` entry back into NJIF (or as separate DB table) and into scan result graph. ### 5.3 Lattice & VEX * Lattice computation is done per `(CVE, component, binary)` triple: * Input: reachability status + other signals. * Resulting state is: * Exposed to **Excitior** as a set of **evidence-annotated VEX facts**. * Excitior translates: * `NOT_REACHABLE_FOUNDATION` → likely `not_affected` with justification “code_not_reachable”. * `REACHABLE_CONFIRMED` → `affected` or “present_and_exploitable” (depending on overall policy). --- ## 6. Patch-Oracle Extension (Advanced, but Architected Now) While not strictly required for v1, we should reserve architecture hooks. ### 6.1 Concept * Given: * A **vulnerable** library build (or binary), * A **patched** build. * Run analyzers on both; produce NJIF for each. * Compare call graphs & function bodies (e.g., hash of normalized bytes): * Identify **changed functions** and potentially changed code regions. * Concelier links those function IDs to specific CVEs (via vendor patch metadata). * These become authoritative “patched function sets” (the **patch oracle**). ### 6.2 Integration Points Add a module: * `StellaOps.Scanner.Analysis.PatchOracle` * Input: pair of artifact hashes (old, new) + NJIF. * Output: list of `FunctionPatchRecord`: * `function_id`, `binary_hash_old`, `binary_hash_new`, `change_kind` (`added`, `modified`, `deleted`). Concelier: * Ingests `FunctionPatchRecord` via internal API and updates advisory graph: * CVE → function set derived from real patch. * Reachability Engine: * Uses patch-derived function sets instead of or in addition to symbol mapping from vendor docs. --- ## 7. Persistence, Determinism, Caching ### 7.1 Scan Manifest For every scan job, create: * `scan_manifest`: * Input artifact hashes, * List of binaries, * Tool container digests (Ghidra, rizin, etc.), * Ruleset/policy/lattice hashes, * Time, user, and config flags. Authority signs this manifest with DSSE. ### 7.2 Binary Analysis Cache Key: `(binary_hash, arch, toolchain_digest, njif_schema_version)`. * If present: * Skip re-running Ghidra/rizin; reuse NJIF. * If absent: * Run analysis, then cache NJIF. This provides deterministic replay and prevents re-analysis across scans and across customers (if allowed by tenancy model). --- ## 8. APIs & Integration Contracts ### 8.1 Scanner.WebService External API (REST) 1. `POST /api/scans/images` * Existing; extended to flag: `includeBinaryReachability: true`. 2. `POST /api/scans/binaries` * Upload a standalone ELF; returns `scan_id`. 3. `GET /api/scans/{scanId}/reachability` * Returns list of `(cve_id, component, binary_path, function_id, status, confidence, path)`. No path versioning; idempotent and additive (new fields appear, old ones remain valid). ### 8.2 Internal APIs * **Worker ↔ Object Store**: * `PUT /binary-njif/{sha256}/njif-v1.json`. * **WebService ↔ Worker (via Scheduler)**: * Job payload includes: * `scan_manifest_id`, * `binary_hashes`, * `analysis_profile` (`default`, `deep`). * **WebService ↔ Concelier**: * `POST /internal/functions/resolve`: * Input: `(cve_id, component_ids[])`, * Output: `soname!symbol[]`, optional `func_hash[]`. * **WebService ↔ Excitior**: * Existing VEX ingestion extended with **reachability evidence** fields. --- ## 9. Observability, Security, Resource Model ### 9.1 Observability * **Metrics**: * Analysis duration per binary, * NJIF size, * Cache hit ratio, * Reachability evaluation time per CVE. * **Logs**: * Ghidra/rizin container logs stored alongside NJIF, * Unknowns logs for unresolved call targets. * **Tracing**: * Each scan/analysis annotated with `scan_manifest_id` to allow end-to-end trace. ### 9.2 Security * Tools containers: * No outbound network. * Limited to read-only artifact mount + write-only result mount. * Binary content: * Treated as confidential; stored encrypted at rest if your global policy requires it. * DSSE: * Authority signs: * Scan Manifest, * NJIF blob hash, * Reachability summary. * Enables “Proof-of-Integrity Graph” linkage later. ### 9.3 Resource Model * ELF analysis can be heavy; design for: * Separate **worker queue** and autoscaling group for binary analysis. * Configurable max concurrency and per-job CPU/memory limits. * Deep analysis (indirect calls, vtables) can be toggled via `analysis_profile`. --- ## 10. Implementation Roadmap A pragmatic, staged plan: ### Phase 0 – Foundations (1–2 sprints) * Create `StellaOps.Scanner.Analyzers.Binary.Elf` project. * Implement: * `ElfDetector`, `ElfNormalizer`. * DB tables: `binary_artifacts`, `binary_njif`. * Integrate with Scheduler and Worker pipeline. ### Phase 1 – Non-stripped ELF + NJIF v1 (2–3 sprints) * Implement **DWARF + dynsym symbolization**. * Implement **GhidraDisassemblyAdapter** for x86_64. * Build **CallGraphBuilder** (direct + PLT calls). * Implement NJIF serializer v1; store in object store. * Basic reachability engine in WebService: * Only direct and PLT edges, * Only for DWARF-named functions. * Integrate with Concelier function mapping via `soname!symbol`. ### Phase 2 – Stripped ELF Support (2–3 sprints) * Implement `HeuristicFunctionFinder` for function discovery in stripped binaries. * Extend Ghidra script to mark PLT/GOT, vtables, function pointers. * Call graph: add: * `indirect-funcptr`, `indirect-vtable`, `tailcall` edges. * Evidence tagging and local confidence scoring. * Extend reachability engine to: * Distinguish `REACHABLE_CONFIRMED` vs `REACHABLE_POSSIBLE`. ### Phase 3 – Multi-Arch & Performance (2–3 sprints) * Add support for **aarch64** (Ghidra language, appropriate calling conventions). * Optimize: * Binary analysis cache, * Tool container lifecycle, * Concurrent analysis. * Add Unknowns reporting and hookup to UnknownsRegistry (if already implemented). ### Phase 4 – Patch-Oracle Pilot (2–3 sprints) * Implement PatchOracle module: * Compare old/new NJIFs, * Detect changed functions. * Integrate with Concelier’s advisory graph. * Start validating against curated CVE/patch datasets. ### Phase 5 – Hardening & Documentation * Golden fixtures: * Small ELF zoo (stripped/non-stripped, PIE, static, C++, vtables). * Known vulnerable libs (e.g., OpenSSL, zlib) to confirm correct function mapping. * Add CLI/demo in `StellaOps.Scanner.Cli`: * `stellaops scan-binary --file app --show-reachability`. * Customer-facing and internal docs: * NJIF schema, * API usage, * Limitations and interpretation guidelines. --- If you want, next step I can do is take this plan and: * Break it into **epics / tickets** (SCAN-BINARY-xxx) with clear DoD per phase, or * Draft the **Ghidra headless Java script** and the **.NET NJIF model classes** so your agents can plug them straight into the Scanner repo.