feat: Add comprehensive documentation for binary reachability with PURL-resolved edges

- Introduced a detailed specification for encoding binary reachability that integrates call graphs with SBOMs. - Defined a minimal data model including nodes, edges, and SBOM components. - Outlined a step-by-step guide for building the reachability graph in a C#-centric manner. - Established core domain models, including enumerations for binary formats and symbol kinds. - Created a public API for the binary reachability service, including methods for graph building and serialization. - Specified SBOM component resolution and binary parsing abstractions for PE, ELF, and Mach-O formats. - Enhanced symbol normalization and digesting processes to ensure deterministic signatures. - Included error handling, logging, and a high-level test plan to ensure robustness and correctness. - Added non-functional requirements to guide performance, memory usage, and thread safety.
2025-11-20 23:16:02 +02:00
parent f0e74d2ee8
commit 522fff73cd
12 changed files with 4974 additions and 10 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -162,16 +162,19 @@ You will be explicitly told which role you are acting in. Your behavior must cha
 Your goals:
-1. Review new advisory files against:
+1. Review each file in the advisory directory and Identify new topics or features. 
-
+2. Then determine whether the topic is relevant by:
-   * Archived advisories: `docs/product-advisories/archive/*.md`.
+  2. 1. Go one by one the files and extract the essentials first - themes, topics, architecture decions
-   * Implementation plans: `docs/implplan/SPRINT_*.md`.
+  2. 2. Then read each of the archive/*.md files and seek if these are already had been advised. If it exists or it is close - then ignore the topic from the new advisory. Else keep it.
-   * Historical tasks: `docs/implplan/archived/all-tasks.md`.
+  2. 3. Check the relevant module docs: `docs/modules/<module>/*arch*.md` for compatibility or contradictions.
-2. Identify new topics or features that require implementation.
+  2. 4. Implementation plans: `docs/implplan/SPRINT_*.md`.
-3. For genuinely new items (not already implemented or planned):
+  2. 5. Historical tasks: `docs/implplan/archived/all-tasks.md`.
-
+  2. 4. For all of the new topics - then go in SPRINT*.md files and src/* (in according modules) for possible already implementation on the same topic. If same or close - ignore it. Otherwise keep it.
-   * Check the relevant module docs: `docs/modules/<module>/*arch*.md` for compatibility or contradictions.
+  2. 5. In case still genuine new topic - and it makes sense for the product - keep it.
-   * If contradictions arise, you must surface and discuss them with the requester (in prose) and propose alignments.
+3. When done for all files and all new genuine topics - present a report. Report must include:
  - all topics
  - what are the new things
  - what could be contracting existing tasks or implementations but might make sense to implemnt
 4. Once scope is agreed, hand over to your **project manager** role (4.2) to define implementation sprints and tasks.
 5. **Advisory and design decision sync**:
--- a/docs/product-advisories/17-Nov-2026
+++ b/docs/product-advisories/17-Nov-2026
--- a/docs/product-advisories/17-Nov-2026
+++ b/docs/product-advisories/17-Nov-2026
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
--- a/docs/product-advisories/18-Nov-2026
+++ b/docs/product-advisories/18-Nov-2026
--- a/docs/product-advisories/20-Nov-2026
+++ b/docs/product-advisories/20-Nov-2026
--- a/docs/product-advisories/20-Nov-2026
+++ b/docs/product-advisories/20-Nov-2026
@@ -0,0 +1,768 @@
 Here’s a quick, practical heads‑up about **binary initialization routines** and why they matter for reachability and vuln triage.
 ---
 ### What’s happening before `main()`
 In ELF binaries/shared objects, the runtime linker runs **constructors** *before* `main()`:
 * `.preinit_array` → runs first (rare, but highest priority)
 * `.init_array` → common place for constructors (ordered by index)
 * Legacy sections: `.init` (function) and `.ctors` (older toolchains)
 * On exit you also have `.fini_array` / `.fini`
 These constructors can:
 * Register signal/atexit handlers
 * Start threads, open sockets/files, tweak `LD_PRELOAD` hooks
 * Call library code you assumed was only used later
 So if you’re doing **call‑graph reachability** for vulnerability impact, starting from only `main()` (or exported APIs) can **miss real edges** that execute at load time.
 ---
 ### What to model (synthetic roots)
 Treat the following as **synthetic entry points** in your graph:
 1. All function pointers in `.preinit_array`
 2. All function pointers in `.init_array`
 3. The symbol `_init` (if present) and legacy `.ctors` entries
 4. For completeness on teardown paths: `.fini_array`, `_fini`
 5. **Dynamic loader interposition**: if `DT_NEEDED` libs have their own constructors, they’re roots too (even if you never call them explicitly)
 For PIE/DSO builds, remember that every loaded **dependency’s** init arrays run as part of `dlopen()`/program start—model those edges across DSOs.
 ---
 ### How to extract quickly
 * **Static parse**: read `PT_DYNAMIC`, then `DT_PREINIT_ARRAY`, `DT_INIT_ARRAY`, their sizes; iterate pointers and add edges to your graph.
 * **Symbol fallback**: if `DT_INIT`/`_init` exists, add it as a root.
 * **Ordering**: preserve index order inside arrays (it can matter).
 * **Relocations**: resolve `R_X86_64_RELATIVE` (etc.) so pointers point to the real code addresses.
 Mini‑C example (constructor runs pre‑main):
 ```c
 static void __attribute__((constructor)) boot(void) {
  // vulnerable call here executes before main()
 }
 int main(){ return 0; }
 ```
 ---
 ### For Stella Ops (binary reachability)
 * **Graph seeds**: `roots = { init arrays of main ELF + all DT_NEEDED DSOs }`
 * **Policy**: mark edges from these roots as `phase=load` vs `phase=runtime`, so your explainer can say “reachable at load time.”
 * **PURLs**: attach edges to the package/node that owns the constructor symbol (DSO package purl), not just the main app.
 * **Attestation**: store the discovered root list (addresses + resolved symbols + DSO soname) in your deterministic scan manifest, so audits can replay it.
 * **Heuristics**: if `dlopen()` is detected statically (strings/symbols), add a potential root “DLOPEN_INIT[*]” bucket for libs found under common plugin dirs.
 ---
 ### Quick checklist
 * [ ] Parse `.preinit_array`, `.init_array`, `.init` (and legacy `.ctors`)
 * [ ] Resolve relocations; preserve order
 * [ ] Seed graph with these as **synthetic roots**
 * [ ] Include constructors of every `DT_NEEDED` DSO
 * [ ] Tag edges as `phase=load` for prioritization/explainability
 * [ ] Persist root list in the scan’s evidence bundle
 If you want, I can drop in a tiny .NET/ELF parser snippet or a Rust routine that walks `DT_INIT_ARRAY` and returns symbol‑resolved roots next.
 Here’s a concrete, C#‑oriented spec you can hand to a developer to implement ELF init/constructor discovery and plug it into a reachability engine like Stella Ops.
 I’ll structure it like an internal design doc:
 1. What we need to do
 2. Public API (what the rest of the system calls)
 3. ELF parsing details (minimal, but correct)
 4. Constructor / init routine discovery algorithm
 5. Dynamic deps (DT_NEEDED) and load‑time roots
 6. Integration with the call graph / reachability
 7. Attestation / evidence output
 8. Testing strategy
 ---
 ## 1. Goal / Requirements
 **Business goal**
 When scanning ELF binaries and shared libraries, we must model functions that run **before `main()`** or at **library load/unload** as *synthetic entry points* in the call graph:
 * `.preinit_array` (pre‑init constructors)
 * `.init_array` (constructors)
 * Legacy constructs:
  * `.ctors` array
  * `_init` (via `DT_INIT`)
 * For teardown (optional but recommended):
  * `.fini_array`
  * `_fini` (via `DT_FINI`)
 **We must:**
 * Discover all these routines in:
  * The main executable
  * All its `DT_NEEDED` shared libraries (and any DSOs subsequently loaded, if we scan them)
 * Represent them as **roots** in the reachability graph:
  * `phase = Load` for preinit/init/constructors
  * `phase = Unload` for finalizers
 * Resolve each routine to:
  * Owning binary path and SONAME
  * Virtual address in the ELF
  * Best‑effort symbol name (`_ZN...`, `my_ctor`, etc.)
  * Order/index within its array (to preserve call order)
 * Emit a structured **evidence/attestation** record so scans are replayable.
 ---
 ## 2. Public API (C#)
 ### 2.1 Data model
 Create a small domain model in a library, e.g. `StellaOps.ElfInit`:
 ```csharp
 namespace StellaOps.ElfInit;
 public enum InitRoutineKind
 {
    PreInitArray,
    InitArray,
    LegacyCtorsSection,
    LegacyInitSymbol,
    FiniArray,
    LegacyFiniSymbol
 }
 public enum InitPhase
 {
    Load,
    Unload
 }
 public sealed record InitRoutineRoot(
    string BinaryPath,           // Full path on disk
    string? Soname,              // From DT_SONAME if present
    InitRoutineKind Kind,
    InitPhase Phase,
    ulong VirtualAddress,        // VA within this ELF
    ulong? FileOffset,           // File offset (if resolved), null if unknown
    string? SymbolName,          // Best-effort name from symbol table
    int? ArrayIndex              // Index for array-based roots
 );
 ```
 ### 2.2 Discovery service
 Public entry point that other components use:
 ```csharp
 public interface IInitRoutineDiscovery
 {
    /// <summary>
    /// Discover load/unload routines (constructors) in a single ELF file
    /// and, optionally, in its DT_NEEDED dependencies.
    /// </summary>
    InitDiscoveryResult Discover(string elfPath, InitDiscoveryOptions options);
 }
 public sealed record InitDiscoveryOptions
 {
    /// <summary>
    /// If true, also discover init routines in DT_NEEDED shared libraries
    /// (using IElfDependencyResolver to locate them on disk).
    /// </summary>
    public bool IncludeDependencies { get; init; } = true;
    /// <summary>
    /// If true, include fini routines (.fini_array, DT_FINI, etc.)
    /// as unload-phase roots.
    /// </summary>
    public bool IncludeUnloadPhase { get; init; } = true;
 }
 public sealed record InitDiscoveryResult(
    IReadOnlyList<InitRoutineRoot> Roots,
    IReadOnlyList<InitRoutineError> Errors // non-fatal problems per binary
 );
 public sealed record InitRoutineError(
    string BinaryPath,
    string Message,
    Exception? Exception = null
 );
 ```
 ### 2.3 Dependency resolution
 We don’t hard‑code how to find `DT_NEEDED` libraries on disk. Define an abstraction:
 ```csharp
 public interface IElfDependencyResolver
 {
    /// <summary>
    /// Resolve SONAME (e.g. "libc.so.6") to a local file path.
    /// Returns null if not found.
    /// </summary>
    string? ResolveLibrary(string soname, string referencingBinaryPath);
 }
 ```
 The implementation can respect `LD_LIBRARY_PATH`, typical system dirs, container images, etc., but that’s outside this spec.
 `IInitRoutineDiscovery` will depend on:
 * `IElfParser`
 * `IElfDependencyResolver`
 * `ISymbolResolver` (symbol tables)
 ---
 ## 3. ELF Parsing Spec (C#‑friendly)
 You can either use a NuGet ELF library or implement a minimal in‑house parser. This spec assumes a **minimal custom parser** that supports:
 * ELF64, little‑endian
 * ET_EXEC, ET_DYN
 * x86‑64 (`e_machine == EM_X86_64`) as v1; keep architecture pluggable for later
 ### 3.1 Core types
 Create an internal parser namespace, e.g. `StellaOps.Elf`:
 ```csharp
 internal sealed class ElfFile
 {
    public string Path { get; }
    public ElfClass ElfClass { get; }
    public ElfEndianness Endianness { get; }
    public ElfHeader Header { get; }
    public IReadOnlyList<ProgramHeader> ProgramHeaders { get; }
    public IReadOnlyList<SectionHeader> SectionHeaders { get; }
    public DynamicSection? Dynamic { get; }
    public ReadOnlyMemory<byte> RawBytes { get; }
    // Helper: mapping VA -> file offset using PT_LOAD segments
    public bool TryMapVaToFileOffset(ulong virtualAddress, out ulong fileOffset);
 }
 internal enum ElfClass { Elf32, Elf64 }
 internal enum ElfEndianness { Little, Big }
 // Fill out ElfHeader / ProgramHeader / SectionHeader / DynamicEntry types
 ```
 Implementation notes:
 * Read ELF header:
  * Validate magic: `0x7F 'E' 'L' 'F'`
  * `EI_CLASS` → 32/64‑bit
  * `EI_DATA` → endianness
 * Read **program headers** (`e_phoff`, `e_phnum`).
  * Identify `PT_LOAD` (for VA→file mapping).
  * Identify `PT_DYNAMIC` (for `DynamicSection`).
 * Read **section headers** (`e_shoff`, `e_shnum`).
  * Identify sections by name: `.preinit_array`, `.init_array`, `.fini_array`, `.ctors`.
  * You need the section name string table `.shstrtab` to decode names.
 ### 3.2 Dynamic section parsing
 Define dynamic section model:
 ```csharp
 internal sealed class DynamicSection
 {
    public IReadOnlyList<DynamicEntry> Entries { get; }
    public ulong? InitFunction { get; }          // DT_INIT
    public ulong? FiniFunction { get; }          // DT_FINI
    public ulong? InitArrayAddress { get; }      // DT_INIT_ARRAY
    public ulong? InitArraySize { get; }         // DT_INIT_ARRAYSZ
    public ulong? FiniArrayAddress { get; }      // DT_FINI_ARRAY
    public ulong? FiniArraySize { get; }         // DT_FINI_ARRAYSZ
    public ulong? PreInitArrayAddress { get; }   // DT_PREINIT_ARRAY
    public ulong? PreInitArraySize { get; }      // DT_PREINIT_ARRAYSZ
    public string? Soname { get; }               // DT_SONAME (decoded via DT_STRTAB)
    public IReadOnlyList<string> Needed { get; } // DT_NEEDED list
    public ulong? StrTabAddress { get; }
    public ulong? SymTabAddress { get; }
    public ulong? StrTabSize { get; }
 }
 ```
 Implementation details:
 * Dynamic entries are at `PT_DYNAMIC.p_offset`, each `Elf64_Dyn`:
  * `d_tag` (signed 64‑bit)
  * `d_un` union (`d_val` or `d_ptr`, treat as `ulong`)
 * Tags of interest (values are from ELF spec):
  * `DT_NULL = 0`
  * `DT_NEEDED = 1`
  * `DT_STRTAB = 5`
  * `DT_SYMTAB = 6`
  * `DT_STRSZ = 10`
  * `DT_INIT = 12`
  * `DT_FINI = 13`
  * `DT_SONAME = 14`
  * `DT_INIT_ARRAY = 25`
  * `DT_FINI_ARRAY = 26`
  * `DT_INIT_ARRAYSZ = 27`
  * `DT_FINI_ARRAYSZ = 28`
  * `DT_PREINIT_ARRAY = 32`
  * `DT_PREINIT_ARRAYSZ = 33`
 * To decode SONAME and NEEDED:
  * Use `DT_STRTAB` as base VA of the dynamic string table.
  * Map VA to file offset with `TryMapVaToFileOffset`.
  * For each `DT_NEEDED` / `DT_SONAME`, treat `d_val` as an offset into that string table; read a null‑terminated UTF‑8 C‑string.
 ---
 ## 4. Constructor & Init Routine Discovery
 We now define the algorithm implemented by `InitRoutineDiscovery` for a **single ELF file**.
 High‑level steps:
 1. Parse `ElfFile`.
 2. Parse `DynamicSection`.
 3. Resolve:
   * Pre‑init array (`DT_PREINIT_ARRAY`, `.preinit_array`)
   * Init array (`DT_INIT_ARRAY`, `.init_array`)
   * Legacy `.ctors`
   * `_init`, `_fini` via `DT_INIT`/`DT_FINI`
   * Fini array (`DT_FINI_ARRAY`, `.fini_array`)
 4. For each VA, optionally resolve symbol name.
 5. Build `InitRoutineRoot` entries.
 ### 4.1 Pointer size & endianness
 * For ELF64:
  * Pointer size = 8 bytes.
 * For ELF32:
  * Pointer size = 4 bytes (if/when you support it).
 * Use `BinaryPrimitives.ReadUInt64LittleEndian` or `ReadUInt64BigEndian` depending on `ElfEndianness`.
 ### 4.2 Mapping VA → file offset
 `ElfFile.TryMapVaToFileOffset`:
 * Iterate `ProgramHeaders` with `p_type == PT_LOAD`.
 * If `virtualAddress` in `[p_vaddr, p_vaddr + p_memsz)`:
  * `fileOffset = p_offset + (virtualAddress - p_vaddr)`
 * Return false if no matching segment.
 ### 4.3 Reading init arrays
 Generic helper:
 ```csharp
 internal static IReadOnlyList<ulong> ReadPointerArray(
    ElfFile elf,
    ulong arrayVa,
    ulong arrayBytes)
 {
    var results = new List<ulong>();
    if (!elf.TryMapVaToFileOffset(arrayVa, out var fileOffset))
        return results;
    int pointerSize = elf.ElfClass == ElfClass.Elf64 ? 8 : 4;
    int count = (int)(arrayBytes / (ulong)pointerSize);
    var span = elf.RawBytes.Span;
    for (int i = 0; i < count; i++)
    {
        ulong offset = fileOffset + (ulong)(i * pointerSize);
        if (offset + (ulong)pointerSize > (ulong)span.Length)
            break;
        ulong pointerValue = elf.Endianness switch
        {
            ElfEndianness.Little when pointerSize == 8
                => System.Buffers.Binary.BinaryPrimitives.ReadUInt64LittleEndian(span[(int)offset..]),
            ElfEndianness.Little
                => System.Buffers.Binary.BinaryPrimitives.ReadUInt32LittleEndian(span[(int)offset..]),
            ElfEndianness.Big when pointerSize == 8
                => System.Buffers.Binary.BinaryPrimitives.ReadUInt64BigEndian(span[(int)offset..]),
            _ // Big, 32-bit
                => System.Buffers.Binary.BinaryPrimitives.ReadUInt32BigEndian(span[(int)offset..]),
        };
        if (pointerValue != 0)
            results.Add(pointerValue);
    }
    return results;
 }
 ```
 Apply to:
 * Pre‑init: if `Dynamic.PreInitArrayAddress` and `Dynamic.PreInitArraySize` present.
 * Init: if `Dynamic.InitArrayAddress` and `Dynamic.InitArraySize` present.
 * Fini: if `Dynamic.FiniArrayAddress` and `Dynamic.FiniArraySize` present.
 ### 4.4 Legacy `.ctors` section
 Fallback for older toolchains:
 * Find section with `Name == ".ctors"`.
 * Its contents are just an array of pointers (same pointer size as ELF).
 * Some compilers include a sentinel `-1` or `0` at beginning or end. Treat:
  * `0` or `0xFFFFFFFFFFFFFFFF` (for 64‑bit) as sentinel; skip them.
 * Use similar `ReadPointerArray` logic but starting from `sh_offset` rather than a VA.
 ### 4.5 `_init` / `_fini` functions
 * `Dynamic.InitFunction` (from `DT_INIT`) is a single VA.
 * `Dynamic.FiniFunction` (from `DT_FINI`) likewise.
 Even if arrays exist, these may also be present; treat them as **independent roots**.
 ---
 ## 5. Symbol Resolution (best‑effort names)
 Define interface:
 ```csharp
 public interface ISymbolResolver
 {
    /// <summary>
    /// Find the symbol whose address matches `virtualAddress` exactly,
    /// or, if not found, the closest preceding symbol (with an offset).
    /// </summary>
    SymbolInfo? ResolveSymbol(ElfFile elf, ulong virtualAddress);
 }
 public sealed record SymbolInfo(
    string Name,
    ulong Value,
    ulong Size
 );
 ```
 Implementation sketch:
 * Use `.dynsym` (dynamic symbol table), and `.symtab` (full symbol table) if available.
 * Each symbol entry includes:
  * Name offset in string table
  * Value (VA)
  * Size
  * Type/binding (function, object, etc.)
 * Build an in‑memory index (e.g. sorted by `Value`) per ELF file.
 * `ResolveSymbol`:
  * Prefer exact match of `Value`.
  * If none, find symbol with largest `Value` less than `virtualAddress` and treat as “nearest symbol + offset”.
  * You can show just `Name` or `Name+0xOFFSET` in explanations; for `InitRoutineRoot` we store plain `Name`.
 ---
 ## 6. Dynamic Dependencies & Load-Time Roots
 When `InitDiscoveryOptions.IncludeDependencies == true`:
 1. For root ELF:
   * Discover its roots as above.
 2. For each `neededSoname` in `Dynamic.Needed`:
   * Ask `IElfDependencyResolver.ResolveLibrary(neededSoname, rootElfPath)`.
   * If it returns a path not yet processed:
     * Parse this ELF and recursively discover its roots.
 3. Return a **flat list** of all `InitRoutineRoot` objects, but with their own `BinaryPath`/`Soname`.
 Important: **We do not implicitly model `dlopen()`** at this stage. That’s separate:
 * As an optional heuristic, if the binary imports `dlopen`, tag those DSOs so later we can add “potential plugin load” roots. You can park this as a TODO in the comments.
 ---
 ## 7. Call Graph / Reachability Integration
 This depends on your existing modeling, but here’s a generic spec a C# dev can follow.
 Assume there is an internal model:
 ```csharp
 public sealed class CallGraph
 {
    public Node GetOrCreateNode(string binaryPath, ulong virtualAddress, string? symbolName);
    public Node GetOrCreateSyntheticRoot(string rootId, string description);
    public void AddEdge(Node from, Node to, CallEdgeMetadata metadata);
 }
 public sealed record CallEdgeMetadata(
    string EdgeKind,   // e.g. "loader-init"
    InitPhase Phase,   // Load / Unload
    InitRoutineKind InitKind,
    int? ArrayIndex
 );
 ```
 ### 7.1 Synthetic loader node
 Create a single graph node representing the dynamic loader / program start:
 ```csharp
 var loaderNode = callGraph.GetOrCreateSyntheticRoot(
    "LOADER",
    "ELF dynamic loader / process start"
 );
 ```
 ### 7.2 Adding edges for each root
 For each `InitRoutineRoot root`:
 1. Get or create a node for the target function:
   ```csharp
   var target = callGraph.GetOrCreateNode(
       root.BinaryPath,
       root.VirtualAddress,
       root.SymbolName
   );
   ```
 2. Add edge from loader:
   ```csharp
   callGraph.AddEdge(
       loaderNode,
       target,
       new CallEdgeMetadata(
           EdgeKind: "loader-init",
           Phase: root.Phase,
           InitKind: root.Kind,
           ArrayIndex: root.ArrayIndex
       )
   );
   ```
 3. Optional: If you model **per‑library** loader nodes, you can add:
   * `LOADER -> libLoaderNode`
   * `libLoaderNode -> each constructor`
   but that’s a nice‑to‑have, not required.
 ### 7.3 Phases
 * For `.preinit_array`, `.init_array`, `.ctors`, `_init`:
  * `Phase = InitPhase.Load`
 * For `.fini_array`, `_fini`:
  * `Phase = InitPhase.Unload`
 This allows downstream UI to say e.g.:
 > This vulnerable function is reachable at **load time** via constructor `foo()` in `libbar.so`.
 ---
 ## 8. Attestation / Evidence Output
 We want deterministic, auditable output per scan.
 Define a JSON schema (C# record) stored alongside other scan artifacts:
 ```csharp
 public sealed record InitRoutineEvidence(
    string ScannerVersion,
    DateTimeOffset ScanTimeUtc,
    IReadOnlyList<InitRoutineEvidenceEntry> Entries
 );
 public sealed record InitRoutineEvidenceEntry(
    string BinaryPath,
    string? Soname,
    InitRoutineKind Kind,
    InitPhase Phase,
    ulong VirtualAddress,
    ulong? FileOffset,
    string? SymbolName,
    int? ArrayIndex
 );
 ```
 Implementation details:
 * After `IInitRoutineDiscovery.Discover` completes:
  * Convert each `InitRoutineRoot` to `InitRoutineEvidenceEntry`.
  * Serialize with `System.Text.Json` (property names in camelCase or snake_case; choose a stable convention).
 * Store the evidence file e.g. `init_roots.json` inside the scan’s result directory.
 ---
 ## 9. Implementation Details & Edge Cases
 ### 9.1 Architectures
 First version:
 * Support:
  * `ElfClass.Elf64`
  * `ElfEndianness.Little`
  * `EM_X86_64`
 * For anything else:
  * Log an `InitRoutineError` and skip (but don’t hard‑fail the whole scan).
 Design the parser so architecture is an enum:
 ```csharp
 internal enum ElfMachine : ushort
 {
    X86_64 = 62,
    // others later
 }
 ```
 ### 9.2 Relocations (simplification)
 Real loaders apply relocations to constructor arrays; some pointers may be stored as relative relocations.
 For **v1 implementation**:
 * Assume that:
  * Array entries are already absolute VAs in the ELF’s address space (which is typical for non‑PIE or when link‑time addresses are used).
 * If you need better fidelity later:
  * Parse `.rela.dyn` / `.rel.dyn`.
  * Apply `R_X86_64_RELATIVE` relocations whose `r_offset` falls within the array’s address range:
    * Effective address = (base address + addend); if you treat base as 0, you get a VA that’s correct **within the file** (relative).
 Document this as a TODO so later you can extend without breaking the API.
 ### 9.3 Error handling
 * All parsing errors **must be non‑fatal** to the overall scan:
  * Record `InitRoutineError` with `BinaryPath`, message, and exception.
  * Continue with other binaries.
 * If a binary is not ELF or has invalid magic:
  * Return no roots, but optionally log a low‑severity error.
 ---
 ## 10. Testing Strategy
 ### 10.1 Unit tests with synthetic ELF fixtures
 Create a small test project `StellaOps.ElfInit.Tests` with known ELF files checked into test resources:
 * Binaries compiled with small C programs like:
  ```c
  static void __attribute__((constructor)) c1(void) {}
  static void __attribute__((constructor)) c2(void) {}
  static void __attribute__((destructor)) d1(void) {}
  int main() { return 0; }
  ```
 * Variants:
  * Using `.ctors` (old GCC flags) for legacy coverage.
  * Shared library with `__attribute__((constructor))` and `DT_NEEDED` from a main binary.
  * Binary with no constructors (expect zero roots).
 Assertions:
 * The count of `InitRoutineRoot` matches expected.
 * `Kind` and `Phase` are correct.
 * `ArrayIndex` is correctly ordered: 0,1,2 …
 * `SymbolName` contains expected mangled function names (if compiler doesn’t drop them).
 * For dependencies:
  * Discover roots in `libfoo.so` when main depends on it via `DT_NEEDED`.
 ### 10.2 Integration tests with call graph
 * Given a small binary and a known vulnerable function reachable from a constructor:
  * Run full pipeline.
  * Assert that the vulnerable function is marked reachable from synthetic `LOADER` node via the constructor.
 ### 10.3 Fuzz / robustness
 * Run the discovery on:
  * Random non‑ELF files.
  * Truncated ELF files.
  * Very large binaries.
 * Ensure no unhandled exceptions; only `InitRoutineError` entries.
 ---
 ## 11. Suggested C# Project Layout
 ```text
 src/
  StellaOps.ElfInit/
    IInitRoutineDiscovery.cs
    InitRoutineModels.cs
    InitRoutineDiscovery.cs
    IElfDependencyResolver.cs
    ISymbolResolver.cs
    Evidence/
      InitRoutineEvidence.cs
    Elf/
      ElfFile.cs
      ElfParser.cs
      ElfHeader.cs
      ProgramHeader.cs
      SectionHeader.cs
      DynamicSection.cs
      VaMapper.cs
      PointerArrayReader.cs
 tests/
  StellaOps.ElfInit.Tests/
    Resources/
      sample_no_ctor
      sample_init_array
      sample_preinit_init_fini
      sample_with_deps_main
      libsample_ctor.so
    InitRoutineDiscoveryTests.cs
 ```
 ---
 If you’d like, I can next:
 * Draft `InitRoutineDiscovery` in C# with full method bodies, or
 * Provide a minimal `ElfFile`/`ElfParser` implementation skeleton you can fill in.
--- a/docs/product-advisories/20-Nov-2026
+++ b/docs/product-advisories/20-Nov-2026
--- a/docs/product-advisories/20-Nov-2026
+++ b/docs/product-advisories/20-Nov-2026