feat: Add comprehensive documentation for binary reachability with PURL-resolved edges
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled

- Introduced a detailed specification for encoding binary reachability that integrates call graphs with SBOMs.
- Defined a minimal data model including nodes, edges, and SBOM components.
- Outlined a step-by-step guide for building the reachability graph in a C#-centric manner.
- Established core domain models, including enumerations for binary formats and symbol kinds.
- Created a public API for the binary reachability service, including methods for graph building and serialization.
- Specified SBOM component resolution and binary parsing abstractions for PE, ELF, and Mach-O formats.
- Enhanced symbol normalization and digesting processes to ensure deterministic signatures.
- Included error handling, logging, and a high-level test plan to ensure robustness and correctness.
- Added non-functional requirements to guide performance, memory usage, and thread safety.
This commit is contained in:
2025-11-20 23:16:02 +02:00
parent f0e74d2ee8
commit 522fff73cd
12 changed files with 4974 additions and 10 deletions

View File

@@ -162,16 +162,19 @@ You will be explicitly told which role you are acting in. Your behavior must cha
Your goals: Your goals:
1. Review new advisory files against: 1. Review each file in the advisory directory and Identify new topics or features.
2. Then determine whether the topic is relevant by:
* Archived advisories: `docs/product-advisories/archive/*.md`. 2. 1. Go one by one the files and extract the essentials first - themes, topics, architecture decions
* Implementation plans: `docs/implplan/SPRINT_*.md`. 2. 2. Then read each of the archive/*.md files and seek if these are already had been advised. If it exists or it is close - then ignore the topic from the new advisory. Else keep it.
* Historical tasks: `docs/implplan/archived/all-tasks.md`. 2. 3. Check the relevant module docs: `docs/modules/<module>/*arch*.md` for compatibility or contradictions.
2. Identify new topics or features that require implementation. 2. 4. Implementation plans: `docs/implplan/SPRINT_*.md`.
3. For genuinely new items (not already implemented or planned): 2. 5. Historical tasks: `docs/implplan/archived/all-tasks.md`.
2. 4. For all of the new topics - then go in SPRINT*.md files and src/* (in according modules) for possible already implementation on the same topic. If same or close - ignore it. Otherwise keep it.
* Check the relevant module docs: `docs/modules/<module>/*arch*.md` for compatibility or contradictions. 2. 5. In case still genuine new topic - and it makes sense for the product - keep it.
* If contradictions arise, you must surface and discuss them with the requester (in prose) and propose alignments. 3. When done for all files and all new genuine topics - present a report. Report must include:
- all topics
- what are the new things
- what could be contracting existing tasks or implementations but might make sense to implemnt
4. Once scope is agreed, hand over to your **project manager** role (4.2) to define implementation sprints and tasks. 4. Once scope is agreed, hand over to your **project manager** role (4.2) to define implementation sprints and tasks.
5. **Advisory and design decision sync**: 5. **Advisory and design decision sync**:

View File

@@ -0,0 +1,768 @@
Heres a quick, practical headsup about **binary initialization routines** and why they matter for reachability and vuln triage.
---
### Whats happening before `main()`
In ELF binaries/shared objects, the runtime linker runs **constructors** *before* `main()`:
* `.preinit_array` → runs first (rare, but highest priority)
* `.init_array` → common place for constructors (ordered by index)
* Legacy sections: `.init` (function) and `.ctors` (older toolchains)
* On exit you also have `.fini_array` / `.fini`
These constructors can:
* Register signal/atexit handlers
* Start threads, open sockets/files, tweak `LD_PRELOAD` hooks
* Call library code you assumed was only used later
So if youre doing **callgraph reachability** for vulnerability impact, starting from only `main()` (or exported APIs) can **miss real edges** that execute at load time.
---
### What to model (synthetic roots)
Treat the following as **synthetic entry points** in your graph:
1. All function pointers in `.preinit_array`
2. All function pointers in `.init_array`
3. The symbol `_init` (if present) and legacy `.ctors` entries
4. For completeness on teardown paths: `.fini_array`, `_fini`
5. **Dynamic loader interposition**: if `DT_NEEDED` libs have their own constructors, theyre roots too (even if you never call them explicitly)
For PIE/DSO builds, remember that every loaded **dependencys** init arrays run as part of `dlopen()`/program start—model those edges across DSOs.
---
### How to extract quickly
* **Static parse**: read `PT_DYNAMIC`, then `DT_PREINIT_ARRAY`, `DT_INIT_ARRAY`, their sizes; iterate pointers and add edges to your graph.
* **Symbol fallback**: if `DT_INIT`/`_init` exists, add it as a root.
* **Ordering**: preserve index order inside arrays (it can matter).
* **Relocations**: resolve `R_X86_64_RELATIVE` (etc.) so pointers point to the real code addresses.
MiniC example (constructor runs premain):
```c
static void __attribute__((constructor)) boot(void) {
// vulnerable call here executes before main()
}
int main(){ return 0; }
```
---
### For StellaOps (binary reachability)
* **Graph seeds**: `roots = { init arrays of main ELF + all DT_NEEDED DSOs }`
* **Policy**: mark edges from these roots as `phase=load` vs `phase=runtime`, so your explainer can say “reachable at load time.”
* **PURLs**: attach edges to the package/node that owns the constructor symbol (DSO package purl), not just the main app.
* **Attestation**: store the discovered root list (addresses + resolved symbols + DSO soname) in your deterministic scan manifest, so audits can replay it.
* **Heuristics**: if `dlopen()` is detected statically (strings/symbols), add a potential root “DLOPEN_INIT[*]” bucket for libs found under common plugin dirs.
---
### Quick checklist
* [ ] Parse `.preinit_array`, `.init_array`, `.init` (and legacy `.ctors`)
* [ ] Resolve relocations; preserve order
* [ ] Seed graph with these as **synthetic roots**
* [ ] Include constructors of every `DT_NEEDED` DSO
* [ ] Tag edges as `phase=load` for prioritization/explainability
* [ ] Persist root list in the scans evidence bundle
If you want, I can drop in a tiny .NET/ELF parser snippet or a Rust routine that walks `DT_INIT_ARRAY` and returns symbolresolved roots next.
Heres a concrete, C#oriented spec you can hand to a developer to implement ELF init/constructor discovery and plug it into a reachability engine like StellaOps.
Ill structure it like an internal design doc:
1. What we need to do
2. Public API (what the rest of the system calls)
3. ELF parsing details (minimal, but correct)
4. Constructor / init routine discovery algorithm
5. Dynamic deps (DT_NEEDED) and loadtime roots
6. Integration with the call graph / reachability
7. Attestation / evidence output
8. Testing strategy
---
## 1. Goal / Requirements
**Business goal**
When scanning ELF binaries and shared libraries, we must model functions that run **before `main()`** or at **library load/unload** as *synthetic entry points* in the call graph:
* `.preinit_array` (preinit constructors)
* `.init_array` (constructors)
* Legacy constructs:
* `.ctors` array
* `_init` (via `DT_INIT`)
* For teardown (optional but recommended):
* `.fini_array`
* `_fini` (via `DT_FINI`)
**We must:**
* Discover all these routines in:
* The main executable
* All its `DT_NEEDED` shared libraries (and any DSOs subsequently loaded, if we scan them)
* Represent them as **roots** in the reachability graph:
* `phase = Load` for preinit/init/constructors
* `phase = Unload` for finalizers
* Resolve each routine to:
* Owning binary path and SONAME
* Virtual address in the ELF
* Besteffort symbol name (`_ZN...`, `my_ctor`, etc.)
* Order/index within its array (to preserve call order)
* Emit a structured **evidence/attestation** record so scans are replayable.
---
## 2. Public API (C#)
### 2.1 Data model
Create a small domain model in a library, e.g. `StellaOps.ElfInit`:
```csharp
namespace StellaOps.ElfInit;
public enum InitRoutineKind
{
PreInitArray,
InitArray,
LegacyCtorsSection,
LegacyInitSymbol,
FiniArray,
LegacyFiniSymbol
}
public enum InitPhase
{
Load,
Unload
}
public sealed record InitRoutineRoot(
string BinaryPath, // Full path on disk
string? Soname, // From DT_SONAME if present
InitRoutineKind Kind,
InitPhase Phase,
ulong VirtualAddress, // VA within this ELF
ulong? FileOffset, // File offset (if resolved), null if unknown
string? SymbolName, // Best-effort name from symbol table
int? ArrayIndex // Index for array-based roots
);
```
### 2.2 Discovery service
Public entry point that other components use:
```csharp
public interface IInitRoutineDiscovery
{
/// <summary>
/// Discover load/unload routines (constructors) in a single ELF file
/// and, optionally, in its DT_NEEDED dependencies.
/// </summary>
InitDiscoveryResult Discover(string elfPath, InitDiscoveryOptions options);
}
public sealed record InitDiscoveryOptions
{
/// <summary>
/// If true, also discover init routines in DT_NEEDED shared libraries
/// (using IElfDependencyResolver to locate them on disk).
/// </summary>
public bool IncludeDependencies { get; init; } = true;
/// <summary>
/// If true, include fini routines (.fini_array, DT_FINI, etc.)
/// as unload-phase roots.
/// </summary>
public bool IncludeUnloadPhase { get; init; } = true;
}
public sealed record InitDiscoveryResult(
IReadOnlyList<InitRoutineRoot> Roots,
IReadOnlyList<InitRoutineError> Errors // non-fatal problems per binary
);
public sealed record InitRoutineError(
string BinaryPath,
string Message,
Exception? Exception = null
);
```
### 2.3 Dependency resolution
We dont hardcode how to find `DT_NEEDED` libraries on disk. Define an abstraction:
```csharp
public interface IElfDependencyResolver
{
/// <summary>
/// Resolve SONAME (e.g. "libc.so.6") to a local file path.
/// Returns null if not found.
/// </summary>
string? ResolveLibrary(string soname, string referencingBinaryPath);
}
```
The implementation can respect `LD_LIBRARY_PATH`, typical system dirs, container images, etc., but thats outside this spec.
`IInitRoutineDiscovery` will depend on:
* `IElfParser`
* `IElfDependencyResolver`
* `ISymbolResolver` (symbol tables)
---
## 3. ELF Parsing Spec (C#friendly)
You can either use a NuGet ELF library or implement a minimal inhouse parser. This spec assumes a **minimal custom parser** that supports:
* ELF64, littleendian
* ET_EXEC, ET_DYN
* x8664 (`e_machine == EM_X86_64`) as v1; keep architecture pluggable for later
### 3.1 Core types
Create an internal parser namespace, e.g. `StellaOps.Elf`:
```csharp
internal sealed class ElfFile
{
public string Path { get; }
public ElfClass ElfClass { get; }
public ElfEndianness Endianness { get; }
public ElfHeader Header { get; }
public IReadOnlyList<ProgramHeader> ProgramHeaders { get; }
public IReadOnlyList<SectionHeader> SectionHeaders { get; }
public DynamicSection? Dynamic { get; }
public ReadOnlyMemory<byte> RawBytes { get; }
// Helper: mapping VA -> file offset using PT_LOAD segments
public bool TryMapVaToFileOffset(ulong virtualAddress, out ulong fileOffset);
}
internal enum ElfClass { Elf32, Elf64 }
internal enum ElfEndianness { Little, Big }
// Fill out ElfHeader / ProgramHeader / SectionHeader / DynamicEntry types
```
Implementation notes:
* Read ELF header:
* Validate magic: `0x7F 'E' 'L' 'F'`
* `EI_CLASS` → 32/64bit
* `EI_DATA` → endianness
* Read **program headers** (`e_phoff`, `e_phnum`).
* Identify `PT_LOAD` (for VA→file mapping).
* Identify `PT_DYNAMIC` (for `DynamicSection`).
* Read **section headers** (`e_shoff`, `e_shnum`).
* Identify sections by name: `.preinit_array`, `.init_array`, `.fini_array`, `.ctors`.
* You need the section name string table `.shstrtab` to decode names.
### 3.2 Dynamic section parsing
Define dynamic section model:
```csharp
internal sealed class DynamicSection
{
public IReadOnlyList<DynamicEntry> Entries { get; }
public ulong? InitFunction { get; } // DT_INIT
public ulong? FiniFunction { get; } // DT_FINI
public ulong? InitArrayAddress { get; } // DT_INIT_ARRAY
public ulong? InitArraySize { get; } // DT_INIT_ARRAYSZ
public ulong? FiniArrayAddress { get; } // DT_FINI_ARRAY
public ulong? FiniArraySize { get; } // DT_FINI_ARRAYSZ
public ulong? PreInitArrayAddress { get; } // DT_PREINIT_ARRAY
public ulong? PreInitArraySize { get; } // DT_PREINIT_ARRAYSZ
public string? Soname { get; } // DT_SONAME (decoded via DT_STRTAB)
public IReadOnlyList<string> Needed { get; } // DT_NEEDED list
public ulong? StrTabAddress { get; }
public ulong? SymTabAddress { get; }
public ulong? StrTabSize { get; }
}
```
Implementation details:
* Dynamic entries are at `PT_DYNAMIC.p_offset`, each `Elf64_Dyn`:
* `d_tag` (signed 64bit)
* `d_un` union (`d_val` or `d_ptr`, treat as `ulong`)
* Tags of interest (values are from ELF spec):
* `DT_NULL = 0`
* `DT_NEEDED = 1`
* `DT_STRTAB = 5`
* `DT_SYMTAB = 6`
* `DT_STRSZ = 10`
* `DT_INIT = 12`
* `DT_FINI = 13`
* `DT_SONAME = 14`
* `DT_INIT_ARRAY = 25`
* `DT_FINI_ARRAY = 26`
* `DT_INIT_ARRAYSZ = 27`
* `DT_FINI_ARRAYSZ = 28`
* `DT_PREINIT_ARRAY = 32`
* `DT_PREINIT_ARRAYSZ = 33`
* To decode SONAME and NEEDED:
* Use `DT_STRTAB` as base VA of the dynamic string table.
* Map VA to file offset with `TryMapVaToFileOffset`.
* For each `DT_NEEDED` / `DT_SONAME`, treat `d_val` as an offset into that string table; read a nullterminated UTF8 Cstring.
---
## 4. Constructor & Init Routine Discovery
We now define the algorithm implemented by `InitRoutineDiscovery` for a **single ELF file**.
Highlevel steps:
1. Parse `ElfFile`.
2. Parse `DynamicSection`.
3. Resolve:
* Preinit array (`DT_PREINIT_ARRAY`, `.preinit_array`)
* Init array (`DT_INIT_ARRAY`, `.init_array`)
* Legacy `.ctors`
* `_init`, `_fini` via `DT_INIT`/`DT_FINI`
* Fini array (`DT_FINI_ARRAY`, `.fini_array`)
4. For each VA, optionally resolve symbol name.
5. Build `InitRoutineRoot` entries.
### 4.1 Pointer size & endianness
* For ELF64:
* Pointer size = 8 bytes.
* For ELF32:
* Pointer size = 4 bytes (if/when you support it).
* Use `BinaryPrimitives.ReadUInt64LittleEndian` or `ReadUInt64BigEndian` depending on `ElfEndianness`.
### 4.2 Mapping VA → file offset
`ElfFile.TryMapVaToFileOffset`:
* Iterate `ProgramHeaders` with `p_type == PT_LOAD`.
* If `virtualAddress` in `[p_vaddr, p_vaddr + p_memsz)`:
* `fileOffset = p_offset + (virtualAddress - p_vaddr)`
* Return false if no matching segment.
### 4.3 Reading init arrays
Generic helper:
```csharp
internal static IReadOnlyList<ulong> ReadPointerArray(
ElfFile elf,
ulong arrayVa,
ulong arrayBytes)
{
var results = new List<ulong>();
if (!elf.TryMapVaToFileOffset(arrayVa, out var fileOffset))
return results;
int pointerSize = elf.ElfClass == ElfClass.Elf64 ? 8 : 4;
int count = (int)(arrayBytes / (ulong)pointerSize);
var span = elf.RawBytes.Span;
for (int i = 0; i < count; i++)
{
ulong offset = fileOffset + (ulong)(i * pointerSize);
if (offset + (ulong)pointerSize > (ulong)span.Length)
break;
ulong pointerValue = elf.Endianness switch
{
ElfEndianness.Little when pointerSize == 8
=> System.Buffers.Binary.BinaryPrimitives.ReadUInt64LittleEndian(span[(int)offset..]),
ElfEndianness.Little
=> System.Buffers.Binary.BinaryPrimitives.ReadUInt32LittleEndian(span[(int)offset..]),
ElfEndianness.Big when pointerSize == 8
=> System.Buffers.Binary.BinaryPrimitives.ReadUInt64BigEndian(span[(int)offset..]),
_ // Big, 32-bit
=> System.Buffers.Binary.BinaryPrimitives.ReadUInt32BigEndian(span[(int)offset..]),
};
if (pointerValue != 0)
results.Add(pointerValue);
}
return results;
}
```
Apply to:
* Preinit: if `Dynamic.PreInitArrayAddress` and `Dynamic.PreInitArraySize` present.
* Init: if `Dynamic.InitArrayAddress` and `Dynamic.InitArraySize` present.
* Fini: if `Dynamic.FiniArrayAddress` and `Dynamic.FiniArraySize` present.
### 4.4 Legacy `.ctors` section
Fallback for older toolchains:
* Find section with `Name == ".ctors"`.
* Its contents are just an array of pointers (same pointer size as ELF).
* Some compilers include a sentinel `-1` or `0` at beginning or end. Treat:
* `0` or `0xFFFFFFFFFFFFFFFF` (for 64bit) as sentinel; skip them.
* Use similar `ReadPointerArray` logic but starting from `sh_offset` rather than a VA.
### 4.5 `_init` / `_fini` functions
* `Dynamic.InitFunction` (from `DT_INIT`) is a single VA.
* `Dynamic.FiniFunction` (from `DT_FINI`) likewise.
Even if arrays exist, these may also be present; treat them as **independent roots**.
---
## 5. Symbol Resolution (besteffort names)
Define interface:
```csharp
public interface ISymbolResolver
{
/// <summary>
/// Find the symbol whose address matches `virtualAddress` exactly,
/// or, if not found, the closest preceding symbol (with an offset).
/// </summary>
SymbolInfo? ResolveSymbol(ElfFile elf, ulong virtualAddress);
}
public sealed record SymbolInfo(
string Name,
ulong Value,
ulong Size
);
```
Implementation sketch:
* Use `.dynsym` (dynamic symbol table), and `.symtab` (full symbol table) if available.
* Each symbol entry includes:
* Name offset in string table
* Value (VA)
* Size
* Type/binding (function, object, etc.)
* Build an inmemory index (e.g. sorted by `Value`) per ELF file.
* `ResolveSymbol`:
* Prefer exact match of `Value`.
* If none, find symbol with largest `Value` less than `virtualAddress` and treat as “nearest symbol + offset”.
* You can show just `Name` or `Name+0xOFFSET` in explanations; for `InitRoutineRoot` we store plain `Name`.
---
## 6. Dynamic Dependencies & Load-Time Roots
When `InitDiscoveryOptions.IncludeDependencies == true`:
1. For root ELF:
* Discover its roots as above.
2. For each `neededSoname` in `Dynamic.Needed`:
* Ask `IElfDependencyResolver.ResolveLibrary(neededSoname, rootElfPath)`.
* If it returns a path not yet processed:
* Parse this ELF and recursively discover its roots.
3. Return a **flat list** of all `InitRoutineRoot` objects, but with their own `BinaryPath`/`Soname`.
Important: **We do not implicitly model `dlopen()`** at this stage. Thats separate:
* As an optional heuristic, if the binary imports `dlopen`, tag those DSOs so later we can add “potential plugin load” roots. You can park this as a TODO in the comments.
---
## 7. Call Graph / Reachability Integration
This depends on your existing modeling, but heres a generic spec a C# dev can follow.
Assume there is an internal model:
```csharp
public sealed class CallGraph
{
public Node GetOrCreateNode(string binaryPath, ulong virtualAddress, string? symbolName);
public Node GetOrCreateSyntheticRoot(string rootId, string description);
public void AddEdge(Node from, Node to, CallEdgeMetadata metadata);
}
public sealed record CallEdgeMetadata(
string EdgeKind, // e.g. "loader-init"
InitPhase Phase, // Load / Unload
InitRoutineKind InitKind,
int? ArrayIndex
);
```
### 7.1 Synthetic loader node
Create a single graph node representing the dynamic loader / program start:
```csharp
var loaderNode = callGraph.GetOrCreateSyntheticRoot(
"LOADER",
"ELF dynamic loader / process start"
);
```
### 7.2 Adding edges for each root
For each `InitRoutineRoot root`:
1. Get or create a node for the target function:
```csharp
var target = callGraph.GetOrCreateNode(
root.BinaryPath,
root.VirtualAddress,
root.SymbolName
);
```
2. Add edge from loader:
```csharp
callGraph.AddEdge(
loaderNode,
target,
new CallEdgeMetadata(
EdgeKind: "loader-init",
Phase: root.Phase,
InitKind: root.Kind,
ArrayIndex: root.ArrayIndex
)
);
```
3. Optional: If you model **perlibrary** loader nodes, you can add:
* `LOADER -> libLoaderNode`
* `libLoaderNode -> each constructor`
but thats a nicetohave, not required.
### 7.3 Phases
* For `.preinit_array`, `.init_array`, `.ctors`, `_init`:
* `Phase = InitPhase.Load`
* For `.fini_array`, `_fini`:
* `Phase = InitPhase.Unload`
This allows downstream UI to say e.g.:
> This vulnerable function is reachable at **load time** via constructor `foo()` in `libbar.so`.
---
## 8. Attestation / Evidence Output
We want deterministic, auditable output per scan.
Define a JSON schema (C# record) stored alongside other scan artifacts:
```csharp
public sealed record InitRoutineEvidence(
string ScannerVersion,
DateTimeOffset ScanTimeUtc,
IReadOnlyList<InitRoutineEvidenceEntry> Entries
);
public sealed record InitRoutineEvidenceEntry(
string BinaryPath,
string? Soname,
InitRoutineKind Kind,
InitPhase Phase,
ulong VirtualAddress,
ulong? FileOffset,
string? SymbolName,
int? ArrayIndex
);
```
Implementation details:
* After `IInitRoutineDiscovery.Discover` completes:
* Convert each `InitRoutineRoot` to `InitRoutineEvidenceEntry`.
* Serialize with `System.Text.Json` (property names in camelCase or snake_case; choose a stable convention).
* Store the evidence file e.g. `init_roots.json` inside the scans result directory.
---
## 9. Implementation Details & Edge Cases
### 9.1 Architectures
First version:
* Support:
* `ElfClass.Elf64`
* `ElfEndianness.Little`
* `EM_X86_64`
* For anything else:
* Log an `InitRoutineError` and skip (but dont hardfail the whole scan).
Design the parser so architecture is an enum:
```csharp
internal enum ElfMachine : ushort
{
X86_64 = 62,
// others later
}
```
### 9.2 Relocations (simplification)
Real loaders apply relocations to constructor arrays; some pointers may be stored as relative relocations.
For **v1 implementation**:
* Assume that:
* Array entries are already absolute VAs in the ELFs address space (which is typical for nonPIE or when linktime addresses are used).
* If you need better fidelity later:
* Parse `.rela.dyn` / `.rel.dyn`.
* Apply `R_X86_64_RELATIVE` relocations whose `r_offset` falls within the arrays address range:
* Effective address = (base address + addend); if you treat base as 0, you get a VA thats correct **within the file** (relative).
Document this as a TODO so later you can extend without breaking the API.
### 9.3 Error handling
* All parsing errors **must be nonfatal** to the overall scan:
* Record `InitRoutineError` with `BinaryPath`, message, and exception.
* Continue with other binaries.
* If a binary is not ELF or has invalid magic:
* Return no roots, but optionally log a lowseverity error.
---
## 10. Testing Strategy
### 10.1 Unit tests with synthetic ELF fixtures
Create a small test project `StellaOps.ElfInit.Tests` with known ELF files checked into test resources:
* Binaries compiled with small C programs like:
```c
static void __attribute__((constructor)) c1(void) {}
static void __attribute__((constructor)) c2(void) {}
static void __attribute__((destructor)) d1(void) {}
int main() { return 0; }
```
* Variants:
* Using `.ctors` (old GCC flags) for legacy coverage.
* Shared library with `__attribute__((constructor))` and `DT_NEEDED` from a main binary.
* Binary with no constructors (expect zero roots).
Assertions:
* The count of `InitRoutineRoot` matches expected.
* `Kind` and `Phase` are correct.
* `ArrayIndex` is correctly ordered: 0,1,2 …
* `SymbolName` contains expected mangled function names (if compiler doesnt drop them).
* For dependencies:
* Discover roots in `libfoo.so` when main depends on it via `DT_NEEDED`.
### 10.2 Integration tests with call graph
* Given a small binary and a known vulnerable function reachable from a constructor:
* Run full pipeline.
* Assert that the vulnerable function is marked reachable from synthetic `LOADER` node via the constructor.
### 10.3 Fuzz / robustness
* Run the discovery on:
* Random nonELF files.
* Truncated ELF files.
* Very large binaries.
* Ensure no unhandled exceptions; only `InitRoutineError` entries.
---
## 11. Suggested C# Project Layout
```text
src/
StellaOps.ElfInit/
IInitRoutineDiscovery.cs
InitRoutineModels.cs
InitRoutineDiscovery.cs
IElfDependencyResolver.cs
ISymbolResolver.cs
Evidence/
InitRoutineEvidence.cs
Elf/
ElfFile.cs
ElfParser.cs
ElfHeader.cs
ProgramHeader.cs
SectionHeader.cs
DynamicSection.cs
VaMapper.cs
PointerArrayReader.cs
tests/
StellaOps.ElfInit.Tests/
Resources/
sample_no_ctor
sample_init_array
sample_preinit_init_fini
sample_with_deps_main
libsample_ctor.so
InitRoutineDiscoveryTests.cs
```
---
If youd like, I can next:
* Draft `InitRoutineDiscovery` in C# with full method bodies, or
* Provide a minimal `ElfFile`/`ElfParser` implementation skeleton you can fill in.