up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-13 18:08:55 +02:00
parent 6e45066e37
commit f1a39c4ce3
234 changed files with 24038 additions and 6910 deletions

View File

@@ -0,0 +1,301 @@
# CONTRACT-BUILDID-PROPAGATION-401: Build-ID and Code-ID Propagation
> **Status:** Published
> **Version:** 1.0.0
> **Published:** 2025-12-13
> **Owners:** Scanner Guild, Signals Guild, BE-Base Platform Guild
> **Unblocks:** SCANNER-BUILDID-401-035, SCANNER-INITROOT-401-036, and downstream tasks
## Overview
This contract defines how GNU build-id (ELF), PE GUID, and Mach-O UUID propagate through the reachability pipeline from Scanner to SBOM, Signals, and runtime facts. It ensures consistent identification of binaries across components for deterministic symbol resolution and replay.
---
## 1. Build-ID Sources and Formats
### 1.1 Per-Format Extraction
| Binary Format | Build-ID Source | Prefix | Example |
|---------------|-----------------|--------|---------|
| ELF | `.note.gnu.build-id` | `gnu-build-id:` | `gnu-build-id:5f0c7c3cab2eb9bc...` |
| PE (Windows) | Debug GUID from PE header | `pe-guid:` | `pe-guid:12345678-1234-1234-1234-123456789abc` |
| Mach-O | `LC_UUID` load command | `macho-uuid:` | `macho-uuid:12345678123412341234123456789abc` |
### 1.2 Canonical Format
```
build_id = "{prefix}{hex_lowercase}"
```
- Hex encoding: lowercase, no separators (except PE GUID retains dashes)
- Minimum length: 16 bytes (32 hex chars) for ELF/Mach-O
- PE GUID: Standard GUID format with dashes
### 1.3 Fallback When Build-ID Absent
When build-id is not present (stripped binaries, older toolchains):
```json
{
"build_id": null,
"build_id_fallback": {
"method": "file_hash",
"value": "sha256:...",
"confidence": 0.7
}
}
```
**Fallback chain:**
1. `file_hash` - SHA-256 of entire binary file (confidence: 0.7)
2. `code_section_hash` - SHA-256 of .text section (confidence: 0.6)
3. `path_hash` - SHA-256 of file path (confidence: 0.3, last resort)
---
## 2. Code-ID for Name-less Symbols
### 2.1 Purpose
`code_id` provides stable identification for symbols in stripped binaries where the symbol name is unavailable.
### 2.2 Format
```
code_id = "code:{lang}:{base64url_sha256}"
```
**Canonical tuple for binary symbols:**
```
{format}\0{build_id_or_file_hash}\0{section}\0{addr}\0{size}\0{code_block_hash}
```
### 2.3 Code Block Hash
For stripped functions, compute hash of the code bytes:
```
code_block_hash = "sha256:" + hex(SHA256(code_bytes[addr:addr+size]))
```
---
## 3. Cross-RID (Runtime Identifier) Mapping
### 3.1 Problem Statement
Different platform builds (linux-x64, win-x64, osx-arm64) of the same source code produce different binaries with different build-ids. Runtime facts from one platform must map to the correct binary variant.
### 3.2 Variant Group
Binaries from the same source are grouped by source digest:
```json
{
"variant_group": {
"source_digest": "sha256:...",
"variants": [
{
"rid": "linux-x64",
"build_id": "gnu-build-id:aaa...",
"file_hash": "sha256:..."
},
{
"rid": "win-x64",
"build_id": "pe-guid:bbb...",
"file_hash": "sha256:..."
},
{
"rid": "osx-arm64",
"build_id": "macho-uuid:ccc...",
"file_hash": "sha256:..."
}
]
}
}
```
### 3.3 Runtime Fact Correlation
When Signals ingests runtime facts:
1. Extract `build_id` from runtime event
2. Look up variant group containing this build_id
3. Correlate with richgraph nodes having matching `build_id`
4. If no match, fall back to `code_id` + `code_block_hash` matching
---
## 4. SBOM Integration
### 4.1 CycloneDX 1.6 Properties
Build-ID propagates to SBOM via component properties:
```json
{
"type": "library",
"name": "libssl.so.3",
"version": "3.0.11",
"properties": [
{"name": "stellaops:build-id", "value": "gnu-build-id:5f0c7c3c..."},
{"name": "stellaops:code-id", "value": "code:binary:abc123..."},
{"name": "stellaops:file-hash", "value": "sha256:..."}
]
}
```
### 4.2 SPDX 3.0 Integration
Build-ID maps to SPDX external references:
```json
{
"spdxId": "SPDXRef-libssl",
"externalRef": {
"referenceCategory": "PERSISTENT-ID",
"referenceType": "gnu-build-id",
"referenceLocator": "gnu-build-id:5f0c7c3c..."
}
}
```
---
## 5. Signals Runtime Facts Schema
### 5.1 Runtime Event with Build-ID
```json
{
"event_type": "function_hit",
"timestamp": "2025-12-13T10:00:00Z",
"binary": {
"path": "/usr/lib/x86_64-linux-gnu/libssl.so.3",
"build_id": "gnu-build-id:5f0c7c3c...",
"file_hash": "sha256:..."
},
"symbol": {
"name": "SSL_read",
"address": "0x12345678",
"symbol_id": "sym:binary:..."
},
"context": {
"pid": 12345,
"container_id": "abc123..."
}
}
```
### 5.2 Ingestion Endpoint
```
POST /signals/runtime-facts
Content-Type: application/x-ndjson
Content-Encoding: gzip
{"event_type":"function_hit","binary":{"build_id":"gnu-build-id:..."},...}
{"event_type":"function_hit","binary":{"build_id":"gnu-build-id:..."},...}
```
---
## 6. RichGraph Integration
### 6.1 Node with Build-ID
```json
{
"id": "sym:binary:...",
"symbol_id": "sym:binary:...",
"lang": "binary",
"kind": "function",
"display": "SSL_read",
"build_id": "gnu-build-id:5f0c7c3c...",
"code_id": "code:binary:...",
"code_block_hash": "sha256:...",
"purl": "pkg:deb/debian/libssl3@3.0.11"
}
```
### 6.2 CAS Evidence Storage
```
cas://binary/
by-build-id/{build_id}/ # Index by build-id
graph.json # Associated graph
symbols.json # Symbol table
by-code-id/{code_id}/ # Index by code-id
block.bin # Code block bytes
disasm.json # Disassembly
```
---
## 7. Implementation Requirements
### 7.1 Scanner Changes
| Component | Change | Priority |
|-----------|--------|----------|
| ELF parser | Extract `.note.gnu.build-id` | P0 |
| PE parser | Extract Debug GUID | P0 |
| Mach-O parser | Extract `LC_UUID` | P0 |
| RichGraphBuilder | Populate `build_id` field on nodes | P0 |
| SBOM emitters | Add `stellaops:build-id` property | P1 |
### 7.2 Signals Changes
| Component | Change | Priority |
|-----------|--------|----------|
| Runtime facts ingestion | Parse and index `build_id` | P0 |
| Scoring service | Correlate by `build_id` then `code_id` | P0 |
| Store repository | Add `build_id` index | P1 |
### 7.3 CLI/UI Changes
| Component | Change | Priority |
|-----------|--------|----------|
| `stella graph explain` | Show build_id in output | P1 |
| UI symbol drawer | Display build_id with copy button | P1 |
---
## 8. Validation Rules
1. `build_id` must match regex: `^(gnu-build-id|pe-guid|macho-uuid):[a-f0-9-]+$`
2. `code_id` must match regex: `^code:[a-z]+:[A-Za-z0-9_-]+$`
3. When `build_id` is null, `build_id_fallback` must be present
4. `code_block_hash` required when `build_id` is null and symbol is stripped
5. Variant group `source_digest` must be consistent across all variants
---
## 9. Test Fixtures
Location: `tests/Binary/fixtures/build-id/`
| Fixture | Description |
|---------|-------------|
| `elf-with-buildid/` | ELF binary with GNU build-id |
| `elf-stripped/` | ELF stripped, fallback to code-id |
| `pe-with-guid/` | PE binary with Debug GUID |
| `macho-with-uuid/` | Mach-O binary with LC_UUID |
| `variant-group/` | Same source, multiple RIDs |
---
## 10. Related Contracts
- [richgraph-v1](./richgraph-v1.md) - Graph schema with build_id field
- [Binary Reachability](../reachability/binary-reachability-schema.md) - Binary evidence schema
- [Symbol Manifest](../specs/SYMBOL_MANIFEST_v1.md) - Symbol identification
---
## Changelog
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2025-12-13 | Scanner Guild | Initial contract for build-id propagation |

View File

@@ -0,0 +1,326 @@
# CONTRACT-INIT-ROOTS-401: Init-Section Synthetic Roots
> **Status:** Published
> **Version:** 1.0.0
> **Published:** 2025-12-13
> **Owners:** Scanner Guild, Policy Guild, Signals Guild
> **Unblocks:** SCANNER-INITROOT-401-036, EDGE-BUNDLE-401-054, and downstream tasks
## Overview
This contract defines how ELF/PE/Mach-O initialization sections (`.init_array`, `.ctors`, `DT_INIT`, etc.) are modeled as synthetic roots in reachability graphs. These roots represent code that executes during program load, before `main()`, and must be included in reachability analysis for complete vulnerability assessment.
---
## 1. Init-Section Categories
### 1.1 ELF Init Sections
| Section/Tag | Phase | Order | Description |
|-------------|-------|-------|-------------|
| `.preinit_array` / `DT_PREINIT_ARRAY` | `preinit` | 0-N | Executed before dynamic linker init |
| `.init` / `DT_INIT` | `init` | 0 | Single init function |
| `.init_array` / `DT_INIT_ARRAY` | `init` | 1-N | Array of init function pointers |
| `.ctors` | `init` | after init_array | Legacy C++ constructors |
| `.fini` / `DT_FINI` | `fini` | 0 | Single cleanup function |
| `.fini_array` / `DT_FINI_ARRAY` | `fini` | 1-N | Array of cleanup function pointers |
| `.dtors` | `fini` | after fini_array | Legacy C++ destructors |
### 1.2 PE Init Sections
| Mechanism | Phase | Order | Description |
|-----------|-------|-------|-------------|
| `DllMain` (DLL_PROCESS_ATTACH) | `init` | 0 | DLL initialization |
| TLS callbacks | `init` | 1-N | Thread-local storage callbacks |
| C++ global constructors | `init` | after TLS | Via CRT init table |
| `DllMain` (DLL_PROCESS_DETACH) | `fini` | 0 | DLL cleanup |
### 1.3 Mach-O Init Sections
| Section | Phase | Order | Description |
|---------|-------|-------|-------------|
| `__mod_init_func` | `init` | 0-N | Module init functions |
| `__mod_term_func` | `fini` | 0-N | Module termination functions |
---
## 2. Synthetic Root Schema
### 2.1 Root Object in richgraph-v1
```json
{
"roots": [
{
"id": "root:init:0:sym:binary:abc123...",
"phase": "init",
"source": "init_array",
"order": 0,
"target_id": "sym:binary:abc123...",
"binary_path": "/usr/lib/libfoo.so.1",
"build_id": "gnu-build-id:5f0c7c3c..."
}
]
}
```
### 2.2 Root ID Format
```
root:{phase}:{order}:{target_symbol_id}
```
**Examples:**
- `root:preinit:0:sym:binary:abc...` - First preinit function
- `root:init:0:sym:binary:def...` - DT_INIT function
- `root:init:1:sym:binary:ghi...` - First init_array entry
- `root:main:0:sym:binary:jkl...` - main() function
- `root:fini:0:sym:binary:mno...` - DT_FINI function
### 2.3 Phase Enumeration
| Phase | Numeric Order | Execution Time |
|-------|---------------|----------------|
| `load` | 0 | Dynamic linker resolution |
| `preinit` | 1 | Before dynamic init |
| `init` | 2 | During initialization |
| `main` | 3 | Program entry (main) |
| `fini` | 4 | During termination |
---
## 3. Root Discovery Algorithm
### 3.1 ELF Root Discovery
```
1. Parse .dynamic section for DT_PREINIT_ARRAY, DT_INIT, DT_INIT_ARRAY
2. For each array:
a. Read function pointer addresses
b. Resolve to symbol (if available) or emit unknown
c. Create root with phase + order
3. Find _start, main, _init, _fini symbols and add as roots
4. Sort roots by (phase, order, target_id) for determinism
```
### 3.2 Handling Unresolved Targets
When init array contains address without symbol:
```json
{
"roots": [
{
"id": "root:init:2:unknown:0x12345678",
"phase": "init",
"source": "init_array",
"order": 2,
"target_id": "unknown:0x12345678",
"resolved": false,
"reason": "No symbol at address 0x12345678"
}
],
"unknowns": [
{
"id": "unknown:0x12345678",
"type": "unresolved_init_target",
"address": "0x12345678",
"source": "init_array[2]"
}
]
}
```
---
## 4. DT_NEEDED Dependency Modeling
### 4.1 Purpose
`DT_NEEDED` entries specify shared library dependencies. These execute their init code before the depending binary's init code.
### 4.2 Schema
```json
{
"dependencies": [
{
"id": "dep:libssl.so.3",
"name": "libssl.so.3",
"source": "DT_NEEDED",
"order": 0,
"resolved_path": "/usr/lib/x86_64-linux-gnu/libssl.so.3",
"resolved_build_id": "gnu-build-id:abc..."
}
]
}
```
### 4.3 Init Order with Dependencies
```
1. libssl.so.3 preinit → init
2. libcrypto.so.3 preinit → init
3. libc.so.6 preinit → init
4. main_binary preinit → init → main
```
---
## 5. Patch Oracle Integration
### 5.1 Oracle Expected Roots
```json
{
"expected_roots": [
{
"id": "root:init:*:sym:binary:*",
"phase": "init",
"source": "init_array",
"required": true,
"reason": "Init function must be detected for CVE-2023-XXXX"
}
]
}
```
### 5.2 Oracle Forbidden Roots
```json
{
"forbidden_roots": [
{
"id": "root:preinit:*:*",
"phase": "preinit",
"reason": "Preinit code should not exist after patch"
}
]
}
```
---
## 6. Policy Integration
### 6.1 Reachability State with Init Roots
When evaluating reachability:
1. If vulnerable function is reachable from `main``REACHABLE`
2. If vulnerable function is reachable from `init` roots → `REACHABLE_INIT`
3. If vulnerable function is reachable only from `fini``REACHABLE_FINI`
### 6.2 Policy DSL Extensions
```yaml
# Require init-phase reachability for not_affected
rules:
- name: init-reachability-required
condition: |
vuln.phase_reachable.includes("init") and
reachability.confidence >= 0.8
action: require_evidence
- name: init-only-lower-severity
condition: |
reachability.reachable_phases == ["init"] and
not reachability.reachable_phases.includes("main")
action: reduce_severity
severity_adjustment: -1
```
---
## 7. Evidence Requirements
### 7.1 Init Root Evidence Bundle
```json
{
"root_evidence": {
"root_id": "root:init:0:sym:binary:...",
"extraction_method": "dynamic_section",
"source_offset": "0x1234",
"target_address": "0x5678",
"target_symbol": "frame_dummy",
"evidence_hash": "sha256:...",
"evidence_uri": "cas://binary/roots/sha256:..."
}
}
```
### 7.2 CAS Storage Layout
```
cas://reachability/roots/{graph_hash}/
init.json # All init-phase roots
fini.json # All fini-phase roots
dependencies.json # DT_NEEDED graph
evidence/
root:{id}.json # Per-root evidence
```
---
## 8. Determinism Rules
### 8.1 Root Ordering
Roots are sorted by:
1. Phase (numeric: load=0, preinit=1, init=2, main=3, fini=4)
2. Order within phase (numeric)
3. Target ID (string, ordinal)
### 8.2 Root ID Canonicalization
```
root_id = "root:" + phase + ":" + order + ":" + target_id
```
All components lowercase, no whitespace.
---
## 9. Implementation Status
| Component | Location | Status |
|-----------|----------|--------|
| ELF init parser | `NativeCallgraphBuilder.cs` | Implemented |
| Root model | `NativeSyntheticRoot` | Implemented |
| richgraph-v1 roots | `RichGraph.cs` | Implemented |
| Patch oracle roots | `PatchOracleComparer.cs` | Implemented |
| Policy integration | - | Pending |
| DT_NEEDED graph | - | Pending |
---
## 10. Test Fixtures
Location: `tests/Binary/fixtures/init-roots/`
| Fixture | Description |
|---------|-------------|
| `elf-simple-init/` | Binary with single init function |
| `elf-init-array/` | Binary with multiple init_array entries |
| `elf-preinit/` | Binary with preinit_array |
| `elf-ctors/` | Binary with .ctors section |
| `elf-stripped-init/` | Stripped binary with init |
| `pe-dllmain/` | PE DLL with DllMain |
| `pe-tls-callbacks/` | PE with TLS callbacks |
---
## 11. Related Contracts
- [richgraph-v1](./richgraph-v1.md) - Root schema in graphs
- [Build-ID Propagation](./buildid-propagation.md) - Binary identification
- [Patch Oracles](../reachability/patch-oracles.md) - Oracle validation
---
## Changelog
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2025-12-13 | Scanner Guild | Initial contract for init-section roots |

View File

@@ -0,0 +1,317 @@
# DECISION-NATIVE-TOOLCHAIN-401: Native Lifter and Demangler Selection
> **Status:** Published
> **Version:** 1.0.0
> **Published:** 2025-12-13
> **Owners:** Scanner Guild, Platform Guild
> **Unblocks:** SCANNER-NATIVE-401-015, SCAN-REACH-401-009
## Decision Summary
This document records the decisions for native binary analysis toolchain selection, enabling implementation of native symbol extraction, callgraph generation, and demangling for ELF/PE/Mach-O binaries.
---
## 1. Component Decisions
### 1.1 ELF Parser
**Decision:** Use custom pure-C# ELF parser
**Rationale:**
- No native dependencies, portable across platforms
- Already implemented in `StellaOps.Scanner.Analyzers.Native`
- Sufficient for symbol table, dynamic section, and relocation parsing
- Avoids licensing complexity of external libraries
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/Elf/`
### 1.2 PE Parser
**Decision:** Use custom pure-C# PE parser
**Rationale:**
- No native dependencies
- Already implemented in `StellaOps.Scanner.Analyzers.Native`
- Handles import/export tables, Debug directory
- Compatible with air-gapped deployment
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/Pe/`
### 1.3 Mach-O Parser
**Decision:** Use custom pure-C# Mach-O parser
**Rationale:**
- Consistent with ELF/PE approach
- No native dependencies
- Sufficient for symbol table and load commands
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/MachO/`
### 1.4 Symbol Demangler
**Decision:** Use per-language managed demanglers with native fallback
| Language | Primary Demangler | Fallback |
|----------|-------------------|----------|
| C++ (Itanium ABI) | `Demangler.Net` (NuGet) | llvm-cxxfilt via P/Invoke |
| C++ (MSVC) | `UnDecorateSymbolName` wrapper | None (Windows-specific) |
| Rust | `rustc-demangle` port | rustfilt via P/Invoke |
| Swift | `swift-demangle` port | None |
| D | `dlang-demangler` port | None |
**Rationale:**
- Managed demanglers provide determinism and portability
- Native fallback only for edge cases
- No runtime dependency on external tools
**NuGet packages:**
```xml
<PackageReference Include="Demangler.Net" Version="1.0.0" />
```
### 1.5 Disassembler (Optional, for heuristic analysis)
**Decision:** Use Iced (x86/x64) + Capstone.NET (ARM/others)
| Architecture | Library | NuGet Package |
|--------------|---------|---------------|
| x86/x64 | Iced | `Iced` |
| ARM/ARM64 | Capstone.NET | `Capstone.NET` |
| Other | Skip disassembly | N/A |
**Rationale:**
- Iced is pure managed, no native deps for x86
- Capstone.NET wraps Capstone with native lib
- Disassembly is optional for heuristic edge detection
### 1.6 Callgraph Extraction
**Decision:** Static analysis only (no dynamic execution)
**Methods:**
1. Relocation-based: Extract call targets from relocations
2. Import/Export: Map import references to exports
3. Symbol-based: Direct and indirect call targets from symbol table
4. CFG heuristics: Basic block boundary detection (x86 only)
**No dynamic analysis:** Avoids execution risks, portable.
---
## 2. CI Toolchain Requirements
### 2.1 Build Requirements
| Component | Requirement | Notes |
|-----------|-------------|-------|
| .NET SDK | 10.0+ | Required for all builds |
| Native libs (optional) | Capstone 4.0+ | Only for ARM disassembly |
| Test binaries | Pre-built fixtures | No compiler dependency in CI |
### 2.2 Test Fixture Strategy
**Decision:** Ship pre-built binary fixtures, not source + compiler
**Rationale:**
- Deterministic: Same binary hash every run
- No compiler dependency in CI
- Smaller CI image footprint
- Cross-platform: Same fixtures on all runners
**Fixture locations:**
```
tests/Binary/fixtures/
elf-x86_64/
binary.elf # Pre-built
expected.json # Expected graph
expected-hashes.txt # Determinism check
pe-x64/
binary.exe
expected.json
macho-arm64/
binary.dylib
expected.json
```
### 2.3 Fixture Generation (Offline)
Fixtures are generated offline by maintainers:
```bash
# Generate ELF fixture (run once, commit result)
cd tools/fixtures
./generate-elf-fixture.sh
# Verify hashes match
./verify-fixtures.sh
```
---
## 3. Demangling Contract
### 3.1 Output Format
Demangled names follow this format:
```json
{
"symbol": {
"mangled": "_ZN4Curl7Session4readEv",
"demangled": "Curl::Session::read()",
"source": "itanium-abi",
"confidence": 1.0
}
}
```
### 3.2 Demangling Sources
| Source | Description | Confidence |
|--------|-------------|------------|
| `itanium-abi` | Itanium C++ ABI (GCC/Clang) | 1.0 |
| `msvc` | Microsoft Visual C++ | 1.0 |
| `rust` | Rust mangling | 1.0 |
| `swift` | Swift mangling | 1.0 |
| `fallback` | Native tool fallback | 0.9 |
| `heuristic` | Pattern-based guess | 0.6 |
| `none` | No demangling available | 0.3 |
### 3.3 Failed Demangling
When demangling fails:
```json
{
"symbol": {
"mangled": "_Z15unknown_format",
"demangled": null,
"source": "none",
"confidence": 0.3,
"demangling_error": "Unrecognized mangling scheme"
}
}
```
---
## 4. Callgraph Edge Types
### 4.1 Edge Type Enumeration
| Type | Description | Confidence |
|------|-------------|------------|
| `call` | Direct call instruction | 1.0 |
| `plt` | PLT/GOT indirect call | 0.95 |
| `indirect` | Indirect call (vtable, function pointer) | 0.6 |
| `init_array` | From init_array to function | 1.0 |
| `tls_callback` | TLS callback invocation | 1.0 |
| `exception` | Exception handler target | 0.8 |
| `switch` | Switch table target | 0.7 |
| `heuristic` | CFG-based heuristic | 0.4 |
### 4.2 Unknown Targets
When call target cannot be resolved:
```json
{
"unknowns": [
{
"id": "unknown:call:0x12345678",
"type": "unresolved_call_target",
"source_id": "sym:binary:abc...",
"call_site": "0x12345678",
"reason": "Indirect call through register"
}
]
}
```
---
## 5. Performance Constraints
### 5.1 Size Limits
| Metric | Limit | Action on Exceed |
|--------|-------|------------------|
| Binary size | 100 MB | Warn, proceed |
| Symbol count | 1M symbols | Chunk processing |
| Edge count | 10M edges | Chunk output |
| Memory usage | 4 GB | Stream processing |
### 5.2 Timeout Constraints
| Operation | Timeout | Action on Exceed |
|-----------|---------|------------------|
| ELF parse | 60s | Fail with partial |
| Demangle all | 120s | Truncate results |
| CFG analysis | 300s | Skip heuristics |
| Total analysis | 600s | Fail gracefully |
---
## 6. Integration Points
### 6.1 Scanner Plugin Interface
```csharp
public interface INativeAnalyzer : IAnalyzerPlugin
{
Task<NativeObservationDocument> AnalyzeAsync(
Stream binaryStream,
NativeAnalyzerOptions options,
CancellationToken ct);
}
```
### 6.2 RichGraph Integration
Native analysis results feed into RichGraph:
```
NativeObservation → NativeReachabilityGraph → RichGraph nodes/edges
```
### 6.3 Signals Integration
Native symbols with runtime hits:
```
Signals runtime-facts + RichGraph → ReachabilityFact with confidence
```
---
## 7. Implementation Checklist
| Task | Status | Owner |
|------|--------|-------|
| ELF parser | Done | Scanner Guild |
| PE parser | Done | Scanner Guild |
| Mach-O parser | In Progress | Scanner Guild |
| C++ demangler | Done | Scanner Guild |
| Rust demangler | Pending | Scanner Guild |
| Callgraph builder | Done | Scanner Guild |
| Test fixtures | Partial | QA Guild |
| CI integration | Pending | DevOps Guild |
---
## 8. Related Documents
- [richgraph-v1 Contract](./richgraph-v1.md)
- [Build-ID Propagation](./buildid-propagation.md)
- [Init-Section Roots](./init-section-roots.md)
- [Binary Reachability Schema](../reachability/binary-reachability-schema.md)
---
## Changelog
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2025-12-13 | Platform Guild | Initial toolchain decision |