up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Notify Smoke Test / Notify Unit Tests (push) Has been cancelled
Notify Smoke Test / Notifier Service Tests (push) Has been cancelled
Notify Smoke Test / Notification Smoke Test (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
This commit is contained in:
317
docs/contracts/native-toolchain-decision.md
Normal file
317
docs/contracts/native-toolchain-decision.md
Normal file
@@ -0,0 +1,317 @@
|
||||
# DECISION-NATIVE-TOOLCHAIN-401: Native Lifter and Demangler Selection
|
||||
|
||||
> **Status:** Published
|
||||
> **Version:** 1.0.0
|
||||
> **Published:** 2025-12-13
|
||||
> **Owners:** Scanner Guild, Platform Guild
|
||||
> **Unblocks:** SCANNER-NATIVE-401-015, SCAN-REACH-401-009
|
||||
|
||||
## Decision Summary
|
||||
|
||||
This document records the decisions for native binary analysis toolchain selection, enabling implementation of native symbol extraction, callgraph generation, and demangling for ELF/PE/Mach-O binaries.
|
||||
|
||||
---
|
||||
|
||||
## 1. Component Decisions
|
||||
|
||||
### 1.1 ELF Parser
|
||||
|
||||
**Decision:** Use custom pure-C# ELF parser
|
||||
|
||||
**Rationale:**
|
||||
- No native dependencies, portable across platforms
|
||||
- Already implemented in `StellaOps.Scanner.Analyzers.Native`
|
||||
- Sufficient for symbol table, dynamic section, and relocation parsing
|
||||
- Avoids licensing complexity of external libraries
|
||||
|
||||
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/Elf/`
|
||||
|
||||
### 1.2 PE Parser
|
||||
|
||||
**Decision:** Use custom pure-C# PE parser
|
||||
|
||||
**Rationale:**
|
||||
- No native dependencies
|
||||
- Already implemented in `StellaOps.Scanner.Analyzers.Native`
|
||||
- Handles import/export tables, Debug directory
|
||||
- Compatible with air-gapped deployment
|
||||
|
||||
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/Pe/`
|
||||
|
||||
### 1.3 Mach-O Parser
|
||||
|
||||
**Decision:** Use custom pure-C# Mach-O parser
|
||||
|
||||
**Rationale:**
|
||||
- Consistent with ELF/PE approach
|
||||
- No native dependencies
|
||||
- Sufficient for symbol table and load commands
|
||||
|
||||
**Implementation:** `src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Native/Internal/MachO/`
|
||||
|
||||
### 1.4 Symbol Demangler
|
||||
|
||||
**Decision:** Use per-language managed demanglers with native fallback
|
||||
|
||||
| Language | Primary Demangler | Fallback |
|
||||
|----------|-------------------|----------|
|
||||
| C++ (Itanium ABI) | `Demangler.Net` (NuGet) | llvm-cxxfilt via P/Invoke |
|
||||
| C++ (MSVC) | `UnDecorateSymbolName` wrapper | None (Windows-specific) |
|
||||
| Rust | `rustc-demangle` port | rustfilt via P/Invoke |
|
||||
| Swift | `swift-demangle` port | None |
|
||||
| D | `dlang-demangler` port | None |
|
||||
|
||||
**Rationale:**
|
||||
- Managed demanglers provide determinism and portability
|
||||
- Native fallback only for edge cases
|
||||
- No runtime dependency on external tools
|
||||
|
||||
**NuGet packages:**
|
||||
```xml
|
||||
<PackageReference Include="Demangler.Net" Version="1.0.0" />
|
||||
```
|
||||
|
||||
### 1.5 Disassembler (Optional, for heuristic analysis)
|
||||
|
||||
**Decision:** Use Iced (x86/x64) + Capstone.NET (ARM/others)
|
||||
|
||||
| Architecture | Library | NuGet Package |
|
||||
|--------------|---------|---------------|
|
||||
| x86/x64 | Iced | `Iced` |
|
||||
| ARM/ARM64 | Capstone.NET | `Capstone.NET` |
|
||||
| Other | Skip disassembly | N/A |
|
||||
|
||||
**Rationale:**
|
||||
- Iced is pure managed, no native deps for x86
|
||||
- Capstone.NET wraps Capstone with native lib
|
||||
- Disassembly is optional for heuristic edge detection
|
||||
|
||||
### 1.6 Callgraph Extraction
|
||||
|
||||
**Decision:** Static analysis only (no dynamic execution)
|
||||
|
||||
**Methods:**
|
||||
1. Relocation-based: Extract call targets from relocations
|
||||
2. Import/Export: Map import references to exports
|
||||
3. Symbol-based: Direct and indirect call targets from symbol table
|
||||
4. CFG heuristics: Basic block boundary detection (x86 only)
|
||||
|
||||
**No dynamic analysis:** Avoids execution risks, portable.
|
||||
|
||||
---
|
||||
|
||||
## 2. CI Toolchain Requirements
|
||||
|
||||
### 2.1 Build Requirements
|
||||
|
||||
| Component | Requirement | Notes |
|
||||
|-----------|-------------|-------|
|
||||
| .NET SDK | 10.0+ | Required for all builds |
|
||||
| Native libs (optional) | Capstone 4.0+ | Only for ARM disassembly |
|
||||
| Test binaries | Pre-built fixtures | No compiler dependency in CI |
|
||||
|
||||
### 2.2 Test Fixture Strategy
|
||||
|
||||
**Decision:** Ship pre-built binary fixtures, not source + compiler
|
||||
|
||||
**Rationale:**
|
||||
- Deterministic: Same binary hash every run
|
||||
- No compiler dependency in CI
|
||||
- Smaller CI image footprint
|
||||
- Cross-platform: Same fixtures on all runners
|
||||
|
||||
**Fixture locations:**
|
||||
```
|
||||
tests/Binary/fixtures/
|
||||
elf-x86_64/
|
||||
binary.elf # Pre-built
|
||||
expected.json # Expected graph
|
||||
expected-hashes.txt # Determinism check
|
||||
pe-x64/
|
||||
binary.exe
|
||||
expected.json
|
||||
macho-arm64/
|
||||
binary.dylib
|
||||
expected.json
|
||||
```
|
||||
|
||||
### 2.3 Fixture Generation (Offline)
|
||||
|
||||
Fixtures are generated offline by maintainers:
|
||||
|
||||
```bash
|
||||
# Generate ELF fixture (run once, commit result)
|
||||
cd tools/fixtures
|
||||
./generate-elf-fixture.sh
|
||||
|
||||
# Verify hashes match
|
||||
./verify-fixtures.sh
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Demangling Contract
|
||||
|
||||
### 3.1 Output Format
|
||||
|
||||
Demangled names follow this format:
|
||||
|
||||
```json
|
||||
{
|
||||
"symbol": {
|
||||
"mangled": "_ZN4Curl7Session4readEv",
|
||||
"demangled": "Curl::Session::read()",
|
||||
"source": "itanium-abi",
|
||||
"confidence": 1.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Demangling Sources
|
||||
|
||||
| Source | Description | Confidence |
|
||||
|--------|-------------|------------|
|
||||
| `itanium-abi` | Itanium C++ ABI (GCC/Clang) | 1.0 |
|
||||
| `msvc` | Microsoft Visual C++ | 1.0 |
|
||||
| `rust` | Rust mangling | 1.0 |
|
||||
| `swift` | Swift mangling | 1.0 |
|
||||
| `fallback` | Native tool fallback | 0.9 |
|
||||
| `heuristic` | Pattern-based guess | 0.6 |
|
||||
| `none` | No demangling available | 0.3 |
|
||||
|
||||
### 3.3 Failed Demangling
|
||||
|
||||
When demangling fails:
|
||||
|
||||
```json
|
||||
{
|
||||
"symbol": {
|
||||
"mangled": "_Z15unknown_format",
|
||||
"demangled": null,
|
||||
"source": "none",
|
||||
"confidence": 0.3,
|
||||
"demangling_error": "Unrecognized mangling scheme"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Callgraph Edge Types
|
||||
|
||||
### 4.1 Edge Type Enumeration
|
||||
|
||||
| Type | Description | Confidence |
|
||||
|------|-------------|------------|
|
||||
| `call` | Direct call instruction | 1.0 |
|
||||
| `plt` | PLT/GOT indirect call | 0.95 |
|
||||
| `indirect` | Indirect call (vtable, function pointer) | 0.6 |
|
||||
| `init_array` | From init_array to function | 1.0 |
|
||||
| `tls_callback` | TLS callback invocation | 1.0 |
|
||||
| `exception` | Exception handler target | 0.8 |
|
||||
| `switch` | Switch table target | 0.7 |
|
||||
| `heuristic` | CFG-based heuristic | 0.4 |
|
||||
|
||||
### 4.2 Unknown Targets
|
||||
|
||||
When call target cannot be resolved:
|
||||
|
||||
```json
|
||||
{
|
||||
"unknowns": [
|
||||
{
|
||||
"id": "unknown:call:0x12345678",
|
||||
"type": "unresolved_call_target",
|
||||
"source_id": "sym:binary:abc...",
|
||||
"call_site": "0x12345678",
|
||||
"reason": "Indirect call through register"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Performance Constraints
|
||||
|
||||
### 5.1 Size Limits
|
||||
|
||||
| Metric | Limit | Action on Exceed |
|
||||
|--------|-------|------------------|
|
||||
| Binary size | 100 MB | Warn, proceed |
|
||||
| Symbol count | 1M symbols | Chunk processing |
|
||||
| Edge count | 10M edges | Chunk output |
|
||||
| Memory usage | 4 GB | Stream processing |
|
||||
|
||||
### 5.2 Timeout Constraints
|
||||
|
||||
| Operation | Timeout | Action on Exceed |
|
||||
|-----------|---------|------------------|
|
||||
| ELF parse | 60s | Fail with partial |
|
||||
| Demangle all | 120s | Truncate results |
|
||||
| CFG analysis | 300s | Skip heuristics |
|
||||
| Total analysis | 600s | Fail gracefully |
|
||||
|
||||
---
|
||||
|
||||
## 6. Integration Points
|
||||
|
||||
### 6.1 Scanner Plugin Interface
|
||||
|
||||
```csharp
|
||||
public interface INativeAnalyzer : IAnalyzerPlugin
|
||||
{
|
||||
Task<NativeObservationDocument> AnalyzeAsync(
|
||||
Stream binaryStream,
|
||||
NativeAnalyzerOptions options,
|
||||
CancellationToken ct);
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 RichGraph Integration
|
||||
|
||||
Native analysis results feed into RichGraph:
|
||||
|
||||
```
|
||||
NativeObservation → NativeReachabilityGraph → RichGraph nodes/edges
|
||||
```
|
||||
|
||||
### 6.3 Signals Integration
|
||||
|
||||
Native symbols with runtime hits:
|
||||
|
||||
```
|
||||
Signals runtime-facts + RichGraph → ReachabilityFact with confidence
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Implementation Checklist
|
||||
|
||||
| Task | Status | Owner |
|
||||
|------|--------|-------|
|
||||
| ELF parser | Done | Scanner Guild |
|
||||
| PE parser | Done | Scanner Guild |
|
||||
| Mach-O parser | In Progress | Scanner Guild |
|
||||
| C++ demangler | Done | Scanner Guild |
|
||||
| Rust demangler | Pending | Scanner Guild |
|
||||
| Callgraph builder | Done | Scanner Guild |
|
||||
| Test fixtures | Partial | QA Guild |
|
||||
| CI integration | Pending | DevOps Guild |
|
||||
|
||||
---
|
||||
|
||||
## 8. Related Documents
|
||||
|
||||
- [richgraph-v1 Contract](./richgraph-v1.md)
|
||||
- [Build-ID Propagation](./buildid-propagation.md)
|
||||
- [Init-Section Roots](./init-section-roots.md)
|
||||
- [Binary Reachability Schema](../reachability/binary-reachability-schema.md)
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0.0 | 2025-12-13 | Platform Guild | Initial toolchain decision |
|
||||
Reference in New Issue
Block a user