Files
git.stella-ops.org/docs/features/checked/binaryindex/disassembly-and-binary-analysis-pipeline.md
2026-02-12 10:27:23 +02:00

50 lines
3.9 KiB
Markdown

# Disassembly and binary analysis pipeline
## Module
BinaryIndex
## Status
VERIFIED
## Description
Pluggable disassembly framework with Ghidra integration (BSim + version tracking) for binary analysis capabilities.
## Implementation Details
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Abstractions/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/`
- **Key Classes**:
- `DisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyService.cs`) - core disassembly orchestrator
- `HybridDisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/HybridDisassemblyService.cs`) - multi-backend hybrid disassembly with quality-based plugin selection
- `DisassemblyPluginRegistry` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyPluginRegistry.cs`) - manages registered disassembly plugins
- `BinaryFormatDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/BinaryFormatDetector.cs`) - detects ELF/PE/Mach-O format from binary headers
- `B2R2DisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2DisassemblyPlugin.cs`) - B2R2 backend with architecture mapping, instruction mapping, operand parsing
- `B2R2LowUirLiftingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LowUirLiftingService.cs`) - lifts machine code to LowUIR intermediate representation with SSA transformation
- `B2R2LifterPool` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPool.cs`) - object pool for B2R2 lifter instances with warm preloading
- `IcedDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/IcedDisassemblyPlugin.cs`) - Iced x86/x64 disassembler plugin
- `GhidraDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/Services/GhidraDisassemblyPlugin.cs`) - Ghidra integration
- `GhidraDecompilerAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/GhidraDecompilerAdapter.cs`) - Ghidra decompilation with AST comparison
- **Abstractions**: `IDisassemblyPlugin`, `IDisassemblyPluginRegistry`, `IDisassemblyService` with models for `BinaryFormat`, `CpuArchitecture`, `DisassembledInstruction`, `InstructionKind`, etc.
- **Decompiler**: Full AST comparison engine with recursive parser, code normalizer, semantic equivalence checking
## E2E Test Plan
- [ ] Load an x86-64 ELF binary via `HybridDisassemblyService` and verify disassembly produces valid instructions
- [ ] Verify `BinaryFormatDetector` correctly identifies ELF, PE, and Mach-O formats
- [ ] Verify B2R2 plugin handles architecture mapping for x86, x64, ARM, AArch64
- [ ] Verify B2R2 LowUIR lifting produces valid IR with SSA form
- [ ] Verify Iced plugin disassembles x86/x64 instructions correctly
- [ ] Verify `B2R2LifterPool` warm preloading and pool size management
- [ ] Verify Ghidra decompiler adapter produces comparable ASTs via `AstComparisonEngine`
- [ ] Verify hybrid disassembly quality scoring selects the best plugin for each binary
## Verification
- Tier 0/1/2 artifacts: `docs/qa/feature-checks/runs/binaryindex/disassembly-and-binary-analysis-pipeline/run-001/`.
- Result: verified.
- Evidence summary:
- `tier1-test-disassembly.log`: Passed 45/45.
- `tier1-test-ghidra-retest.log`: Passed 122/122.
- `tier1-test-decompiler-retest.log`: Passed 35/35.
- `tier2-test-disassembly.log`: Passed 45/45.
- `tier2-test-ghidra.log`: Passed 122/122.
- `tier2-test-decompiler.log`: Passed 35/35.
- Note: initial Ghidra/Decompiler `--no-build` checks produced `Invalid TargetPath`; reran with build and captured final passing evidence.