50 lines
3.9 KiB
Markdown
50 lines
3.9 KiB
Markdown
# Disassembly and binary analysis pipeline
|
|
|
|
## Module
|
|
BinaryIndex
|
|
|
|
## Status
|
|
VERIFIED
|
|
|
|
## Description
|
|
Pluggable disassembly framework with Ghidra integration (BSim + version tracking) for binary analysis capabilities.
|
|
|
|
## Implementation Details
|
|
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Abstractions/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/`
|
|
- **Key Classes**:
|
|
- `DisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyService.cs`) - core disassembly orchestrator
|
|
- `HybridDisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/HybridDisassemblyService.cs`) - multi-backend hybrid disassembly with quality-based plugin selection
|
|
- `DisassemblyPluginRegistry` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyPluginRegistry.cs`) - manages registered disassembly plugins
|
|
- `BinaryFormatDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/BinaryFormatDetector.cs`) - detects ELF/PE/Mach-O format from binary headers
|
|
- `B2R2DisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2DisassemblyPlugin.cs`) - B2R2 backend with architecture mapping, instruction mapping, operand parsing
|
|
- `B2R2LowUirLiftingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LowUirLiftingService.cs`) - lifts machine code to LowUIR intermediate representation with SSA transformation
|
|
- `B2R2LifterPool` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPool.cs`) - object pool for B2R2 lifter instances with warm preloading
|
|
- `IcedDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/IcedDisassemblyPlugin.cs`) - Iced x86/x64 disassembler plugin
|
|
- `GhidraDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/Services/GhidraDisassemblyPlugin.cs`) - Ghidra integration
|
|
- `GhidraDecompilerAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/GhidraDecompilerAdapter.cs`) - Ghidra decompilation with AST comparison
|
|
- **Abstractions**: `IDisassemblyPlugin`, `IDisassemblyPluginRegistry`, `IDisassemblyService` with models for `BinaryFormat`, `CpuArchitecture`, `DisassembledInstruction`, `InstructionKind`, etc.
|
|
- **Decompiler**: Full AST comparison engine with recursive parser, code normalizer, semantic equivalence checking
|
|
|
|
## E2E Test Plan
|
|
- [ ] Load an x86-64 ELF binary via `HybridDisassemblyService` and verify disassembly produces valid instructions
|
|
- [ ] Verify `BinaryFormatDetector` correctly identifies ELF, PE, and Mach-O formats
|
|
- [ ] Verify B2R2 plugin handles architecture mapping for x86, x64, ARM, AArch64
|
|
- [ ] Verify B2R2 LowUIR lifting produces valid IR with SSA form
|
|
- [ ] Verify Iced plugin disassembles x86/x64 instructions correctly
|
|
- [ ] Verify `B2R2LifterPool` warm preloading and pool size management
|
|
- [ ] Verify Ghidra decompiler adapter produces comparable ASTs via `AstComparisonEngine`
|
|
- [ ] Verify hybrid disassembly quality scoring selects the best plugin for each binary
|
|
|
|
## Verification
|
|
- Tier 0/1/2 artifacts: `docs/qa/feature-checks/runs/binaryindex/disassembly-and-binary-analysis-pipeline/run-001/`.
|
|
- Result: verified.
|
|
- Evidence summary:
|
|
- `tier1-test-disassembly.log`: Passed 45/45.
|
|
- `tier1-test-ghidra-retest.log`: Passed 122/122.
|
|
- `tier1-test-decompiler-retest.log`: Passed 35/35.
|
|
- `tier2-test-disassembly.log`: Passed 45/45.
|
|
- `tier2-test-ghidra.log`: Passed 122/122.
|
|
- `tier2-test-decompiler.log`: Passed 35/35.
|
|
- Note: initial Ghidra/Decompiler `--no-build` checks produced `Invalid TargetPath`; reran with build and captured final passing evidence.
|
|
|