save checkpoint: save features
This commit is contained in:
@@ -0,0 +1,31 @@
|
||||
# Semantic Analysis Library (IR Lifting and Function Fingerprinting)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Semantic binary analysis with IR lifting, function fingerprint generation, semantic matching, graph extraction, and call n-gram generation for function-level binary comparison.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
|
||||
- **Key Classes**:
|
||||
- `IrLiftingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/IrLiftingService.cs`) - lifts disassembled instructions to deterministic IR/SSA models (with B2R2-specific lifting types available under `Lifting/`)
|
||||
- `SemanticFingerprintGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs`) - generates `SemanticFingerprint` using Weisfeiler-Lehman graph hashing (KsgWeisfeilerLehmanV1 algorithm)
|
||||
- `SemanticGraphExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticGraphExtractor.cs`) - extracts key-semantics graphs (KSG) from lifted IR
|
||||
- `SemanticMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticMatcher.cs`) - matches semantic fingerprints for similarity scoring
|
||||
- `CallNgramGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs`) - call-sequence n-gram fingerprinting
|
||||
- `WeisfeilerLehmanHasher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/WeisfeilerLehmanHasher.cs`) - WL graph hash implementation
|
||||
- `GraphCanonicalizer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/GraphCanonicalizer.cs`) - graph canonicalization for deterministic hashing
|
||||
- **Models**: `FingerprintModels` (SemanticFingerprint, SemanticFingerprintOptions, SemanticFingerprintAlgorithm), `GraphModels` (KeySemanticsGraph), `IrModels` (LiftedFunction, IrStatement)
|
||||
- **Interfaces**: `IIrLiftingService`, `ISemanticFingerprintGenerator`, `ISemanticGraphExtractor`, `ISemanticMatcher`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Lift a binary function to IR via `IrLiftingService` and verify IR structure contains valid statements
|
||||
- [ ] Generate a semantic fingerprint via `SemanticFingerprintGenerator` and verify hash is deterministic
|
||||
- [ ] Extract a key-semantics graph via `SemanticGraphExtractor` and verify node/edge structure
|
||||
- [ ] Match two fingerprints of the same function (different compilers) via `SemanticMatcher` and verify high similarity
|
||||
- [ ] Verify Weisfeiler-Lehman graph hash produces different hashes for structurally different functions
|
||||
- [ ] Verify `GraphCanonicalizer` produces consistent canonical forms for isomorphic graphs
|
||||
Reference in New Issue
Block a user