2.5 KiB
2.5 KiB
Semantic Analysis Library (IR Lifting and Function Fingerprinting)
Module
BinaryIndex
Status
IMPLEMENTED
Description
Semantic binary analysis with IR lifting, function fingerprint generation, semantic matching, graph extraction, and call n-gram generation for function-level binary comparison.
Implementation Details
- Modules:
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/ - Key Classes:
IrLiftingService(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/IrLiftingService.cs) - lifts machine code to intermediate representation using B2R2SemanticFingerprintGenerator(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs) - generatesSemanticFingerprintusing Weisfeiler-Lehman graph hashing (KsgWeisfeilerLehmanV1 algorithm)SemanticGraphExtractor(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticGraphExtractor.cs) - extracts key-semantics graphs (KSG) from lifted IRSemanticMatcher(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticMatcher.cs) - matches semantic fingerprints for similarity scoringCallNgramGenerator(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs) - call-sequence n-gram fingerprintingWeisfeilerLehmanHasher(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/WeisfeilerLehmanHasher.cs) - WL graph hash implementationGraphCanonicalizer(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/GraphCanonicalizer.cs) - graph canonicalization for deterministic hashing
- Models:
FingerprintModels(SemanticFingerprint, SemanticFingerprintOptions, SemanticFingerprintAlgorithm),GraphModels(KeySemanticsGraph),IrModels(LiftedFunction, IrStatement) - Interfaces:
IIrLiftingService,ISemanticFingerprintGenerator,ISemanticGraphExtractor,ISemanticMatcher
E2E Test Plan
- Lift a binary function to IR via
IrLiftingServiceand verify IR structure contains valid statements - Generate a semantic fingerprint via
SemanticFingerprintGeneratorand verify hash is deterministic - Extract a key-semantics graph via
SemanticGraphExtractorand verify node/edge structure - Match two fingerprints of the same function (different compilers) via
SemanticMatcherand verify high similarity - Verify Weisfeiler-Lehman graph hash produces different hashes for structurally different functions
- Verify
GraphCanonicalizerproduces consistent canonical forms for isomorphic graphs