# ELF Normalization and Delta Hashing ## Module BinaryIndex ## Status IMPLEMENTED ## Description Low-entropy delta signatures over ELF segments with normalization (relocation zeroing, NOP canonicalization, jump table rewriting). Not yet implemented. ## What's Implemented - **Delta Signature Infrastructure**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/` - function-level delta signatures with V1 and V2 predicates exist - `DeltaSignatureGenerator` - generates delta signatures (function-level, not ELF-segment-level) - `DeltaSignatureMatcher` - matches delta signatures - `CfgExtractor` - extracts control flow graphs - `IrDiffGenerator` - IR-level diff generation - **Binary Diff Engine**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/PatchDiffEngine.cs` - byte-level and function-level diffing - **ELF Feature Extraction**: `ElfFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/`) - extracts Build-ID and section info from ELF binaries - **Disassembly**: `B2R2DisassemblyPlugin`, `HybridDisassemblyService` - multi-backend disassembly infrastructure ## What's Missing - ELF segment-level normalization (relocation zeroing to eliminate position-dependent bytes) - NOP canonicalization (normalizing NOP sled variations across compilers) - Jump table rewriting (normalizing indirect jump table entries) - Low-entropy delta hashing over normalized ELF segments (currently delta-sig operates at function level, not segment level) - Segment-aware normalization that handles .text, .rodata, .data sections separately ## Implementation Plan - Add ELF segment normalization pass to `ElfFeatureExtractor` or new `ElfNormalizer` class - Implement relocation zeroing: identify and zero-out position-dependent bytes (GOT/PLT entries, absolute addresses) - Implement NOP canonicalization: normalize all NOP variants to canonical form - Implement jump table rewriting: normalize indirect jump table entries - Add segment-level delta hashing on normalized output - Integrate with existing `DeltaSignatureGenerator` for hybrid function+segment signatures - Add tests using known ELF binaries with position-dependent variations ## Related Documentation - Current delta-sig: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/` - ELF extraction: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs` - Disassembly: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`