Files
git.stella-ops.org/docs/features/unchecked/binaryindex/function-range-hashing-and-symbol-mapping.md

2.2 KiB

Function-Range Hashing and Symbol Mapping

Module

BinaryIndex

Status

IMPLEMENTED

Description

Multi-backend disassembly (Iced, B2R2) with function-range normalization for symbol-level binary proof.

Implementation Details

  • Modules: src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/, src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/
  • Key Classes:
    • IFunctionFingerprintExtractor (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/IFunctionFingerprintExtractor.cs) - extracts function-range fingerprints from disassembled binaries
    • FunctionDiffer (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionDiffer.cs) - compares function fingerprints with semantic analysis support; computes call-graph edge diffs
    • FunctionRenameDetector (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionRenameDetector.cs) - detects renamed functions by comparing fingerprint similarity
    • PatchDiffEngine (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/PatchDiffEngine.cs) - builder-level patch diff engine
    • FingerprintClaimModels (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/FingerprintClaimModels.cs) - FingerprintClaim and FingerprintClaimEvidence records
  • Models: FingerprintModels (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/Models/FingerprintModels.cs) - FunctionFingerprint with hash, size, call edges
  • Disassembly Backends: B2R2DisassemblyPlugin (ARM/x86/x64/AArch64), IcedDisassemblyPlugin (x86/x64)

E2E Test Plan

  • Extract function fingerprints from an ELF binary and verify hash consistency for identical functions
  • Verify function-range normalization produces same hash across compiler optimization levels when function logic is identical
  • Verify FunctionDiffer correctly identifies added, removed, and modified functions
  • Verify FunctionRenameDetector matches renamed functions based on fingerprint similarity threshold
  • Verify FingerprintClaim evidence links correctly to Build-ID and function IDs
  • Verify multi-backend consistency: same binary produces matching fingerprints via B2R2 and Iced