Files
git.stella-ops.org/docs/features/unimplemented/binaryindex/binary-identity-extraction.md
2026-02-12 10:27:23 +02:00

2.6 KiB

Binary Identity Extraction (Build-ID Based)

Module

BinaryIndex

Status

PARTIALLY_IMPLEMENTED

Description

Binary identity extraction using Build-IDs and symbol observations for ELF binary identification, with ground-truth validation and SBOM stability verification.

Implementation Details

  • Modules: src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/, src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/
  • Key Classes:
    • BinaryIdentityService (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/BinaryIdentityService.cs) - main service for extracting binary identity from ELF/PE/Mach-O binaries
    • ElfFeatureExtractor (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs) - extracts Build-ID, symbol tables, and section info from ELF binaries
    • PeFeatureExtractor (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/PeFeatureExtractor.cs) - extracts CodeView GUID from Windows PE binaries
    • MachoFeatureExtractor (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/MachoFeatureExtractor.cs) - extracts LC_UUID from Mach-O binaries
    • StreamGuard (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/StreamGuard.cs) - safe stream handling for non-seekable streams
  • Interfaces: IBinaryFeatureExtractor (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/IBinaryFeatureExtractor.cs)
  • Models: BinaryIdentity (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/BinaryIdentity.cs)

E2E Test Plan

  • Submit an ELF binary with a known Build-ID and verify the extracted identity matches
  • Submit a Windows PE binary and verify CodeView GUID extraction via PeFeatureExtractor
  • Submit a Mach-O binary and verify LC_UUID extraction via MachoFeatureExtractor
  • Verify that non-seekable streams are handled correctly via StreamGuard
  • Verify that binaries without Build-IDs fall back to symbol-based identification
  • Verify extracted identities are persisted and queryable through BinaryVulnerabilityService

Verification Outcome (run-001)

  • Tier 0/1/2 artifacts: docs/qa/feature-checks/runs/binaryindex/binary-identity-extraction/run-001/
  • Result: not implemented at claim parity.
  • Missing behavior:
    • Build-ID-missing fallback path uses file hash, not symbol-observation-based identity as claimed.
    • Ground-truth validation and SBOM stability verification are not implemented in the documented extraction flow.
    • Existing behavioral tests do not explicitly prove PE CodeView GUID / Mach-O LC_UUID extraction semantics.