save checkpoint: save features
This commit is contained in:
@@ -0,0 +1,37 @@
|
||||
# Binary Identity Extraction (Build-ID Based)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
PARTIALLY_IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Binary identity extraction using Build-IDs and symbol observations for ELF binary identification, with ground-truth validation and SBOM stability verification.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIdentityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/BinaryIdentityService.cs`) - main service for extracting binary identity from ELF/PE/Mach-O binaries
|
||||
- `ElfFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs`) - extracts Build-ID, symbol tables, and section info from ELF binaries
|
||||
- `PeFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/PeFeatureExtractor.cs`) - extracts CodeView GUID from Windows PE binaries
|
||||
- `MachoFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/MachoFeatureExtractor.cs`) - extracts LC_UUID from Mach-O binaries
|
||||
- `StreamGuard` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/StreamGuard.cs`) - safe stream handling for non-seekable streams
|
||||
- **Interfaces**: `IBinaryFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/IBinaryFeatureExtractor.cs`)
|
||||
- **Models**: `BinaryIdentity` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/BinaryIdentity.cs`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit an ELF binary with a known Build-ID and verify the extracted identity matches
|
||||
- [ ] Submit a Windows PE binary and verify CodeView GUID extraction via `PeFeatureExtractor`
|
||||
- [ ] Submit a Mach-O binary and verify LC_UUID extraction via `MachoFeatureExtractor`
|
||||
- [ ] Verify that non-seekable streams are handled correctly via `StreamGuard`
|
||||
- [ ] Verify that binaries without Build-IDs fall back to symbol-based identification
|
||||
- [ ] Verify extracted identities are persisted and queryable through `BinaryVulnerabilityService`
|
||||
|
||||
## Verification Outcome (run-001)
|
||||
- Tier 0/1/2 artifacts: docs/qa/feature-checks/runs/binaryindex/binary-identity-extraction/run-001/
|
||||
- Result: not implemented at claim parity.
|
||||
- Missing behavior:
|
||||
- Build-ID-missing fallback path uses file hash, not symbol-observation-based identity as claimed.
|
||||
- Ground-truth validation and SBOM stability verification are not implemented in the documented extraction flow.
|
||||
- Existing behavioral tests do not explicitly prove PE CodeView GUID / Mach-O LC_UUID extraction semantics.
|
||||
Reference in New Issue
Block a user