save checkpoint
This commit is contained in:
@@ -0,0 +1,29 @@
|
||||
# Function-Range Hashing and Symbol Mapping
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Multi-backend disassembly (Iced, B2R2) with function-range normalization for symbol-level binary proof.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/`
|
||||
- **Key Classes**:
|
||||
- `IFunctionFingerprintExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/IFunctionFingerprintExtractor.cs`) - extracts function-range fingerprints from disassembled binaries
|
||||
- `FunctionDiffer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionDiffer.cs`) - compares function fingerprints with semantic analysis support; computes call-graph edge diffs
|
||||
- `FunctionRenameDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionRenameDetector.cs`) - detects renamed functions by comparing fingerprint similarity
|
||||
- `PatchDiffEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/PatchDiffEngine.cs`) - builder-level patch diff engine
|
||||
- `FingerprintClaimModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/FingerprintClaimModels.cs`) - `FingerprintClaim` and `FingerprintClaimEvidence` records
|
||||
- **Models**: `FingerprintModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/Models/FingerprintModels.cs`) - `FunctionFingerprint` with hash, size, call edges
|
||||
- **Disassembly Backends**: `B2R2DisassemblyPlugin` (ARM/x86/x64/AArch64), `IcedDisassemblyPlugin` (x86/x64)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Extract function fingerprints from an ELF binary and verify hash consistency for identical functions
|
||||
- [ ] Verify function-range normalization produces same hash across compiler optimization levels when function logic is identical
|
||||
- [ ] Verify `FunctionDiffer` correctly identifies added, removed, and modified functions
|
||||
- [ ] Verify `FunctionRenameDetector` matches renamed functions based on fingerprint similarity threshold
|
||||
- [ ] Verify `FingerprintClaim` evidence links correctly to Build-ID and function IDs
|
||||
- [ ] Verify multi-backend consistency: same binary produces matching fingerprints via B2R2 and Iced
|
||||
@@ -0,0 +1,29 @@
|
||||
# Golden Corpus Bundle Export/Import Service
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Import/export services for golden corpus bundles with standalone verification support, enabling offline corpus distribution and validation. The known list has "Offline Corpus Bundle Export/Import" but this provides reproducible bundle management with trust-profile-aware verification specific to the golden corpus.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/`
|
||||
- **Key Classes**:
|
||||
- `ServiceCollectionExtensions.AddCorpusBundleExport()` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ServiceCollectionExtensions.cs`) - registers export services
|
||||
- `ServiceCollectionExtensions.AddCorpusBundleImport()` (same file) - registers import services
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - uses imported bundles for validation runs
|
||||
- `GroundTruthCorpusBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/GroundTruthCorpusBuilder.cs`) - builds training corpus with export support in JsonLines and Json formats
|
||||
- **Interfaces**: `ICorpusBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/ICorpusBuilder.cs`) - `ExportAsync()` with format selection
|
||||
- **Export Formats**: `CorpusExportFormat.JsonLines`, `CorpusExportFormat.Json`
|
||||
- **Source**: SPRINT_20260121_035_BinaryIndex_golden_corpus_connectors_cli.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Export a golden corpus bundle and verify the output file contains all function fingerprints and metadata
|
||||
- [ ] Import the exported bundle and verify all entries are restored correctly
|
||||
- [ ] Verify round-trip: export then import and verify validation results match
|
||||
- [ ] Verify JsonLines export format produces one record per line
|
||||
- [ ] Verify Json export format produces a single valid JSON document
|
||||
- [ ] Verify offline verification works with imported bundles without network access
|
||||
@@ -0,0 +1,27 @@
|
||||
# Golden Corpus KPI Regression Service
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
KPI regression tracking service for golden corpus validation, including SBOM hash stability validation, regression detection across corpus runs, and automated KPI reporting. The known list has "Golden Corpus" and "Golden Set" entries but not a dedicated KPI regression service for tracking validation quality over time.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/`
|
||||
- **Key Classes**:
|
||||
- `KpiRegressionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/KpiRegressionService.cs`) - detects accuracy regressions across validation runs by comparing KPI metrics over time; uses `TimeProvider` for testable timestamps
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - produces validation run results consumed by KPI regression tracking
|
||||
- **Interfaces**: `IKpiRegressionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/IKpiRegressionService.cs`)
|
||||
- **Validation Metrics**: precision, recall, F1 score, false positive rate tracked per validation run
|
||||
- **Source**: SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Run two validation passes with different accuracy and verify `KpiRegressionService` detects the regression
|
||||
- [ ] Verify KPI metrics (precision, recall, F1) are computed correctly from validation run results
|
||||
- [ ] Verify no regression is reported when accuracy improves between runs
|
||||
- [ ] Verify SBOM hash stability check flags unstable hash generation
|
||||
- [ ] Verify regression alerts include the specific metrics that degraded
|
||||
- [ ] Verify `TimeProvider` injection allows deterministic testing of time-based regression windows
|
||||
@@ -0,0 +1,29 @@
|
||||
# Golden Corpus Validation Harness
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Validation harness infrastructure for running golden corpus tests against binary index results, comparing expected vs actual outcomes. While "Validation Harness and Reproducibility Verification" is in the known list, this is a distinct BinaryIndex-specific validation harness with its own abstraction layer.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation.Abstractions/`
|
||||
- **Key Classes**:
|
||||
- `ValidationHarness` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/ValidationHarness.cs`) - main harness with `IMatcherAdapterFactory` integration for pluggable matching strategies
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - orchestrates reproducible-build validation runs with `ValidationRunContext`
|
||||
- `CallGraphMatcherAdapter` and other `MatcherAdapters` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/Matchers/MatcherAdapters.cs`) - adapters for different matching strategies
|
||||
- **Interfaces**: `IValidationHarness` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation.Abstractions/IValidationHarness.cs`)
|
||||
- **Models**: `ValidationRun` with `CorpusSnapshotId` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation.Abstractions/ValidationRun.cs`)
|
||||
- **Registration**: `ValidationServiceCollectionExtensions.AddValidationHarness()` and `AddCorpusBundleExport/Import`
|
||||
- **Source**: SPRINT_20260121_034_BinaryIndex_golden_corpus_foundation.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Run validation harness against a golden set and verify expected vs actual outcomes are compared
|
||||
- [ ] Verify pluggable matcher adapters: run with call-graph matcher and verify correct results
|
||||
- [ ] Verify validation run produces a `ValidationRun` with correct `CorpusSnapshotId`
|
||||
- [ ] Verify validation attestor generates valid attestation predicates from validation run results
|
||||
- [ ] Verify report generator produces deterministic reports from validation runs
|
||||
- [ ] Verify validation results feed into KPI regression service for tracking
|
||||
@@ -0,0 +1,31 @@
|
||||
# Golden Set for Patch Validation (in BinaryIndex)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Golden set analysis pipeline and API controller for curated binary patch validation test cases.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`, `src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/`
|
||||
- **Key Classes**:
|
||||
- `GoldenSetAnalysisPipeline` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/GoldenSetAnalysisPipeline.cs`) - runs validation analysis against golden set definitions
|
||||
- `GoldenSetController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/GoldenSetController.cs`) - REST API for golden set CRUD operations with filtering, pagination, and ordering
|
||||
- `POST /api/v1/golden-sets` - create golden set definitions
|
||||
- `GET /api/v1/golden-sets` - list with status/component/tag filters
|
||||
- `GET /api/v1/golden-sets/{id}` - get by ID
|
||||
- `GoldenSetValidator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Validation/GoldenSetValidator.cs`) - validates golden set definitions
|
||||
- **Interfaces**: `IGoldenSetStore`, `IGoldenSetValidator`
|
||||
- **Models**: `GoldenSetDefinition`, `GoldenSetListQuery`, `GoldenSetListResponse`, `GoldenSetCreateRequest/Response`
|
||||
- **Enums**: `GoldenSetStatus` (Draft, Active, etc.), `GoldenSetOrderBy`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Create a golden set via `POST /api/v1/golden-sets` and verify it is stored with Draft status
|
||||
- [ ] List golden sets with component filter and verify only matching sets are returned
|
||||
- [ ] Get golden set by ID and verify all fields including metadata are returned
|
||||
- [ ] Run golden set analysis pipeline against a known binary pair and verify patch validation result
|
||||
- [ ] Verify golden set validation rejects definitions with invalid CVE references
|
||||
- [ ] Verify pagination and ordering work correctly with multiple golden sets
|
||||
@@ -0,0 +1,33 @@
|
||||
# Golden Set Schema and Management
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Full golden set management library with authoring, configuration, serialization, storage, validation, and migration support.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/`
|
||||
- **Key Classes**:
|
||||
- **Authoring**: `GoldenSetExtractor`, `GoldenSetEnrichmentService`, `GoldenSetReviewService`, `UpstreamCommitAnalyzer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Authoring/`)
|
||||
- **Source Extractors**: `NvdGoldenSetExtractor`, `FunctionHintExtractor`, `CweToSinkMapper` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Authoring/Extractors/`)
|
||||
- **Configuration**: `GoldenSetOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Configuration/`)
|
||||
- **Models**: `GoldenSetDefinition` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Models/`)
|
||||
- **Serialization**: `GoldenSetYamlSerializer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Serialization/`)
|
||||
- **Storage**: `PostgresGoldenSetStore` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Storage/`), `IGoldenSetStore`
|
||||
- **Validation**: `GoldenSetValidator`, `ICveValidator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Validation/`)
|
||||
- **Services**: `SinkRegistry`, `ISinkRegistry` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Services/`)
|
||||
- **Registration**: `GoldenSetServiceCollectionExtensions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Extensions/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Author a golden set from NVD data via `NvdGoldenSetExtractor` and verify extracted CVE entries
|
||||
- [ ] Enrich golden set with function hints via `FunctionHintExtractor` and verify hint annotations
|
||||
- [ ] Map CWEs to sink functions via `CweToSinkMapper` and verify correct mappings
|
||||
- [ ] Serialize golden set to YAML via `GoldenSetYamlSerializer` and verify round-trip fidelity
|
||||
- [ ] Store golden set in PostgreSQL via `PostgresGoldenSetStore` and verify retrieval
|
||||
- [ ] Validate golden set definition via `GoldenSetValidator` and verify errors for invalid entries
|
||||
- [ ] Verify `SinkRegistry` maintains the sink function catalog
|
||||
- [ ] Verify review workflow via `GoldenSetReviewService` transitions (Draft -> Review -> Approved)
|
||||
@@ -0,0 +1,30 @@
|
||||
# Ground-Truth Corpus Infrastructure (Symbol Source Abstractions)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Abstraction layer for symbol source connectors, validation harness, KPI computation, and security pair tracking for the ground-truth corpus infrastructure.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/`
|
||||
- **Key Classes**:
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - orchestrates ground truth validation with `ValidationRunContext`
|
||||
- `KpiRegressionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/KpiRegressionService.cs`) - KPI computation and regression tracking
|
||||
- `GroundTruthProvenanceResolver` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Provenance/GroundTruthProvenanceResolver.cs`) - resolves symbol provenance from ground truth data
|
||||
- `GroundTruthCorpusBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/GroundTruthCorpusBuilder.cs`) - builds training corpus from ground truth pairs
|
||||
- Corpus connectors: `AlpineCorpusConnector`, `DebianCorpusConnector`, `RpmCorpusConnector` - distro-specific symbol sources
|
||||
- Library connectors: `CurlCorpusConnector`, `GlibcCorpusConnector`, `OpenSslCorpusConnector`, `ZlibCorpusConnector`
|
||||
- **Interfaces**: `IBinaryCorpusConnector`, `ILibraryCorpusConnector`, `ICorpusSnapshotRepository`, `ISymbolProvenanceResolver`, `IKpiRegressionService`
|
||||
- **Registration**: `ServiceCollectionExtensions` with `AddCorpusBundleExport/Import` methods
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Connect to a corpus source via library connector and verify binary extraction works
|
||||
- [ ] Resolve symbol provenance for a known function via `GroundTruthProvenanceResolver`
|
||||
- [ ] Build a ground truth corpus for ML training via `GroundTruthCorpusBuilder`
|
||||
- [ ] Track KPI metrics across multiple validation runs and verify regression detection
|
||||
- [ ] Verify corpus snapshot repository persists and retrieves snapshots with correct IDs
|
||||
- [ ] Verify security pair tracking (vulnerable/fixed binary pairs) across corpus connectors
|
||||
@@ -0,0 +1,30 @@
|
||||
# ML Function Embedding Service (CodeBERT/ONNX Inference)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
ONNX-based function embedding inference service for binary function matching using CodeBERT-derived models. Includes training corpus schema, embedding generation pipeline, and ensemble integration with existing matchers. No direct match in known features list.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/`
|
||||
- **Key Classes**:
|
||||
- `IEmbeddingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/IEmbeddingService.cs`) - generates `FunctionEmbedding` from binary functions; supports batch generation, similarity computation, and nearest-neighbor search
|
||||
- `InMemoryEmbeddingIndex` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/InMemoryEmbeddingIndex.cs`) - in-memory vector index for fast embedding similarity search with cosine similarity
|
||||
- `MlEmbeddingMatcherAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/MlEmbeddingMatcherAdapter.cs`) - adapts ML embeddings for ensemble decision engine
|
||||
- `GroundTruthCorpusBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/GroundTruthCorpusBuilder.cs`) - builds training corpus from ground truth data with JsonLines/Json export
|
||||
- `ICorpusBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/ICorpusBuilder.cs`) - training corpus building interface with `CorpusExportFormat` enum
|
||||
- `FunctionEmbedding` - vector embedding record for binary functions
|
||||
- **Integration**: `FunctionAnalysisBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs`) passes ML embeddings into ensemble scoring
|
||||
- **Registration**: `TrainingServiceCollectionExtensions` for DI setup
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Generate a function embedding from a known binary function and verify vector dimensions are correct
|
||||
- [ ] Compute similarity between embeddings of identical functions (compiled with different flags) and verify high similarity
|
||||
- [ ] Add embeddings to `InMemoryEmbeddingIndex` and verify nearest-neighbor search returns correct matches
|
||||
- [ ] Build a training corpus from ground truth pairs via `GroundTruthCorpusBuilder`
|
||||
- [ ] Verify `MlEmbeddingMatcherAdapter` integrates with ensemble decision engine
|
||||
- [ ] Verify batch embedding generation processes multiple functions efficiently
|
||||
@@ -0,0 +1,28 @@
|
||||
# Reproducible build verification
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Reproducible build backend supports local rebuilds with air-gap bundle support for verifying binary provenance.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`, `src/BinaryIndex/StellaOps.BinaryIndex.Worker/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/`
|
||||
- **Key Classes**:
|
||||
- `ReproducibleBuildJob` (`src/BinaryIndex/StellaOps.BinaryIndex.Worker/Jobs/ReproducibleBuildJob.cs`) - worker job that executes reproducible builds using `IFunctionFingerprintExtractor`, `IPatchDiffEngine`, and `IFingerprintClaimRepository`
|
||||
- `ReproducibleBuildJob` (builders) (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/ReproducibleBuildJobTypes.cs`) - builder-level reproducible build job with options
|
||||
- `ReproducibleBuildOptions` - configuration for build verification parameters
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - validates reproducible build outputs
|
||||
- `FingerprintClaim` / `FingerprintClaimEvidence` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/FingerprintClaimModels.cs`) - claims produced from build verification
|
||||
- **Interfaces**: `IReproducibleBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/IReproducibleBuilder.cs`), `IReproducibleBuildJob`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit a source package and verify reproducible build produces matching binary fingerprints
|
||||
- [ ] Verify `FingerprintClaim` is generated with correct `FingerprintClaimEvidence` linking to Build-ID
|
||||
- [ ] Verify build verification with non-matching binaries produces a failed verification result
|
||||
- [ ] Verify air-gap bundle support: import build inputs from bundle and verify build completes offline
|
||||
- [ ] Verify `ReproducibleBuildOptions` configuration controls build behavior
|
||||
- [ ] Verify build job integrates with `IPatchDiffEngine` for post-build comparison
|
||||
@@ -0,0 +1,27 @@
|
||||
# SBOM Bom-Ref Linkage in Binary Function Identity
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Extended function identity model (SymbolSignatureV2) with SBOM bom-ref linkage following the format `module:bom-ref:offset:canonical-IR-hash`. Includes IBomRefResolver interface for resolving binary artifacts to SBOM component references with graceful fallback.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- **Key Classes**:
|
||||
- `DeltaSigPredicateV2` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicateV2.cs`) - V2 predicate including SBOM bom-ref linkage in function identity records
|
||||
- `DeltaSigVexBridge` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/VexIntegration/DeltaSigVexBridge.cs`) - VEX bridge uses symbol provenance (which includes SBOM refs) to enrich VEX observations
|
||||
- `GroundTruthProvenanceResolver` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Provenance/GroundTruthProvenanceResolver.cs`) - enriches function matches with `SymbolProvenance` including source references
|
||||
- `Models.cs` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Models.cs`) - `SymbolMatchResult` with `SymbolProvenance` property for bom-ref linkage
|
||||
- **Interfaces**: `ISymbolProvenanceResolver` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Provenance/ISymbolProvenanceResolver.cs`) - resolves `SymbolProvenanceV2` with batch lookup support
|
||||
- **Source**: SPRINT_20260118_026_BinaryIndex_deltasig_enhancements.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Resolve a binary function to its SBOM bom-ref via `ISymbolProvenanceResolver` and verify the linkage format
|
||||
- [ ] Verify `DeltaSigPredicateV2` includes bom-ref linkage in function identity records
|
||||
- [ ] Verify `DeltaSigVexBridge` includes provenance source from SBOM in VEX observations
|
||||
- [ ] Verify batch lookup via `BatchLookupAsync` resolves multiple symbols efficiently
|
||||
- [ ] Verify graceful fallback when SBOM bom-ref is not available (function identity still works without it)
|
||||
@@ -0,0 +1,28 @@
|
||||
# Scanner Integration for Binary Analysis
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Binary vulnerability analysis integrated into the scanner worker pipeline with patch verification and build provenance reproducibility verification.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/`, `src/Scanner/`
|
||||
- **Key Classes**:
|
||||
- `BinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/Services/BinaryVulnerabilityService.cs`) - core binary vulnerability detection service used by scanner pipeline; queries `ICorpusQueryService` for function matches
|
||||
- `CachedBinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/CachedBinaryVulnerabilityService.cs`) - cached decorator with `LookupByDeltaSignatureAsync` for scanner integration
|
||||
- `ResolutionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Resolution/ResolutionService.cs`) - resolves whether a CVE is fixed based on binary-level evidence
|
||||
- `ReproducibleBuildJob` (`src/BinaryIndex/StellaOps.BinaryIndex.Worker/Jobs/ReproducibleBuildJob.cs`) - worker job for build provenance verification
|
||||
- `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs`) - multi-tier matching for scanner-detected vulnerabilities
|
||||
- **Integration Points**: Scanner pipeline calls `IBinaryVulnerabilityService` to enrich findings with binary-level patch verification
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Trigger a scanner scan on a container with known binaries and verify binary analysis runs automatically
|
||||
- [ ] Verify scanner findings are enriched with binary-level patch status (Fixed, Vulnerable, Unknown)
|
||||
- [ ] Verify `CachedBinaryVulnerabilityService` caches scanner lookups for performance
|
||||
- [ ] Verify build provenance verification runs as a background worker job
|
||||
- [ ] Verify ensemble decision engine produces consistent results when called from scanner pipeline
|
||||
- [ ] Verify binary analysis results are included in scanner output findings
|
||||
29
docs/features/checked/binaryindex/static-to-binary-braid.md
Normal file
29
docs/features/checked/binaryindex/static-to-binary-braid.md
Normal file
@@ -0,0 +1,29 @@
|
||||
# Static-to-Binary Braid (Build-Time Function Proof)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Full binary analysis pipeline with function fingerprinting, delta signatures, multi-backend disassembly (Iced, B2R2), normalization, and semantic analysis for build-time function proof.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/`
|
||||
- **Key Classes**:
|
||||
- `PatchDiffEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/PatchDiffEngine.cs`) - orchestrates build-time function proof by comparing pre/post binaries
|
||||
- `DeltaSigServiceV2` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSigServiceV2.cs`) - V2 delta-sig with IR diff support
|
||||
- `SemanticFingerprintGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs`) - semantic function fingerprinting
|
||||
- `HybridDisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/HybridDisassemblyService.cs`) - multi-backend disassembly
|
||||
- `CodeNormalizer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/CodeNormalizer.cs`) - normalizes decompiled code for comparison
|
||||
- `SemanticEquivalence` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/SemanticEquivalence.cs`) - semantic equivalence checking between code versions
|
||||
- `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs`) - combines all matching tiers for final proof verdict
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit source-to-binary pair and verify function-level proof linking source functions to binary symbols
|
||||
- [ ] Verify multi-backend disassembly: same binary analyzed by Iced and B2R2 produces compatible fingerprints
|
||||
- [ ] Verify delta-sig generation creates build-time proof of which functions changed
|
||||
- [ ] Verify semantic analysis identifies equivalent functions across different compiler outputs
|
||||
- [ ] Verify code normalization strips compiler-specific artifacts for fair comparison
|
||||
- [ ] Verify ensemble decision produces final proof verdict combining all evidence tiers
|
||||
Reference in New Issue
Block a user