save checkpoint: save features
This commit is contained in:
@@ -0,0 +1,37 @@
|
||||
# Ensemble decision engine for multi-tier matching
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
PARTIALLY_IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Ensemble decision engine combines multiple matching tiers (range match, Build-ID, fingerprint) with configurable weight tuning for vulnerability classification.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/`
|
||||
- **Key Classes**:
|
||||
- `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs`) - combines multiple matching signals with configurable weights into a final vulnerability classification decision
|
||||
- `FunctionAnalysisBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs`) - builds function analysis inputs including optional ML embeddings
|
||||
- `WeightTuningService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/WeightTuningService.cs`) - tunes ensemble weights based on golden set validation results
|
||||
- `EnsembleOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs`) - configurable weights and thresholds for matching tiers
|
||||
- `MlEmbeddingMatcherAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/MlEmbeddingMatcherAdapter.cs`) - adapts ML function embeddings for ensemble use
|
||||
- **Interfaces**: `IEnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs`)
|
||||
- **Registration**: `EnsembleServiceCollectionExtensions.AddBinarySimilarityServices()` for full pipeline setup
|
||||
- **Benchmarks**: `EnsembleAccuracyBenchmarks`, `EnsembleLatencyBenchmarks` (`src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Benchmarks/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit a binary with known vulnerability and verify ensemble produces correct classification
|
||||
- [ ] Verify weight tuning: adjust instruction weight to 0.6 and verify it changes classification outcomes
|
||||
- [ ] Verify multi-tier integration: Build-ID match, fingerprint match, and ML embedding all contribute to score
|
||||
- [ ] Verify `FunctionAnalysisBuilder` correctly assembles all matching dimensions
|
||||
- [ ] Verify `WeightTuningService` optimizes weights based on golden set validation accuracy
|
||||
- [ ] Run accuracy benchmark and verify F1 score meets minimum threshold
|
||||
|
||||
## Verification
|
||||
- Run: `run-001` (2026-02-11 UTC).
|
||||
- Tier 1/2 builds and tests passed (`37/37`), but parity review found contract mismatch and missing coverage for key claims.
|
||||
- Ensemble signal model currently exposes syntactic, semantic, embedding, and exact-hash signals, but the feature contract claims range-match, Build-ID, and fingerprint tiers (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs:232`).
|
||||
- `FunctionAnalysisBuilder` explicitly retains a simplified semantic-graph path when binary data is unavailable (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs:87`).
|
||||
- No direct tests were found for `FunctionAnalysisBuilder` or `MlEmbeddingMatcherAdapter` in `src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Ensemble.Tests`.
|
||||
Reference in New Issue
Block a user