2.2 KiB
2.2 KiB
Ensemble decision engine for multi-tier matching
Module
BinaryIndex
Status
IMPLEMENTED
Description
Ensemble decision engine combines multiple matching tiers (range match, Build-ID, fingerprint) with configurable weight tuning for vulnerability classification.
Implementation Details
- Modules:
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/ - Key Classes:
EnsembleDecisionEngine(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs) - combines multiple matching signals with configurable weights into a final vulnerability classification decisionFunctionAnalysisBuilder(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs) - builds function analysis inputs including optional ML embeddingsWeightTuningService(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/WeightTuningService.cs) - tunes ensemble weights based on golden set validation resultsEnsembleOptions(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs) - configurable weights and thresholds for matching tiersMlEmbeddingMatcherAdapter(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/MlEmbeddingMatcherAdapter.cs) - adapts ML function embeddings for ensemble use
- Interfaces:
IEnsembleDecisionEngine(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs) - Registration:
EnsembleServiceCollectionExtensions.AddBinarySimilarityServices()for full pipeline setup - Benchmarks:
EnsembleAccuracyBenchmarks,EnsembleLatencyBenchmarks(src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Benchmarks/)
E2E Test Plan
- Submit a binary with known vulnerability and verify ensemble produces correct classification
- Verify weight tuning: adjust instruction weight to 0.6 and verify it changes classification outcomes
- Verify multi-tier integration: Build-ID match, fingerprint match, and ML embedding all contribute to score
- Verify
FunctionAnalysisBuildercorrectly assembles all matching dimensions - Verify
WeightTuningServiceoptimizes weights based on golden set validation accuracy - Run accuracy benchmark and verify F1 score meets minimum threshold