3.0 KiB
3.0 KiB
Ensemble decision engine for multi-tier matching
Module
BinaryIndex
Status
PARTIALLY_IMPLEMENTED
Description
Ensemble decision engine combines multiple matching tiers (range match, Build-ID, fingerprint) with configurable weight tuning for vulnerability classification.
Implementation Details
- Modules:
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/ - Key Classes:
EnsembleDecisionEngine(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs) - combines multiple matching signals with configurable weights into a final vulnerability classification decisionFunctionAnalysisBuilder(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs) - builds function analysis inputs including optional ML embeddingsWeightTuningService(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/WeightTuningService.cs) - tunes ensemble weights based on golden set validation resultsEnsembleOptions(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs) - configurable weights and thresholds for matching tiersMlEmbeddingMatcherAdapter(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/MlEmbeddingMatcherAdapter.cs) - adapts ML function embeddings for ensemble use
- Interfaces:
IEnsembleDecisionEngine(src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs) - Registration:
EnsembleServiceCollectionExtensions.AddBinarySimilarityServices()for full pipeline setup - Benchmarks:
EnsembleAccuracyBenchmarks,EnsembleLatencyBenchmarks(src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Benchmarks/)
E2E Test Plan
- Submit a binary with known vulnerability and verify ensemble produces correct classification
- Verify weight tuning: adjust instruction weight to 0.6 and verify it changes classification outcomes
- Verify multi-tier integration: Build-ID match, fingerprint match, and ML embedding all contribute to score
- Verify
FunctionAnalysisBuildercorrectly assembles all matching dimensions - Verify
WeightTuningServiceoptimizes weights based on golden set validation accuracy - Run accuracy benchmark and verify F1 score meets minimum threshold
Verification
- Run:
run-001(2026-02-11 UTC). - Tier 1/2 builds and tests passed (
37/37), but parity review found contract mismatch and missing coverage for key claims. - Ensemble signal model currently exposes syntactic, semantic, embedding, and exact-hash signals, but the feature contract claims range-match, Build-ID, and fingerprint tiers (
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs:232). FunctionAnalysisBuilderexplicitly retains a simplified semantic-graph path when binary data is unavailable (src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs:87).- No direct tests were found for
FunctionAnalysisBuilderorMlEmbeddingMatcherAdapterinsrc/BinaryIndex/__Tests/StellaOps.BinaryIndex.Ensemble.Tests.