save checkpoint: save features
This commit is contained in:
@@ -1,28 +0,0 @@
|
||||
# Binary Call-Graph Extraction and Reachability Analysis
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Binary call-graph extraction with BinaryCallGraphExtractor, reachability lifting via BinaryReachabilityLifter, dedicated BinaryIndex analysis module, and CLI binary commands.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- **Key Classes**:
|
||||
- `ReachGraphBinaryReachabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/ReachGraphBinaryReachabilityService.cs`) - binary-level reachability integration with ReachGraph
|
||||
- `TaintGateExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/TaintGateExtractor.cs`) - extracts taint gates (bounds checks, null checks, auth checks, permission checks, type checks) from binary call paths
|
||||
- `CfgExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/CfgExtractor.cs`) - control flow graph extraction from disassembled binaries
|
||||
- `CallNgramGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs`) - generates call-sequence n-grams from lifted IR for call graph analysis
|
||||
- `CallGraphMatcherAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/Matchers/MatcherAdapters.cs`) - adapter for call graph matching in validation harness
|
||||
- **Interfaces**: `ICallNgramGenerator`, `IBinaryFeatureExtractor`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit an ELF binary and verify call-graph extraction produces a valid set of function nodes and edges
|
||||
- [ ] Verify `TaintGateExtractor` classifies conditions correctly (bounds check, null check, auth check, permission check, type check)
|
||||
- [ ] Verify `CfgExtractor` produces control flow graphs from disassembled functions
|
||||
- [ ] Verify `CallNgramGenerator` generates n-grams (n=2,3,4) from lifted function IR and computes Jaccard similarity
|
||||
- [ ] Verify `ReachGraphBinaryReachabilityService` integrates with the ReachGraph module for function-level exploitability assessment
|
||||
- [ ] Verify call-graph-based reachability results feed into the ensemble decision engine
|
||||
@@ -1,29 +0,0 @@
|
||||
# Binary Identity Extraction (Build-ID Based)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Binary identity extraction using Build-IDs and symbol observations for ELF binary identification, with ground-truth validation and SBOM stability verification.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIdentityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/BinaryIdentityService.cs`) - main service for extracting binary identity from ELF/PE/Mach-O binaries
|
||||
- `ElfFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs`) - extracts Build-ID, symbol tables, and section info from ELF binaries
|
||||
- `PeFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/PeFeatureExtractor.cs`) - extracts CodeView GUID from Windows PE binaries
|
||||
- `MachoFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/MachoFeatureExtractor.cs`) - extracts LC_UUID from Mach-O binaries
|
||||
- `StreamGuard` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/StreamGuard.cs`) - safe stream handling for non-seekable streams
|
||||
- **Interfaces**: `IBinaryFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/IBinaryFeatureExtractor.cs`)
|
||||
- **Models**: `BinaryIdentity` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/BinaryIdentity.cs`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit an ELF binary with a known Build-ID and verify the extracted identity matches
|
||||
- [ ] Submit a Windows PE binary and verify CodeView GUID extraction via `PeFeatureExtractor`
|
||||
- [ ] Submit a Mach-O binary and verify LC_UUID extraction via `MachoFeatureExtractor`
|
||||
- [ ] Verify that non-seekable streams are handled correctly via `StreamGuard`
|
||||
- [ ] Verify that binaries without Build-IDs fall back to symbol-based identification
|
||||
- [ ] Verify extracted identities are persisted and queryable through `BinaryVulnerabilityService`
|
||||
@@ -1,28 +0,0 @@
|
||||
# Binary Intelligence Graph / Binary Identity Indexing
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Complete BinaryIndex module with binary identity indexing, ELF feature extraction, vulnerability fingerprint matching, and reachability status tracking. Advisory marked as SUPERSEDED by this implementation.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIdentityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/BinaryIdentityService.cs`) - binary identity management
|
||||
- `ElfFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs`) - ELF feature extraction
|
||||
- `BinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/Services/BinaryVulnerabilityService.cs`) - vulnerability matching with Build-ID catalog lookups
|
||||
- `SignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/SignatureMatcher.cs`) - signature-based vulnerability fingerprint matching
|
||||
- `ReachGraphBinaryReachabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/ReachGraphBinaryReachabilityService.cs`) - reachability status tracking
|
||||
- **Models**: `BinaryIdentity`, `FixModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/`)
|
||||
- **Persistence**: `IBinaryVulnAssertionRepository`, `IBinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Verify end-to-end flow: submit binary, extract identity, index in the graph, and query by Build-ID
|
||||
- [ ] Verify vulnerability fingerprint matching via `SignatureMatcher` returns correct match scores
|
||||
- [ ] Verify reachability status tracking integrates with ReachGraph
|
||||
- [ ] Verify `BinaryVulnerabilityService` correctly maps match methods (buildid_catalog, delta_signature, etc.)
|
||||
- [ ] Verify binary identity indexing supports multi-tenant contexts via `ITenantContext`
|
||||
@@ -1,28 +0,0 @@
|
||||
# Binary Proof Verification Pipeline
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Full binary proof verification with ground truth sources (buildinfo, debuginfod, reproducible builds), validation, and golden set testing.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation.Abstractions/`
|
||||
- **Key Classes**:
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - orchestrates reproducible-build-based validation runs
|
||||
- `ValidationHarness` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/ValidationHarness.cs`) - main validation harness with matcher adapter factory integration
|
||||
- `KpiRegressionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/KpiRegressionService.cs`) - KPI regression detection across validation runs
|
||||
- `GroundTruthProvenanceResolver` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Provenance/GroundTruthProvenanceResolver.cs`) - resolves symbol provenance from ground truth sources
|
||||
- **Interfaces**: `IValidationHarness` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation.Abstractions/IValidationHarness.cs`), `IKpiRegressionService`, `ISymbolProvenanceResolver`
|
||||
- **Registration**: `ServiceCollectionExtensions.AddCorpusBundleExport/Import` for bundle exchange
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Run a validation harness against a known binary pair and verify proof correctness
|
||||
- [ ] Verify ground truth resolution from buildinfo sources produces correct provenance data
|
||||
- [ ] Verify KPI regression service detects accuracy drops between validation runs
|
||||
- [ ] Verify golden set validation produces deterministic, reproducible results
|
||||
- [ ] Verify corpus bundle export/import round-trips correctly
|
||||
- [ ] Verify validation run attestor generates valid attestation predicates with corpus snapshot IDs
|
||||
@@ -1,26 +0,0 @@
|
||||
# Binary Reachability Analysis
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Binary-level reachability analysis integrating with the ReachGraph and taint gate extraction for function-level exploitability assessment.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`
|
||||
- **Key Classes**:
|
||||
- `ReachGraphBinaryReachabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/ReachGraphBinaryReachabilityService.cs`) - connects binary analysis to the ReachGraph module for function-level reachability
|
||||
- `TaintGateExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/TaintGateExtractor.cs`) - identifies taint gate types (BoundsCheck, NullCheck, AuthCheck, PermissionCheck, TypeCheck) from condition strings
|
||||
- `SignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/SignatureMatcher.cs`) - matches vulnerability signatures at the binary level
|
||||
- **Models**: `AnalysisResultModels`, `FingerprintModels`, `SignatureIndexModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/Models/`)
|
||||
- **Interfaces**: defined in `Interfaces.cs`, implementations in `Implementations.cs`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit a binary with a known vulnerable function and verify reachability analysis identifies it as reachable from entry points
|
||||
- [ ] Verify `TaintGateExtractor` correctly classifies all gate types (bounds, null, auth, permission, type checks)
|
||||
- [ ] Verify that unreachable vulnerable functions reduce the exploitability score
|
||||
- [ ] Verify integration between `ReachGraphBinaryReachabilityService` and the ReachGraph module
|
||||
- [ ] Verify that taint gate presence between entry point and vulnerable function is reflected in the analysis result
|
||||
@@ -1,30 +0,0 @@
|
||||
# Binary Resolution API with Cache Layer
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
REST API endpoints (`POST /api/v1/resolve/vuln` and `/vuln/batch`) for querying whether a CVE is resolved through binary-level backport detection. Includes Valkey-backed response caching, rate limiting middleware, and telemetry instrumentation.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/StellaOps.BinaryIndex.WebService/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/`
|
||||
- **Key Classes**:
|
||||
- `ResolutionController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/ResolutionController.cs`) - REST API controller with `POST /api/v1/resolve/vuln` and `/vuln/batch` endpoints
|
||||
- `ResolutionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Resolution/ResolutionService.cs`) - core resolution logic
|
||||
- `CachedResolutionService` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Services/CachedResolutionService.cs`) - decorator adding Valkey-backed caching around ResolutionService
|
||||
- `ResolutionCacheService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/ResolutionCacheService.cs`) - Valkey cache operations for resolution results
|
||||
- `RateLimitingMiddleware` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Middleware/RateLimitingMiddleware.cs`) - per-tenant rate limiting with X-RateLimit headers
|
||||
- `ResolutionTelemetry` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Telemetry/ResolutionTelemetry.cs`) - OpenTelemetry metrics for resolution requests, cache hits, rate limits
|
||||
- **Contracts**: `VulnResolutionRequest/Response`, `ResolutionMatchTypes` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Contracts/Resolution/VulnResolutionContracts.cs`)
|
||||
- **Cache Options**: `BinaryCacheOptions`, `CacheOptionsValidation` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Send `POST /api/v1/resolve/vuln` with a known CVE and package purl, verify resolution response contains match type (BuildId, DeltaSignature, etc.)
|
||||
- [ ] Send batch request to `/api/v1/resolve/vuln/batch` with multiple packages and verify all are resolved
|
||||
- [ ] Verify cache hit: send same request twice and confirm second response comes from cache (check telemetry counters)
|
||||
- [ ] Verify rate limiting: exceed the configured request limit and confirm 429 response with X-RateLimit headers
|
||||
- [ ] Verify telemetry: confirm resolution metrics are emitted (request count, cache hit ratio, latency histogram)
|
||||
- [ ] Verify disabled rate limiting mode passes requests through without headers
|
||||
@@ -1,30 +0,0 @@
|
||||
# Binary Symbol Table Diff Engine
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Symbol table comparison between binary versions tracking exported/imported symbol changes, version map diffs, GOT/PLT table modifications, and ABI compatibility assessment. Produces content-addressed diff IDs for deterministic reporting.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/`
|
||||
- **Key Classes**:
|
||||
- `SymbolTableDiffAnalyzer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/SymbolTableDiffAnalyzer.cs`) - computes diffs between symbol tables with `ComputeDiffAsync` and `AssessAbiCompatibility`
|
||||
- `SymbolTableDiff` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/SymbolTableDiff.cs`) - diff result model with added/removed/changed symbols
|
||||
- `VersionMapDiff` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/VersionMapDiff.cs`) - tracks changes in ELF version maps
|
||||
- `AbiCompatibility` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/AbiCompatibility.cs`) - ABI compatibility assessment (FullyCompatible, Warnings, Incompatible)
|
||||
- `DynamicLinkingDiff` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/DynamicLinkingDiff.cs`) - GOT/PLT table modification tracking
|
||||
- `NameDemangler` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/NameDemangler.cs`) - C++ symbol name demangling
|
||||
- **Interfaces**: `ISymbolTableDiffAnalyzer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/SymbolDiff/ISymbolTableDiffAnalyzer.cs`)
|
||||
- **Registration**: `SymbolDiffServiceExtensions` for DI setup
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Compute diff between two ELF binaries with known symbol changes and verify added/removed symbols are correctly identified
|
||||
- [ ] Verify `AssessAbiCompatibility` returns `FullyCompatible` when only symbols are added
|
||||
- [ ] Verify `AssessAbiCompatibility` returns `Incompatible` when exported symbols are removed
|
||||
- [ ] Verify version map diff detection for ELF version script changes
|
||||
- [ ] Verify C++ symbol demangling produces human-readable names via `NameDemangler`
|
||||
- [ ] Verify content-addressed diff IDs are deterministic for identical inputs
|
||||
@@ -1,28 +0,0 @@
|
||||
# Binary-to-VEX Claim Auto-Generation (VexBridge Library)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Automated generation of VEX claims from binary fingerprint match results. The VexBridge library translates binary match evidence into DSSE-signed VEX statements with confidence scores, enabling automated VEX claim production from binary analysis without manual triage.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.VexBridge/`
|
||||
- **Key Classes**:
|
||||
- `VexEvidenceGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.VexBridge/VexEvidenceGenerator.cs`) - generates VEX observations from `BinaryVulnMatch` results; maps `FixState` to `VexClaimStatus` (Fixed -> NotAffected, Vulnerable -> Affected, Unknown -> UnderInvestigation)
|
||||
- `BinaryMatchEvidenceSchema` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.VexBridge/BinaryMatchEvidenceSchema.cs`) - defines evidence schema with match type constants (BuildId, DeltaSignature, etc.)
|
||||
- `VexBridgeOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.VexBridge/VexBridgeOptions.cs`) - configuration for confidence thresholds
|
||||
- `DeltaSigVexBridge` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/VexIntegration/DeltaSigVexBridge.cs`) - bridges delta-signature analysis results into VEX observations with provenance data
|
||||
- **Interfaces**: `IVexEvidenceGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.VexBridge/IVexEvidenceGenerator.cs`), `IDeltaSigVexBridge`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Generate a VEX claim from a `Fixed` binary match and verify status is `NotAffected` with justification `VulnerableCodeNotPresent`
|
||||
- [ ] Generate a VEX claim from a `Vulnerable` match and verify status is `Affected`
|
||||
- [ ] Generate a VEX claim from an `Unknown` match and verify status is `UnderInvestigation`
|
||||
- [ ] Verify confidence threshold enforcement: low-confidence matches below threshold are rejected
|
||||
- [ ] Verify Build-ID references are included in VEX evidence when present
|
||||
- [ ] Verify `DeltaSigVexBridge` produces VEX observations with symbol provenance metadata
|
||||
- [ ] Verify generated VEX statements include correct DSSE evidence references
|
||||
@@ -1,27 +0,0 @@
|
||||
# BinaryIndex Ops CLI Commands (stella binary ops)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
CLI commands for BinaryIndex ops: health, bench, cache, config subcommands with JSON/table output and BinaryIndex base URL configuration. Also adds --semantic flag to deltasig extract/author/match commands.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/`, `src/Cli/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIndexOpsController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/BinaryIndexOpsController.cs`) - serves health, bench, cache stats, and config endpoints consumed by CLI
|
||||
- `BinaryIndexOpsHealthResponse` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/BinaryIndexOpsModels.cs`) - health response model with lifter warmness, component versions
|
||||
- `BinaryIndexOpsOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/BinaryIndexOpsModels.cs`) - ops configuration with redacted keys and bench rate limits
|
||||
- `B2R2LifterPool` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPool.cs`) - lifter pool stats reported via ops health endpoint
|
||||
- **Source**: SPRINT_20260112_006_CLI_binaryindex_ops_cli.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Run `stella binary ops health` and verify JSON output includes lifter warmness and version info
|
||||
- [ ] Run `stella binary ops bench` and verify latency measurement results are returned
|
||||
- [ ] Run `stella binary ops cache` and verify Valkey hit/miss statistics are reported
|
||||
- [ ] Run `stella binary ops config` and verify effective configuration is returned with secrets redacted
|
||||
- [ ] Run `stella deltasig extract --semantic` and verify semantic flag is passed through
|
||||
- [ ] Verify table output format renders correctly for all subcommands
|
||||
@@ -1,27 +0,0 @@
|
||||
# BinaryIndex Ops Endpoints (Health, Bench, Cache Stats, Config)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Ops endpoints for BinaryIndex: health (lifter warmness), bench/run (latency measurement), cache stats (Valkey hit/miss), and effective config with deterministic JSON responses.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/StellaOps.BinaryIndex.WebService/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIndexOpsController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/BinaryIndexOpsController.cs`) - exposes `GET /ops/health`, bench, cache stats, and config endpoints; integrates with `B2R2LifterPool` and `FunctionIrCacheService`
|
||||
- `B2R2LifterPool` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPool.cs`) - provides pool stats (warm ISAs, pool sizes, acquire timeouts)
|
||||
- `FunctionIrCacheService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/FunctionIrCacheService.cs`) - Valkey-based function IR cache with hit/miss reporting
|
||||
- `B2R2LifterPoolOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPoolOptions.cs`) - pool configuration (MaxPoolSizePerIsa, EnableWarmPreload, AcquireTimeout)
|
||||
- `BinaryIndexOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/BinaryIndexOptions.cs`) - top-level options with B2R2Pool, SemanticLifting sections
|
||||
- **Source**: SPRINT_20260112_004_BINIDX_b2r2_lowuir_perf_cache.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Call `GET /ops/health` and verify response includes lifter pool warmness state and component versions
|
||||
- [ ] Call bench endpoint and verify deterministic latency measurement JSON
|
||||
- [ ] Call cache stats endpoint and verify Valkey hit/miss counts and cache key count
|
||||
- [ ] Call config endpoint and verify effective configuration is returned with secrets redacted
|
||||
- [ ] Verify all ops responses use deterministic JSON serialization (consistent key ordering)
|
||||
@@ -1,30 +0,0 @@
|
||||
# BinaryIndex User Configuration System
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Comprehensive user configuration for B2R2 lifter pooling, LowUIR enablement, Valkey function cache behavior, PostgreSQL persistence, with ops endpoints for health/bench/cache/config and redaction rules for operator visibility.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIndexOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/BinaryIndexOptions.cs`) - top-level config with sections for B2R2Pool, SemanticLifting, cache, persistence
|
||||
- `B2R2PoolOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPoolOptions.cs`) - MaxPoolSizePerIsa (1-64), EnableWarmPreload, AcquireTimeout, EnableMetrics, WarmPreloadIsas
|
||||
- `SemanticLiftingOptions` - B2R2Version, Enabled flag, function limits
|
||||
- `BinaryCacheOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/BinaryCacheOptions.cs`) - Valkey cache configuration
|
||||
- `CacheOptionsValidation` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/CacheOptionsValidation.cs`) - validates cache config at startup
|
||||
- `FunctionIrCacheOptions` - function IR cache TTL and size limits
|
||||
- `BinaryIndexOpsOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Configuration/BinaryIndexOpsModels.cs`) - redacted keys list for operator visibility, bench rate limits
|
||||
- **Source**: SPRINT_20260112_007_BINIDX_binaryindex_user_config.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Configure B2R2 pool with custom MaxPoolSizePerIsa and verify pool initializes with correct size
|
||||
- [ ] Configure SemanticLifting as disabled and verify LowUIR lifting is skipped
|
||||
- [ ] Configure Valkey cache options and verify function IR cache respects TTL settings
|
||||
- [ ] Verify configuration binding from `StellaOps:BinaryIndex:*` config sections
|
||||
- [ ] Verify redacted keys do not appear in ops config endpoint responses
|
||||
- [ ] Verify CacheOptionsValidation rejects invalid configuration at startup
|
||||
@@ -1,30 +0,0 @@
|
||||
# Byte-Level Binary Diffing with Rolling Hash Windows
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Byte-level binary comparison using rolling hash windows that identifies exactly which byte ranges changed between binary versions. Produces binary proof snippets with section analysis and privacy controls to strip raw bytes. Supports stream and file-based comparison.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/`
|
||||
- **Key Classes**:
|
||||
- `PatchDiffEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/PatchDiffEngine.cs`) - core diffing engine computing byte-level differences between binary versions using function fingerprints
|
||||
- `FunctionDiffer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionDiffer.cs`) - function-level comparison with semantic analysis option and call-graph edge diffing
|
||||
- `FunctionRenameDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionRenameDetector.cs`) - detects function renames between versions using fingerprint similarity
|
||||
- `VerdictCalculator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/VerdictCalculator.cs`) - computes patch verification verdicts from diff results
|
||||
- `InMemoryDiffResultStore` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/Storage/InMemoryDiffResultStore.cs`) - stores diff results with content-addressed IDs
|
||||
- **Models**: `PatchDiffModels`, `DiffEvidenceModels`, `BinaryReference` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/Models/`)
|
||||
- **Interfaces**: `IPatchDiffEngine`, `IDiffResultStore` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/`)
|
||||
- **Source**: SPRINT_20260112_200_004_CHGTRC_byte_diffing.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit two binary versions and verify byte-range differences are identified with correct offsets
|
||||
- [ ] Verify section analysis identifies which ELF sections changed (.text, .data, .rodata)
|
||||
- [ ] Verify privacy controls strip raw bytes from proof snippets when configured
|
||||
- [ ] Verify `FunctionRenameDetector` correctly identifies renamed functions between versions
|
||||
- [ ] Verify `VerdictCalculator` produces correct patch verification verdict (patched vs unpatched)
|
||||
- [ ] Verify diff results are stored with deterministic content-addressed IDs
|
||||
@@ -1,27 +0,0 @@
|
||||
# Call-Ngram Fingerprinting for Binary Similarity Analysis
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Call-sequence n-gram extraction from lifted IR for improved cross-compiler binary similarity matching. Generates n-grams (n=2,3,4) from function call sequences and integrates into the semantic fingerprint pipeline with configurable dimension weights (instruction 0.4, CFG 0.3, call-ngram 0.2, semantic 0.1).
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
|
||||
- **Key Classes**:
|
||||
- `CallNgramGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs`) - generates `CallNgramFingerprint` from `LiftedFunction` call sequences; computes Jaccard similarity between fingerprints
|
||||
- `CallNgramFingerprint` (record in same file) - contains n-gram hash sets and metadata; has `Empty` sentinel for functions without calls
|
||||
- **Interfaces**: `ICallNgramGenerator` (defined in `CallNgramGenerator.cs`) - `Generate(LiftedFunction)` and `ComputeSimilarity(CallNgramFingerprint, CallNgramFingerprint)`
|
||||
- **Integration**: Used by `EnsembleDecisionEngine` and `FunctionAnalysisBuilder` as one of the matching dimensions with 0.2 default weight
|
||||
- **Source**: SPRINT_20260118_026_BinaryIndex_deltasig_enhancements.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Generate call-ngram fingerprint from a function with known call sequences and verify correct n-gram extraction (n=2,3,4)
|
||||
- [ ] Compute similarity between identical call sequences and verify similarity = 1.0
|
||||
- [ ] Compute similarity between disjoint call sequences and verify similarity = 0.0
|
||||
- [ ] Verify `CallNgramFingerprint.Empty` is returned for functions without call instructions
|
||||
- [ ] Verify call-ngram dimension integrates into ensemble scoring with configurable weight (default 0.2)
|
||||
- [ ] Verify cross-compiler similarity: same source compiled with GCC vs Clang should produce similar call n-grams
|
||||
@@ -1,34 +0,0 @@
|
||||
# Corpus Ingestion and Query Services
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Corpus ingestion and query services with distro-specific connectors for Alpine, Debian, and RPM package ecosystems.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Alpine/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Rpm/`
|
||||
- **Key Classes**:
|
||||
- `CorpusIngestionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Services/CorpusIngestionService.cs`) - orchestrates binary ingestion into the corpus
|
||||
- `CorpusQueryService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Services/CorpusQueryService.cs`) - queries corpus for function fingerprints and binary metadata
|
||||
- `BatchFingerprintPipeline` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Services/BatchFingerprintPipeline.cs`) - batch fingerprint extraction from corpus binaries
|
||||
- `FunctionClusteringService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Services/FunctionClusteringService.cs`) - clusters similar functions across corpus
|
||||
- `CveFunctionMappingUpdater` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Services/CveFunctionMappingUpdater.cs`) - maps CVEs to affected functions
|
||||
- `AlpineCorpusConnector` / `AlpinePackageExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Alpine/`)
|
||||
- `DebianCorpusConnector` / `DebianPackageExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/`)
|
||||
- `RpmCorpusConnector` / `RpmPackageExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Rpm/`)
|
||||
- Library-specific connectors: `CurlCorpusConnector`, `GlibcCorpusConnector`, `OpenSslCorpusConnector`, `ZlibCorpusConnector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Connectors/`)
|
||||
- **Interfaces**: `ICorpusIngestionService`, `ICorpusQueryService`, `IBinaryCorpusConnector`, `ILibraryCorpusConnector`, `ICorpusRepository`, `ICorpusSnapshotRepository`
|
||||
- **Models**: `FunctionCorpusModels`, `CorpusQuery`, `CorpusSnapshot`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Ingest a Debian package via `DebianCorpusConnector` and verify binary fingerprints are stored
|
||||
- [ ] Ingest an Alpine APK via `AlpineCorpusConnector` and verify secfixes extraction via `ApkBuildSecfixesExtractor`
|
||||
- [ ] Ingest an RPM package via `RpmCorpusConnector` and verify changelog extraction via `SrpmChangelogExtractor`
|
||||
- [ ] Query corpus for a known function fingerprint via `CorpusQueryService` and verify match
|
||||
- [ ] Run `BatchFingerprintPipeline` on a corpus snapshot and verify all binaries are fingerprinted
|
||||
- [ ] Verify `CveFunctionMappingUpdater` creates correct CVE-to-function mappings
|
||||
- [ ] Verify corpus snapshot creation with deterministic snapshot IDs
|
||||
@@ -1,37 +0,0 @@
|
||||
# Cross-Distro Golden Set for Backport Validation
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Golden set infrastructure exists in BinaryIndex with analysis pipeline and API. The advisory's detailed curated test cases (OpenSSL Heartbleed, sudo Baron Samedit, etc.) and specific database schema may not be fully populated yet.
|
||||
|
||||
## What's Implemented
|
||||
- **Golden Set Infrastructure**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/` - full authoring, validation, storage, serialization
|
||||
- `GoldenSetExtractor`, `NvdGoldenSetExtractor` - extraction from NVD data
|
||||
- `GoldenSetEnrichmentService` - enriches golden sets with function hints
|
||||
- `GoldenSetValidator`, `ICveValidator` - validation pipeline
|
||||
- `PostgresGoldenSetStore` - PostgreSQL storage
|
||||
- `GoldenSetYamlSerializer` - YAML serialization
|
||||
- **Analysis Pipeline**: `GoldenSetAnalysisPipeline` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`) - runs analysis against golden set definitions
|
||||
- **API Controller**: `GoldenSetController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/`) - CRUD and listing endpoints
|
||||
- **Corpus Connectors**: Alpine (`AlpineCorpusConnector`), Debian (`DebianCorpusConnector`), RPM (`RpmCorpusConnector`) for cross-distro support
|
||||
- **Validation Harness**: `ValidationHarness` and `ValidationHarnessService` for running golden set tests
|
||||
|
||||
## What's Missing
|
||||
- Curated cross-distro test cases for high-impact CVEs (OpenSSL Heartbleed CVE-2014-0160, sudo Baron Samedit CVE-2021-3156, etc.) may not be fully populated in the golden set database
|
||||
- Cross-distro coverage matrix (Alpine vs Debian vs RHEL backport variations for same CVE) may need population
|
||||
- Automated golden set population pipeline from NVD for new CVEs
|
||||
|
||||
## Implementation Plan
|
||||
- Populate golden set database with curated cross-distro test cases for high-impact CVEs
|
||||
- Validate backport detection accuracy across Alpine, Debian, and RHEL for each curated CVE
|
||||
- Build automated pipeline to generate cross-distro golden set entries from NVD advisories
|
||||
- Add cross-distro regression test suite using existing `ValidationHarness` infrastructure
|
||||
|
||||
## Related Documentation
|
||||
- Golden set schema: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Models/GoldenSetDefinition.cs`
|
||||
- Authoring workflow: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GoldenSet/Authoring/`
|
||||
@@ -1,30 +0,0 @@
|
||||
# Delta signature matching and patch coverage analysis
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Delta signature matching traces symbol-level changes between vulnerable and fixed builds. PatchCoverageController exposes an API for patch coverage assessment.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`, `src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/`
|
||||
- **Key Classes**:
|
||||
- `DeltaSignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSignatureMatcher.cs`) - matches delta signatures against target binaries
|
||||
- `DeltaSignatureGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSignatureGenerator.cs`) - generates delta signatures from binary pairs
|
||||
- `DeltaSigService` / `DeltaSigServiceV2` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`) - service layer for delta signature operations (V2 adds IR diffs)
|
||||
- `PatchCoverageController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/PatchCoverageController.cs`) - REST API for patch coverage queries using `IDeltaSignatureRepository`
|
||||
- `SymbolChangeTracer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/SymbolChangeTracer.cs`) - traces symbol-level changes between builds
|
||||
- `DeltaScopePolicyGate` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Policy/DeltaScopePolicyGate.cs`) - policy gate for delta scope enforcement
|
||||
- **Interfaces**: `IDeltaSigService`, `IDeltaSignatureGenerator`, `IDeltaSignatureMatcher`, `ISymbolChangeTracer`
|
||||
- **IR Diff**: `IrDiffGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/IrDiff/`) - generates IR-level diffs between function versions
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Generate a delta signature from known vulnerable/fixed binary pair and verify signature captures changed functions
|
||||
- [ ] Match the generated delta signature against a target binary and verify correct patch status detection
|
||||
- [ ] Query `PatchCoverageController` API for patch coverage and verify coverage percentage
|
||||
- [ ] Verify `SymbolChangeTracer` identifies added, removed, and modified symbols
|
||||
- [ ] Verify `DeltaScopePolicyGate` enforces delta scope policies
|
||||
- [ ] Verify IR-level diff generation captures semantic function changes beyond byte-level diffs
|
||||
@@ -1,30 +0,0 @@
|
||||
# Delta-Signature Predicates (Function-Level Binary Diffs)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Function-level delta signature predicates (v1 and v2) with signature generation, matching, and symbol change tracing. V2 adds symbol provenance and IR diffs, which is architecturally superior to the byte-level hunks proposed in the advisory.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- **Key Classes**:
|
||||
- `DeltaSigPredicate` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicate.cs`) - V1 predicate for attestation
|
||||
- `DeltaSigPredicateV2` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicateV2.cs`) - V2 predicate with symbol provenance and IR diff support
|
||||
- `DeltaSigPredicateConverter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigPredicateConverter.cs`) - converts between predicate versions
|
||||
- `DeltaSigAttestorIntegration` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Attestation/DeltaSigAttestorIntegration.cs`) - integrates delta-sig predicates with the Attestor module
|
||||
- `GroundTruthProvenanceResolver` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/Provenance/GroundTruthProvenanceResolver.cs`) - enriches matches with symbol provenance data
|
||||
- `CfgExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/CfgExtractor.cs`) - extracts control flow graphs for delta-sig generation
|
||||
- **Models**: `Models.cs` in DeltaSig namespace - function match records, signature models
|
||||
- **VEX Integration**: `DeltaSigVexBridge` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/VexIntegration/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Generate a V1 delta-sig predicate and verify it contains function-level diff data
|
||||
- [ ] Generate a V2 delta-sig predicate and verify it includes symbol provenance and IR diff metadata
|
||||
- [ ] Convert between V1 and V2 predicates via `DeltaSigPredicateConverter` and verify data fidelity
|
||||
- [ ] Verify `DeltaSigAttestorIntegration` produces valid attestation predicates for the Attestor module
|
||||
- [ ] Verify `GroundTruthProvenanceResolver` enriches function matches with provenance sources
|
||||
- [ ] Verify V2 predicates flow into VEX observations via `DeltaSigVexBridge`
|
||||
@@ -1,36 +0,0 @@
|
||||
# Disassembly and binary analysis pipeline
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Pluggable disassembly framework with Ghidra integration (BSim + version tracking) for binary analysis capabilities.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Abstractions/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/`
|
||||
- **Key Classes**:
|
||||
- `DisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyService.cs`) - core disassembly orchestrator
|
||||
- `HybridDisassemblyService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/HybridDisassemblyService.cs`) - multi-backend hybrid disassembly with quality-based plugin selection
|
||||
- `DisassemblyPluginRegistry` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/DisassemblyPluginRegistry.cs`) - manages registered disassembly plugins
|
||||
- `BinaryFormatDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly/BinaryFormatDetector.cs`) - detects ELF/PE/Mach-O format from binary headers
|
||||
- `B2R2DisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2DisassemblyPlugin.cs`) - B2R2 backend with architecture mapping, instruction mapping, operand parsing
|
||||
- `B2R2LowUirLiftingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LowUirLiftingService.cs`) - lifts machine code to LowUIR intermediate representation with SSA transformation
|
||||
- `B2R2LifterPool` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/B2R2LifterPool.cs`) - object pool for B2R2 lifter instances with warm preloading
|
||||
- `IcedDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.Iced/IcedDisassemblyPlugin.cs`) - Iced x86/x64 disassembler plugin
|
||||
- `GhidraDisassemblyPlugin` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/Services/GhidraDisassemblyPlugin.cs`) - Ghidra integration
|
||||
- `GhidraDecompilerAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/GhidraDecompilerAdapter.cs`) - Ghidra decompilation with AST comparison
|
||||
- **Abstractions**: `IDisassemblyPlugin`, `IDisassemblyPluginRegistry`, `IDisassemblyService` with models for `BinaryFormat`, `CpuArchitecture`, `DisassembledInstruction`, `InstructionKind`, etc.
|
||||
- **Decompiler**: Full AST comparison engine with recursive parser, code normalizer, semantic equivalence checking
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Load an x86-64 ELF binary via `HybridDisassemblyService` and verify disassembly produces valid instructions
|
||||
- [ ] Verify `BinaryFormatDetector` correctly identifies ELF, PE, and Mach-O formats
|
||||
- [ ] Verify B2R2 plugin handles architecture mapping for x86, x64, ARM, AArch64
|
||||
- [ ] Verify B2R2 LowUIR lifting produces valid IR with SSA form
|
||||
- [ ] Verify Iced plugin disassembles x86/x64 instructions correctly
|
||||
- [ ] Verify `B2R2LifterPool` warm preloading and pool size management
|
||||
- [ ] Verify Ghidra decompiler adapter produces comparable ASTs via `AstComparisonEngine`
|
||||
- [ ] Verify hybrid disassembly quality scoring selects the best plugin for each binary
|
||||
@@ -1,41 +0,0 @@
|
||||
# ELF Normalization and Delta Hashing
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Low-entropy delta signatures over ELF segments with normalization (relocation zeroing, NOP canonicalization, jump table rewriting). Not yet implemented.
|
||||
|
||||
## What's Implemented
|
||||
- **Delta Signature Infrastructure**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/` - function-level delta signatures with V1 and V2 predicates exist
|
||||
- `DeltaSignatureGenerator` - generates delta signatures (function-level, not ELF-segment-level)
|
||||
- `DeltaSignatureMatcher` - matches delta signatures
|
||||
- `CfgExtractor` - extracts control flow graphs
|
||||
- `IrDiffGenerator` - IR-level diff generation
|
||||
- **Binary Diff Engine**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/PatchDiffEngine.cs` - byte-level and function-level diffing
|
||||
- **ELF Feature Extraction**: `ElfFeatureExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/`) - extracts Build-ID and section info from ELF binaries
|
||||
- **Disassembly**: `B2R2DisassemblyPlugin`, `HybridDisassemblyService` - multi-backend disassembly infrastructure
|
||||
|
||||
## What's Missing
|
||||
- ELF segment-level normalization (relocation zeroing to eliminate position-dependent bytes)
|
||||
- NOP canonicalization (normalizing NOP sled variations across compilers)
|
||||
- Jump table rewriting (normalizing indirect jump table entries)
|
||||
- Low-entropy delta hashing over normalized ELF segments (currently delta-sig operates at function level, not segment level)
|
||||
- Segment-aware normalization that handles .text, .rodata, .data sections separately
|
||||
|
||||
## Implementation Plan
|
||||
- Add ELF segment normalization pass to `ElfFeatureExtractor` or new `ElfNormalizer` class
|
||||
- Implement relocation zeroing: identify and zero-out position-dependent bytes (GOT/PLT entries, absolute addresses)
|
||||
- Implement NOP canonicalization: normalize all NOP variants to canonical form
|
||||
- Implement jump table rewriting: normalize indirect jump table entries
|
||||
- Add segment-level delta hashing on normalized output
|
||||
- Integrate with existing `DeltaSignatureGenerator` for hybrid function+segment signatures
|
||||
- Add tests using known ELF binaries with position-dependent variations
|
||||
|
||||
## Related Documentation
|
||||
- Current delta-sig: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- ELF extraction: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/ElfFeatureExtractor.cs`
|
||||
- Disassembly: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Disassembly.B2R2/`
|
||||
@@ -1,30 +0,0 @@
|
||||
# Ensemble decision engine for multi-tier matching
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Ensemble decision engine combines multiple matching tiers (range match, Build-ID, fingerprint) with configurable weight tuning for vulnerability classification.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/`
|
||||
- **Key Classes**:
|
||||
- `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs`) - combines multiple matching signals with configurable weights into a final vulnerability classification decision
|
||||
- `FunctionAnalysisBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs`) - builds function analysis inputs including optional ML embeddings
|
||||
- `WeightTuningService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/WeightTuningService.cs`) - tunes ensemble weights based on golden set validation results
|
||||
- `EnsembleOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/Models.cs`) - configurable weights and thresholds for matching tiers
|
||||
- `MlEmbeddingMatcherAdapter` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/Training/MlEmbeddingMatcherAdapter.cs`) - adapts ML function embeddings for ensemble use
|
||||
- **Interfaces**: `IEnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs`)
|
||||
- **Registration**: `EnsembleServiceCollectionExtensions.AddBinarySimilarityServices()` for full pipeline setup
|
||||
- **Benchmarks**: `EnsembleAccuracyBenchmarks`, `EnsembleLatencyBenchmarks` (`src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Benchmarks/`)
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit a binary with known vulnerability and verify ensemble produces correct classification
|
||||
- [ ] Verify weight tuning: adjust instruction weight to 0.6 and verify it changes classification outcomes
|
||||
- [ ] Verify multi-tier integration: Build-ID match, fingerprint match, and ML embedding all contribute to score
|
||||
- [ ] Verify `FunctionAnalysisBuilder` correctly assembles all matching dimensions
|
||||
- [ ] Verify `WeightTuningService` optimizes weights based on golden set validation accuracy
|
||||
- [ ] Run accuracy benchmark and verify F1 score meets minimum threshold
|
||||
@@ -1,28 +0,0 @@
|
||||
# Known-build binary catalog (Build-ID + hash-based binary identity)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
BinaryIdentity model and vulnerability assertion repository implement the binary-key-based catalog using Build-ID and file SHA256 as primary keys.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`
|
||||
- **Key Classes**:
|
||||
- `BinaryIdentity` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/BinaryIdentity.cs`) - core model with Build-ID, file SHA256, symbol tables as primary keys
|
||||
- `BinaryIdentityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/BinaryIdentityService.cs`) - manages binary identity lifecycle
|
||||
- `BinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/Services/BinaryVulnerabilityService.cs`) - vulnerability assertion repository with Build-ID catalog lookups and match method mapping (buildid_catalog, delta_signature, etc.)
|
||||
- `CachedBinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/CachedBinaryVulnerabilityService.cs`) - cached decorator with `LookupByDeltaSignatureAsync`
|
||||
- **Interfaces**: `IBinaryVulnerabilityService`, `IBinaryVulnAssertionRepository` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Services/`)
|
||||
- **Models**: `FixModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Core/Models/`) - `FixState`, `FixStatusResult`, `MatchMethod`, `MatchEvidence`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Register a binary identity with known Build-ID and verify it is stored in the catalog
|
||||
- [ ] Query the catalog by Build-ID and verify the correct binary identity is returned
|
||||
- [ ] Query by file SHA256 hash and verify the correct binary identity is returned
|
||||
- [ ] Assert a vulnerability against a binary identity and verify the assertion is persisted
|
||||
- [ ] Verify `CachedBinaryVulnerabilityService` caches lookups and returns cached results on repeat queries
|
||||
- [ ] Verify match method mapping: `buildid_catalog` maps to `MatchMethod.BuildIdCatalog`
|
||||
@@ -1,28 +0,0 @@
|
||||
# Local Mirror Layer for Corpus Sources
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Local mirror service for caching and serving corpus data from remote sources, supporting offline operation.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Alpine/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Rpm/`
|
||||
- **Key Classes**:
|
||||
- `DebianMirrorPackageSource` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/DebianMirrorPackageSource.cs`) - mirrors Debian package repositories for offline access
|
||||
- `DebianCorpusConnector` with `ICorpusSnapshotRepository` - creates snapshots of remote corpus state for local use
|
||||
- `AlpineCorpusConnector` with snapshot support - caches Alpine APK package data locally
|
||||
- `RpmCorpusConnector` - caches RPM package data for offline operation
|
||||
- `ICorpusSnapshotRepository` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/ICorpusSnapshotRepository.cs`) - persists corpus snapshots for offline retrieval
|
||||
- **Interfaces**: `IDebianPackageSource`, `IAlpinePackageSource`, `IRpmPackageSource` - distro-specific package source abstractions
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Fetch packages from Debian mirror source and verify local cache is populated
|
||||
- [ ] Disconnect network and verify cached corpus data is still accessible
|
||||
- [ ] Create a corpus snapshot and verify it captures the complete state of remote data
|
||||
- [ ] Verify Alpine APK packages are cached locally via `AlpineCorpusConnector`
|
||||
- [ ] Verify RPM packages are cached locally via `RpmCorpusConnector`
|
||||
- [ ] Verify snapshot-based queries return consistent results when the remote source changes
|
||||
@@ -1,25 +0,0 @@
|
||||
# Patch Coverage Tracking
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Dedicated patch coverage API endpoint for tracking which CVE patches are covered in binary analysis.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/`
|
||||
- **Key Classes**:
|
||||
- `PatchCoverageController` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Controllers/PatchCoverageController.cs`) - REST API controller for patch coverage queries using `IDeltaSignatureRepository`
|
||||
- `DeltaSignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSignatureMatcher.cs`) - matches delta signatures to assess patch coverage
|
||||
- `DeltaSigService` / `DeltaSigServiceV2` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`) - service layer for delta-sig operations
|
||||
- **Interfaces**: `IDeltaSignatureRepository` - repository for persisted delta signatures used by patch coverage queries
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Query patch coverage API for a known CVE and verify coverage status (covered/not covered)
|
||||
- [ ] Verify patch coverage percentage calculation: submit binaries with partial patch coverage
|
||||
- [ ] Verify that delta signatures for the CVE fix are used to determine coverage
|
||||
- [ ] Verify API returns correct coverage for batch queries across multiple CVEs
|
||||
- [ ] Verify coverage tracking updates when new delta signatures are added
|
||||
@@ -1,31 +0,0 @@
|
||||
# PatchDiffEngine (Binary Pre/Post Patch Comparison for Fix Verification)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Compares pre-patch and post-patch binaries at multiple levels (BasicBlock, CFG, StringRefs, Semantic/KSG fingerprints) to determine if a vulnerability has been remediated. Produces structured verification results with confidence scores based on match depth. Core verification logic for the Golden Set Diff Layer.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
|
||||
- **Key Classes**:
|
||||
- `PatchDiffEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/PatchDiffEngine.cs`) - core engine comparing pre/post binaries using `ISignatureMatcher`, `IFunctionFingerprintExtractor`, and `IFunctionDiffer`; produces `PatchDiffResult` with confidence scores
|
||||
- `PatchDiffEngine` (builders) (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/PatchDiffEngine.cs`) - builder-level diff engine
|
||||
- `FunctionDiffer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionDiffer.cs`) - function-level comparison with semantic analysis, call-graph edge diffing, and string reference comparison
|
||||
- `FunctionRenameDetector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/FunctionRenameDetector.cs`) - detects renamed functions between versions
|
||||
- `VerdictCalculator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/VerdictCalculator.cs`) - computes fix verification verdict from diff results
|
||||
- **Models**: `PatchDiffResult`, `PatchDiffModels`, `DiffEvidenceModels`, `DiffOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/Models/`)
|
||||
- **Storage**: `IDiffResultStore`, `InMemoryDiffResultStore` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Diff/Storage/`)
|
||||
- **Source**: SPRINT_20260110_012_004_BINDEX_golden_set_diff_verify.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Submit pre-patch and post-patch binaries for a known CVE fix and verify the diff result shows patch applied
|
||||
- [ ] Verify multi-level comparison: BasicBlock, CFG, StringRefs, and semantic fingerprints all contribute to confidence
|
||||
- [ ] Verify `FunctionDiffer` with `IncludeSemanticAnalysis=true` computes semantic similarity
|
||||
- [ ] Verify `FunctionRenameDetector` handles renamed functions between versions
|
||||
- [ ] Verify `VerdictCalculator` produces correct verdict (Fixed, Vulnerable, Unknown) based on diff evidence
|
||||
- [ ] Verify `NoPatchDetected` result is returned when binaries are identical
|
||||
- [ ] Verify diff results are persistable via `IDiffResultStore` with content-addressed IDs
|
||||
@@ -1,29 +0,0 @@
|
||||
# Reproducible Distro Build Pipeline (Container-Based Builders)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Container-based reproducible build pipeline for Alpine, Debian, and RHEL packages. Rebuilds upstream source packages in isolated containers to produce reference binaries for function-level fingerprint comparison, enabling backport detection by comparing distro-patched binaries against unpatched originals.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`, `src/BinaryIndex/StellaOps.BinaryIndex.Worker/`
|
||||
- **Key Classes**:
|
||||
- `ReproducibleBuildJob` (`src/BinaryIndex/StellaOps.BinaryIndex.Worker/Jobs/ReproducibleBuildJob.cs`) - background worker job using `IFunctionFingerprintExtractor` and `IPatchDiffEngine` to rebuild packages and compare fingerprints
|
||||
- `ReproducibleBuildOptions` - build configuration (timeout, container images, source package locations)
|
||||
- `IReproducibleBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/IReproducibleBuilder.cs`) - abstraction for container-based builds
|
||||
- `BuilderOptions` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/BuilderOptions.cs`) - builder configuration
|
||||
- `GuidProvider` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/GuidProvider.cs`) - deterministic GUID generation for reproducibility
|
||||
- **Integration**: Uses `IFingerprintClaimRepository` to store build verification claims; integrates with `IPatchDiffEngine` for post-build binary comparison
|
||||
- **Source**: SPRINT_1227_0002_0001_LB_reproducible_builders.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Trigger a reproducible build for a Debian package and verify reference binaries are produced
|
||||
- [ ] Compare distro-patched binary against unpatched original and verify fingerprint differences
|
||||
- [ ] Verify container isolation: build runs in isolated container with controlled environment
|
||||
- [ ] Verify `FingerprintClaim` records are generated with build provenance evidence
|
||||
- [ ] Verify `GuidProvider` produces deterministic GUIDs for identical build inputs
|
||||
- [ ] Verify backport detection: distro-patched binary with backported fix is correctly identified
|
||||
@@ -1,31 +0,0 @@
|
||||
# Semantic Analysis Library (IR Lifting and Function Fingerprinting)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Semantic binary analysis with IR lifting, function fingerprint generation, semantic matching, graph extraction, and call n-gram generation for function-level binary comparison.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
|
||||
- **Key Classes**:
|
||||
- `IrLiftingService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/IrLiftingService.cs`) - lifts machine code to intermediate representation using B2R2
|
||||
- `SemanticFingerprintGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs`) - generates `SemanticFingerprint` using Weisfeiler-Lehman graph hashing (KsgWeisfeilerLehmanV1 algorithm)
|
||||
- `SemanticGraphExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticGraphExtractor.cs`) - extracts key-semantics graphs (KSG) from lifted IR
|
||||
- `SemanticMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticMatcher.cs`) - matches semantic fingerprints for similarity scoring
|
||||
- `CallNgramGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs`) - call-sequence n-gram fingerprinting
|
||||
- `WeisfeilerLehmanHasher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/WeisfeilerLehmanHasher.cs`) - WL graph hash implementation
|
||||
- `GraphCanonicalizer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/Internal/GraphCanonicalizer.cs`) - graph canonicalization for deterministic hashing
|
||||
- **Models**: `FingerprintModels` (SemanticFingerprint, SemanticFingerprintOptions, SemanticFingerprintAlgorithm), `GraphModels` (KeySemanticsGraph), `IrModels` (LiftedFunction, IrStatement)
|
||||
- **Interfaces**: `IIrLiftingService`, `ISemanticFingerprintGenerator`, `ISemanticGraphExtractor`, `ISemanticMatcher`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Lift a binary function to IR via `IrLiftingService` and verify IR structure contains valid statements
|
||||
- [ ] Generate a semantic fingerprint via `SemanticFingerprintGenerator` and verify hash is deterministic
|
||||
- [ ] Extract a key-semantics graph via `SemanticGraphExtractor` and verify node/edge structure
|
||||
- [ ] Match two fingerprints of the same function (different compilers) via `SemanticMatcher` and verify high similarity
|
||||
- [ ] Verify Weisfeiler-Lehman graph hash produces different hashes for structurally different functions
|
||||
- [ ] Verify `GraphCanonicalizer` produces consistent canonical forms for isomorphic graphs
|
||||
@@ -1,30 +0,0 @@
|
||||
# Symbol Change Tracking in Binary Diffs (SymbolChangeTracer)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Extends BinaryIndex DeltaSignature module to track which specific symbols changed between binary versions (not just whether they match). Adds change metadata to SymbolMatchResult and provides detailed CFG hash and instruction hash comparison for symbol-level binary change forensics.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- **Key Classes**:
|
||||
- `SymbolChangeTracer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/SymbolChangeTracer.cs`) - traces symbol-level changes between binary versions with detailed CFG hash and instruction hash comparison
|
||||
- `DeltaSignatureGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSignatureGenerator.cs`) - generates delta signatures capturing symbol change metadata
|
||||
- `DeltaSignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/DeltaSignatureMatcher.cs`) - matches signatures with change tracking awareness
|
||||
- `CfgExtractor` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/CfgExtractor.cs`) - extracts CFG for hash comparison
|
||||
- `IrDiffGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/IrDiff/IrDiffGenerator.cs`) - generates IR-level diffs for detailed change analysis
|
||||
- **Interfaces**: `ISymbolChangeTracer` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/ISymbolChangeTracer.cs`)
|
||||
- **Models**: `SymbolMatchResult` with change metadata in `Models.cs`
|
||||
- **Source**: SPRINT_20260112_200_003_BINDEX_symbol_tracking.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Compare two binary versions with known symbol changes and verify `SymbolChangeTracer` identifies which symbols changed
|
||||
- [ ] Verify CFG hash comparison detects control flow changes in modified functions
|
||||
- [ ] Verify instruction hash comparison detects instruction-level changes
|
||||
- [ ] Verify `SymbolMatchResult` includes change metadata (added, removed, modified symbols)
|
||||
- [ ] Verify IR-level diff captures semantic changes beyond byte-level differences
|
||||
- [ ] Verify unchanged symbols are correctly identified as stable between versions
|
||||
@@ -1,28 +0,0 @@
|
||||
# Symbol Source Connectors (Debuginfod, Buildinfo, Ddeb, SecDb)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Four symbol source connector implementations (Debuginfod, Debian Buildinfo, Ubuntu Ddeb, Alpine SecDb), each with plugin registration and configuration support.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Alpine/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Rpm/`
|
||||
- **Key Classes**:
|
||||
- **Alpine SecDb**: `AlpineCorpusConnector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Alpine/AlpineCorpusConnector.cs`) - connects to Alpine security database; `ApkBuildSecfixesExtractor` - extracts secfixes from APK build files
|
||||
- **Debian Buildinfo**: `DebianCorpusConnector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Debian/DebianCorpusConnector.cs`) - connects to Debian buildinfo sources; `DebianMirrorPackageSource` - mirrors Debian repositories
|
||||
- **RPM**: `RpmCorpusConnector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus.Rpm/RpmCorpusConnector.cs`) - connects to RPM repositories; `SrpmChangelogExtractor` - extracts changelogs from source RPMs
|
||||
- **Library-specific**: `CurlCorpusConnector`, `GlibcCorpusConnector`, `OpenSslCorpusConnector`, `ZlibCorpusConnector` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Connectors/`)
|
||||
- **Interfaces**: `IBinaryCorpusConnector`, `ILibraryCorpusConnector`, `IAlpinePackageSource`, `IDebianPackageSource`, `IRpmPackageSource`
|
||||
- **Package Extractors**: `AlpinePackageExtractor`, `DebianPackageExtractor`, `RpmPackageExtractor` - extract binaries from packages using `IBinaryFeatureExtractor`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Connect via `AlpineCorpusConnector` and verify secfixes data is extracted from APK builds
|
||||
- [ ] Connect via `DebianCorpusConnector` and verify buildinfo data is retrieved from Debian mirrors
|
||||
- [ ] Connect via `RpmCorpusConnector` and verify RPM changelog extraction works
|
||||
- [ ] Verify library-specific connectors (OpenSSL, glibc, curl, zlib) retrieve correct binary versions
|
||||
- [ ] Verify all connectors produce `CorpusSnapshot` with consistent snapshot IDs
|
||||
- [ ] Verify package extractors use `IBinaryFeatureExtractor` to extract identity features from packages
|
||||
@@ -1,29 +0,0 @@
|
||||
# Validation Harness and Reproducibility Verification
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Validation harness with determinism validation, SBOM stability checking, and reproducible build verification. Includes local rebuild backend and bundle export/import.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
|
||||
- **Key Classes**:
|
||||
- `ValidationHarness` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Validation/ValidationHarness.cs`) - main validation harness with `IMatcherAdapterFactory` for pluggable matching
|
||||
- `ValidationHarnessService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ValidationHarnessService.cs`) - reproducible-build validation with `ValidationRunContext`
|
||||
- `ReproducibleBuildJob` (`src/BinaryIndex/StellaOps.BinaryIndex.Worker/Jobs/ReproducibleBuildJob.cs`) - local rebuild backend
|
||||
- `KpiRegressionService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/Services/KpiRegressionService.cs`) - SBOM stability and KPI regression tracking
|
||||
- **Bundle Export/Import**: `ServiceCollectionExtensions.AddCorpusBundleExport/Import` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.GroundTruth.Reproducible/ServiceCollectionExtensions.cs`)
|
||||
- **Interfaces**: `IValidationHarness`, `IKpiRegressionService`, `IReproducibleBuildJob`
|
||||
- **Registration**: `ValidationServiceCollectionExtensions.AddValidationHarness()`
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Run validation harness and verify deterministic results for identical inputs
|
||||
- [ ] Verify SBOM stability checking detects unstable hash generation
|
||||
- [ ] Verify reproducible build verification: rebuild from source and compare against original binary
|
||||
- [ ] Verify bundle export produces a self-contained archive importable on air-gapped systems
|
||||
- [ ] Verify bundle import restores corpus data and enables offline validation
|
||||
- [ ] Verify KPI regression tracking across multiple validation harness runs
|
||||
@@ -1,29 +0,0 @@
|
||||
# Vulnerable Binaries Database (BinaryIndex Module)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Dedicated BinaryIndex module with web service, worker, and library structure for binary vulnerability detection independent of package metadata.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/StellaOps.BinaryIndex.WebService/`, `src/BinaryIndex/StellaOps.BinaryIndex.Worker/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`
|
||||
- **Key Classes**:
|
||||
- **Web Service**: `ResolutionController` (`Controllers/ResolutionController.cs`) - vulnerability resolution API; `GoldenSetController` - golden set management API; `PatchCoverageController` - patch coverage API; `BinaryIndexOpsController` - ops health/bench/cache endpoints
|
||||
- **Worker**: `ReproducibleBuildJob` (`Jobs/ReproducibleBuildJob.cs`) - background worker for build verification
|
||||
- **Persistence**: `BinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/Services/BinaryVulnerabilityService.cs`) - vulnerability detection service with match method mapping and corpus query integration
|
||||
- **Cache**: `CachedBinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Cache/CachedBinaryVulnerabilityService.cs`) - Valkey-backed caching layer
|
||||
- **Analysis**: `SignatureMatcher`, `TaintGateExtractor`, `ReachGraphBinaryReachabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`)
|
||||
- **Ensemble**: `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/`) - multi-tier vulnerability classification
|
||||
- **Program Entry**: `Program.cs` (`src/BinaryIndex/StellaOps.BinaryIndex.WebService/Program.cs`) - configures services, resolution caching, rate limiting
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Query the database for a known vulnerable binary (by Build-ID) and verify vulnerability is detected
|
||||
- [ ] Submit a binary for analysis and verify detection works independent of package metadata
|
||||
- [ ] Verify web service endpoints are accessible: resolution, golden set, patch coverage, ops
|
||||
- [ ] Verify worker job processes reproducible build verification in the background
|
||||
- [ ] Verify cached lookups improve performance on repeated queries
|
||||
- [ ] Verify ensemble decision engine combines all matching signals for final vulnerability classification
|
||||
@@ -1,30 +0,0 @@
|
||||
# Vulnerable Code Fingerprint Matching (CFG + Basic Block + String Refs Ensemble)
|
||||
|
||||
## Module
|
||||
BinaryIndex
|
||||
|
||||
## Status
|
||||
IMPLEMENTED
|
||||
|
||||
## Description
|
||||
Function-level vulnerability detection independent of package metadata using an ensemble of fingerprint algorithms: basic block hashing, control flow graph fingerprinting, and string reference fingerprinting. Combined generator provides multi-algorithm similarity matching with configurable thresholds. Includes pre-seeded fingerprints for high-impact CVEs in OpenSSL, glibc, zlib, and curl.
|
||||
|
||||
## Implementation Details
|
||||
- **Modules**: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/`, `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/`
|
||||
- **Key Classes**:
|
||||
- `SignatureMatcher` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/SignatureMatcher.cs`) - matches vulnerability signatures using fingerprint index
|
||||
- `EnsembleDecisionEngine` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/EnsembleDecisionEngine.cs`) - combines CFG, basic block, string ref, and ML embedding fingerprints with configurable weights
|
||||
- `FunctionAnalysisBuilder` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/FunctionAnalysisBuilder.cs`) - assembles multi-algorithm fingerprint inputs
|
||||
- `SemanticFingerprintGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs`) - KSG-based semantic fingerprinting
|
||||
- `CallNgramGenerator` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/CallNgramGenerator.cs`) - call-sequence fingerprinting
|
||||
- `BinaryVulnerabilityService` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Persistence/Services/BinaryVulnerabilityService.cs`) - vulnerability lookup with pre-seeded fingerprints
|
||||
- **Models**: `SignatureIndexModels` (`src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Analysis/Models/`) - fingerprint index models
|
||||
- **Source**: SPRINT_20251226_013_BINIDX_fingerprint_factory.md
|
||||
|
||||
## E2E Test Plan
|
||||
- [ ] Match a known vulnerable function (e.g., OpenSSL Heartbleed) against pre-seeded fingerprints and verify detection
|
||||
- [ ] Verify multi-algorithm ensemble: CFG fingerprint + basic block hash + string refs all contribute to match score
|
||||
- [ ] Verify configurable threshold: adjust threshold to 0.8 and verify borderline matches are excluded
|
||||
- [ ] Verify pre-seeded fingerprints exist for high-impact CVEs (OpenSSL, glibc, zlib, curl)
|
||||
- [ ] Verify false positive rate: submit clean binary functions and verify no false matches
|
||||
- [ ] Verify `EnsembleDecisionEngine` weight tuning affects match outcomes
|
||||
Reference in New Issue
Block a user