save progress
This commit is contained in:
@@ -1,541 +0,0 @@
|
||||
# Sprint 20260105_001_001_BINDEX - Semantic Diffing Phase 1: IR-Level Semantic Analysis
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Enhance the BinaryIndex module to leverage B2R2's Intermediate Representation (IR) for semantic-level function comparison, moving beyond instruction-byte normalization to true semantic matching that is resilient to compiler optimizations, instruction reordering, and register allocation differences.
|
||||
|
||||
**Advisory Reference:** Product advisory on semantic diffing breakthrough capabilities (Jan 2026)
|
||||
|
||||
**Key Insight:** Current implementation normalizes instruction bytes and computes CFG hashes, but does not lift to B2R2's LowUIR/SSA form for semantic analysis. This limits accuracy on optimized/obfuscated binaries by ~15-20%.
|
||||
|
||||
**Working directory:** `src/BinaryIndex/`
|
||||
|
||||
**Evidence:** New `StellaOps.BinaryIndex.Semantic` library, updated fingerprint generators, integration tests.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| B2R2 v0.9.1+ | Package | Available |
|
||||
| StellaOps.BinaryIndex.Disassembly | Internal | Stable |
|
||||
| StellaOps.BinaryIndex.Fingerprints | Internal | Stable |
|
||||
| StellaOps.BinaryIndex.DeltaSig | Internal | Stable |
|
||||
|
||||
**Parallel Execution:** Tasks SEMD-001 through SEMD-004 can proceed in parallel. SEMD-005+ depend on foundation work.
|
||||
|
||||
---
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/modules/binary-index/architecture.md`
|
||||
- `docs/modules/binary-index/README.md`
|
||||
- B2R2 documentation: https://b2r2.org/
|
||||
- SemDiff paper: https://arxiv.org/abs/2308.01463
|
||||
|
||||
---
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Current State
|
||||
|
||||
```
|
||||
Binary Input
|
||||
|
|
||||
v
|
||||
B2R2 Disassembly --> Raw Instructions
|
||||
|
|
||||
v
|
||||
Normalization Pipeline --> Normalized Bytes (position-independent)
|
||||
|
|
||||
v
|
||||
Hash Generation --> BasicBlockHash, CfgHash, StringRefsHash
|
||||
|
|
||||
v
|
||||
Fingerprint Matching --> Similarity Score
|
||||
```
|
||||
|
||||
**Limitations:**
|
||||
1. **Instruction-level comparison** - Sensitive to register allocation changes
|
||||
2. **No semantic lifting** - Cannot detect equivalent operations with different instructions
|
||||
3. **Optimization blindness** - Loop unrolling, inlining, constant propagation break matches
|
||||
4. **Basic CFG hashing** - Edge counts/hashes miss semantic equivalence
|
||||
|
||||
### Target State
|
||||
|
||||
```
|
||||
Binary Input
|
||||
|
|
||||
v
|
||||
B2R2 Disassembly --> Raw Instructions
|
||||
|
|
||||
v
|
||||
B2R2 IR Lifting --> LowUIR Statements
|
||||
|
|
||||
v
|
||||
SSA Transformation --> SSA Form (optional)
|
||||
|
|
||||
v
|
||||
Semantic Graph Extraction --> Key-Semantics Graph (KSG)
|
||||
|
|
||||
v
|
||||
Graph Fingerprinting --> Semantic Fingerprint
|
||||
|
|
||||
v
|
||||
Graph Isomorphism Check --> Semantic Similarity Score
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Design
|
||||
|
||||
### New Components
|
||||
|
||||
#### 1. IR Lifting Service
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/IrLiftingService.cs
|
||||
namespace StellaOps.BinaryIndex.Semantic;
|
||||
|
||||
public interface IIrLiftingService
|
||||
{
|
||||
/// <summary>
|
||||
/// Lift disassembled instructions to B2R2 LowUIR.
|
||||
/// </summary>
|
||||
Task<LiftedFunction> LiftToIrAsync(
|
||||
DisassembledFunction function,
|
||||
LiftOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Transform IR to SSA form for dataflow analysis.
|
||||
/// </summary>
|
||||
Task<SsaFunction> TransformToSsaAsync(
|
||||
LiftedFunction lifted,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record LiftedFunction(
|
||||
string Name,
|
||||
ulong Address,
|
||||
ImmutableArray<IrStatement> Statements,
|
||||
ImmutableArray<IrBasicBlock> BasicBlocks,
|
||||
ControlFlowGraph Cfg);
|
||||
|
||||
public sealed record SsaFunction(
|
||||
string Name,
|
||||
ulong Address,
|
||||
ImmutableArray<SsaStatement> Statements,
|
||||
ImmutableArray<SsaBasicBlock> BasicBlocks,
|
||||
DefUseChains DefUse);
|
||||
```
|
||||
|
||||
#### 2. Semantic Graph Extractor
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticGraphExtractor.cs
|
||||
namespace StellaOps.BinaryIndex.Semantic;
|
||||
|
||||
public interface ISemanticGraphExtractor
|
||||
{
|
||||
/// <summary>
|
||||
/// Extract key-semantics graph from lifted IR.
|
||||
/// Captures: data dependencies, control dependencies, memory operations.
|
||||
/// </summary>
|
||||
Task<KeySemanticsGraph> ExtractGraphAsync(
|
||||
LiftedFunction function,
|
||||
GraphExtractionOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record KeySemanticsGraph(
|
||||
string FunctionName,
|
||||
ImmutableArray<SemanticNode> Nodes,
|
||||
ImmutableArray<SemanticEdge> Edges,
|
||||
GraphProperties Properties);
|
||||
|
||||
public sealed record SemanticNode(
|
||||
int Id,
|
||||
SemanticNodeType Type, // Compute, Load, Store, Branch, Call, Return
|
||||
string Operation, // add, mul, cmp, etc.
|
||||
ImmutableArray<string> Operands);
|
||||
|
||||
public sealed record SemanticEdge(
|
||||
int SourceId,
|
||||
int TargetId,
|
||||
SemanticEdgeType Type); // DataDep, ControlDep, MemoryDep
|
||||
|
||||
public enum SemanticNodeType { Compute, Load, Store, Branch, Call, Return, Phi }
|
||||
public enum SemanticEdgeType { DataDependency, ControlDependency, MemoryDependency }
|
||||
```
|
||||
|
||||
#### 3. Semantic Fingerprint Generator
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs
|
||||
namespace StellaOps.BinaryIndex.Semantic;
|
||||
|
||||
public interface ISemanticFingerprintGenerator
|
||||
{
|
||||
/// <summary>
|
||||
/// Generate semantic fingerprint from key-semantics graph.
|
||||
/// </summary>
|
||||
Task<SemanticFingerprint> GenerateAsync(
|
||||
KeySemanticsGraph graph,
|
||||
SemanticFingerprintOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record SemanticFingerprint(
|
||||
string FunctionName,
|
||||
byte[] GraphHash, // 32-byte SHA-256 of canonical graph
|
||||
byte[] OperationHash, // Hash of operation sequence
|
||||
byte[] DataFlowHash, // Hash of data dependency patterns
|
||||
int NodeCount,
|
||||
int EdgeCount,
|
||||
int CyclomaticComplexity,
|
||||
ImmutableArray<string> ApiCalls, // External calls (semantic anchors)
|
||||
SemanticFingerprintAlgorithm Algorithm);
|
||||
|
||||
public enum SemanticFingerprintAlgorithm
|
||||
{
|
||||
KsgV1, // Key-Semantics Graph v1
|
||||
WeisfeilerLehman, // WL graph hashing
|
||||
GraphletCounting // Graphlet-based similarity
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. Semantic Matcher
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticMatcher.cs
|
||||
namespace StellaOps.BinaryIndex.Semantic;
|
||||
|
||||
public interface ISemanticMatcher
|
||||
{
|
||||
/// <summary>
|
||||
/// Compute semantic similarity between two functions.
|
||||
/// </summary>
|
||||
Task<SemanticMatchResult> MatchAsync(
|
||||
SemanticFingerprint a,
|
||||
SemanticFingerprint b,
|
||||
MatchOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Find best matches for a function in a corpus.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<SemanticMatchResult>> FindMatchesAsync(
|
||||
SemanticFingerprint query,
|
||||
IAsyncEnumerable<SemanticFingerprint> corpus,
|
||||
decimal minSimilarity = 0.7m,
|
||||
int maxResults = 10,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record SemanticMatchResult(
|
||||
string FunctionA,
|
||||
string FunctionB,
|
||||
decimal OverallSimilarity,
|
||||
decimal GraphSimilarity,
|
||||
decimal DataFlowSimilarity,
|
||||
decimal ApiCallSimilarity,
|
||||
MatchConfidence Confidence,
|
||||
ImmutableArray<MatchDelta> Deltas); // What changed
|
||||
|
||||
public enum MatchConfidence { VeryHigh, High, Medium, Low, VeryLow }
|
||||
|
||||
public sealed record MatchDelta(
|
||||
DeltaType Type,
|
||||
string Description,
|
||||
decimal Impact);
|
||||
|
||||
public enum DeltaType { NodeAdded, NodeRemoved, EdgeAdded, EdgeRemoved, OperationChanged }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owners | Task Definition |
|
||||
|---|---------|--------|------------|--------|-----------------|
|
||||
| 1 | SEMD-001 | TODO | - | Guild | Create `StellaOps.BinaryIndex.Semantic` project structure |
|
||||
| 2 | SEMD-002 | TODO | - | Guild | Define IR model types (IrStatement, IrBasicBlock, IrOperand) |
|
||||
| 3 | SEMD-003 | TODO | - | Guild | Define semantic graph model types (KeySemanticsGraph, SemanticNode, SemanticEdge) |
|
||||
| 4 | SEMD-004 | TODO | - | Guild | Define SemanticFingerprint and matching result types |
|
||||
| 5 | SEMD-005 | TODO | SEMD-001,002 | Guild | Implement B2R2 IR lifting adapter (LowUIR extraction) |
|
||||
| 6 | SEMD-006 | TODO | SEMD-005 | Guild | Implement SSA transformation (optional dataflow analysis) |
|
||||
| 7 | SEMD-007 | TODO | SEMD-003,005 | Guild | Implement KeySemanticsGraph extractor from IR |
|
||||
| 8 | SEMD-008 | TODO | SEMD-004,007 | Guild | Implement graph canonicalization for deterministic hashing |
|
||||
| 9 | SEMD-009 | TODO | SEMD-008 | Guild | Implement Weisfeiler-Lehman graph hashing |
|
||||
| 10 | SEMD-010 | TODO | SEMD-009 | Guild | Implement SemanticFingerprintGenerator |
|
||||
| 11 | SEMD-011 | TODO | SEMD-010 | Guild | Implement SemanticMatcher with weighted similarity |
|
||||
| 12 | SEMD-012 | TODO | SEMD-011 | Guild | Integrate semantic fingerprints into PatchDiffEngine |
|
||||
| 13 | SEMD-013 | TODO | SEMD-012 | Guild | Integrate semantic fingerprints into DeltaSignatureGenerator |
|
||||
| 14 | SEMD-014 | TODO | SEMD-010 | Guild | Unit tests: IR lifting correctness |
|
||||
| 15 | SEMD-015 | TODO | SEMD-010 | Guild | Unit tests: Graph extraction determinism |
|
||||
| 16 | SEMD-016 | TODO | SEMD-011 | Guild | Unit tests: Semantic matching accuracy |
|
||||
| 17 | SEMD-017 | TODO | SEMD-013 | Guild | Integration tests: End-to-end semantic diffing |
|
||||
| 18 | SEMD-018 | TODO | SEMD-017 | Guild | Golden corpus: Create test binaries with known semantic equivalences |
|
||||
| 19 | SEMD-019 | TODO | SEMD-018 | Guild | Benchmark: Compare accuracy vs. instruction-level matching |
|
||||
| 20 | SEMD-020 | TODO | SEMD-019 | Guild | Documentation: Update architecture.md with semantic diffing |
|
||||
|
||||
---
|
||||
|
||||
## Task Details
|
||||
|
||||
### SEMD-001: Create Project Structure
|
||||
|
||||
Create new library project for semantic analysis:
|
||||
|
||||
```
|
||||
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/
|
||||
StellaOps.BinaryIndex.Semantic.csproj
|
||||
IrLiftingService.cs
|
||||
SemanticGraphExtractor.cs
|
||||
SemanticFingerprintGenerator.cs
|
||||
SemanticMatcher.cs
|
||||
Models/
|
||||
IrModels.cs
|
||||
GraphModels.cs
|
||||
FingerprintModels.cs
|
||||
MatchModels.cs
|
||||
Internal/
|
||||
B2R2IrAdapter.cs
|
||||
GraphCanonicalizer.cs
|
||||
WeisfeilerLehmanHasher.cs
|
||||
```
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- [ ] Project builds successfully
|
||||
- [ ] References StellaOps.BinaryIndex.Disassembly
|
||||
- [ ] References B2R2.FrontEnd.BinLifter
|
||||
|
||||
---
|
||||
|
||||
### SEMD-005: Implement B2R2 IR Lifting Adapter
|
||||
|
||||
Leverage B2R2's BinLifter to lift raw instructions to LowUIR:
|
||||
|
||||
```csharp
|
||||
internal sealed class B2R2IrAdapter : IIrLiftingService
|
||||
{
|
||||
public async Task<LiftedFunction> LiftToIrAsync(
|
||||
DisassembledFunction function,
|
||||
LiftOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var handle = BinHandle.FromBytes(
|
||||
function.Architecture.ToB2R2Isa(),
|
||||
function.RawBytes);
|
||||
|
||||
var lifter = LowUIRHelper.init(handle);
|
||||
var statements = new List<IrStatement>();
|
||||
|
||||
foreach (var instr in function.Instructions)
|
||||
{
|
||||
ct.ThrowIfCancellationRequested();
|
||||
|
||||
var stmts = LowUIRHelper.translateInstr(lifter, instr.Address);
|
||||
statements.AddRange(ConvertStatements(stmts));
|
||||
}
|
||||
|
||||
var cfg = BuildControlFlowGraph(statements, function.StartAddress);
|
||||
|
||||
return new LiftedFunction(
|
||||
function.Name,
|
||||
function.StartAddress,
|
||||
[.. statements],
|
||||
ExtractBasicBlocks(cfg),
|
||||
cfg);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- [ ] Successfully lifts x64 instructions to IR
|
||||
- [ ] Successfully lifts ARM64 instructions to IR
|
||||
- [ ] CFG is correctly constructed
|
||||
- [ ] Memory operations are properly modeled
|
||||
|
||||
---
|
||||
|
||||
### SEMD-007: Implement Key-Semantics Graph Extractor
|
||||
|
||||
Extract semantic graph capturing:
|
||||
- **Computation nodes**: Arithmetic, logic, comparison operations
|
||||
- **Memory nodes**: Load/store operations with abstract addresses
|
||||
- **Control nodes**: Branches, calls, returns
|
||||
- **Data dependency edges**: Def-use chains
|
||||
- **Control dependency edges**: Branch->target relationships
|
||||
|
||||
```csharp
|
||||
internal sealed class KeySemanticsGraphExtractor : ISemanticGraphExtractor
|
||||
{
|
||||
public async Task<KeySemanticsGraph> ExtractGraphAsync(
|
||||
LiftedFunction function,
|
||||
GraphExtractionOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var nodes = new List<SemanticNode>();
|
||||
var edges = new List<SemanticEdge>();
|
||||
var defMap = new Dictionary<string, int>(); // Variable -> defining node
|
||||
var nodeId = 0;
|
||||
|
||||
foreach (var stmt in function.Statements)
|
||||
{
|
||||
ct.ThrowIfCancellationRequested();
|
||||
|
||||
var node = CreateNode(ref nodeId, stmt);
|
||||
nodes.Add(node);
|
||||
|
||||
// Add data dependency edges
|
||||
foreach (var use in GetUses(stmt))
|
||||
{
|
||||
if (defMap.TryGetValue(use, out var defNode))
|
||||
{
|
||||
edges.Add(new SemanticEdge(defNode, node.Id, SemanticEdgeType.DataDependency));
|
||||
}
|
||||
}
|
||||
|
||||
// Track definitions
|
||||
foreach (var def in GetDefs(stmt))
|
||||
{
|
||||
defMap[def] = node.Id;
|
||||
}
|
||||
}
|
||||
|
||||
// Add control dependency edges from CFG
|
||||
AddControlDependencies(function.Cfg, nodes, edges);
|
||||
|
||||
return new KeySemanticsGraph(
|
||||
function.Name,
|
||||
[.. nodes],
|
||||
[.. edges],
|
||||
ComputeProperties(nodes, edges));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SEMD-009: Implement Weisfeiler-Lehman Graph Hashing
|
||||
|
||||
WL hashing provides stable graph fingerprints:
|
||||
|
||||
```csharp
|
||||
internal sealed class WeisfeilerLehmanHasher
|
||||
{
|
||||
private readonly int _iterations;
|
||||
|
||||
public WeisfeilerLehmanHasher(int iterations = 3)
|
||||
{
|
||||
_iterations = iterations;
|
||||
}
|
||||
|
||||
public byte[] ComputeHash(KeySemanticsGraph graph)
|
||||
{
|
||||
// Initialize labels from node types
|
||||
var labels = graph.Nodes.ToDictionary(
|
||||
n => n.Id,
|
||||
n => ComputeNodeLabel(n));
|
||||
|
||||
// WL iteration
|
||||
for (var i = 0; i < _iterations; i++)
|
||||
{
|
||||
var newLabels = new Dictionary<int, string>();
|
||||
|
||||
foreach (var node in graph.Nodes)
|
||||
{
|
||||
var neighbors = graph.Edges
|
||||
.Where(e => e.SourceId == node.Id || e.TargetId == node.Id)
|
||||
.Select(e => e.SourceId == node.Id ? e.TargetId : e.SourceId)
|
||||
.OrderBy(id => labels[id])
|
||||
.ToList();
|
||||
|
||||
var multiset = string.Join(",", neighbors.Select(id => labels[id]));
|
||||
var newLabel = ComputeLabel(labels[node.Id], multiset);
|
||||
newLabels[node.Id] = newLabel;
|
||||
}
|
||||
|
||||
labels = newLabels;
|
||||
}
|
||||
|
||||
// Compute final hash from sorted labels
|
||||
var sortedLabels = labels.Values.OrderBy(l => l).ToList();
|
||||
var combined = string.Join("|", sortedLabels);
|
||||
return SHA256.HashData(Encoding.UTF8.GetBytes(combined));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
| Test Class | Coverage |
|
||||
|------------|----------|
|
||||
| `IrLiftingServiceTests` | IR lifting correctness per architecture |
|
||||
| `SemanticGraphExtractorTests` | Graph construction, edge types, node types |
|
||||
| `GraphCanonicalizerTests` | Deterministic ordering |
|
||||
| `WeisfeilerLehmanHasherTests` | Hash stability, collision resistance |
|
||||
| `SemanticMatcherTests` | Similarity scoring accuracy |
|
||||
|
||||
### Integration Tests
|
||||
|
||||
| Test Class | Coverage |
|
||||
|------------|----------|
|
||||
| `EndToEndSemanticDiffTests` | Full pipeline from binary to match result |
|
||||
| `OptimizationResilienceTests` | Same source, different optimization levels |
|
||||
| `CompilerVariantTests` | Same source, GCC vs Clang |
|
||||
|
||||
### Golden Corpus
|
||||
|
||||
Create test binaries from known C source with variations:
|
||||
- `test_func_O0.o` - No optimization
|
||||
- `test_func_O2.o` - Standard optimization
|
||||
- `test_func_O3.o` - Aggressive optimization
|
||||
- `test_func_clang.o` - Different compiler
|
||||
|
||||
All should match semantically despite instruction differences.
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Current | Target |
|
||||
|--------|---------|--------|
|
||||
| Semantic match accuracy (optimized binaries) | ~65% | 85%+ |
|
||||
| False positive rate | ~5% | <2% |
|
||||
| Match latency (per function) | N/A | <50ms |
|
||||
| Memory per function | N/A | <10MB |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
|
||||
|
||||
---
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision/Risk | Type | Mitigation |
|
||||
|---------------|------|------------|
|
||||
| B2R2 IR coverage may be incomplete for some instructions | Risk | Fallback to instruction-level matching for unsupported operations |
|
||||
| WL hashing may produce collisions for small functions | Risk | Combine with operation hash and API call hash |
|
||||
| SSA transformation adds latency | Trade-off | Make SSA optional, use for high-confidence matching only |
|
||||
| Graph size explosion for large functions | Risk | Limit node count, use sampling for very large functions |
|
||||
|
||||
---
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-10: SEMD-001 through SEMD-004 (project structure, models) complete
|
||||
- 2026-01-17: SEMD-005 through SEMD-010 (core implementation) complete
|
||||
- 2026-01-24: SEMD-011 through SEMD-020 (integration, testing, benchmarks) complete
|
||||
@@ -1,592 +0,0 @@
|
||||
# Sprint 20260105_001_002_BINDEX - Semantic Diffing Phase 2: Function Behavior Corpus
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Build a comprehensive function behavior corpus (similar to Ghidra's BSim/FunctionID) containing fingerprints of known library functions across multiple versions and architectures. This enables identification of functions in stripped binaries by matching against a large corpus of pre-indexed function behaviors.
|
||||
|
||||
**Advisory Reference:** Product advisory on semantic diffing - BSim behavioral similarity against large signature sets.
|
||||
|
||||
**Key Insight:** Current delta signatures are CVE-specific. A large pre-built corpus of "known good" function behaviors enables identifying functions like "this is `memcpy` from glibc 2.31" even in stripped binaries, which is critical for accurate vulnerability attribution.
|
||||
|
||||
**Working directory:** `src/BinaryIndex/`
|
||||
|
||||
**Evidence:** New `StellaOps.BinaryIndex.Corpus` library, corpus ingestion pipeline, PostgreSQL corpus schema.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Required for semantic fingerprints |
|
||||
| StellaOps.BinaryIndex.Semantic | Internal | From Phase 1 |
|
||||
| PostgreSQL | Infrastructure | Available |
|
||||
| Package mirrors (Debian, Alpine, RHEL) | External | Available |
|
||||
|
||||
**Parallel Execution:** Corpus connector development (CORP-005-007) can proceed in parallel after CORP-004.
|
||||
|
||||
---
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/modules/binary-index/architecture.md`
|
||||
- Phase 1 sprint: `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
|
||||
- Ghidra BSim documentation: https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/BSimServerAPI.html
|
||||
|
||||
---
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Current State
|
||||
|
||||
- Delta signatures are generated on-demand for specific CVEs
|
||||
- No pre-built corpus of common library functions
|
||||
- Cannot identify functions by behavior alone (requires symbols or prior CVE signature)
|
||||
- Stripped binaries fall back to weaker Build-ID/hash matching
|
||||
|
||||
### Target State
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Function Behavior Corpus │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Corpus Ingestion Layer │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ GlibcCorpus │ │ OpenSSLCorpus│ │ zlibCorpus │ ... │ │
|
||||
│ │ │ Connector │ │ Connector │ │ Connector │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Fingerprint Generation │ │
|
||||
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
|
||||
│ │ │ Instruction │ │ Semantic │ │ API Call │ │ │
|
||||
│ │ │ Fingerprint │ │ Fingerprint │ │ Fingerprint │ │ │
|
||||
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Corpus Storage (PostgreSQL) │ │
|
||||
│ │ │ │
|
||||
│ │ corpus.libraries - Known libraries (glibc, openssl, etc.) │ │
|
||||
│ │ corpus.library_versions - Version snapshots │ │
|
||||
│ │ corpus.functions - Function metadata │ │
|
||||
│ │ corpus.fingerprints - Fingerprint index (semantic + instruction) │ │
|
||||
│ │ corpus.function_clusters - Similar function groups │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Query Layer │ │
|
||||
│ │ │ │
|
||||
│ │ ICorpusQueryService.IdentifyFunctionAsync(fingerprint) │ │
|
||||
│ │ -> Returns: [{library: "glibc", version: "2.31", name: "memcpy"}] │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Design
|
||||
|
||||
### Database Schema
|
||||
|
||||
```sql
|
||||
-- Corpus schema for function behavior database
|
||||
CREATE SCHEMA IF NOT EXISTS corpus;
|
||||
|
||||
-- Known libraries tracked in corpus
|
||||
CREATE TABLE corpus.libraries (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
name TEXT NOT NULL UNIQUE, -- glibc, openssl, zlib, curl
|
||||
description TEXT,
|
||||
homepage_url TEXT,
|
||||
source_repo TEXT, -- git URL
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
|
||||
-- Library versions indexed
|
||||
CREATE TABLE corpus.library_versions (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
|
||||
version TEXT NOT NULL, -- 2.31, 1.1.1n, 1.2.13
|
||||
release_date DATE,
|
||||
is_security_release BOOLEAN DEFAULT false,
|
||||
source_archive_sha256 TEXT, -- Hash of source tarball
|
||||
indexed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE (library_id, version)
|
||||
);
|
||||
|
||||
-- Architecture variants
|
||||
CREATE TABLE corpus.build_variants (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
library_version_id UUID NOT NULL REFERENCES corpus.library_versions(id),
|
||||
architecture TEXT NOT NULL, -- x86_64, aarch64, armv7
|
||||
abi TEXT, -- gnu, musl, msvc
|
||||
compiler TEXT, -- gcc, clang
|
||||
compiler_version TEXT,
|
||||
optimization_level TEXT, -- O0, O2, O3, Os
|
||||
build_id TEXT, -- ELF Build-ID if available
|
||||
binary_sha256 TEXT NOT NULL,
|
||||
indexed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE (library_version_id, architecture, abi, compiler, optimization_level)
|
||||
);
|
||||
|
||||
-- Functions in corpus
|
||||
CREATE TABLE corpus.functions (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
build_variant_id UUID NOT NULL REFERENCES corpus.build_variants(id),
|
||||
name TEXT NOT NULL, -- Function name (may be mangled)
|
||||
demangled_name TEXT, -- Demangled C++ name
|
||||
address BIGINT NOT NULL,
|
||||
size_bytes INTEGER NOT NULL,
|
||||
is_exported BOOLEAN DEFAULT false,
|
||||
is_inline BOOLEAN DEFAULT false,
|
||||
source_file TEXT, -- Source file if debug info
|
||||
source_line INTEGER,
|
||||
UNIQUE (build_variant_id, name, address)
|
||||
);
|
||||
|
||||
-- Function fingerprints (multiple algorithms per function)
|
||||
CREATE TABLE corpus.fingerprints (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
function_id UUID NOT NULL REFERENCES corpus.functions(id),
|
||||
algorithm TEXT NOT NULL, -- semantic_ksg, instruction_bb, cfg_wl
|
||||
fingerprint BYTEA NOT NULL, -- Variable length depending on algorithm
|
||||
fingerprint_hex TEXT GENERATED ALWAYS AS (encode(fingerprint, 'hex')) STORED,
|
||||
metadata JSONB, -- Algorithm-specific metadata
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE (function_id, algorithm)
|
||||
);
|
||||
|
||||
-- Index for fast fingerprint lookup
|
||||
CREATE INDEX idx_fingerprints_algorithm_hex ON corpus.fingerprints(algorithm, fingerprint_hex);
|
||||
CREATE INDEX idx_fingerprints_bytea ON corpus.fingerprints USING hash (fingerprint);
|
||||
|
||||
-- Function clusters (similar functions across versions)
|
||||
CREATE TABLE corpus.function_clusters (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
|
||||
canonical_name TEXT NOT NULL, -- e.g., "memcpy" across all versions
|
||||
description TEXT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
|
||||
UNIQUE (library_id, canonical_name)
|
||||
);
|
||||
|
||||
-- Cluster membership
|
||||
CREATE TABLE corpus.cluster_members (
|
||||
cluster_id UUID NOT NULL REFERENCES corpus.function_clusters(id),
|
||||
function_id UUID NOT NULL REFERENCES corpus.functions(id),
|
||||
similarity_to_centroid DECIMAL(5,4),
|
||||
PRIMARY KEY (cluster_id, function_id)
|
||||
);
|
||||
|
||||
-- CVE associations (which functions are affected by which CVEs)
|
||||
CREATE TABLE corpus.function_cves (
|
||||
function_id UUID NOT NULL REFERENCES corpus.functions(id),
|
||||
cve_id TEXT NOT NULL,
|
||||
affected_state TEXT NOT NULL, -- vulnerable, fixed, not_affected
|
||||
patch_commit TEXT, -- Git commit that fixed
|
||||
confidence DECIMAL(3,2) NOT NULL,
|
||||
evidence_type TEXT, -- changelog, commit, advisory
|
||||
PRIMARY KEY (function_id, cve_id)
|
||||
);
|
||||
|
||||
-- Ingestion job tracking
|
||||
CREATE TABLE corpus.ingestion_jobs (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
|
||||
job_type TEXT NOT NULL, -- full_ingest, incremental, cve_update
|
||||
status TEXT NOT NULL DEFAULT 'pending',
|
||||
started_at TIMESTAMPTZ,
|
||||
completed_at TIMESTAMPTZ,
|
||||
functions_indexed INTEGER,
|
||||
errors JSONB,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
|
||||
);
|
||||
```
|
||||
|
||||
### Core Interfaces
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/ICorpusIngestionService.cs
|
||||
namespace StellaOps.BinaryIndex.Corpus;
|
||||
|
||||
public interface ICorpusIngestionService
|
||||
{
|
||||
/// <summary>
|
||||
/// Ingest all functions from a library binary.
|
||||
/// </summary>
|
||||
Task<IngestionResult> IngestLibraryAsync(
|
||||
LibraryMetadata metadata,
|
||||
Stream binaryStream,
|
||||
IngestionOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Ingest a specific version range.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<IngestionResult>> IngestVersionRangeAsync(
|
||||
string libraryName,
|
||||
VersionRange range,
|
||||
IAsyncEnumerable<LibraryBinary> binaries,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record LibraryMetadata(
|
||||
string Name,
|
||||
string Version,
|
||||
string Architecture,
|
||||
string? Abi,
|
||||
string? Compiler,
|
||||
string? OptimizationLevel);
|
||||
|
||||
public sealed record IngestionResult(
|
||||
Guid JobId,
|
||||
string LibraryName,
|
||||
string Version,
|
||||
int FunctionsIndexed,
|
||||
int FingerprintsGenerated,
|
||||
ImmutableArray<string> Errors);
|
||||
```
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/ICorpusQueryService.cs
|
||||
namespace StellaOps.BinaryIndex.Corpus;
|
||||
|
||||
public interface ICorpusQueryService
|
||||
{
|
||||
/// <summary>
|
||||
/// Identify a function by its fingerprint.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<FunctionMatch>> IdentifyFunctionAsync(
|
||||
FunctionFingerprints fingerprints,
|
||||
IdentifyOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Get all functions associated with a CVE.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<CorpusFunction>> GetFunctionsForCveAsync(
|
||||
string cveId,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Get function evolution across versions.
|
||||
/// </summary>
|
||||
Task<FunctionEvolution> GetFunctionEvolutionAsync(
|
||||
string libraryName,
|
||||
string functionName,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record FunctionFingerprints(
|
||||
byte[]? SemanticHash,
|
||||
byte[]? InstructionHash,
|
||||
byte[]? CfgHash,
|
||||
ImmutableArray<string>? ApiCalls);
|
||||
|
||||
public sealed record FunctionMatch(
|
||||
string LibraryName,
|
||||
string Version,
|
||||
string FunctionName,
|
||||
decimal Similarity,
|
||||
MatchConfidence Confidence,
|
||||
string? CveStatus, // null if not CVE-affected
|
||||
ImmutableArray<string> AffectedCves);
|
||||
|
||||
public sealed record FunctionEvolution(
|
||||
string LibraryName,
|
||||
string FunctionName,
|
||||
ImmutableArray<VersionSnapshot> Versions);
|
||||
|
||||
public sealed record VersionSnapshot(
|
||||
string Version,
|
||||
int SizeBytes,
|
||||
string FingerprintHex,
|
||||
ImmutableArray<string> CveChanges); // CVEs fixed/introduced in this version
|
||||
```
|
||||
|
||||
### Library Connectors
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Connectors/IGlibcCorpusConnector.cs
|
||||
namespace StellaOps.BinaryIndex.Corpus.Connectors;
|
||||
|
||||
public interface ILibraryCorpusConnector
|
||||
{
|
||||
string LibraryName { get; }
|
||||
string[] SupportedArchitectures { get; }
|
||||
|
||||
/// <summary>
|
||||
/// Get available versions from source.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<string>> GetAvailableVersionsAsync(CancellationToken ct);
|
||||
|
||||
/// <summary>
|
||||
/// Download and extract library binary for a version.
|
||||
/// </summary>
|
||||
Task<LibraryBinary> FetchBinaryAsync(
|
||||
string version,
|
||||
string architecture,
|
||||
string? abi = null,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
// Implementations:
|
||||
// - GlibcCorpusConnector (GNU C Library)
|
||||
// - OpenSslCorpusConnector (OpenSSL/LibreSSL/BoringSSL)
|
||||
// - ZlibCorpusConnector (zlib/zlib-ng)
|
||||
// - CurlCorpusConnector (libcurl)
|
||||
// - SqliteCorpusConnector (SQLite)
|
||||
// - LibpngCorpusConnector (libpng)
|
||||
// - LibjpegCorpusConnector (libjpeg-turbo)
|
||||
// - LibxmlCorpusConnector (libxml2)
|
||||
// - OpenJpegCorpusConnector (OpenJPEG)
|
||||
// - ExpatCorpusConnector (Expat XML parser)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owners | Task Definition |
|
||||
|---|---------|--------|------------|--------|-----------------|
|
||||
| 1 | CORP-001 | TODO | Phase 1 | Guild | Create `StellaOps.BinaryIndex.Corpus` project structure |
|
||||
| 2 | CORP-002 | TODO | CORP-001 | Guild | Define corpus model types (LibraryMetadata, FunctionMatch, etc.) |
|
||||
| 3 | CORP-003 | TODO | CORP-001 | Guild | Create PostgreSQL corpus schema (corpus.* tables) |
|
||||
| 4 | CORP-004 | TODO | CORP-003 | Guild | Implement PostgreSQL corpus repository |
|
||||
| 5 | CORP-005 | TODO | CORP-004 | Guild | Implement GlibcCorpusConnector |
|
||||
| 6 | CORP-006 | TODO | CORP-004 | Guild | Implement OpenSslCorpusConnector |
|
||||
| 7 | CORP-007 | TODO | CORP-004 | Guild | Implement ZlibCorpusConnector |
|
||||
| 8 | CORP-008 | TODO | CORP-004 | Guild | Implement CurlCorpusConnector |
|
||||
| 9 | CORP-009 | TODO | CORP-005-008 | Guild | Implement CorpusIngestionService |
|
||||
| 10 | CORP-010 | TODO | CORP-009 | Guild | Implement batch fingerprint generation pipeline |
|
||||
| 11 | CORP-011 | TODO | CORP-010 | Guild | Implement function clustering (group similar functions) |
|
||||
| 12 | CORP-012 | TODO | CORP-011 | Guild | Implement CorpusQueryService |
|
||||
| 13 | CORP-013 | TODO | CORP-012 | Guild | Implement CVE-to-function mapping updater |
|
||||
| 14 | CORP-014 | TODO | CORP-012 | Guild | Integrate corpus queries into BinaryVulnerabilityService |
|
||||
| 15 | CORP-015 | TODO | CORP-009 | Guild | Initial corpus ingestion: glibc (5 major versions x 3 archs) |
|
||||
| 16 | CORP-016 | TODO | CORP-015 | Guild | Initial corpus ingestion: OpenSSL (10 versions x 3 archs) |
|
||||
| 17 | CORP-017 | TODO | CORP-016 | Guild | Initial corpus ingestion: zlib, curl, sqlite |
|
||||
| 18 | CORP-018 | TODO | CORP-012 | Guild | Unit tests: Corpus ingestion correctness |
|
||||
| 19 | CORP-019 | TODO | CORP-012 | Guild | Unit tests: Query service accuracy |
|
||||
| 20 | CORP-020 | TODO | CORP-017 | Guild | Integration tests: End-to-end function identification |
|
||||
| 21 | CORP-021 | TODO | CORP-020 | Guild | Benchmark: Query latency at scale (100K+ functions) |
|
||||
| 22 | CORP-022 | TODO | CORP-021 | Guild | Documentation: Corpus management guide |
|
||||
|
||||
---
|
||||
|
||||
## Task Details
|
||||
|
||||
### CORP-005: Implement GlibcCorpusConnector
|
||||
|
||||
Fetch glibc binaries from GNU mirrors and Debian/Ubuntu packages:
|
||||
|
||||
```csharp
|
||||
internal sealed class GlibcCorpusConnector : ILibraryCorpusConnector
|
||||
{
|
||||
private readonly IHttpClientFactory _httpClientFactory;
|
||||
private readonly ILogger<GlibcCorpusConnector> _logger;
|
||||
|
||||
public string LibraryName => "glibc";
|
||||
public string[] SupportedArchitectures => ["x86_64", "aarch64", "armv7", "i686"];
|
||||
|
||||
public async Task<ImmutableArray<string>> GetAvailableVersionsAsync(CancellationToken ct)
|
||||
{
|
||||
// Query GNU FTP mirror for available versions
|
||||
// https://ftp.gnu.org/gnu/glibc/
|
||||
var client = _httpClientFactory.CreateClient("GnuMirror");
|
||||
var html = await client.GetStringAsync("https://ftp.gnu.org/gnu/glibc/", ct);
|
||||
|
||||
// Parse directory listing for glibc-X.Y.tar.gz files
|
||||
var versions = ParseVersionsFromListing(html);
|
||||
|
||||
return [.. versions.OrderByDescending(v => Version.Parse(v))];
|
||||
}
|
||||
|
||||
public async Task<LibraryBinary> FetchBinaryAsync(
|
||||
string version,
|
||||
string architecture,
|
||||
string? abi = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// Strategy 1: Try Debian/Ubuntu package (pre-built)
|
||||
var debBinary = await TryFetchDebianPackageAsync(version, architecture, ct);
|
||||
if (debBinary is not null)
|
||||
return debBinary;
|
||||
|
||||
// Strategy 2: Download source and compile with specific flags
|
||||
var sourceTarball = await DownloadSourceAsync(version, ct);
|
||||
return await CompileForArchitecture(sourceTarball, architecture, abi, ct);
|
||||
}
|
||||
|
||||
private async Task<LibraryBinary?> TryFetchDebianPackageAsync(
|
||||
string version,
|
||||
string architecture,
|
||||
CancellationToken ct)
|
||||
{
|
||||
// Map glibc version to Debian package version
|
||||
// e.g., glibc 2.31 -> libc6_2.31-13+deb11u5_amd64.deb
|
||||
var packages = await QueryDebianPackagesAsync(version, architecture, ct);
|
||||
|
||||
foreach (var pkg in packages)
|
||||
{
|
||||
var binary = await DownloadAndExtractDebAsync(pkg, ct);
|
||||
if (binary is not null)
|
||||
return binary;
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### CORP-011: Implement Function Clustering
|
||||
|
||||
Group semantically similar functions across versions:
|
||||
|
||||
```csharp
|
||||
internal sealed class FunctionClusteringService
|
||||
{
|
||||
private readonly ICorpusRepository _repository;
|
||||
private readonly ISemanticMatcher _matcher;
|
||||
|
||||
public async Task ClusterFunctionsAsync(
|
||||
Guid libraryId,
|
||||
ClusteringOptions options,
|
||||
CancellationToken ct)
|
||||
{
|
||||
// Get all functions with semantic fingerprints
|
||||
var functions = await _repository.GetFunctionsWithFingerprintsAsync(libraryId, ct);
|
||||
|
||||
// Group by canonical name (demangled, normalized)
|
||||
var groups = functions
|
||||
.GroupBy(f => NormalizeCanonicalName(f.DemangledName ?? f.Name))
|
||||
.ToList();
|
||||
|
||||
foreach (var group in groups)
|
||||
{
|
||||
ct.ThrowIfCancellationRequested();
|
||||
|
||||
// Create or update cluster
|
||||
var clusterId = await _repository.EnsureClusterAsync(
|
||||
libraryId,
|
||||
group.Key,
|
||||
ct);
|
||||
|
||||
// Compute centroid (most common fingerprint)
|
||||
var centroid = ComputeCentroid(group);
|
||||
|
||||
// Add members with similarity scores
|
||||
foreach (var function in group)
|
||||
{
|
||||
var similarity = await _matcher.MatchAsync(
|
||||
function.SemanticFingerprint,
|
||||
centroid,
|
||||
ct: ct);
|
||||
|
||||
await _repository.AddClusterMemberAsync(
|
||||
clusterId,
|
||||
function.Id,
|
||||
similarity.OverallSimilarity,
|
||||
ct);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static string NormalizeCanonicalName(string name)
|
||||
{
|
||||
// Strip version suffixes, GLIBC_2.X annotations
|
||||
// Demangle C++ names
|
||||
// Normalize to base function name
|
||||
return CppDemangler.Demangle(name)
|
||||
.Replace("@GLIBC_", "")
|
||||
.TrimEnd("@@".ToCharArray());
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Initial Corpus Coverage
|
||||
|
||||
### Priority Libraries (Phase 2a)
|
||||
|
||||
| Library | Versions | Architectures | Est. Functions | CVE Coverage |
|
||||
|---------|----------|---------------|----------------|--------------|
|
||||
| glibc | 2.17, 2.28, 2.31, 2.35, 2.38 | x64, arm64, armv7 | ~15,000 | 50+ CVEs |
|
||||
| OpenSSL | 1.0.2, 1.1.0, 1.1.1, 3.0, 3.1 | x64, arm64 | ~8,000 | 100+ CVEs |
|
||||
| zlib | 1.2.8, 1.2.11, 1.2.13, 1.3 | x64, arm64 | ~200 | 5+ CVEs |
|
||||
| libcurl | 7.50-7.88 (select) | x64, arm64 | ~2,000 | 80+ CVEs |
|
||||
| SQLite | 3.30-3.44 (select) | x64, arm64 | ~1,500 | 30+ CVEs |
|
||||
|
||||
### Extended Coverage (Phase 2b)
|
||||
|
||||
| Library | Est. Functions | Priority |
|
||||
|---------|----------------|----------|
|
||||
| libpng | ~300 | Medium |
|
||||
| libjpeg-turbo | ~400 | Medium |
|
||||
| libxml2 | ~1,200 | High |
|
||||
| expat | ~150 | High |
|
||||
| OpenJPEG | ~600 | Medium |
|
||||
| freetype | ~800 | Medium |
|
||||
| harfbuzz | ~500 | Low |
|
||||
|
||||
**Total estimated corpus size:** ~30,000 unique functions, ~100,000 fingerprints (including variants)
|
||||
|
||||
---
|
||||
|
||||
## Storage Estimates
|
||||
|
||||
| Component | Size Estimate |
|
||||
|-----------|---------------|
|
||||
| PostgreSQL tables | ~2 GB |
|
||||
| Fingerprint index | ~500 MB |
|
||||
| Full corpus with metadata | ~5 GB |
|
||||
| Query cache (Valkey) | ~100 MB |
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Target |
|
||||
|--------|--------|
|
||||
| Function identification accuracy | 90%+ on stripped binaries |
|
||||
| Query latency (p99) | <100ms |
|
||||
| Corpus coverage (top 20 libs) | 80%+ of security-critical functions |
|
||||
| CVE attribution accuracy | 95%+ |
|
||||
| False positive rate | <3% |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
|
||||
|
||||
---
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision/Risk | Type | Mitigation |
|
||||
|---------------|------|------------|
|
||||
| Corpus size may grow large | Risk | Implement tiered storage, archive old versions |
|
||||
| Package version mapping is complex | Risk | Maintain distro-version mapping tables |
|
||||
| Compilation variants create explosion | Risk | Prioritize common optimization levels (O2, O3) |
|
||||
| CVE mapping requires manual curation | Risk | Start with high-impact CVEs, automate with NVD data |
|
||||
|
||||
---
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-20: CORP-001 through CORP-008 (infrastructure, connectors) complete
|
||||
- 2026-01-31: CORP-009 through CORP-014 (services, integration) complete
|
||||
- 2026-02-15: CORP-015 through CORP-022 (corpus ingestion, testing) complete
|
||||
@@ -1,772 +0,0 @@
|
||||
# Sprint 20260105_001_003_BINDEX - Semantic Diffing Phase 3: Ghidra Integration
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Integrate Ghidra as a secondary analysis backend for cases where B2R2 provides insufficient coverage or accuracy. Leverage Ghidra's mature Version Tracking, BSim, and FunctionID capabilities via headless analysis and the ghidriff Python bridge.
|
||||
|
||||
**Advisory Reference:** Product advisory on semantic diffing - Ghidra Version Tracking correlators, BSim behavioral similarity, ghidriff for automated patch diff workflows.
|
||||
|
||||
**Key Insight:** Ghidra has 15+ years of refinement in binary diffing. Rather than reimplementing, we should integrate Ghidra as a fallback/enhancement layer for:
|
||||
1. Architectures B2R2 handles poorly
|
||||
2. Complex obfuscation scenarios
|
||||
3. Version Tracking with multiple correlators
|
||||
4. BSim database queries
|
||||
|
||||
**Working directory:** `src/BinaryIndex/`
|
||||
|
||||
**Evidence:** New `StellaOps.BinaryIndex.Ghidra` library, Ghidra Headless integration, ghidriff bridge.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Should be complete |
|
||||
| SPRINT_20260105_001_002 (Corpus) | Sprint | Can run in parallel |
|
||||
| Ghidra 11.x | External | Available |
|
||||
| Java 17+ | Runtime | Required for Ghidra |
|
||||
| Python 3.10+ | Runtime | Required for ghidriff |
|
||||
| ghidriff | External | Available (pip) |
|
||||
|
||||
**Parallel Execution:** Ghidra Headless setup (GHID-001-004) and ghidriff integration (GHID-005-008) can proceed in parallel.
|
||||
|
||||
---
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/modules/binary-index/architecture.md`
|
||||
- Ghidra documentation: https://ghidra.re/ghidra_docs/
|
||||
- Ghidra Version Tracking: https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing
|
||||
- ghidriff repository: https://github.com/clearbluejar/ghidriff
|
||||
- BSim documentation: https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/
|
||||
|
||||
---
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Current State
|
||||
|
||||
- B2R2 is the sole disassembly/analysis backend
|
||||
- B2R2 coverage varies by architecture (excellent x64/ARM64, limited others)
|
||||
- No access to Ghidra's mature correlators and similarity engines
|
||||
- Cannot leverage BSim's pre-built signature databases
|
||||
|
||||
### B2R2 vs Ghidra Trade-offs
|
||||
|
||||
| Capability | B2R2 | Ghidra |
|
||||
|------------|------|--------|
|
||||
| Speed | Fast (native .NET) | Slower (Java, headless startup) |
|
||||
| Architecture coverage | 12+ (some limited) | 20+ (mature) |
|
||||
| IR quality | Good (LowUIR) | Excellent (P-Code) |
|
||||
| Decompiler | None | Excellent |
|
||||
| Version Tracking | None | Mature (multiple correlators) |
|
||||
| BSim | None | Full support |
|
||||
| Integration | Native .NET | Process/API bridge |
|
||||
|
||||
### Target Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Unified Disassembly/Analysis Layer │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ IDisassemblyPlugin Selection Logic │ │
|
||||
│ │ │ │
|
||||
│ │ Primary: B2R2 (fast, deterministic) │ │
|
||||
│ │ Fallback: Ghidra (complex cases, low B2R2 confidence) │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │ │
|
||||
│ v v │
|
||||
│ ┌──────────────────────────┐ ┌──────────────────────────────────────┐ │
|
||||
│ │ B2R2 Backend │ │ Ghidra Backend │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ - Native .NET │ │ ┌────────────────────────────────┐ │ │
|
||||
│ │ - LowUIR lifting │ │ │ Ghidra Headless Server │ │ │
|
||||
│ │ - CFG recovery │ │ │ │ │ │
|
||||
│ │ - Fast fingerprinting │ │ │ - P-Code decompilation │ │ │
|
||||
│ │ │ │ │ - Version Tracking │ │ │
|
||||
│ └──────────────────────────┘ │ │ - BSim queries │ │ │
|
||||
│ │ │ - FunctionID matching │ │ │
|
||||
│ │ └────────────────────────────────┘ │ │
|
||||
│ │ │ │ │
|
||||
│ │ v │ │
|
||||
│ │ ┌────────────────────────────────┐ │ │
|
||||
│ │ │ ghidriff Bridge │ │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ │ - Automated patch diffing │ │ │
|
||||
│ │ │ - JSON/Markdown output │ │ │
|
||||
│ │ │ - CI/CD integration │ │ │
|
||||
│ │ └────────────────────────────────┘ │ │
|
||||
│ └──────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Design
|
||||
|
||||
### Ghidra Headless Service
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IGhidraService.cs
|
||||
namespace StellaOps.BinaryIndex.Ghidra;
|
||||
|
||||
public interface IGhidraService
|
||||
{
|
||||
/// <summary>
|
||||
/// Analyze a binary using Ghidra headless.
|
||||
/// </summary>
|
||||
Task<GhidraAnalysisResult> AnalyzeAsync(
|
||||
Stream binaryStream,
|
||||
GhidraAnalysisOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Run Version Tracking between two binaries.
|
||||
/// </summary>
|
||||
Task<VersionTrackingResult> CompareVersionsAsync(
|
||||
Stream oldBinary,
|
||||
Stream newBinary,
|
||||
VersionTrackingOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Query BSim for function matches.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<BSimMatch>> QueryBSimAsync(
|
||||
GhidraFunction function,
|
||||
BSimQueryOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Check if Ghidra backend is available and healthy.
|
||||
/// </summary>
|
||||
Task<bool> IsAvailableAsync(CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record GhidraAnalysisResult(
|
||||
string BinaryHash,
|
||||
ImmutableArray<GhidraFunction> Functions,
|
||||
ImmutableArray<GhidraImport> Imports,
|
||||
ImmutableArray<GhidraExport> Exports,
|
||||
ImmutableArray<GhidraString> Strings,
|
||||
GhidraMetadata Metadata);
|
||||
|
||||
public sealed record GhidraFunction(
|
||||
string Name,
|
||||
ulong Address,
|
||||
int Size,
|
||||
string? Signature, // Decompiled signature
|
||||
string? DecompiledCode, // Decompiled C code
|
||||
byte[] PCodeHash, // P-Code semantic hash
|
||||
ImmutableArray<string> CalledFunctions,
|
||||
ImmutableArray<string> CallingFunctions);
|
||||
```
|
||||
|
||||
### Version Tracking Integration
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IVersionTrackingService.cs
|
||||
namespace StellaOps.BinaryIndex.Ghidra;
|
||||
|
||||
public interface IVersionTrackingService
|
||||
{
|
||||
/// <summary>
|
||||
/// Run Ghidra Version Tracking with multiple correlators.
|
||||
/// </summary>
|
||||
Task<VersionTrackingResult> TrackVersionsAsync(
|
||||
Stream oldBinary,
|
||||
Stream newBinary,
|
||||
VersionTrackingOptions options,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record VersionTrackingOptions
|
||||
{
|
||||
public ImmutableArray<CorrelatorType> Correlators { get; init; } =
|
||||
[CorrelatorType.ExactBytes, CorrelatorType.ExactMnemonics,
|
||||
CorrelatorType.SymbolName, CorrelatorType.DataReference,
|
||||
CorrelatorType.CombinedReference];
|
||||
|
||||
public decimal MinSimilarity { get; init; } = 0.5m;
|
||||
public bool IncludeDecompilation { get; init; } = false;
|
||||
}
|
||||
|
||||
public enum CorrelatorType
|
||||
{
|
||||
ExactBytes, // Identical byte sequences
|
||||
ExactMnemonics, // Identical instruction mnemonics
|
||||
SymbolName, // Matching symbol names
|
||||
DataReference, // Similar data references
|
||||
CombinedReference, // Combined reference scoring
|
||||
BSim // Behavioral similarity
|
||||
}
|
||||
|
||||
public sealed record VersionTrackingResult(
|
||||
ImmutableArray<FunctionMatch> Matches,
|
||||
ImmutableArray<FunctionAdded> AddedFunctions,
|
||||
ImmutableArray<FunctionRemoved> RemovedFunctions,
|
||||
ImmutableArray<FunctionModified> ModifiedFunctions,
|
||||
VersionTrackingStats Statistics);
|
||||
|
||||
public sealed record FunctionMatch(
|
||||
string OldName,
|
||||
ulong OldAddress,
|
||||
string NewName,
|
||||
ulong NewAddress,
|
||||
decimal Similarity,
|
||||
CorrelatorType MatchedBy,
|
||||
ImmutableArray<MatchDifference> Differences);
|
||||
|
||||
public sealed record MatchDifference(
|
||||
DifferenceType Type,
|
||||
string Description,
|
||||
string? OldValue,
|
||||
string? NewValue);
|
||||
|
||||
public enum DifferenceType
|
||||
{
|
||||
InstructionAdded,
|
||||
InstructionRemoved,
|
||||
InstructionChanged,
|
||||
BranchTargetChanged,
|
||||
CallTargetChanged,
|
||||
ConstantChanged,
|
||||
SizeChanged
|
||||
}
|
||||
```
|
||||
|
||||
### ghidriff Bridge
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IGhidriffBridge.cs
|
||||
namespace StellaOps.BinaryIndex.Ghidra;
|
||||
|
||||
public interface IGhidriffBridge
|
||||
{
|
||||
/// <summary>
|
||||
/// Run ghidriff to compare two binaries.
|
||||
/// </summary>
|
||||
Task<GhidriffResult> DiffAsync(
|
||||
string oldBinaryPath,
|
||||
string newBinaryPath,
|
||||
GhidriffOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Generate patch diff report.
|
||||
/// </summary>
|
||||
Task<string> GenerateReportAsync(
|
||||
GhidriffResult result,
|
||||
ReportFormat format,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record GhidriffOptions
|
||||
{
|
||||
public string? GhidraPath { get; init; }
|
||||
public string? ProjectPath { get; init; }
|
||||
public bool IncludeDecompilation { get; init; } = true;
|
||||
public bool IncludeDisassembly { get; init; } = true;
|
||||
public ImmutableArray<string> ExcludeFunctions { get; init; } = [];
|
||||
}
|
||||
|
||||
public sealed record GhidriffResult(
|
||||
string OldBinaryHash,
|
||||
string NewBinaryHash,
|
||||
ImmutableArray<GhidriffFunction> AddedFunctions,
|
||||
ImmutableArray<GhidriffFunction> RemovedFunctions,
|
||||
ImmutableArray<GhidriffDiff> ModifiedFunctions,
|
||||
GhidriffStats Statistics,
|
||||
string RawJsonOutput);
|
||||
|
||||
public sealed record GhidriffDiff(
|
||||
string FunctionName,
|
||||
string OldSignature,
|
||||
string NewSignature,
|
||||
decimal Similarity,
|
||||
string? OldDecompiled,
|
||||
string? NewDecompiled,
|
||||
ImmutableArray<string> InstructionChanges);
|
||||
|
||||
public enum ReportFormat { Json, Markdown, Html }
|
||||
```
|
||||
|
||||
### BSim Integration
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IBSimService.cs
|
||||
namespace StellaOps.BinaryIndex.Ghidra;
|
||||
|
||||
public interface IBSimService
|
||||
{
|
||||
/// <summary>
|
||||
/// Generate BSim signatures for functions.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<BSimSignature>> GenerateSignaturesAsync(
|
||||
GhidraAnalysisResult analysis,
|
||||
BSimGenerationOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Query BSim database for similar functions.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<BSimMatch>> QueryAsync(
|
||||
BSimSignature signature,
|
||||
BSimQueryOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Ingest functions into BSim database.
|
||||
/// </summary>
|
||||
Task IngestAsync(
|
||||
string libraryName,
|
||||
string version,
|
||||
ImmutableArray<BSimSignature> signatures,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record BSimSignature(
|
||||
string FunctionName,
|
||||
ulong Address,
|
||||
byte[] FeatureVector, // BSim feature extraction
|
||||
int VectorLength,
|
||||
double SelfSignificance); // How distinctive is this function
|
||||
|
||||
public sealed record BSimMatch(
|
||||
string MatchedLibrary,
|
||||
string MatchedVersion,
|
||||
string MatchedFunction,
|
||||
double Similarity,
|
||||
double Significance,
|
||||
double Confidence);
|
||||
|
||||
public sealed record BSimQueryOptions
|
||||
{
|
||||
public double MinSimilarity { get; init; } = 0.7;
|
||||
public double MinSignificance { get; init; } = 0.0;
|
||||
public int MaxResults { get; init; } = 10;
|
||||
public ImmutableArray<string> TargetLibraries { get; init; } = [];
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owners | Task Definition |
|
||||
|---|---------|--------|------------|--------|-----------------|
|
||||
| 1 | GHID-001 | TODO | - | Guild | Create `StellaOps.BinaryIndex.Ghidra` project structure |
|
||||
| 2 | GHID-002 | TODO | GHID-001 | Guild | Define Ghidra model types (GhidraFunction, VersionTrackingResult, etc.) |
|
||||
| 3 | GHID-003 | TODO | GHID-001 | Guild | Implement Ghidra Headless launcher/manager |
|
||||
| 4 | GHID-004 | TODO | GHID-003 | Guild | Implement GhidraService (headless analysis wrapper) |
|
||||
| 5 | GHID-005 | TODO | GHID-001 | Guild | Set up ghidriff Python environment |
|
||||
| 6 | GHID-006 | TODO | GHID-005 | Guild | Implement GhidriffBridge (Python interop) |
|
||||
| 7 | GHID-007 | TODO | GHID-006 | Guild | Implement GhidriffReportGenerator |
|
||||
| 8 | GHID-008 | TODO | GHID-004,006 | Guild | Implement VersionTrackingService |
|
||||
| 9 | GHID-009 | TODO | GHID-004 | Guild | Implement BSim signature generation |
|
||||
| 10 | GHID-010 | TODO | GHID-009 | Guild | Implement BSim query service |
|
||||
| 11 | GHID-011 | TODO | GHID-010 | Guild | Set up BSim PostgreSQL database |
|
||||
| 12 | GHID-012 | TODO | GHID-008,010 | Guild | Implement GhidraDisassemblyPlugin (IDisassemblyPlugin) |
|
||||
| 13 | GHID-013 | TODO | GHID-012 | Guild | Integrate Ghidra into DisassemblyService as fallback |
|
||||
| 14 | GHID-014 | TODO | GHID-013 | Guild | Implement fallback selection logic (B2R2 -> Ghidra) |
|
||||
| 15 | GHID-015 | TODO | GHID-008 | Guild | Unit tests: Version Tracking correlators |
|
||||
| 16 | GHID-016 | TODO | GHID-010 | Guild | Unit tests: BSim signature generation |
|
||||
| 17 | GHID-017 | TODO | GHID-014 | Guild | Integration tests: Fallback scenarios |
|
||||
| 18 | GHID-018 | TODO | GHID-017 | Guild | Benchmark: Ghidra vs B2R2 accuracy comparison |
|
||||
| 19 | GHID-019 | TODO | GHID-018 | Guild | Documentation: Ghidra deployment guide |
|
||||
| 20 | GHID-020 | TODO | GHID-019 | Guild | Docker image: Ghidra Headless service |
|
||||
|
||||
---
|
||||
|
||||
## Task Details
|
||||
|
||||
### GHID-003: Implement Ghidra Headless Launcher
|
||||
|
||||
Manage Ghidra Headless process lifecycle:
|
||||
|
||||
```csharp
|
||||
internal sealed class GhidraHeadlessManager : IAsyncDisposable
|
||||
{
|
||||
private readonly GhidraOptions _options;
|
||||
private readonly ILogger<GhidraHeadlessManager> _logger;
|
||||
private Process? _ghidraProcess;
|
||||
private readonly SemaphoreSlim _lock = new(1, 1);
|
||||
|
||||
public GhidraHeadlessManager(
|
||||
IOptions<GhidraOptions> options,
|
||||
ILogger<GhidraHeadlessManager> logger)
|
||||
{
|
||||
_options = options.Value;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task<string> AnalyzeAsync(
|
||||
string binaryPath,
|
||||
string scriptName,
|
||||
string[] scriptArgs,
|
||||
CancellationToken ct)
|
||||
{
|
||||
await _lock.WaitAsync(ct);
|
||||
try
|
||||
{
|
||||
var projectDir = Path.Combine(_options.WorkDir, Guid.NewGuid().ToString("N"));
|
||||
Directory.CreateDirectory(projectDir);
|
||||
|
||||
var args = BuildAnalyzeArgs(projectDir, binaryPath, scriptName, scriptArgs);
|
||||
|
||||
var result = await RunGhidraAsync(args, ct);
|
||||
|
||||
return result;
|
||||
}
|
||||
finally
|
||||
{
|
||||
_lock.Release();
|
||||
}
|
||||
}
|
||||
|
||||
private string[] BuildAnalyzeArgs(
|
||||
string projectDir,
|
||||
string binaryPath,
|
||||
string scriptName,
|
||||
string[] scriptArgs)
|
||||
{
|
||||
var args = new List<string>
|
||||
{
|
||||
projectDir, // Project location
|
||||
"TempProject", // Project name
|
||||
"-import", binaryPath,
|
||||
"-postScript", scriptName
|
||||
};
|
||||
|
||||
if (scriptArgs.Length > 0)
|
||||
{
|
||||
args.AddRange(scriptArgs);
|
||||
}
|
||||
|
||||
// Add standard options
|
||||
args.AddRange([
|
||||
"-noanalysis", // We'll run analysis explicitly
|
||||
"-scriptPath", _options.ScriptsDir,
|
||||
"-max-cpu", _options.MaxCpu.ToString(CultureInfo.InvariantCulture)
|
||||
]);
|
||||
|
||||
return [.. args];
|
||||
}
|
||||
|
||||
private async Task<string> RunGhidraAsync(string[] args, CancellationToken ct)
|
||||
{
|
||||
var startInfo = new ProcessStartInfo
|
||||
{
|
||||
FileName = Path.Combine(_options.GhidraHome, "support", "analyzeHeadless"),
|
||||
Arguments = string.Join(" ", args.Select(QuoteArg)),
|
||||
RedirectStandardOutput = true,
|
||||
RedirectStandardError = true,
|
||||
UseShellExecute = false,
|
||||
CreateNoWindow = true
|
||||
};
|
||||
|
||||
// Set Java options
|
||||
startInfo.EnvironmentVariables["JAVA_HOME"] = _options.JavaHome;
|
||||
startInfo.EnvironmentVariables["MAXMEM"] = _options.MaxMemory;
|
||||
|
||||
using var process = Process.Start(startInfo)
|
||||
?? throw new InvalidOperationException("Failed to start Ghidra");
|
||||
|
||||
var output = await process.StandardOutput.ReadToEndAsync(ct);
|
||||
var error = await process.StandardError.ReadToEndAsync(ct);
|
||||
|
||||
await process.WaitForExitAsync(ct);
|
||||
|
||||
if (process.ExitCode != 0)
|
||||
{
|
||||
throw new GhidraException($"Ghidra failed: {error}");
|
||||
}
|
||||
|
||||
return output;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### GHID-006: Implement ghidriff Bridge
|
||||
|
||||
Python interop for ghidriff:
|
||||
|
||||
```csharp
|
||||
internal sealed class GhidriffBridge : IGhidriffBridge
|
||||
{
|
||||
private readonly GhidriffOptions _options;
|
||||
private readonly ILogger<GhidriffBridge> _logger;
|
||||
|
||||
public async Task<GhidriffResult> DiffAsync(
|
||||
string oldBinaryPath,
|
||||
string newBinaryPath,
|
||||
GhidriffOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
options ??= _options;
|
||||
|
||||
var outputDir = Path.Combine(Path.GetTempPath(), $"ghidriff_{Guid.NewGuid():N}");
|
||||
Directory.CreateDirectory(outputDir);
|
||||
|
||||
try
|
||||
{
|
||||
var args = BuildGhidriffArgs(oldBinaryPath, newBinaryPath, outputDir, options);
|
||||
|
||||
var result = await RunPythonAsync("ghidriff", args, ct);
|
||||
|
||||
// Parse JSON output
|
||||
var jsonPath = Path.Combine(outputDir, "diff.json");
|
||||
if (!File.Exists(jsonPath))
|
||||
{
|
||||
throw new GhidriffException($"ghidriff did not produce output: {result}");
|
||||
}
|
||||
|
||||
var json = await File.ReadAllTextAsync(jsonPath, ct);
|
||||
return ParseGhidriffOutput(json);
|
||||
}
|
||||
finally
|
||||
{
|
||||
if (Directory.Exists(outputDir))
|
||||
{
|
||||
Directory.Delete(outputDir, recursive: true);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static string[] BuildGhidriffArgs(
|
||||
string oldPath,
|
||||
string newPath,
|
||||
string outputDir,
|
||||
GhidriffOptions options)
|
||||
{
|
||||
var args = new List<string>
|
||||
{
|
||||
oldPath,
|
||||
newPath,
|
||||
"--output-dir", outputDir,
|
||||
"--output-format", "json"
|
||||
};
|
||||
|
||||
if (!string.IsNullOrEmpty(options.GhidraPath))
|
||||
{
|
||||
args.AddRange(["--ghidra-path", options.GhidraPath]);
|
||||
}
|
||||
|
||||
if (options.IncludeDecompilation)
|
||||
{
|
||||
args.Add("--include-decompilation");
|
||||
}
|
||||
|
||||
if (options.ExcludeFunctions.Length > 0)
|
||||
{
|
||||
args.AddRange(["--exclude", string.Join(",", options.ExcludeFunctions)]);
|
||||
}
|
||||
|
||||
return [.. args];
|
||||
}
|
||||
|
||||
private async Task<string> RunPythonAsync(
|
||||
string module,
|
||||
string[] args,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var startInfo = new ProcessStartInfo
|
||||
{
|
||||
FileName = _options.PythonPath ?? "python3",
|
||||
Arguments = $"-m {module} {string.Join(" ", args.Select(QuoteArg))}",
|
||||
RedirectStandardOutput = true,
|
||||
RedirectStandardError = true,
|
||||
UseShellExecute = false,
|
||||
CreateNoWindow = true
|
||||
};
|
||||
|
||||
using var process = Process.Start(startInfo)
|
||||
?? throw new InvalidOperationException("Failed to start Python");
|
||||
|
||||
var output = await process.StandardOutput.ReadToEndAsync(ct);
|
||||
await process.WaitForExitAsync(ct);
|
||||
|
||||
return output;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### GHID-014: Implement Fallback Selection Logic
|
||||
|
||||
Smart routing between B2R2 and Ghidra:
|
||||
|
||||
```csharp
|
||||
internal sealed class HybridDisassemblyService : IDisassemblyService
|
||||
{
|
||||
private readonly B2R2DisassemblyPlugin _b2r2;
|
||||
private readonly GhidraDisassemblyPlugin _ghidra;
|
||||
private readonly ILogger<HybridDisassemblyService> _logger;
|
||||
|
||||
public async Task<DisassemblyResult> DisassembleAsync(
|
||||
Stream binaryStream,
|
||||
DisassemblyOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
options ??= new DisassemblyOptions();
|
||||
|
||||
// Try B2R2 first (faster, native)
|
||||
var b2r2Result = await TryB2R2Async(binaryStream, options, ct);
|
||||
|
||||
if (b2r2Result is not null && MeetsQualityThreshold(b2r2Result, options))
|
||||
{
|
||||
_logger.LogDebug("Using B2R2 result (confidence: {Confidence})",
|
||||
b2r2Result.Confidence);
|
||||
return b2r2Result;
|
||||
}
|
||||
|
||||
// Fallback to Ghidra for:
|
||||
// 1. Low B2R2 confidence
|
||||
// 2. Unsupported architecture
|
||||
// 3. Explicit Ghidra preference
|
||||
if (!await _ghidra.IsAvailableAsync(ct))
|
||||
{
|
||||
_logger.LogWarning("Ghidra unavailable, returning B2R2 result");
|
||||
return b2r2Result ?? throw new DisassemblyException("No backend available");
|
||||
}
|
||||
|
||||
_logger.LogInformation("Falling back to Ghidra (B2R2 confidence: {Confidence})",
|
||||
b2r2Result?.Confidence ?? 0);
|
||||
|
||||
binaryStream.Position = 0;
|
||||
return await _ghidra.DisassembleAsync(binaryStream, options, ct);
|
||||
}
|
||||
|
||||
private static bool MeetsQualityThreshold(
|
||||
DisassemblyResult result,
|
||||
DisassemblyOptions options)
|
||||
{
|
||||
// Confidence threshold
|
||||
if (result.Confidence < options.MinConfidence)
|
||||
return false;
|
||||
|
||||
// Function discovery threshold
|
||||
if (result.Functions.Length < options.MinFunctions)
|
||||
return false;
|
||||
|
||||
// Instruction decoding success rate
|
||||
var decodeRate = (double)result.DecodedInstructions / result.TotalInstructions;
|
||||
if (decodeRate < options.MinDecodeRate)
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Deployment Architecture
|
||||
|
||||
### Container Setup
|
||||
|
||||
```yaml
|
||||
# docker-compose.ghidra.yml
|
||||
services:
|
||||
ghidra-headless:
|
||||
image: stellaops/ghidra-headless:11.2
|
||||
build:
|
||||
context: ./devops/docker/ghidra
|
||||
dockerfile: Dockerfile.headless
|
||||
volumes:
|
||||
- ghidra-projects:/projects
|
||||
- ghidra-scripts:/scripts
|
||||
environment:
|
||||
JAVA_HOME: /opt/java/openjdk
|
||||
MAXMEM: 4G
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '4'
|
||||
memory: 8G
|
||||
|
||||
bsim-postgres:
|
||||
image: postgres:16
|
||||
volumes:
|
||||
- bsim-data:/var/lib/postgresql/data
|
||||
environment:
|
||||
POSTGRES_DB: bsim
|
||||
POSTGRES_USER: bsim
|
||||
POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD}
|
||||
|
||||
volumes:
|
||||
ghidra-projects:
|
||||
ghidra-scripts:
|
||||
bsim-data:
|
||||
```
|
||||
|
||||
### Dockerfile
|
||||
|
||||
```dockerfile
|
||||
# devops/docker/ghidra/Dockerfile.headless
|
||||
FROM eclipse-temurin:17-jdk-jammy
|
||||
|
||||
ARG GHIDRA_VERSION=11.2
|
||||
ARG GHIDRA_SHA256=abc123...
|
||||
|
||||
# Download and extract Ghidra
|
||||
RUN curl -fsSL https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_*.zip \
|
||||
-o /tmp/ghidra.zip \
|
||||
&& echo "${GHIDRA_SHA256} /tmp/ghidra.zip" | sha256sum -c - \
|
||||
&& unzip /tmp/ghidra.zip -d /opt \
|
||||
&& rm /tmp/ghidra.zip \
|
||||
&& ln -s /opt/ghidra_* /opt/ghidra
|
||||
|
||||
# Install Python for ghidriff
|
||||
RUN apt-get update && apt-get install -y python3 python3-pip \
|
||||
&& pip3 install ghidriff \
|
||||
&& apt-get clean
|
||||
|
||||
ENV GHIDRA_HOME=/opt/ghidra
|
||||
ENV PATH="${GHIDRA_HOME}/support:${PATH}"
|
||||
|
||||
WORKDIR /projects
|
||||
ENTRYPOINT ["analyzeHeadless"]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Current | Target |
|
||||
|--------|---------|--------|
|
||||
| Architecture coverage | 12 (B2R2) | 20+ (with Ghidra) |
|
||||
| Complex binary accuracy | ~70% | 90%+ |
|
||||
| Version tracking precision | N/A | 85%+ |
|
||||
| BSim identification rate | N/A | 80%+ on known libs |
|
||||
| Fallback latency overhead | N/A | <30s per binary |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
|
||||
|
||||
---
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision/Risk | Type | Mitigation |
|
||||
|---------------|------|------------|
|
||||
| Ghidra adds Java dependency | Trade-off | Containerize Ghidra, keep optional |
|
||||
| ghidriff Python interop adds complexity | Trade-off | Use subprocess, avoid embedding |
|
||||
| Ghidra startup time is slow (~10-30s) | Risk | Keep B2R2 primary, Ghidra fallback only |
|
||||
| BSim database grows large | Risk | Prune old versions, tier storage |
|
||||
| License considerations (Apache 2.0) | Compliance | Ghidra is Apache 2.0, compatible with AGPL |
|
||||
|
||||
---
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-02-01: GHID-001 through GHID-007 (project setup, bridges) complete
|
||||
- 2026-02-15: GHID-008 through GHID-014 (services, integration) complete
|
||||
- 2026-02-28: GHID-015 through GHID-020 (testing, deployment) complete
|
||||
@@ -1,906 +0,0 @@
|
||||
# Sprint 20260105_001_004_BINDEX - Semantic Diffing Phase 4: Decompiler Integration & ML Similarity
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Implement advanced semantic analysis capabilities including decompiled pseudo-code comparison and machine learning-based function embeddings. This phase addresses the highest-impact but most complex enhancements for detecting semantic equivalence in heavily optimized and obfuscated binaries.
|
||||
|
||||
**Advisory Reference:** Product advisory on semantic diffing - SEI Carnegie Mellon semantic equivalence checking of decompiled binaries, ML-based similarity models.
|
||||
|
||||
**Key Insight:** Comparing decompiled C-like code provides the highest semantic fidelity, as it abstracts away instruction-level details. ML embeddings capture functional behavior patterns that resist obfuscation.
|
||||
|
||||
**Working directory:** `src/BinaryIndex/`
|
||||
|
||||
**Evidence:** New `StellaOps.BinaryIndex.Decompiler` and `StellaOps.BinaryIndex.ML` libraries, model training pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
| Dependency | Type | Status |
|
||||
|------------|------|--------|
|
||||
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Required |
|
||||
| SPRINT_20260105_001_002 (Corpus) | Sprint | Required for training data |
|
||||
| SPRINT_20260105_001_003 (Ghidra) | Sprint | Required for decompiler |
|
||||
| Ghidra Decompiler | External | Via Phase 3 |
|
||||
| ONNX Runtime | Package | Available |
|
||||
| ML.NET | Package | Available |
|
||||
|
||||
**Parallel Execution:** Decompiler integration (DCML-001-010) and ML pipeline (DCML-011-020) can proceed in parallel.
|
||||
|
||||
---
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- Phase 1-3 sprint documents
|
||||
- `docs/modules/binary-index/architecture.md`
|
||||
- SEI paper: https://www.sei.cmu.edu/annual-reviews/2022-research-review/semantic-equivalence-checking-of-decompiled-binaries/
|
||||
- Code similarity research: https://arxiv.org/abs/2308.01463
|
||||
|
||||
---
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Current State
|
||||
|
||||
After Phases 1-3:
|
||||
- B2R2 IR-level semantic fingerprints (Phase 1)
|
||||
- Function behavior corpus (Phase 2)
|
||||
- Ghidra fallback with Version Tracking (Phase 3)
|
||||
|
||||
**Remaining Gaps:**
|
||||
1. No decompiled code comparison (highest semantic fidelity)
|
||||
2. No ML-based similarity (robustness to obfuscation)
|
||||
3. Cannot detect functionally equivalent code with radically different structure
|
||||
|
||||
### Target Capabilities
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Advanced Semantic Analysis Stack │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Decompilation Layer │ │
|
||||
│ │ │ │
|
||||
│ │ Binary -> Ghidra P-Code -> Decompiled C -> AST -> Semantic Hash │ │
|
||||
│ │ │ │
|
||||
│ │ Comparison methods: │ │
|
||||
│ │ - AST structural similarity │ │
|
||||
│ │ - Control flow equivalence │ │
|
||||
│ │ - Data flow equivalence │ │
|
||||
│ │ - Normalized code text similarity │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ ML Embedding Layer │ │
|
||||
│ │ │ │
|
||||
│ │ Function Code -> Tokenization -> Transformer -> Embedding Vector │ │
|
||||
│ │ │ │
|
||||
│ │ Models: │ │
|
||||
│ │ - CodeBERT variant for binary code │ │
|
||||
│ │ - Graph Neural Network for CFG │ │
|
||||
│ │ - Contrastive learning for similarity │ │
|
||||
│ │ │ │
|
||||
│ │ Vector similarity: cosine, euclidean, learned metric │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Ensemble Decision Layer │ │
|
||||
│ │ │ │
|
||||
│ │ Combine signals: │ │
|
||||
│ │ - Instruction fingerprint (Phase 1) : 15% weight │ │
|
||||
│ │ - Semantic graph (Phase 1) : 25% weight │ │
|
||||
│ │ - Decompiled AST similarity : 35% weight │ │
|
||||
│ │ - ML embedding similarity : 25% weight │ │
|
||||
│ │ │ │
|
||||
│ │ Output: Confidence-weighted similarity score │ │
|
||||
│ │ │ │
|
||||
│ └───────────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture Design
|
||||
|
||||
### Decompiler Integration
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/IDecompilerService.cs
|
||||
namespace StellaOps.BinaryIndex.Decompiler;
|
||||
|
||||
public interface IDecompilerService
|
||||
{
|
||||
/// <summary>
|
||||
/// Decompile a function to C-like pseudo-code.
|
||||
/// </summary>
|
||||
Task<DecompiledFunction> DecompileAsync(
|
||||
GhidraFunction function,
|
||||
DecompileOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Parse decompiled code into AST.
|
||||
/// </summary>
|
||||
Task<DecompiledAst> ParseToAstAsync(
|
||||
string decompiledCode,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Compare two decompiled functions for semantic equivalence.
|
||||
/// </summary>
|
||||
Task<DecompiledComparisonResult> CompareAsync(
|
||||
DecompiledFunction a,
|
||||
DecompiledFunction b,
|
||||
ComparisonOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record DecompiledFunction(
|
||||
string FunctionName,
|
||||
string Signature,
|
||||
string Code, // Decompiled C code
|
||||
DecompiledAst? Ast,
|
||||
ImmutableArray<LocalVariable> Locals,
|
||||
ImmutableArray<string> CalledFunctions);
|
||||
|
||||
public sealed record DecompiledAst(
|
||||
AstNode Root,
|
||||
int NodeCount,
|
||||
int Depth,
|
||||
ImmutableArray<AstPattern> Patterns); // Recognized code patterns
|
||||
|
||||
public abstract record AstNode(AstNodeType Type, ImmutableArray<AstNode> Children);
|
||||
|
||||
public enum AstNodeType
|
||||
{
|
||||
Function, Block, If, While, For, DoWhile, Switch,
|
||||
Return, Break, Continue, Goto,
|
||||
Assignment, BinaryOp, UnaryOp, Call, Cast,
|
||||
Variable, Constant, ArrayAccess, FieldAccess, Deref
|
||||
}
|
||||
```
|
||||
|
||||
### AST Comparison Engine
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/AstComparisonEngine.cs
|
||||
namespace StellaOps.BinaryIndex.Decompiler;
|
||||
|
||||
public interface IAstComparisonEngine
|
||||
{
|
||||
/// <summary>
|
||||
/// Compute structural similarity between ASTs.
|
||||
/// </summary>
|
||||
decimal ComputeStructuralSimilarity(DecompiledAst a, DecompiledAst b);
|
||||
|
||||
/// <summary>
|
||||
/// Compute edit distance between ASTs.
|
||||
/// </summary>
|
||||
AstEditDistance ComputeEditDistance(DecompiledAst a, DecompiledAst b);
|
||||
|
||||
/// <summary>
|
||||
/// Find semantic equivalent patterns.
|
||||
/// </summary>
|
||||
ImmutableArray<SemanticEquivalence> FindEquivalences(
|
||||
DecompiledAst a,
|
||||
DecompiledAst b);
|
||||
}
|
||||
|
||||
public sealed record AstEditDistance(
|
||||
int Insertions,
|
||||
int Deletions,
|
||||
int Modifications,
|
||||
int TotalOperations,
|
||||
decimal NormalizedDistance); // 0.0 = identical, 1.0 = completely different
|
||||
|
||||
public sealed record SemanticEquivalence(
|
||||
AstNode NodeA,
|
||||
AstNode NodeB,
|
||||
EquivalenceType Type,
|
||||
decimal Confidence);
|
||||
|
||||
public enum EquivalenceType
|
||||
{
|
||||
Identical, // Exact match
|
||||
Renamed, // Same structure, different names
|
||||
Reordered, // Same operations, different order
|
||||
Optimized, // Compiler optimization variant
|
||||
Semantically, // Different structure, same behavior
|
||||
}
|
||||
```
|
||||
|
||||
### Decompiled Code Normalizer
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/CodeNormalizer.cs
|
||||
namespace StellaOps.BinaryIndex.Decompiler;
|
||||
|
||||
public interface ICodeNormalizer
|
||||
{
|
||||
/// <summary>
|
||||
/// Normalize decompiled code for comparison.
|
||||
/// </summary>
|
||||
string Normalize(string code, NormalizationOptions? options = null);
|
||||
|
||||
/// <summary>
|
||||
/// Generate canonical form hash.
|
||||
/// </summary>
|
||||
byte[] ComputeCanonicalHash(string code);
|
||||
}
|
||||
|
||||
internal sealed class CodeNormalizer : ICodeNormalizer
|
||||
{
|
||||
public string Normalize(string code, NormalizationOptions? options = null)
|
||||
{
|
||||
options ??= NormalizationOptions.Default;
|
||||
|
||||
var normalized = code;
|
||||
|
||||
// 1. Normalize variable names (var1, var2, ...)
|
||||
if (options.NormalizeVariables)
|
||||
{
|
||||
normalized = NormalizeVariableNames(normalized);
|
||||
}
|
||||
|
||||
// 2. Normalize function calls (func1, func2, ... or keep known names)
|
||||
if (options.NormalizeFunctionCalls)
|
||||
{
|
||||
normalized = NormalizeFunctionCalls(normalized, options.KnownFunctions);
|
||||
}
|
||||
|
||||
// 3. Normalize constants (replace magic numbers with placeholders)
|
||||
if (options.NormalizeConstants)
|
||||
{
|
||||
normalized = NormalizeConstants(normalized);
|
||||
}
|
||||
|
||||
// 4. Normalize whitespace
|
||||
if (options.NormalizeWhitespace)
|
||||
{
|
||||
normalized = NormalizeWhitespace(normalized);
|
||||
}
|
||||
|
||||
// 5. Sort independent statements (where order doesn't matter)
|
||||
if (options.SortIndependentStatements)
|
||||
{
|
||||
normalized = SortIndependentStatements(normalized);
|
||||
}
|
||||
|
||||
return normalized;
|
||||
}
|
||||
|
||||
private static string NormalizeVariableNames(string code)
|
||||
{
|
||||
// Replace all local variable names with canonical names
|
||||
// var_0, var_1, ... in order of first appearance
|
||||
var varIndex = 0;
|
||||
var varMap = new Dictionary<string, string>();
|
||||
|
||||
// Regex to find variable declarations and uses
|
||||
return Regex.Replace(code, @"\b([a-zA-Z_][a-zA-Z0-9_]*)\b", match =>
|
||||
{
|
||||
var name = match.Value;
|
||||
|
||||
// Skip keywords and known types
|
||||
if (IsKeywordOrType(name))
|
||||
return name;
|
||||
|
||||
if (!varMap.TryGetValue(name, out var canonical))
|
||||
{
|
||||
canonical = $"var_{varIndex++}";
|
||||
varMap[name] = canonical;
|
||||
}
|
||||
|
||||
return canonical;
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ML Embedding Service
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/IEmbeddingService.cs
|
||||
namespace StellaOps.BinaryIndex.ML;
|
||||
|
||||
public interface IEmbeddingService
|
||||
{
|
||||
/// <summary>
|
||||
/// Generate embedding vector for a function.
|
||||
/// </summary>
|
||||
Task<FunctionEmbedding> GenerateEmbeddingAsync(
|
||||
EmbeddingInput input,
|
||||
EmbeddingOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Compute similarity between embeddings.
|
||||
/// </summary>
|
||||
decimal ComputeSimilarity(
|
||||
FunctionEmbedding a,
|
||||
FunctionEmbedding b,
|
||||
SimilarityMetric metric = SimilarityMetric.Cosine);
|
||||
|
||||
/// <summary>
|
||||
/// Find similar functions in embedding index.
|
||||
/// </summary>
|
||||
Task<ImmutableArray<EmbeddingMatch>> FindSimilarAsync(
|
||||
FunctionEmbedding query,
|
||||
int topK = 10,
|
||||
decimal minSimilarity = 0.7m,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record EmbeddingInput(
|
||||
string? DecompiledCode, // Preferred
|
||||
KeySemanticsGraph? SemanticGraph, // Fallback
|
||||
byte[]? InstructionBytes, // Last resort
|
||||
EmbeddingInputType PreferredInput);
|
||||
|
||||
public enum EmbeddingInputType { DecompiledCode, SemanticGraph, Instructions }
|
||||
|
||||
public sealed record FunctionEmbedding(
|
||||
string FunctionName,
|
||||
float[] Vector, // 768-dimensional
|
||||
EmbeddingModel Model,
|
||||
EmbeddingInputType InputType);
|
||||
|
||||
public enum EmbeddingModel
|
||||
{
|
||||
CodeBertBinary, // Fine-tuned CodeBERT for binary code
|
||||
GraphSageFunction, // GNN for CFG/call graph
|
||||
ContrastiveFunction // Contrastive learning model
|
||||
}
|
||||
|
||||
public enum SimilarityMetric { Cosine, Euclidean, Manhattan, LearnedMetric }
|
||||
```
|
||||
|
||||
### Model Training Pipeline
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/IModelTrainingService.cs
|
||||
namespace StellaOps.BinaryIndex.ML;
|
||||
|
||||
public interface IModelTrainingService
|
||||
{
|
||||
/// <summary>
|
||||
/// Train embedding model on function pairs.
|
||||
/// </summary>
|
||||
Task<TrainingResult> TrainAsync(
|
||||
IAsyncEnumerable<TrainingPair> trainingData,
|
||||
TrainingOptions options,
|
||||
IProgress<TrainingProgress>? progress = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Evaluate model on test set.
|
||||
/// </summary>
|
||||
Task<EvaluationResult> EvaluateAsync(
|
||||
IAsyncEnumerable<TrainingPair> testData,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Export trained model for inference.
|
||||
/// </summary>
|
||||
Task ExportModelAsync(
|
||||
string outputPath,
|
||||
ModelExportFormat format = ModelExportFormat.Onnx,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record TrainingPair(
|
||||
EmbeddingInput FunctionA,
|
||||
EmbeddingInput FunctionB,
|
||||
bool IsSimilar, // Ground truth: same function?
|
||||
decimal? SimilarityScore); // Optional: how similar (0-1)
|
||||
|
||||
public sealed record TrainingOptions
|
||||
{
|
||||
public EmbeddingModel Model { get; init; } = EmbeddingModel.CodeBertBinary;
|
||||
public int EmbeddingDimension { get; init; } = 768;
|
||||
public int BatchSize { get; init; } = 32;
|
||||
public int Epochs { get; init; } = 10;
|
||||
public double LearningRate { get; init; } = 1e-5;
|
||||
public double MarginLoss { get; init; } = 0.5; // Contrastive margin
|
||||
public string? PretrainedModelPath { get; init; }
|
||||
}
|
||||
|
||||
public sealed record TrainingResult(
|
||||
string ModelPath,
|
||||
int TotalPairs,
|
||||
int Epochs,
|
||||
double FinalLoss,
|
||||
double ValidationAccuracy,
|
||||
TimeSpan TrainingTime);
|
||||
|
||||
public sealed record EvaluationResult(
|
||||
double Accuracy,
|
||||
double Precision,
|
||||
double Recall,
|
||||
double F1Score,
|
||||
double AucRoc,
|
||||
ImmutableArray<ConfusionEntry> ConfusionMatrix);
|
||||
```
|
||||
|
||||
### ONNX Inference Engine
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/OnnxInferenceEngine.cs
|
||||
namespace StellaOps.BinaryIndex.ML;
|
||||
|
||||
internal sealed class OnnxInferenceEngine : IEmbeddingService, IAsyncDisposable
|
||||
{
|
||||
private readonly InferenceSession _session;
|
||||
private readonly ITokenizer _tokenizer;
|
||||
private readonly ILogger<OnnxInferenceEngine> _logger;
|
||||
|
||||
public OnnxInferenceEngine(
|
||||
string modelPath,
|
||||
ITokenizer tokenizer,
|
||||
ILogger<OnnxInferenceEngine> logger)
|
||||
{
|
||||
var options = new SessionOptions
|
||||
{
|
||||
GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL,
|
||||
ExecutionMode = ExecutionMode.ORT_PARALLEL
|
||||
};
|
||||
|
||||
_session = new InferenceSession(modelPath, options);
|
||||
_tokenizer = tokenizer;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task<FunctionEmbedding> GenerateEmbeddingAsync(
|
||||
EmbeddingInput input,
|
||||
EmbeddingOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var text = input.PreferredInput switch
|
||||
{
|
||||
EmbeddingInputType.DecompiledCode => input.DecompiledCode
|
||||
?? throw new ArgumentException("DecompiledCode required"),
|
||||
EmbeddingInputType.SemanticGraph => SerializeGraph(input.SemanticGraph
|
||||
?? throw new ArgumentException("SemanticGraph required")),
|
||||
EmbeddingInputType.Instructions => SerializeInstructions(input.InstructionBytes
|
||||
?? throw new ArgumentException("InstructionBytes required")),
|
||||
_ => throw new ArgumentOutOfRangeException()
|
||||
};
|
||||
|
||||
// Tokenize
|
||||
var tokens = _tokenizer.Tokenize(text, maxLength: 512);
|
||||
|
||||
// Run inference
|
||||
var inputTensor = new DenseTensor<long>(tokens, [1, tokens.Length]);
|
||||
var inputs = new List<NamedOnnxValue>
|
||||
{
|
||||
NamedOnnxValue.CreateFromTensor("input_ids", inputTensor)
|
||||
};
|
||||
|
||||
using var results = await Task.Run(() => _session.Run(inputs), ct);
|
||||
|
||||
var outputTensor = results.First().AsTensor<float>();
|
||||
var embedding = outputTensor.ToArray();
|
||||
|
||||
return new FunctionEmbedding(
|
||||
input.DecompiledCode?.GetHashCode().ToString() ?? "unknown",
|
||||
embedding,
|
||||
EmbeddingModel.CodeBertBinary,
|
||||
input.PreferredInput);
|
||||
}
|
||||
|
||||
public decimal ComputeSimilarity(
|
||||
FunctionEmbedding a,
|
||||
FunctionEmbedding b,
|
||||
SimilarityMetric metric = SimilarityMetric.Cosine)
|
||||
{
|
||||
return metric switch
|
||||
{
|
||||
SimilarityMetric.Cosine => CosineSimilarity(a.Vector, b.Vector),
|
||||
SimilarityMetric.Euclidean => EuclideanSimilarity(a.Vector, b.Vector),
|
||||
SimilarityMetric.Manhattan => ManhattanSimilarity(a.Vector, b.Vector),
|
||||
_ => throw new ArgumentOutOfRangeException(nameof(metric))
|
||||
};
|
||||
}
|
||||
|
||||
private static decimal CosineSimilarity(float[] a, float[] b)
|
||||
{
|
||||
var dotProduct = 0.0;
|
||||
var normA = 0.0;
|
||||
var normB = 0.0;
|
||||
|
||||
for (var i = 0; i < a.Length; i++)
|
||||
{
|
||||
dotProduct += a[i] * b[i];
|
||||
normA += a[i] * a[i];
|
||||
normB += b[i] * b[i];
|
||||
}
|
||||
|
||||
if (normA == 0 || normB == 0)
|
||||
return 0;
|
||||
|
||||
return (decimal)(dotProduct / (Math.Sqrt(normA) * Math.Sqrt(normB)));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Ensemble Decision Engine
|
||||
|
||||
```csharp
|
||||
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs
|
||||
namespace StellaOps.BinaryIndex.Ensemble;
|
||||
|
||||
public interface IEnsembleDecisionEngine
|
||||
{
|
||||
/// <summary>
|
||||
/// Compute final similarity using all available signals.
|
||||
/// </summary>
|
||||
Task<EnsembleResult> ComputeSimilarityAsync(
|
||||
FunctionAnalysis a,
|
||||
FunctionAnalysis b,
|
||||
EnsembleOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record FunctionAnalysis(
|
||||
string FunctionName,
|
||||
byte[]? InstructionFingerprint, // Phase 1
|
||||
SemanticFingerprint? SemanticGraph, // Phase 1
|
||||
DecompiledFunction? Decompiled, // Phase 4
|
||||
FunctionEmbedding? Embedding); // Phase 4
|
||||
|
||||
public sealed record EnsembleOptions
|
||||
{
|
||||
// Weight configuration (must sum to 1.0)
|
||||
public decimal InstructionWeight { get; init; } = 0.15m;
|
||||
public decimal SemanticGraphWeight { get; init; } = 0.25m;
|
||||
public decimal DecompiledWeight { get; init; } = 0.35m;
|
||||
public decimal EmbeddingWeight { get; init; } = 0.25m;
|
||||
|
||||
// Confidence thresholds
|
||||
public decimal MinConfidence { get; init; } = 0.6m;
|
||||
public bool RequireAllSignals { get; init; } = false;
|
||||
}
|
||||
|
||||
public sealed record EnsembleResult(
|
||||
decimal OverallSimilarity,
|
||||
MatchConfidence Confidence,
|
||||
ImmutableArray<SignalContribution> Contributions,
|
||||
string? Explanation);
|
||||
|
||||
public sealed record SignalContribution(
|
||||
string SignalName,
|
||||
decimal RawSimilarity,
|
||||
decimal Weight,
|
||||
decimal WeightedContribution,
|
||||
bool WasAvailable);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owners | Task Definition |
|
||||
|---|---------|--------|------------|--------|-----------------|
|
||||
| **Decompiler Integration** |
|
||||
| 1 | DCML-001 | TODO | Phase 3 | Guild | Create `StellaOps.BinaryIndex.Decompiler` project |
|
||||
| 2 | DCML-002 | TODO | DCML-001 | Guild | Define decompiled code model types |
|
||||
| 3 | DCML-003 | TODO | DCML-002 | Guild | Implement Ghidra decompiler adapter |
|
||||
| 4 | DCML-004 | TODO | DCML-003 | Guild | Implement C code parser (AST generation) |
|
||||
| 5 | DCML-005 | TODO | DCML-004 | Guild | Implement AST comparison engine |
|
||||
| 6 | DCML-006 | TODO | DCML-005 | Guild | Implement code normalizer |
|
||||
| 7 | DCML-007 | TODO | DCML-006 | Guild | Implement semantic equivalence detector |
|
||||
| 8 | DCML-008 | TODO | DCML-007 | Guild | Unit tests: Decompiler adapter |
|
||||
| 9 | DCML-009 | TODO | DCML-007 | Guild | Unit tests: AST comparison |
|
||||
| 10 | DCML-010 | TODO | DCML-009 | Guild | Integration tests: End-to-end decompiled comparison |
|
||||
| **ML Embedding Pipeline** |
|
||||
| 11 | DCML-011 | TODO | Phase 2 | Guild | Create `StellaOps.BinaryIndex.ML` project |
|
||||
| 12 | DCML-012 | TODO | DCML-011 | Guild | Define embedding model types |
|
||||
| 13 | DCML-013 | TODO | DCML-012 | Guild | Implement code tokenizer (binary-aware BPE) |
|
||||
| 14 | DCML-014 | TODO | DCML-013 | Guild | Set up ONNX Runtime inference engine |
|
||||
| 15 | DCML-015 | TODO | DCML-014 | Guild | Implement embedding service |
|
||||
| 16 | DCML-016 | TODO | DCML-015 | Guild | Create training data from corpus (positive/negative pairs) |
|
||||
| 17 | DCML-017 | TODO | DCML-016 | Guild | Train CodeBERT-Binary model |
|
||||
| 18 | DCML-018 | TODO | DCML-017 | Guild | Export model to ONNX format |
|
||||
| 19 | DCML-019 | TODO | DCML-015 | Guild | Unit tests: Embedding generation |
|
||||
| 20 | DCML-020 | TODO | DCML-018 | Guild | Evaluation: Model accuracy metrics |
|
||||
| **Ensemble Integration** |
|
||||
| 21 | DCML-021 | TODO | DCML-010,020 | Guild | Create `StellaOps.BinaryIndex.Ensemble` project |
|
||||
| 22 | DCML-022 | TODO | DCML-021 | Guild | Implement ensemble decision engine |
|
||||
| 23 | DCML-023 | TODO | DCML-022 | Guild | Implement weight tuning (grid search) |
|
||||
| 24 | DCML-024 | TODO | DCML-023 | Guild | Integrate ensemble into PatchDiffEngine |
|
||||
| 25 | DCML-025 | TODO | DCML-024 | Guild | Integrate ensemble into DeltaSignatureMatcher |
|
||||
| 26 | DCML-026 | TODO | DCML-025 | Guild | Unit tests: Ensemble decision logic |
|
||||
| 27 | DCML-027 | TODO | DCML-026 | Guild | Integration tests: Full semantic diffing pipeline |
|
||||
| 28 | DCML-028 | TODO | DCML-027 | Guild | Benchmark: Accuracy vs. baseline (Phase 1 only) |
|
||||
| 29 | DCML-029 | TODO | DCML-028 | Guild | Benchmark: Latency impact |
|
||||
| 30 | DCML-030 | TODO | DCML-029 | Guild | Documentation: ML model training guide |
|
||||
|
||||
---
|
||||
|
||||
## Task Details
|
||||
|
||||
### DCML-004: Implement C Code Parser
|
||||
|
||||
Parse Ghidra's decompiled C output into AST:
|
||||
|
||||
```csharp
|
||||
internal sealed class DecompiledCodeParser
|
||||
{
|
||||
public DecompiledAst Parse(string code)
|
||||
{
|
||||
// Use Tree-sitter or Roslyn-based C parser
|
||||
// Ghidra output is C-like but not standard C
|
||||
|
||||
var tokens = Tokenize(code);
|
||||
var ast = BuildAst(tokens);
|
||||
|
||||
return new DecompiledAst(
|
||||
ast,
|
||||
CountNodes(ast),
|
||||
ComputeDepth(ast),
|
||||
ExtractPatterns(ast));
|
||||
}
|
||||
|
||||
private AstNode BuildAst(IList<Token> tokens)
|
||||
{
|
||||
var parser = new RecursiveDescentParser(tokens);
|
||||
return parser.ParseFunction();
|
||||
}
|
||||
|
||||
private ImmutableArray<AstPattern> ExtractPatterns(AstNode root)
|
||||
{
|
||||
var patterns = new List<AstPattern>();
|
||||
|
||||
// Detect common patterns
|
||||
patterns.AddRange(DetectLoopPatterns(root));
|
||||
patterns.AddRange(DetectBranchPatterns(root));
|
||||
patterns.AddRange(DetectAllocationPatterns(root));
|
||||
patterns.AddRange(DetectErrorHandlingPatterns(root));
|
||||
|
||||
return [.. patterns];
|
||||
}
|
||||
|
||||
private static IEnumerable<AstPattern> DetectLoopPatterns(AstNode root)
|
||||
{
|
||||
// Find: for loops, while loops, do-while
|
||||
// Classify: counted loop, sentinel loop, infinite loop
|
||||
foreach (var node in TraverseNodes(root))
|
||||
{
|
||||
if (node.Type == AstNodeType.For)
|
||||
{
|
||||
yield return new AstPattern(
|
||||
PatternType.CountedLoop,
|
||||
node,
|
||||
AnalyzeForLoop(node));
|
||||
}
|
||||
else if (node.Type == AstNodeType.While)
|
||||
{
|
||||
yield return new AstPattern(
|
||||
PatternType.ConditionalLoop,
|
||||
node,
|
||||
AnalyzeWhileLoop(node));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### DCML-017: Train CodeBERT-Binary Model
|
||||
|
||||
Training pipeline for function similarity:
|
||||
|
||||
```python
|
||||
# tools/ml/train_codebert_binary.py
|
||||
import torch
|
||||
from transformers import RobertaTokenizer, RobertaModel
|
||||
from torch.utils.data import DataLoader
|
||||
import onnx
|
||||
|
||||
class CodeBertBinaryModel(torch.nn.Module):
|
||||
def __init__(self, pretrained_model="microsoft/codebert-base"):
|
||||
super().__init__()
|
||||
self.encoder = RobertaModel.from_pretrained(pretrained_model)
|
||||
self.projection = torch.nn.Linear(768, 768)
|
||||
|
||||
def forward(self, input_ids, attention_mask):
|
||||
outputs = self.encoder(input_ids, attention_mask=attention_mask)
|
||||
pooled = outputs.last_hidden_state[:, 0, :] # [CLS] token
|
||||
projected = self.projection(pooled)
|
||||
return torch.nn.functional.normalize(projected, p=2, dim=1)
|
||||
|
||||
|
||||
class ContrastiveLoss(torch.nn.Module):
|
||||
def __init__(self, margin=0.5):
|
||||
super().__init__()
|
||||
self.margin = margin
|
||||
|
||||
def forward(self, embedding_a, embedding_b, label):
|
||||
distance = torch.nn.functional.pairwise_distance(embedding_a, embedding_b)
|
||||
|
||||
# label=1: similar, label=0: dissimilar
|
||||
loss = label * distance.pow(2) + \
|
||||
(1 - label) * torch.clamp(self.margin - distance, min=0).pow(2)
|
||||
|
||||
return loss.mean()
|
||||
|
||||
|
||||
def train_model(train_dataloader, val_dataloader, epochs=10):
|
||||
model = CodeBertBinaryModel()
|
||||
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-5)
|
||||
criterion = ContrastiveLoss(margin=0.5)
|
||||
|
||||
for epoch in range(epochs):
|
||||
model.train()
|
||||
total_loss = 0
|
||||
|
||||
for batch in train_dataloader:
|
||||
optimizer.zero_grad()
|
||||
|
||||
emb_a = model(batch['input_ids_a'], batch['attention_mask_a'])
|
||||
emb_b = model(batch['input_ids_b'], batch['attention_mask_b'])
|
||||
|
||||
loss = criterion(emb_a, emb_b, batch['label'])
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
|
||||
total_loss += loss.item()
|
||||
|
||||
# Validation
|
||||
model.eval()
|
||||
val_accuracy = evaluate(model, val_dataloader)
|
||||
print(f"Epoch {epoch+1}: Loss={total_loss:.4f}, Val Acc={val_accuracy:.4f}")
|
||||
|
||||
return model
|
||||
|
||||
|
||||
def export_to_onnx(model, output_path):
|
||||
model.eval()
|
||||
dummy_input = torch.randint(0, 50000, (1, 512))
|
||||
dummy_mask = torch.ones(1, 512)
|
||||
|
||||
torch.onnx.export(
|
||||
model,
|
||||
(dummy_input, dummy_mask),
|
||||
output_path,
|
||||
input_names=['input_ids', 'attention_mask'],
|
||||
output_names=['embedding'],
|
||||
dynamic_axes={
|
||||
'input_ids': {0: 'batch', 1: 'seq'},
|
||||
'attention_mask': {0: 'batch', 1: 'seq'},
|
||||
'embedding': {0: 'batch'}
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### DCML-023: Implement Weight Tuning
|
||||
|
||||
Grid search for optimal ensemble weights:
|
||||
|
||||
```csharp
|
||||
internal sealed class EnsembleWeightTuner
|
||||
{
|
||||
public async Task<EnsembleOptions> TuneWeightsAsync(
|
||||
IAsyncEnumerable<LabeledPair> validationData,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var bestOptions = EnsembleOptions.Default;
|
||||
var bestF1 = 0.0;
|
||||
|
||||
// Grid search over weight combinations
|
||||
var weightCombinations = GenerateWeightCombinations(step: 0.05m);
|
||||
|
||||
foreach (var weights in weightCombinations)
|
||||
{
|
||||
ct.ThrowIfCancellationRequested();
|
||||
|
||||
var options = new EnsembleOptions
|
||||
{
|
||||
InstructionWeight = weights[0],
|
||||
SemanticGraphWeight = weights[1],
|
||||
DecompiledWeight = weights[2],
|
||||
EmbeddingWeight = weights[3]
|
||||
};
|
||||
|
||||
var metrics = await EvaluateAsync(validationData, options, ct);
|
||||
|
||||
if (metrics.F1Score > bestF1)
|
||||
{
|
||||
bestF1 = metrics.F1Score;
|
||||
bestOptions = options;
|
||||
}
|
||||
}
|
||||
|
||||
return bestOptions;
|
||||
}
|
||||
|
||||
private static IEnumerable<decimal[]> GenerateWeightCombinations(decimal step)
|
||||
{
|
||||
for (var w1 = 0m; w1 <= 1m; w1 += step)
|
||||
for (var w2 = 0m; w2 <= 1m - w1; w2 += step)
|
||||
for (var w3 = 0m; w3 <= 1m - w1 - w2; w3 += step)
|
||||
{
|
||||
var w4 = 1m - w1 - w2 - w3;
|
||||
if (w4 >= 0)
|
||||
{
|
||||
yield return [w1, w2, w3, w4];
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Training Data Requirements
|
||||
|
||||
### Positive Pairs (Similar Functions)
|
||||
|
||||
| Source | Count | Description |
|
||||
|--------|-------|-------------|
|
||||
| Same function, different optimization | ~50,000 | O0 vs O2 vs O3 |
|
||||
| Same function, different compiler | ~30,000 | GCC vs Clang |
|
||||
| Same function, different version | ~100,000 | From corpus (Phase 2) |
|
||||
| Same function, with patches | ~20,000 | Vulnerable vs fixed |
|
||||
|
||||
### Negative Pairs (Dissimilar Functions)
|
||||
|
||||
| Source | Count | Description |
|
||||
|--------|-------|-------------|
|
||||
| Random function pairs | ~100,000 | Random sampling |
|
||||
| Similar-named different functions | ~50,000 | Hard negatives |
|
||||
| Same library, different functions | ~50,000 | Medium negatives |
|
||||
|
||||
**Total training data:** ~400,000 labeled pairs
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
| Metric | Phase 1 Only | With Phase 4 | Target |
|
||||
|--------|--------------|--------------|--------|
|
||||
| Accuracy (optimized binaries) | 70% | 92% | 90%+ |
|
||||
| Accuracy (obfuscated binaries) | 40% | 75% | 70%+ |
|
||||
| False positive rate | 5% | 1.5% | <2% |
|
||||
| False negative rate | 25% | 8% | <10% |
|
||||
| Latency (per comparison) | 10ms | 150ms | <200ms |
|
||||
|
||||
---
|
||||
|
||||
## Resource Requirements
|
||||
|
||||
| Resource | Training | Inference |
|
||||
|----------|----------|-----------|
|
||||
| GPU | 1x V100 (32GB) or 4x T4 | Optional (CPU viable) |
|
||||
| Memory | 64GB | 16GB |
|
||||
| Storage | 100GB (training data) | 5GB (model) |
|
||||
| Time | ~24 hours | <200ms per function |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
|
||||
|
||||
---
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision/Risk | Type | Mitigation |
|
||||
|---------------|------|------------|
|
||||
| ML model requires significant training data | Risk | Leverage corpus from Phase 2 |
|
||||
| ONNX inference adds latency | Trade-off | Make ML optional, use for high-value comparisons |
|
||||
| Decompiler output varies by Ghidra version | Risk | Pin Ghidra version, normalize output |
|
||||
| Model may overfit to training library set | Risk | Diverse training data, regularization |
|
||||
| GPU dependency for training | Constraint | Use cloud GPU, document CPU-only option |
|
||||
|
||||
---
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-03-01: DCML-001 through DCML-010 (decompiler integration) complete
|
||||
- 2026-03-15: DCML-011 through DCML-020 (ML pipeline) complete
|
||||
- 2026-03-31: DCML-021 through DCML-030 (ensemble, benchmarks) complete
|
||||
@@ -142,17 +142,17 @@ CREATE INDEX idx_hlc_state_updated ON scheduler.hlc_state(updated_at DESC);
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | HLC-001 | TODO | - | Guild | Create `StellaOps.HybridLogicalClock` project with Directory.Build.props integration |
|
||||
| 2 | HLC-002 | TODO | HLC-001 | Guild | Implement `HlcTimestamp` record with comparison, parsing, serialization |
|
||||
| 3 | HLC-003 | TODO | HLC-002 | Guild | Implement `HybridLogicalClock` class with Tick/Receive/Current |
|
||||
| 4 | HLC-004 | TODO | HLC-003 | Guild | Implement `IHlcStateStore` interface and `InMemoryHlcStateStore` |
|
||||
| 5 | HLC-005 | TODO | HLC-004 | Guild | Implement `PostgresHlcStateStore` with atomic update semantics |
|
||||
| 6 | HLC-006 | TODO | HLC-003 | Guild | Add `HlcTimestampJsonConverter` for System.Text.Json serialization |
|
||||
| 7 | HLC-007 | TODO | HLC-003 | Guild | Add `HlcTimestampTypeHandler` for Npgsql/Dapper |
|
||||
| 8 | HLC-008 | TODO | HLC-005 | Guild | Write unit tests: tick monotonicity, receive merge, clock skew handling |
|
||||
| 9 | HLC-009 | TODO | HLC-008 | Guild | Write integration tests: concurrent ticks, node restart recovery |
|
||||
| 1 | HLC-001 | DONE | - | Guild | Create `StellaOps.HybridLogicalClock` project with Directory.Build.props integration |
|
||||
| 2 | HLC-002 | DONE | HLC-001 | Guild | Implement `HlcTimestamp` record with comparison, parsing, serialization |
|
||||
| 3 | HLC-003 | DONE | HLC-002 | Guild | Implement `HybridLogicalClock` class with Tick/Receive/Current |
|
||||
| 4 | HLC-004 | DONE | HLC-003 | Guild | Implement `IHlcStateStore` interface and `InMemoryHlcStateStore` |
|
||||
| 5 | HLC-005 | DONE | HLC-004 | Guild | Implement `PostgresHlcStateStore` with atomic update semantics |
|
||||
| 6 | HLC-006 | DONE | HLC-003 | Guild | Add `HlcTimestampJsonConverter` for System.Text.Json serialization |
|
||||
| 7 | HLC-007 | DONE | HLC-003 | Guild | Add `HlcTimestampTypeHandler` for Npgsql/Dapper |
|
||||
| 8 | HLC-008 | DONE | HLC-005 | Guild | Write unit tests: tick monotonicity, receive merge, clock skew handling |
|
||||
| 9 | HLC-009 | DONE | HLC-008 | Guild | Write integration tests: concurrent ticks, node restart recovery |
|
||||
| 10 | HLC-010 | TODO | HLC-009 | Guild | Write benchmarks: tick throughput, memory allocation |
|
||||
| 11 | HLC-011 | TODO | HLC-010 | Guild | Create `HlcServiceCollectionExtensions` for DI registration |
|
||||
| 11 | HLC-011 | DONE | HLC-010 | Guild | Create `HlcServiceCollectionExtensions` for DI registration |
|
||||
| 12 | HLC-012 | TODO | HLC-011 | Guild | Documentation: README.md, API docs, usage examples |
|
||||
|
||||
## Implementation Details
|
||||
@@ -335,6 +335,7 @@ hlc_physical_time_offset_seconds{node_id} // Drift from wall clock
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
|
||||
| 2026-01-05 | HLC-001 to HLC-011 implemented: core library, state stores, JSON/Dapper serializers, DI extensions, 56 unit tests all passing | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
|
||||
@@ -466,16 +466,16 @@ internal static class ProveCommandGroup
|
||||
| 4 | RPL-004 | TODO | RPL-003 | Replay Guild | Update `CommandHandlers.VerifyBundle.ReplayVerdictAsync()` to use service |
|
||||
| 5 | RPL-005 | TODO | RPL-004 | Replay Guild | Unit tests: VerdictBuilder replay with fixtures |
|
||||
| **DSSE Verification** |
|
||||
| 6 | RPL-006 | TODO | - | Attestor Guild | Define `IDsseVerifier` interface in `StellaOps.Attestation` |
|
||||
| 7 | RPL-007 | TODO | RPL-006 | Attestor Guild | Implement `DsseVerifier` using existing `DsseHelper` |
|
||||
| 8 | RPL-008 | TODO | RPL-007 | CLI Guild | Wire `DsseVerifier` into CLI DI container |
|
||||
| 9 | RPL-009 | TODO | RPL-008 | CLI Guild | Update `CommandHandlers.VerifyBundle.VerifyDsseSignatureAsync()` |
|
||||
| 10 | RPL-010 | TODO | RPL-009 | Attestor Guild | Unit tests: DSSE verification with valid/invalid signatures |
|
||||
| 6 | RPL-006 | DONE | - | Attestor Guild | Define `IDsseVerifier` interface in `StellaOps.Attestation` |
|
||||
| 7 | RPL-007 | DONE | RPL-006 | Attestor Guild | Implement `DsseVerifier` using existing `DsseHelper` |
|
||||
| 8 | RPL-008 | DONE | RPL-007 | CLI Guild | Wire `DsseVerifier` into CLI DI container |
|
||||
| 9 | RPL-009 | DONE | RPL-008 | CLI Guild | Update `CommandHandlers.VerifyBundle.VerifyDsseSignatureAsync()` |
|
||||
| 10 | RPL-010 | DONE | RPL-009 | Attestor Guild | Unit tests: DSSE verification with valid/invalid signatures |
|
||||
| **ReplayProof Schema** |
|
||||
| 11 | RPL-011 | TODO | - | Replay Guild | Create `ReplayProof` model in `StellaOps.Replay.Core` |
|
||||
| 12 | RPL-012 | TODO | RPL-011 | Replay Guild | Implement `ToCompactString()` with canonical JSON + SHA-256 |
|
||||
| 13 | RPL-013 | TODO | RPL-012 | Replay Guild | Update `stella verify --bundle` to output replay proof |
|
||||
| 14 | RPL-014 | TODO | RPL-013 | Replay Guild | Unit tests: Replay proof generation and parsing |
|
||||
| 11 | RPL-011 | DONE | - | Replay Guild | Create `ReplayProof` model in `StellaOps.Replay.Core` |
|
||||
| 12 | RPL-012 | DONE | RPL-011 | Replay Guild | Implement `ToCompactString()` with canonical JSON + SHA-256 |
|
||||
| 13 | RPL-013 | DONE | RPL-012 | Replay Guild | Update `stella verify --bundle` to output replay proof |
|
||||
| 14 | RPL-014 | DONE | RPL-013 | Replay Guild | Unit tests: Replay proof generation and parsing |
|
||||
| **stella prove Command** |
|
||||
| 15 | RPL-015 | TODO | RPL-011 | CLI Guild | Create `ProveCommandGroup.cs` with command structure |
|
||||
| 16 | RPL-016 | TODO | RPL-015 | CLI Guild | Implement `ITimelineQueryService` adapter for snapshot lookup |
|
||||
@@ -506,6 +506,8 @@ internal static class ProveCommandGroup
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
|
||||
| 2026-01-xx | Completed RPL-006 through RPL-010: IDsseVerifier interface, DsseVerifier implementation with ECDSA/RSA support, CLI integration, 12 unit tests all passing | Implementer |
|
||||
| 2026-01-xx | Completed RPL-011 through RPL-014: ReplayProof model, ToCompactString with SHA-256, ToCanonicalJson, FromExecutionResult factory, 14 unit tests all passing | Implementer |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -289,28 +289,28 @@ public sealed class BatchSnapshotService
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | SQC-001 | TODO | HLC lib | Guild | Add StellaOps.HybridLogicalClock reference to Scheduler projects |
|
||||
| 2 | SQC-002 | TODO | SQC-001 | Guild | Create migration: `scheduler.scheduler_log` table |
|
||||
| 3 | SQC-003 | TODO | SQC-002 | Guild | Create migration: `scheduler.batch_snapshot` table |
|
||||
| 4 | SQC-004 | TODO | SQC-002 | Guild | Create migration: `scheduler.chain_heads` table |
|
||||
| 5 | SQC-005 | TODO | SQC-004 | Guild | Implement `ISchedulerLogRepository` interface |
|
||||
| 6 | SQC-006 | TODO | SQC-005 | Guild | Implement `PostgresSchedulerLogRepository` |
|
||||
| 7 | SQC-007 | TODO | SQC-004 | Guild | Implement `IChainHeadRepository` and Postgres implementation |
|
||||
| 8 | SQC-008 | TODO | SQC-006 | Guild | Implement `SchedulerChainLinking` static class |
|
||||
| 9 | SQC-009 | TODO | SQC-008 | Guild | Implement `HlcSchedulerEnqueueService` |
|
||||
| 10 | SQC-010 | TODO | SQC-009 | Guild | Implement `HlcSchedulerDequeueService` |
|
||||
| 11 | SQC-011 | TODO | SQC-010 | Guild | Update Redis queue adapter to include HLC in message |
|
||||
| 12 | SQC-012 | TODO | SQC-010 | Guild | Update NATS queue adapter to include HLC in message |
|
||||
| 13 | SQC-013 | TODO | SQC-006 | Guild | Implement `BatchSnapshotService` |
|
||||
| 14 | SQC-014 | TODO | SQC-013 | Guild | Add DSSE signing integration for batch snapshots |
|
||||
| 15 | SQC-015 | TODO | SQC-008 | Guild | Implement chain verification: `VerifyChainIntegrity()` |
|
||||
| 16 | SQC-016 | TODO | SQC-015 | Guild | Write unit tests: chain linking, HLC ordering |
|
||||
| 17 | SQC-017 | TODO | SQC-016 | Guild | Write integration tests: enqueue/dequeue with chain |
|
||||
| 18 | SQC-018 | TODO | SQC-017 | Guild | Write determinism tests: same input -> same chain |
|
||||
| 19 | SQC-019 | TODO | SQC-018 | Guild | Update existing JobRepository to use HLC ordering optionally |
|
||||
| 20 | SQC-020 | TODO | SQC-019 | Guild | Feature flag: `SchedulerOptions.EnableHlcOrdering` |
|
||||
| 21 | SQC-021 | TODO | SQC-020 | Guild | Migration guide: enabling HLC on existing deployments |
|
||||
| 22 | SQC-022 | TODO | SQC-021 | Guild | Metrics: `scheduler_hlc_enqueues_total`, `scheduler_chain_verifications_total` |
|
||||
| 1 | SQC-001 | DONE | HLC lib | Guild | Add StellaOps.HybridLogicalClock reference to Scheduler projects |
|
||||
| 2 | SQC-002 | DONE | SQC-001 | Guild | Create migration: `scheduler.scheduler_log` table |
|
||||
| 3 | SQC-003 | DONE | SQC-002 | Guild | Create migration: `scheduler.batch_snapshot` table |
|
||||
| 4 | SQC-004 | DONE | SQC-002 | Guild | Create migration: `scheduler.chain_heads` table |
|
||||
| 5 | SQC-005 | DONE | SQC-004 | Guild | Implement `ISchedulerLogRepository` interface |
|
||||
| 6 | SQC-006 | DONE | SQC-005 | Guild | Implement `PostgresSchedulerLogRepository` |
|
||||
| 7 | SQC-007 | DONE | SQC-004 | Guild | Implement `IChainHeadRepository` and Postgres implementation |
|
||||
| 8 | SQC-008 | DONE | SQC-006 | Guild | Implement `SchedulerChainLinking` static class |
|
||||
| 9 | SQC-009 | DONE | SQC-008 | Guild | Implement `HlcSchedulerEnqueueService` |
|
||||
| 10 | SQC-010 | DONE | SQC-009 | Guild | Implement `HlcSchedulerDequeueService` |
|
||||
| 11 | SQC-011 | DONE | SQC-010 | Guild | Update Redis queue adapter to include HLC in message |
|
||||
| 12 | SQC-012 | DONE | SQC-010 | Guild | Update NATS queue adapter to include HLC in message |
|
||||
| 13 | SQC-013 | DONE | SQC-006 | Guild | Implement `BatchSnapshotService` |
|
||||
| 14 | SQC-014 | DONE | SQC-013 | Guild | Add DSSE signing integration for batch snapshots |
|
||||
| 15 | SQC-015 | DONE | SQC-008 | Guild | Implement chain verification: `VerifyChainIntegrity()` |
|
||||
| 16 | SQC-016 | DONE | SQC-015 | Guild | Write unit tests: chain linking, HLC ordering |
|
||||
| 17 | SQC-017 | DONE | SQC-016 | Guild | Write integration tests: enqueue/dequeue with chain |
|
||||
| 18 | SQC-018 | DONE | SQC-017 | Guild | Write determinism tests: same input -> same chain |
|
||||
| 19 | SQC-019 | DONE | SQC-018 | Guild | Update existing JobRepository to use HLC ordering optionally |
|
||||
| 20 | SQC-020 | DONE | SQC-019 | Guild | Feature flag: `SchedulerOptions.EnableHlcOrdering` |
|
||||
| 21 | SQC-021 | DONE | SQC-020 | Guild | Migration guide: enabling HLC on existing deployments |
|
||||
| 22 | SQC-022 | DONE | SQC-021 | Guild | Metrics: `scheduler_hlc_enqueues_total`, `scheduler_chain_verifications_total` |
|
||||
|
||||
## Chain Verification
|
||||
|
||||
@@ -419,6 +419,20 @@ public sealed class SchedulerOptions
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
|
||||
| 2026-01-06 | SQC-001: Added HLC and CanonicalJson references to Scheduler.Persistence and Scheduler.Queue projects | Agent |
|
||||
| 2026-01-06 | SQC-002-004: Created migration 002_hlc_queue_chain.sql with scheduler_log, batch_snapshot, chain_heads tables | Agent |
|
||||
| 2026-01-06 | SQC-005-008: Implemented SchedulerChainLinking, ISchedulerLogRepository, PostgresSchedulerLogRepository, IChainHeadRepository, PostgresChainHeadRepository | Agent |
|
||||
| 2026-01-06 | SQC-009: Implemented HlcSchedulerEnqueueService with chain linking and idempotency | Agent |
|
||||
| 2026-01-06 | SQC-010: Implemented HlcSchedulerDequeueService with HLC-ordered retrieval and cursor pagination | Agent |
|
||||
| 2026-01-06 | SQC-013: Implemented BatchSnapshotService with audit anchoring and optional DSSE signing | Agent |
|
||||
| 2026-01-06 | SQC-015: Implemented SchedulerChainVerifier for chain integrity verification | Agent |
|
||||
| 2026-01-06 | SQC-020: Added SchedulerHlcOptions with EnableHlcOrdering, DualWriteMode, VerifyOnDequeue flags | Agent |
|
||||
| 2026-01-06 | SQC-022: Implemented HlcSchedulerMetrics with enqueue, dequeue, verification, and snapshot metrics | Agent |
|
||||
| 2026-01-06 | Added HlcSchedulerServiceCollectionExtensions for DI registration | Agent |
|
||||
| 2026-01-06 | SQC-011-012: Verified Redis and NATS adapters already have HLC support (IHybridLogicalClock injection, Tick(), header storage) | Agent |
|
||||
| 2026-01-06 | SQC-021: Created HLC migration guide at docs/modules/scheduler/hlc-migration-guide.md | Agent |
|
||||
| 2026-01-06 | SQC-014: Implemented BatchSnapshotDsseSigner with HMAC-SHA256 signing, PAE encoding, and verification | Agent |
|
||||
| 2026-01-06 | SQC-019: Updated JobRepository with optional HLC ordering via JobRepositoryOptions; GetScheduledJobsAsync and GetByStatusAsync now join with scheduler_log when enabled | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
|
||||
@@ -632,17 +632,17 @@ public sealed class FacetDriftVexEmitter
|
||||
| # | Task ID | Status | Dependency | Owners | Task Definition |
|
||||
|---|---------|--------|------------|--------|-----------------|
|
||||
| **Drift Engine** |
|
||||
| 1 | QTA-001 | TODO | FCT models | Facet Guild | Define `IFacetDriftEngine` interface |
|
||||
| 2 | QTA-002 | TODO | QTA-001 | Facet Guild | Define `FacetDriftReport` model |
|
||||
| 3 | QTA-003 | TODO | QTA-002 | Facet Guild | Implement file diff computation (added/removed/modified) |
|
||||
| 4 | QTA-004 | TODO | QTA-003 | Facet Guild | Implement allowlist glob filtering |
|
||||
| 5 | QTA-005 | TODO | QTA-004 | Facet Guild | Implement drift score calculation |
|
||||
| 6 | QTA-006 | TODO | QTA-005 | Facet Guild | Implement quota evaluation logic |
|
||||
| 7 | QTA-007 | TODO | QTA-006 | Facet Guild | Unit tests: Drift computation with fixtures |
|
||||
| 8 | QTA-008 | TODO | QTA-007 | Facet Guild | Unit tests: Quota evaluation edge cases |
|
||||
| 1 | QTA-001 | DONE | FCT models | Facet Guild | Define `IFacetDriftEngine` interface |
|
||||
| 2 | QTA-002 | DONE | QTA-001 | Facet Guild | Define `FacetDriftReport` model |
|
||||
| 3 | QTA-003 | DONE | QTA-002 | Facet Guild | Implement file diff computation (added/removed/modified) |
|
||||
| 4 | QTA-004 | DONE | QTA-003 | Facet Guild | Implement allowlist glob filtering |
|
||||
| 5 | QTA-005 | DONE | QTA-004 | Facet Guild | Implement drift score calculation |
|
||||
| 6 | QTA-006 | DONE | QTA-005 | Facet Guild | Implement quota evaluation logic |
|
||||
| 7 | QTA-007 | DONE | QTA-006 | Facet Guild | Unit tests: Drift computation with fixtures |
|
||||
| 8 | QTA-008 | DONE | QTA-007 | Facet Guild | Unit tests: Quota evaluation edge cases |
|
||||
| **Quota Enforcement** |
|
||||
| 9 | QTA-009 | TODO | QTA-006 | Policy Guild | Create `FacetQuotaGate` class |
|
||||
| 10 | QTA-010 | TODO | QTA-009 | Policy Guild | Integrate with `IGateEvaluator` pipeline |
|
||||
| 9 | QTA-009 | DONE | QTA-006 | Policy Guild | Create `FacetQuotaGate` class |
|
||||
| 10 | QTA-010 | DONE | QTA-009 | Policy Guild | Integrate with `IGateEvaluator` pipeline |
|
||||
| 11 | QTA-011 | TODO | QTA-010 | Policy Guild | Add `FacetQuotaEnabled` to policy options |
|
||||
| 12 | QTA-012 | TODO | QTA-011 | Policy Guild | Create `IFacetSealStore` for baseline lookups |
|
||||
| 13 | QTA-013 | TODO | QTA-012 | Policy Guild | Implement Postgres storage for facet seals |
|
||||
@@ -678,6 +678,10 @@ public sealed class FacetDriftVexEmitter
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | QTA-001 to QTA-006 already implemented in FacetDriftDetector.cs | Agent |
|
||||
| 2026-01-06 | QTA-007/008: Created StellaOps.Facet.Tests with 18 passing tests | Agent |
|
||||
| 2026-01-06 | QTA-009: Created FacetQuotaGate in StellaOps.Policy.Gates | Agent |
|
||||
| 2026-01-06 | QTA-010: Created FacetQuotaGateServiceCollectionExtensions for DI/registry integration | Agent |
|
||||
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
|
||||
|
||||
---
|
||||
|
||||
@@ -337,27 +337,27 @@ public sealed class ConflictResolver
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | OMP-001 | TODO | SQC lib | Guild | Create `StellaOps.AirGap.Sync` library project |
|
||||
| 2 | OMP-002 | TODO | OMP-001 | Guild | Implement `OfflineHlcManager` for local offline enqueue |
|
||||
| 3 | OMP-003 | TODO | OMP-002 | Guild | Implement `IOfflineJobLogStore` and file-based store |
|
||||
| 4 | OMP-004 | TODO | OMP-003 | Guild | Implement `HlcMergeService` with total order merge |
|
||||
| 5 | OMP-005 | TODO | OMP-004 | Guild | Implement `ConflictResolver` for edge cases |
|
||||
| 6 | OMP-006 | TODO | OMP-005 | Guild | Implement `AirGapSyncService` for bundle import |
|
||||
| 7 | OMP-007 | TODO | OMP-006 | Guild | Define `AirGapBundle` format (JSON schema) |
|
||||
| 8 | OMP-008 | TODO | OMP-007 | Guild | Implement bundle export: `AirGapBundleExporter` |
|
||||
| 9 | OMP-009 | TODO | OMP-008 | Guild | Implement bundle import: `AirGapBundleImporter` |
|
||||
| 10 | OMP-010 | TODO | OMP-009 | Guild | Add DSSE signing for bundle integrity |
|
||||
| 11 | OMP-011 | TODO | OMP-006 | Guild | Integrate with Router transport layer |
|
||||
| 12 | OMP-012 | TODO | OMP-011 | Guild | Update `stella airgap export` CLI command |
|
||||
| 13 | OMP-013 | TODO | OMP-012 | Guild | Update `stella airgap import` CLI command |
|
||||
| 1 | OMP-001 | DONE | SQC lib | Guild | Create `StellaOps.AirGap.Sync` library project |
|
||||
| 2 | OMP-002 | DONE | OMP-001 | Guild | Implement `OfflineHlcManager` for local offline enqueue |
|
||||
| 3 | OMP-003 | DONE | OMP-002 | Guild | Implement `IOfflineJobLogStore` and file-based store |
|
||||
| 4 | OMP-004 | DONE | OMP-003 | Guild | Implement `HlcMergeService` with total order merge |
|
||||
| 5 | OMP-005 | DONE | OMP-004 | Guild | Implement `ConflictResolver` for edge cases |
|
||||
| 6 | OMP-006 | DONE | OMP-005 | Guild | Implement `AirGapSyncService` for bundle import |
|
||||
| 7 | OMP-007 | DONE | OMP-006 | Guild | Define `AirGapBundle` format (JSON schema) |
|
||||
| 8 | OMP-008 | DONE | OMP-007 | Guild | Implement bundle export: `AirGapBundleExporter` |
|
||||
| 9 | OMP-009 | DONE | OMP-008 | Guild | Implement bundle import: `AirGapBundleImporter` |
|
||||
| 10 | OMP-010 | DONE | OMP-009 | Guild | Add DSSE signing for bundle integrity |
|
||||
| 11 | OMP-011 | DONE | OMP-006 | Guild | Integrate with Router transport layer |
|
||||
| 12 | OMP-012 | DONE | OMP-011 | Guild | Update `stella airgap export` CLI command |
|
||||
| 13 | OMP-013 | DONE | OMP-012 | Guild | Update `stella airgap import` CLI command |
|
||||
| 14 | OMP-014 | TODO | OMP-004 | Guild | Write unit tests: merge algorithm correctness |
|
||||
| 15 | OMP-015 | TODO | OMP-014 | Guild | Write unit tests: duplicate detection |
|
||||
| 16 | OMP-016 | TODO | OMP-015 | Guild | Write unit tests: conflict resolution |
|
||||
| 17 | OMP-017 | TODO | OMP-016 | Guild | Write integration tests: offline -> online sync |
|
||||
| 18 | OMP-018 | TODO | OMP-017 | Guild | Write integration tests: multi-node merge |
|
||||
| 19 | OMP-019 | TODO | OMP-018 | Guild | Write determinism tests: same bundles -> same result |
|
||||
| 20 | OMP-020 | TODO | OMP-019 | Guild | Metrics: `airgap_sync_total`, `airgap_merge_conflicts_total` |
|
||||
| 21 | OMP-021 | TODO | OMP-020 | Guild | Documentation: offline operations guide |
|
||||
| 20 | OMP-020 | DONE | OMP-019 | Guild | Metrics: `airgap_sync_total`, `airgap_merge_conflicts_total` |
|
||||
| 21 | OMP-021 | DONE | OMP-020 | Guild | Documentation: offline operations guide |
|
||||
|
||||
## Test Scenarios
|
||||
|
||||
@@ -436,6 +436,16 @@ airgap_last_sync_timestamp{node_id}
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
|
||||
| 2026-01-06 | OMP-001: Created StellaOps.AirGap.Sync library project with HLC, Canonical.Json, Scheduler.Models dependencies | Agent |
|
||||
| 2026-01-06 | OMP-002-003: Implemented OfflineHlcManager and FileBasedOfflineJobLogStore for offline enqueue | Agent |
|
||||
| 2026-01-06 | OMP-004-005: Implemented HlcMergeService with total order merge and ConflictResolver | Agent |
|
||||
| 2026-01-06 | OMP-006: Implemented AirGapSyncService for bundle import with idempotency and chain recomputation | Agent |
|
||||
| 2026-01-06 | OMP-007-009: Defined AirGapBundle models and implemented AirGapBundleExporter/Importer with validation | Agent |
|
||||
| 2026-01-06 | OMP-010: Added manifest digest computation for bundle integrity (DSSE signing prepared via delegate) | Agent |
|
||||
| 2026-01-06 | OMP-020: Implemented AirGapSyncMetrics with counters for exports, imports, syncs, duplicates, conflicts | Agent |
|
||||
| 2026-01-06 | OMP-011: Created IJobSyncTransport, FileBasedJobSyncTransport, RouterJobSyncTransport for transport abstraction | Agent |
|
||||
| 2026-01-06 | OMP-012-013: Added `stella airgap jobs export/import/list` CLI commands with handlers | Agent |
|
||||
| 2026-01-06 | OMP-021: Created docs/airgap/job-sync-offline.md with CLI usage, bundle format, and runbook | Agent |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
|
||||
@@ -0,0 +1,775 @@
|
||||
# Sprint 20260106_001_001_LB - Determinization: Core Models and Types
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Create the foundational models and types for the Determinization subsystem. This implements the core data structures from the advisory: `pending_determinization` state, `SignalState<T>` wrapper, `UncertaintyScore`, and `ObservationDecay`.
|
||||
|
||||
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Determinization/`
|
||||
- **Evidence:** New library project, model classes, unit tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Current state tracking for CVEs:
|
||||
- VEX has 4 states (`Affected`, `NotAffected`, `Fixed`, `UnderInvestigation`)
|
||||
- Unknowns tracked separately via `Unknown` entity in Policy.Unknowns
|
||||
- No unified "observation state" for CVE lifecycle
|
||||
- Signal absence (EPSS null) indistinguishable from "not queried"
|
||||
|
||||
Advisory requires:
|
||||
- `pending_determinization` as first-class observation state
|
||||
- `SignalState<T>` distinguishing `NotQueried` vs `Queried(null)` vs `Queried(value)`
|
||||
- `UncertaintyScore` measuring knowledge completeness (not code entropy)
|
||||
- `ObservationDecay` tracking evidence staleness with configurable half-life
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** None (foundational library)
|
||||
- **Blocks:** SPRINT_20260106_001_002_LB (scoring), SPRINT_20260106_001_003_POLICY (gates)
|
||||
- **Parallel safe:** New library; no cross-module conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/determinization-architecture.md
|
||||
- src/Policy/AGENTS.md
|
||||
- Product Advisory: "Unknown CVEs: graceful placeholders, not blockers"
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
src/Policy/__Libraries/StellaOps.Policy.Determinization/
|
||||
├── StellaOps.Policy.Determinization.csproj
|
||||
├── Models/
|
||||
│ ├── ObservationState.cs
|
||||
│ ├── SignalState.cs
|
||||
│ ├── SignalQueryStatus.cs
|
||||
│ ├── SignalSnapshot.cs
|
||||
│ ├── UncertaintyScore.cs
|
||||
│ ├── UncertaintyTier.cs
|
||||
│ ├── SignalGap.cs
|
||||
│ ├── ObservationDecay.cs
|
||||
│ ├── GuardRails.cs
|
||||
│ ├── DeterminizationContext.cs
|
||||
│ └── DeterminizationResult.cs
|
||||
├── Evidence/
|
||||
│ ├── EpssEvidence.cs # Re-export or reference Scanner.Core
|
||||
│ ├── VexClaimSummary.cs
|
||||
│ ├── ReachabilityEvidence.cs
|
||||
│ ├── RuntimeEvidence.cs
|
||||
│ ├── BackportEvidence.cs
|
||||
│ ├── SbomLineageEvidence.cs
|
||||
│ └── CvssEvidence.cs
|
||||
└── GlobalUsings.cs
|
||||
```
|
||||
|
||||
### ObservationState Enum
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Observation state for CVE tracking, independent of VEX status.
|
||||
/// Allows a CVE to be "Affected" (VEX) but "PendingDeterminization" (observation).
|
||||
/// </summary>
|
||||
public enum ObservationState
|
||||
{
|
||||
/// <summary>
|
||||
/// Initial state: CVE discovered but evidence incomplete.
|
||||
/// Triggers guardrail-based policy evaluation.
|
||||
/// </summary>
|
||||
PendingDeterminization = 0,
|
||||
|
||||
/// <summary>
|
||||
/// Evidence sufficient for confident determination.
|
||||
/// Normal policy evaluation applies.
|
||||
/// </summary>
|
||||
Determined = 1,
|
||||
|
||||
/// <summary>
|
||||
/// Multiple signals conflict (K4 Conflict state).
|
||||
/// Requires human review regardless of confidence.
|
||||
/// </summary>
|
||||
Disputed = 2,
|
||||
|
||||
/// <summary>
|
||||
/// Evidence decayed below threshold; needs refresh.
|
||||
/// Auto-triggered when decay > threshold.
|
||||
/// </summary>
|
||||
StaleRequiresRefresh = 3,
|
||||
|
||||
/// <summary>
|
||||
/// Manually flagged for review.
|
||||
/// Bypasses automatic determinization.
|
||||
/// </summary>
|
||||
ManualReviewRequired = 4,
|
||||
|
||||
/// <summary>
|
||||
/// CVE suppressed/ignored by policy exception.
|
||||
/// Evidence tracking continues but decisions skip.
|
||||
/// </summary>
|
||||
Suppressed = 5
|
||||
}
|
||||
```
|
||||
|
||||
### SignalState<T> Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Wraps a signal value with query status metadata.
|
||||
/// Distinguishes between: not queried, queried with value, queried but absent, query failed.
|
||||
/// </summary>
|
||||
/// <typeparam name="T">The signal evidence type.</typeparam>
|
||||
public sealed record SignalState<T>
|
||||
{
|
||||
/// <summary>Status of the signal query.</summary>
|
||||
public required SignalQueryStatus Status { get; init; }
|
||||
|
||||
/// <summary>Signal value if Status is Queried and value exists.</summary>
|
||||
public T? Value { get; init; }
|
||||
|
||||
/// <summary>When the signal was last queried (UTC).</summary>
|
||||
public DateTimeOffset? QueriedAt { get; init; }
|
||||
|
||||
/// <summary>Reason for failure if Status is Failed.</summary>
|
||||
public string? FailureReason { get; init; }
|
||||
|
||||
/// <summary>Source that provided the value (feed ID, issuer, etc.).</summary>
|
||||
public string? Source { get; init; }
|
||||
|
||||
/// <summary>Whether this signal contributes to uncertainty (true if not queried or failed).</summary>
|
||||
public bool ContributesToUncertainty =>
|
||||
Status is SignalQueryStatus.NotQueried or SignalQueryStatus.Failed;
|
||||
|
||||
/// <summary>Whether this signal has a usable value.</summary>
|
||||
public bool HasValue => Status == SignalQueryStatus.Queried && Value is not null;
|
||||
|
||||
/// <summary>Creates a NotQueried signal state.</summary>
|
||||
public static SignalState<T> NotQueried() => new()
|
||||
{
|
||||
Status = SignalQueryStatus.NotQueried
|
||||
};
|
||||
|
||||
/// <summary>Creates a Queried signal state with a value.</summary>
|
||||
public static SignalState<T> WithValue(T value, DateTimeOffset queriedAt, string? source = null) => new()
|
||||
{
|
||||
Status = SignalQueryStatus.Queried,
|
||||
Value = value,
|
||||
QueriedAt = queriedAt,
|
||||
Source = source
|
||||
};
|
||||
|
||||
/// <summary>Creates a Queried signal state with null (queried but absent).</summary>
|
||||
public static SignalState<T> Absent(DateTimeOffset queriedAt, string? source = null) => new()
|
||||
{
|
||||
Status = SignalQueryStatus.Queried,
|
||||
Value = default,
|
||||
QueriedAt = queriedAt,
|
||||
Source = source
|
||||
};
|
||||
|
||||
/// <summary>Creates a Failed signal state.</summary>
|
||||
public static SignalState<T> Failed(string reason) => new()
|
||||
{
|
||||
Status = SignalQueryStatus.Failed,
|
||||
FailureReason = reason
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Query status for a signal source.
|
||||
/// </summary>
|
||||
public enum SignalQueryStatus
|
||||
{
|
||||
/// <summary>Signal source not yet queried.</summary>
|
||||
NotQueried = 0,
|
||||
|
||||
/// <summary>Signal source queried; value may be present or absent.</summary>
|
||||
Queried = 1,
|
||||
|
||||
/// <summary>Signal query failed (timeout, network, parse error).</summary>
|
||||
Failed = 2
|
||||
}
|
||||
```
|
||||
|
||||
### SignalSnapshot Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Immutable snapshot of all signals for a CVE observation at a point in time.
|
||||
/// </summary>
|
||||
public sealed record SignalSnapshot
|
||||
{
|
||||
/// <summary>CVE identifier (e.g., CVE-2026-12345).</summary>
|
||||
public required string CveId { get; init; }
|
||||
|
||||
/// <summary>Subject component (PURL).</summary>
|
||||
public required string SubjectPurl { get; init; }
|
||||
|
||||
/// <summary>Snapshot capture time (UTC).</summary>
|
||||
public required DateTimeOffset CapturedAt { get; init; }
|
||||
|
||||
/// <summary>EPSS score signal.</summary>
|
||||
public required SignalState<EpssEvidence> Epss { get; init; }
|
||||
|
||||
/// <summary>VEX claim signal.</summary>
|
||||
public required SignalState<VexClaimSummary> Vex { get; init; }
|
||||
|
||||
/// <summary>Reachability determination signal.</summary>
|
||||
public required SignalState<ReachabilityEvidence> Reachability { get; init; }
|
||||
|
||||
/// <summary>Runtime observation signal (eBPF, dyld, ETW).</summary>
|
||||
public required SignalState<RuntimeEvidence> Runtime { get; init; }
|
||||
|
||||
/// <summary>Fix backport detection signal.</summary>
|
||||
public required SignalState<BackportEvidence> Backport { get; init; }
|
||||
|
||||
/// <summary>SBOM lineage signal.</summary>
|
||||
public required SignalState<SbomLineageEvidence> SbomLineage { get; init; }
|
||||
|
||||
/// <summary>Known Exploited Vulnerability flag.</summary>
|
||||
public required SignalState<bool> Kev { get; init; }
|
||||
|
||||
/// <summary>CVSS score signal.</summary>
|
||||
public required SignalState<CvssEvidence> Cvss { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Creates an empty snapshot with all signals in NotQueried state.
|
||||
/// </summary>
|
||||
public static SignalSnapshot Empty(string cveId, string subjectPurl, DateTimeOffset capturedAt) => new()
|
||||
{
|
||||
CveId = cveId,
|
||||
SubjectPurl = subjectPurl,
|
||||
CapturedAt = capturedAt,
|
||||
Epss = SignalState<EpssEvidence>.NotQueried(),
|
||||
Vex = SignalState<VexClaimSummary>.NotQueried(),
|
||||
Reachability = SignalState<ReachabilityEvidence>.NotQueried(),
|
||||
Runtime = SignalState<RuntimeEvidence>.NotQueried(),
|
||||
Backport = SignalState<BackportEvidence>.NotQueried(),
|
||||
SbomLineage = SignalState<SbomLineageEvidence>.NotQueried(),
|
||||
Kev = SignalState<bool>.NotQueried(),
|
||||
Cvss = SignalState<CvssEvidence>.NotQueried()
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### UncertaintyScore Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Measures knowledge completeness for a CVE observation.
|
||||
/// High entropy (close to 1.0) means many signals are missing.
|
||||
/// Low entropy (close to 0.0) means comprehensive evidence.
|
||||
/// </summary>
|
||||
public sealed record UncertaintyScore
|
||||
{
|
||||
/// <summary>Entropy value [0.0-1.0]. Higher = more uncertain.</summary>
|
||||
public required double Entropy { get; init; }
|
||||
|
||||
/// <summary>Completeness value [0.0-1.0]. Higher = more complete. (1 - Entropy)</summary>
|
||||
public double Completeness => 1.0 - Entropy;
|
||||
|
||||
/// <summary>Signals that are missing or failed.</summary>
|
||||
public required ImmutableArray<SignalGap> MissingSignals { get; init; }
|
||||
|
||||
/// <summary>Weighted sum of present signals.</summary>
|
||||
public required double WeightedEvidenceSum { get; init; }
|
||||
|
||||
/// <summary>Maximum possible weighted sum (all signals present).</summary>
|
||||
public required double MaxPossibleWeight { get; init; }
|
||||
|
||||
/// <summary>Tier classification based on entropy.</summary>
|
||||
public UncertaintyTier Tier => Entropy switch
|
||||
{
|
||||
<= 0.2 => UncertaintyTier.VeryLow,
|
||||
<= 0.4 => UncertaintyTier.Low,
|
||||
<= 0.6 => UncertaintyTier.Medium,
|
||||
<= 0.8 => UncertaintyTier.High,
|
||||
_ => UncertaintyTier.VeryHigh
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Creates a fully certain score (all evidence present).
|
||||
/// </summary>
|
||||
public static UncertaintyScore FullyCertain(double maxWeight) => new()
|
||||
{
|
||||
Entropy = 0.0,
|
||||
MissingSignals = ImmutableArray<SignalGap>.Empty,
|
||||
WeightedEvidenceSum = maxWeight,
|
||||
MaxPossibleWeight = maxWeight
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Creates a fully uncertain score (no evidence).
|
||||
/// </summary>
|
||||
public static UncertaintyScore FullyUncertain(double maxWeight, ImmutableArray<SignalGap> gaps) => new()
|
||||
{
|
||||
Entropy = 1.0,
|
||||
MissingSignals = gaps,
|
||||
WeightedEvidenceSum = 0.0,
|
||||
MaxPossibleWeight = maxWeight
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Tier classification for uncertainty levels.
|
||||
/// </summary>
|
||||
public enum UncertaintyTier
|
||||
{
|
||||
/// <summary>Entropy <= 0.2: Comprehensive evidence.</summary>
|
||||
VeryLow = 0,
|
||||
|
||||
/// <summary>Entropy <= 0.4: Good evidence coverage.</summary>
|
||||
Low = 1,
|
||||
|
||||
/// <summary>Entropy <= 0.6: Moderate gaps.</summary>
|
||||
Medium = 2,
|
||||
|
||||
/// <summary>Entropy <= 0.8: Significant gaps.</summary>
|
||||
High = 3,
|
||||
|
||||
/// <summary>Entropy > 0.8: Minimal evidence.</summary>
|
||||
VeryHigh = 4
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Represents a missing or failed signal in uncertainty calculation.
|
||||
/// </summary>
|
||||
public sealed record SignalGap(
|
||||
string SignalName,
|
||||
double Weight,
|
||||
SignalQueryStatus Status,
|
||||
string? Reason);
|
||||
```
|
||||
|
||||
### ObservationDecay Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Tracks evidence freshness decay for a CVE observation.
|
||||
/// </summary>
|
||||
public sealed record ObservationDecay
|
||||
{
|
||||
/// <summary>Half-life for confidence decay. Default: 14 days per advisory.</summary>
|
||||
public required TimeSpan HalfLife { get; init; }
|
||||
|
||||
/// <summary>Minimum confidence floor (never decays below). Default: 0.35.</summary>
|
||||
public required double Floor { get; init; }
|
||||
|
||||
/// <summary>Last time any signal was updated (UTC).</summary>
|
||||
public required DateTimeOffset LastSignalUpdate { get; init; }
|
||||
|
||||
/// <summary>Current decayed confidence multiplier [Floor-1.0].</summary>
|
||||
public required double DecayedMultiplier { get; init; }
|
||||
|
||||
/// <summary>When next auto-review is scheduled (UTC).</summary>
|
||||
public DateTimeOffset? NextReviewAt { get; init; }
|
||||
|
||||
/// <summary>Whether decay has triggered stale state.</summary>
|
||||
public bool IsStale { get; init; }
|
||||
|
||||
/// <summary>Age of the evidence in days.</summary>
|
||||
public double AgeDays { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Creates a fresh observation (no decay applied).
|
||||
/// </summary>
|
||||
public static ObservationDecay Fresh(DateTimeOffset lastUpdate, TimeSpan halfLife, double floor = 0.35) => new()
|
||||
{
|
||||
HalfLife = halfLife,
|
||||
Floor = floor,
|
||||
LastSignalUpdate = lastUpdate,
|
||||
DecayedMultiplier = 1.0,
|
||||
NextReviewAt = lastUpdate.Add(halfLife),
|
||||
IsStale = false,
|
||||
AgeDays = 0
|
||||
};
|
||||
|
||||
/// <summary>Default half-life: 14 days per advisory recommendation.</summary>
|
||||
public static readonly TimeSpan DefaultHalfLife = TimeSpan.FromDays(14);
|
||||
|
||||
/// <summary>Default floor: 0.35 per existing FreshnessCalculator.</summary>
|
||||
public const double DefaultFloor = 0.35;
|
||||
}
|
||||
```
|
||||
|
||||
### GuardRails Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Guardrails applied when allowing uncertain observations.
|
||||
/// </summary>
|
||||
public sealed record GuardRails
|
||||
{
|
||||
/// <summary>Enable runtime monitoring for this observation.</summary>
|
||||
public required bool EnableRuntimeMonitoring { get; init; }
|
||||
|
||||
/// <summary>Interval for automatic re-review.</summary>
|
||||
public required TimeSpan ReviewInterval { get; init; }
|
||||
|
||||
/// <summary>EPSS threshold that triggers automatic escalation.</summary>
|
||||
public required double EpssEscalationThreshold { get; init; }
|
||||
|
||||
/// <summary>Reachability status that triggers escalation.</summary>
|
||||
public required ImmutableArray<string> EscalatingReachabilityStates { get; init; }
|
||||
|
||||
/// <summary>Maximum time in guarded state before forced review.</summary>
|
||||
public required TimeSpan MaxGuardedDuration { get; init; }
|
||||
|
||||
/// <summary>Alert channels for this observation.</summary>
|
||||
public ImmutableArray<string> AlertChannels { get; init; } = ImmutableArray<string>.Empty;
|
||||
|
||||
/// <summary>Additional context for audit trail.</summary>
|
||||
public string? PolicyRationale { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Creates default guardrails per advisory recommendation.
|
||||
/// </summary>
|
||||
public static GuardRails Default() => new()
|
||||
{
|
||||
EnableRuntimeMonitoring = true,
|
||||
ReviewInterval = TimeSpan.FromDays(7),
|
||||
EpssEscalationThreshold = 0.4,
|
||||
EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
|
||||
MaxGuardedDuration = TimeSpan.FromDays(30)
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationContext Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Context for determinization policy evaluation.
|
||||
/// </summary>
|
||||
public sealed record DeterminizationContext
|
||||
{
|
||||
/// <summary>Point-in-time signal snapshot.</summary>
|
||||
public required SignalSnapshot SignalSnapshot { get; init; }
|
||||
|
||||
/// <summary>Calculated uncertainty score.</summary>
|
||||
public required UncertaintyScore UncertaintyScore { get; init; }
|
||||
|
||||
/// <summary>Evidence decay information.</summary>
|
||||
public required ObservationDecay Decay { get; init; }
|
||||
|
||||
/// <summary>Aggregated trust score [0.0-1.0].</summary>
|
||||
public required double TrustScore { get; init; }
|
||||
|
||||
/// <summary>Deployment environment (Production, Staging, Development).</summary>
|
||||
public required DeploymentEnvironment Environment { get; init; }
|
||||
|
||||
/// <summary>Asset criticality tier (optional).</summary>
|
||||
public AssetCriticality? AssetCriticality { get; init; }
|
||||
|
||||
/// <summary>Existing observation state (for transition decisions).</summary>
|
||||
public ObservationState? CurrentState { get; init; }
|
||||
|
||||
/// <summary>Policy evaluation options.</summary>
|
||||
public DeterminizationOptions? Options { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Deployment environment classification.
|
||||
/// </summary>
|
||||
public enum DeploymentEnvironment
|
||||
{
|
||||
Development = 0,
|
||||
Staging = 1,
|
||||
Production = 2
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Asset criticality classification.
|
||||
/// </summary>
|
||||
public enum AssetCriticality
|
||||
{
|
||||
Low = 0,
|
||||
Medium = 1,
|
||||
High = 2,
|
||||
Critical = 3
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationResult Record
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Result of determinization policy evaluation.
|
||||
/// </summary>
|
||||
public sealed record DeterminizationResult
|
||||
{
|
||||
/// <summary>Policy verdict status.</summary>
|
||||
public required PolicyVerdictStatus Status { get; init; }
|
||||
|
||||
/// <summary>Human-readable reason for the decision.</summary>
|
||||
public required string Reason { get; init; }
|
||||
|
||||
/// <summary>Guardrails to apply if Status is GuardedPass.</summary>
|
||||
public GuardRails? GuardRails { get; init; }
|
||||
|
||||
/// <summary>Suggested new observation state.</summary>
|
||||
public ObservationState? SuggestedState { get; init; }
|
||||
|
||||
/// <summary>Rule that matched (for audit).</summary>
|
||||
public string? MatchedRule { get; init; }
|
||||
|
||||
/// <summary>Additional metadata for audit trail.</summary>
|
||||
public ImmutableDictionary<string, object>? Metadata { get; init; }
|
||||
|
||||
public static DeterminizationResult Allowed(string reason, PolicyVerdictStatus status = PolicyVerdictStatus.Pass) =>
|
||||
new() { Status = status, Reason = reason, SuggestedState = ObservationState.Determined };
|
||||
|
||||
public static DeterminizationResult GuardedAllow(string reason, PolicyVerdictStatus status, GuardRails guardrails) =>
|
||||
new() { Status = status, Reason = reason, GuardRails = guardrails, SuggestedState = ObservationState.PendingDeterminization };
|
||||
|
||||
public static DeterminizationResult Quarantined(string reason, PolicyVerdictStatus status) =>
|
||||
new() { Status = status, Reason = reason, SuggestedState = ObservationState.ManualReviewRequired };
|
||||
|
||||
public static DeterminizationResult Escalated(string reason, PolicyVerdictStatus status) =>
|
||||
new() { Status = status, Reason = reason, SuggestedState = ObservationState.ManualReviewRequired };
|
||||
|
||||
public static DeterminizationResult Deferred(string reason, PolicyVerdictStatus status) =>
|
||||
new() { Status = status, Reason = reason, SuggestedState = ObservationState.StaleRequiresRefresh };
|
||||
}
|
||||
```
|
||||
|
||||
### Evidence Models
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Evidence;
|
||||
|
||||
/// <summary>
|
||||
/// EPSS evidence for a CVE.
|
||||
/// </summary>
|
||||
public sealed record EpssEvidence
|
||||
{
|
||||
/// <summary>EPSS score [0.0-1.0].</summary>
|
||||
public required double Score { get; init; }
|
||||
|
||||
/// <summary>EPSS percentile [0.0-1.0].</summary>
|
||||
public required double Percentile { get; init; }
|
||||
|
||||
/// <summary>EPSS model date.</summary>
|
||||
public required DateOnly ModelDate { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// VEX claim summary for a CVE.
|
||||
/// </summary>
|
||||
public sealed record VexClaimSummary
|
||||
{
|
||||
/// <summary>VEX status.</summary>
|
||||
public required string Status { get; init; }
|
||||
|
||||
/// <summary>Justification if not_affected.</summary>
|
||||
public string? Justification { get; init; }
|
||||
|
||||
/// <summary>Issuer of the VEX statement.</summary>
|
||||
public required string Issuer { get; init; }
|
||||
|
||||
/// <summary>Issuer trust level.</summary>
|
||||
public required double IssuerTrust { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Reachability evidence for a CVE.
|
||||
/// </summary>
|
||||
public sealed record ReachabilityEvidence
|
||||
{
|
||||
/// <summary>Reachability status.</summary>
|
||||
public required ReachabilityStatus Status { get; init; }
|
||||
|
||||
/// <summary>Confidence in the determination [0.0-1.0].</summary>
|
||||
public required double Confidence { get; init; }
|
||||
|
||||
/// <summary>Call path depth if reachable.</summary>
|
||||
public int? PathDepth { get; init; }
|
||||
}
|
||||
|
||||
public enum ReachabilityStatus
|
||||
{
|
||||
Unknown = 0,
|
||||
Reachable = 1,
|
||||
Unreachable = 2,
|
||||
Gated = 3,
|
||||
ObservedReachable = 4
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Runtime observation evidence.
|
||||
/// </summary>
|
||||
public sealed record RuntimeEvidence
|
||||
{
|
||||
/// <summary>Whether vulnerable code was observed loaded.</summary>
|
||||
public required bool ObservedLoaded { get; init; }
|
||||
|
||||
/// <summary>Observation source (eBPF, dyld, ETW).</summary>
|
||||
public required string Source { get; init; }
|
||||
|
||||
/// <summary>Observation window.</summary>
|
||||
public required TimeSpan ObservationWindow { get; init; }
|
||||
|
||||
/// <summary>Sample count.</summary>
|
||||
public required int SampleCount { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Fix backport detection evidence.
|
||||
/// </summary>
|
||||
public sealed record BackportEvidence
|
||||
{
|
||||
/// <summary>Whether a backport was detected.</summary>
|
||||
public required bool BackportDetected { get; init; }
|
||||
|
||||
/// <summary>Confidence in detection [0.0-1.0].</summary>
|
||||
public required double Confidence { get; init; }
|
||||
|
||||
/// <summary>Detection method.</summary>
|
||||
public string? Method { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// SBOM lineage evidence.
|
||||
/// </summary>
|
||||
public sealed record SbomLineageEvidence
|
||||
{
|
||||
/// <summary>Whether lineage is verified.</summary>
|
||||
public required bool LineageVerified { get; init; }
|
||||
|
||||
/// <summary>SBOM quality score [0.0-1.0].</summary>
|
||||
public required double QualityScore { get; init; }
|
||||
|
||||
/// <summary>Provenance attestation present.</summary>
|
||||
public required bool HasProvenanceAttestation { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// CVSS evidence for a CVE.
|
||||
/// </summary>
|
||||
public sealed record CvssEvidence
|
||||
{
|
||||
/// <summary>CVSS base score [0.0-10.0].</summary>
|
||||
public required double BaseScore { get; init; }
|
||||
|
||||
/// <summary>CVSS version (2.0, 3.0, 3.1, 4.0).</summary>
|
||||
public required string Version { get; init; }
|
||||
|
||||
/// <summary>CVSS vector string.</summary>
|
||||
public string? Vector { get; init; }
|
||||
|
||||
/// <summary>Severity label.</summary>
|
||||
public required string Severity { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Project File
|
||||
|
||||
```xml
|
||||
<Project Sdk="Microsoft.NET.Sdk">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<Nullable>enable</Nullable>
|
||||
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
|
||||
<RootNamespace>StellaOps.Policy.Determinization</RootNamespace>
|
||||
<AssemblyName>StellaOps.Policy.Determinization</AssemblyName>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="System.Collections.Immutable" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\StellaOps.Policy\StellaOps.Policy.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | DCM-001 | TODO | - | Guild | Create `StellaOps.Policy.Determinization.csproj` project |
|
||||
| 2 | DCM-002 | TODO | DCM-001 | Guild | Implement `ObservationState` enum |
|
||||
| 3 | DCM-003 | TODO | DCM-001 | Guild | Implement `SignalQueryStatus` enum |
|
||||
| 4 | DCM-004 | TODO | DCM-003 | Guild | Implement `SignalState<T>` record with factory methods |
|
||||
| 5 | DCM-005 | TODO | DCM-004 | Guild | Implement `SignalGap` record |
|
||||
| 6 | DCM-006 | TODO | DCM-005 | Guild | Implement `UncertaintyTier` enum |
|
||||
| 7 | DCM-007 | TODO | DCM-006 | Guild | Implement `UncertaintyScore` record with factory methods |
|
||||
| 8 | DCM-008 | TODO | DCM-001 | Guild | Implement `ObservationDecay` record with factory methods |
|
||||
| 9 | DCM-009 | TODO | DCM-001 | Guild | Implement `GuardRails` record with defaults |
|
||||
| 10 | DCM-010 | TODO | DCM-001 | Guild | Implement `DeploymentEnvironment` enum |
|
||||
| 11 | DCM-011 | TODO | DCM-001 | Guild | Implement `AssetCriticality` enum |
|
||||
| 12 | DCM-012 | TODO | DCM-011 | Guild | Implement `DeterminizationContext` record |
|
||||
| 13 | DCM-013 | TODO | DCM-012 | Guild | Implement `DeterminizationResult` record with factory methods |
|
||||
| 14 | DCM-014 | TODO | DCM-001 | Guild | Implement `EpssEvidence` record |
|
||||
| 15 | DCM-015 | TODO | DCM-001 | Guild | Implement `VexClaimSummary` record |
|
||||
| 16 | DCM-016 | TODO | DCM-001 | Guild | Implement `ReachabilityEvidence` record with status enum |
|
||||
| 17 | DCM-017 | TODO | DCM-001 | Guild | Implement `RuntimeEvidence` record |
|
||||
| 18 | DCM-018 | TODO | DCM-001 | Guild | Implement `BackportEvidence` record |
|
||||
| 19 | DCM-019 | TODO | DCM-001 | Guild | Implement `SbomLineageEvidence` record |
|
||||
| 20 | DCM-020 | TODO | DCM-001 | Guild | Implement `CvssEvidence` record |
|
||||
| 21 | DCM-021 | TODO | DCM-020 | Guild | Implement `SignalSnapshot` record with Empty factory |
|
||||
| 22 | DCM-022 | TODO | DCM-021 | Guild | Add `GlobalUsings.cs` with common imports |
|
||||
| 23 | DCM-023 | TODO | DCM-022 | Guild | Create test project `StellaOps.Policy.Determinization.Tests` |
|
||||
| 24 | DCM-024 | TODO | DCM-023 | Guild | Write unit tests: `SignalState<T>` factory methods |
|
||||
| 25 | DCM-025 | TODO | DCM-024 | Guild | Write unit tests: `UncertaintyScore` tier calculation |
|
||||
| 26 | DCM-026 | TODO | DCM-025 | Guild | Write unit tests: `ObservationDecay` fresh/stale detection |
|
||||
| 27 | DCM-027 | TODO | DCM-026 | Guild | Write unit tests: `SignalSnapshot.Empty()` initialization |
|
||||
| 28 | DCM-028 | TODO | DCM-027 | Guild | Write unit tests: `DeterminizationResult` factory methods |
|
||||
| 29 | DCM-029 | TODO | DCM-028 | Guild | Add project to `StellaOps.Policy.sln` |
|
||||
| 30 | DCM-030 | TODO | DCM-029 | Guild | Verify build with `dotnet build` |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. All model types compile without warnings
|
||||
2. Unit tests pass for all factory methods
|
||||
3. `SignalState<T>` correctly distinguishes NotQueried/Queried/Failed
|
||||
4. `UncertaintyScore.Tier` correctly maps entropy ranges
|
||||
5. `ObservationDecay` correctly calculates staleness
|
||||
6. All records are immutable and use `required` where appropriate
|
||||
7. XML documentation complete for all public types
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Separate `ObservationState` from VEX status | Orthogonal concerns: VEX = vulnerability impact, Observation = evidence lifecycle |
|
||||
| `SignalState<T>` as generic wrapper | Type safety for different evidence types; unified null-awareness |
|
||||
| Entropy tiers at 0.2 increments | Aligns with existing confidence tiers; provides 5 distinct levels |
|
||||
| 14-day default half-life | Per advisory recommendation; shorter than existing 90-day FreshnessCalculator |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Evidence type proliferation | Keep evidence records minimal; reference existing types where possible |
|
||||
| Name collision with EntropySignal | Use "Uncertainty" terminology consistently; document difference |
|
||||
| Breaking changes to PolicyVerdictStatus | GuardedPass addition is additive; existing code unaffected |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-07: DCM-001 to DCM-013 complete (core models)
|
||||
- 2026-01-08: DCM-014 to DCM-022 complete (evidence models)
|
||||
- 2026-01-09: DCM-023 to DCM-030 complete (tests, integration)
|
||||
@@ -0,0 +1,737 @@
|
||||
# Sprint 20260106_001_001_LB - Unified Verdict Rationale Renderer
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Implement a unified verdict rationale renderer that composes existing evidence (PathWitness, RiskVerdictAttestation, ScoreExplanation, VEX consensus) into a standardized 4-line template for consistent explainability across UI, CLI, and API.
|
||||
|
||||
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Explainability/`
|
||||
- **Evidence:** New library with renderer, tests, schema validation
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The product advisory requires **uniform, explainable verdicts** with a 4-line template:
|
||||
|
||||
1. **Evidence:** "CVE-2024-XXXX in `libxyz` 1.2.3; symbol `foo_read` reachable from `/usr/bin/tool`."
|
||||
2. **Policy clause:** "Policy S2.1: reachable+EPSS>=0.2 => triage=P1."
|
||||
3. **Attestations/Proofs:** "Build-ID match to vendor advisory; call-path: `main->parse->foo_read`."
|
||||
4. **Decision:** "Affected (score 0.72). Mitigation recommended: upgrade or backport KB-123."
|
||||
|
||||
Current state:
|
||||
- `RiskVerdictAttestation` has `Explanation` field but no structured format
|
||||
- `PathWitness` documents call paths but not rendered into rationale
|
||||
- `ScoreExplanation` has factor breakdowns but not composed with verdicts
|
||||
- `VerdictReasonCode` has descriptions but not formatted for users
|
||||
- `AdvisoryAI.ExplanationResult` provides LLM explanations but no template enforcement
|
||||
|
||||
**Gap:** No unified renderer that composes these pieces into the 4-line format for any output channel.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** None (uses existing models)
|
||||
- **Blocks:** None
|
||||
- **Parallel safe:** New library; no cross-module conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/architecture.md
|
||||
- src/Policy/AGENTS.md (if exists)
|
||||
- Product Advisory: "Smart-Diff & Unknowns" explainability section
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Data Contracts
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Explainability;
|
||||
|
||||
/// <summary>
|
||||
/// Structured verdict rationale following the 4-line template.
|
||||
/// </summary>
|
||||
public sealed record VerdictRationale
|
||||
{
|
||||
/// <summary>Schema version for forward compatibility.</summary>
|
||||
[JsonPropertyName("schema_version")]
|
||||
public string SchemaVersion { get; init; } = "1.0";
|
||||
|
||||
/// <summary>Unique rationale ID (content-addressed).</summary>
|
||||
[JsonPropertyName("rationale_id")]
|
||||
public required string RationaleId { get; init; }
|
||||
|
||||
/// <summary>Reference to the verdict being explained.</summary>
|
||||
[JsonPropertyName("verdict_ref")]
|
||||
public required VerdictReference VerdictRef { get; init; }
|
||||
|
||||
/// <summary>Line 1: Evidence summary.</summary>
|
||||
[JsonPropertyName("evidence")]
|
||||
public required RationaleEvidence Evidence { get; init; }
|
||||
|
||||
/// <summary>Line 2: Policy clause that triggered the decision.</summary>
|
||||
[JsonPropertyName("policy_clause")]
|
||||
public required RationalePolicyClause PolicyClause { get; init; }
|
||||
|
||||
/// <summary>Line 3: Attestations and proofs supporting the verdict.</summary>
|
||||
[JsonPropertyName("attestations")]
|
||||
public required RationaleAttestations Attestations { get; init; }
|
||||
|
||||
/// <summary>Line 4: Final decision with score and recommendation.</summary>
|
||||
[JsonPropertyName("decision")]
|
||||
public required RationaleDecision Decision { get; init; }
|
||||
|
||||
/// <summary>Generation timestamp (UTC).</summary>
|
||||
[JsonPropertyName("generated_at")]
|
||||
public required DateTimeOffset GeneratedAt { get; init; }
|
||||
|
||||
/// <summary>Input digests for reproducibility.</summary>
|
||||
[JsonPropertyName("input_digests")]
|
||||
public required RationaleInputDigests InputDigests { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Reference to the verdict being explained.</summary>
|
||||
public sealed record VerdictReference
|
||||
{
|
||||
[JsonPropertyName("attestation_id")]
|
||||
public required string AttestationId { get; init; }
|
||||
|
||||
[JsonPropertyName("artifact_digest")]
|
||||
public required string ArtifactDigest { get; init; }
|
||||
|
||||
[JsonPropertyName("policy_id")]
|
||||
public required string PolicyId { get; init; }
|
||||
|
||||
[JsonPropertyName("policy_version")]
|
||||
public required string PolicyVersion { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Line 1: Evidence summary.</summary>
|
||||
public sealed record RationaleEvidence
|
||||
{
|
||||
/// <summary>Primary vulnerability ID (CVE, GHSA, etc.).</summary>
|
||||
[JsonPropertyName("vulnerability_id")]
|
||||
public required string VulnerabilityId { get; init; }
|
||||
|
||||
/// <summary>Affected component PURL.</summary>
|
||||
[JsonPropertyName("component_purl")]
|
||||
public required string ComponentPurl { get; init; }
|
||||
|
||||
/// <summary>Affected version.</summary>
|
||||
[JsonPropertyName("component_version")]
|
||||
public required string ComponentVersion { get; init; }
|
||||
|
||||
/// <summary>Vulnerable symbol (if reachability analyzed).</summary>
|
||||
[JsonPropertyName("vulnerable_symbol")]
|
||||
public string? VulnerableSymbol { get; init; }
|
||||
|
||||
/// <summary>Entry point from which vulnerable code is reachable.</summary>
|
||||
[JsonPropertyName("entrypoint")]
|
||||
public string? Entrypoint { get; init; }
|
||||
|
||||
/// <summary>Rendered text for display.</summary>
|
||||
[JsonPropertyName("text")]
|
||||
public required string Text { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Line 2: Policy clause.</summary>
|
||||
public sealed record RationalePolicyClause
|
||||
{
|
||||
/// <summary>Policy section reference (e.g., "S2.1").</summary>
|
||||
[JsonPropertyName("section")]
|
||||
public required string Section { get; init; }
|
||||
|
||||
/// <summary>Rule expression that matched.</summary>
|
||||
[JsonPropertyName("rule_expression")]
|
||||
public required string RuleExpression { get; init; }
|
||||
|
||||
/// <summary>Resulting triage priority.</summary>
|
||||
[JsonPropertyName("triage_priority")]
|
||||
public required string TriagePriority { get; init; }
|
||||
|
||||
/// <summary>Rendered text for display.</summary>
|
||||
[JsonPropertyName("text")]
|
||||
public required string Text { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Line 3: Attestations and proofs.</summary>
|
||||
public sealed record RationaleAttestations
|
||||
{
|
||||
/// <summary>Build-ID match status.</summary>
|
||||
[JsonPropertyName("build_id_match")]
|
||||
public BuildIdMatchInfo? BuildIdMatch { get; init; }
|
||||
|
||||
/// <summary>Call path summary (if available).</summary>
|
||||
[JsonPropertyName("call_path")]
|
||||
public CallPathSummary? CallPath { get; init; }
|
||||
|
||||
/// <summary>VEX statement source.</summary>
|
||||
[JsonPropertyName("vex_source")]
|
||||
public string? VexSource { get; init; }
|
||||
|
||||
/// <summary>Suppression proof (if not affected).</summary>
|
||||
[JsonPropertyName("suppression_proof")]
|
||||
public SuppressionProofSummary? SuppressionProof { get; init; }
|
||||
|
||||
/// <summary>Rendered text for display.</summary>
|
||||
[JsonPropertyName("text")]
|
||||
public required string Text { get; init; }
|
||||
}
|
||||
|
||||
public sealed record BuildIdMatchInfo
|
||||
{
|
||||
[JsonPropertyName("build_id")]
|
||||
public required string BuildId { get; init; }
|
||||
|
||||
[JsonPropertyName("match_source")]
|
||||
public required string MatchSource { get; init; }
|
||||
|
||||
[JsonPropertyName("confidence")]
|
||||
public required double Confidence { get; init; }
|
||||
}
|
||||
|
||||
public sealed record CallPathSummary
|
||||
{
|
||||
[JsonPropertyName("hop_count")]
|
||||
public required int HopCount { get; init; }
|
||||
|
||||
[JsonPropertyName("path_abbreviated")]
|
||||
public required string PathAbbreviated { get; init; }
|
||||
|
||||
[JsonPropertyName("witness_id")]
|
||||
public string? WitnessId { get; init; }
|
||||
}
|
||||
|
||||
public sealed record SuppressionProofSummary
|
||||
{
|
||||
[JsonPropertyName("type")]
|
||||
public required string Type { get; init; }
|
||||
|
||||
[JsonPropertyName("reason")]
|
||||
public required string Reason { get; init; }
|
||||
|
||||
[JsonPropertyName("proof_id")]
|
||||
public string? ProofId { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Line 4: Decision with recommendation.</summary>
|
||||
public sealed record RationaleDecision
|
||||
{
|
||||
/// <summary>Final decision status.</summary>
|
||||
[JsonPropertyName("status")]
|
||||
public required string Status { get; init; }
|
||||
|
||||
/// <summary>Numeric risk score (0.0-1.0).</summary>
|
||||
[JsonPropertyName("score")]
|
||||
public required double Score { get; init; }
|
||||
|
||||
/// <summary>Score band (P1, P2, P3, P4).</summary>
|
||||
[JsonPropertyName("band")]
|
||||
public required string Band { get; init; }
|
||||
|
||||
/// <summary>Recommended mitigation action.</summary>
|
||||
[JsonPropertyName("recommendation")]
|
||||
public required string Recommendation { get; init; }
|
||||
|
||||
/// <summary>Knowledge base reference (if applicable).</summary>
|
||||
[JsonPropertyName("kb_ref")]
|
||||
public string? KbRef { get; init; }
|
||||
|
||||
/// <summary>Rendered text for display.</summary>
|
||||
[JsonPropertyName("text")]
|
||||
public required string Text { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Input digests for reproducibility verification.</summary>
|
||||
public sealed record RationaleInputDigests
|
||||
{
|
||||
[JsonPropertyName("verdict_digest")]
|
||||
public required string VerdictDigest { get; init; }
|
||||
|
||||
[JsonPropertyName("witness_digest")]
|
||||
public string? WitnessDigest { get; init; }
|
||||
|
||||
[JsonPropertyName("score_explanation_digest")]
|
||||
public string? ScoreExplanationDigest { get; init; }
|
||||
|
||||
[JsonPropertyName("vex_consensus_digest")]
|
||||
public string? VexConsensusDigest { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Renderer Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Explainability;
|
||||
|
||||
/// <summary>
|
||||
/// Renders structured rationales from verdict components.
|
||||
/// </summary>
|
||||
public interface IVerdictRationaleRenderer
|
||||
{
|
||||
/// <summary>
|
||||
/// Render a complete rationale from verdict components.
|
||||
/// </summary>
|
||||
VerdictRationale Render(VerdictRationaleInput input);
|
||||
|
||||
/// <summary>
|
||||
/// Render rationale as plain text (4 lines).
|
||||
/// </summary>
|
||||
string RenderPlainText(VerdictRationale rationale);
|
||||
|
||||
/// <summary>
|
||||
/// Render rationale as Markdown.
|
||||
/// </summary>
|
||||
string RenderMarkdown(VerdictRationale rationale);
|
||||
|
||||
/// <summary>
|
||||
/// Render rationale as structured JSON (RFC 8785 canonical).
|
||||
/// </summary>
|
||||
string RenderJson(VerdictRationale rationale);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Input components for rationale rendering.
|
||||
/// </summary>
|
||||
public sealed record VerdictRationaleInput
|
||||
{
|
||||
/// <summary>The verdict attestation being explained.</summary>
|
||||
public required RiskVerdictAttestation Verdict { get; init; }
|
||||
|
||||
/// <summary>Path witness (if reachability analyzed).</summary>
|
||||
public PathWitness? PathWitness { get; init; }
|
||||
|
||||
/// <summary>Score explanation with factor breakdown.</summary>
|
||||
public ScoreExplanation? ScoreExplanation { get; init; }
|
||||
|
||||
/// <summary>VEX consensus result.</summary>
|
||||
public ConsensusResult? VexConsensus { get; init; }
|
||||
|
||||
/// <summary>Policy rule that triggered the decision.</summary>
|
||||
public PolicyRuleMatch? TriggeringRule { get; init; }
|
||||
|
||||
/// <summary>Suppression proof (if not affected).</summary>
|
||||
public SuppressionWitness? SuppressionWitness { get; init; }
|
||||
|
||||
/// <summary>Recommended mitigation (from advisory or policy).</summary>
|
||||
public MitigationRecommendation? Recommendation { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Policy rule that matched during evaluation.
|
||||
/// </summary>
|
||||
public sealed record PolicyRuleMatch
|
||||
{
|
||||
public required string Section { get; init; }
|
||||
public required string RuleName { get; init; }
|
||||
public required string Expression { get; init; }
|
||||
public required string TriagePriority { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Mitigation recommendation.
|
||||
/// </summary>
|
||||
public sealed record MitigationRecommendation
|
||||
{
|
||||
public required string Action { get; init; }
|
||||
public string? KbRef { get; init; }
|
||||
public string? TargetVersion { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Renderer Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Explainability;
|
||||
|
||||
public sealed class VerdictRationaleRenderer : IVerdictRationaleRenderer
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<VerdictRationaleRenderer> _logger;
|
||||
|
||||
public VerdictRationaleRenderer(
|
||||
TimeProvider timeProvider,
|
||||
ILogger<VerdictRationaleRenderer> logger)
|
||||
{
|
||||
_timeProvider = timeProvider;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public VerdictRationale Render(VerdictRationaleInput input)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(input);
|
||||
ArgumentNullException.ThrowIfNull(input.Verdict);
|
||||
|
||||
var evidence = RenderEvidence(input);
|
||||
var policyClause = RenderPolicyClause(input);
|
||||
var attestations = RenderAttestations(input);
|
||||
var decision = RenderDecision(input);
|
||||
|
||||
var rationale = new VerdictRationale
|
||||
{
|
||||
RationaleId = ComputeRationaleId(input),
|
||||
VerdictRef = new VerdictReference
|
||||
{
|
||||
AttestationId = input.Verdict.AttestationId,
|
||||
ArtifactDigest = input.Verdict.Subject.Digest,
|
||||
PolicyId = input.Verdict.Policy.PolicyId,
|
||||
PolicyVersion = input.Verdict.Policy.Version
|
||||
},
|
||||
Evidence = evidence,
|
||||
PolicyClause = policyClause,
|
||||
Attestations = attestations,
|
||||
Decision = decision,
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
InputDigests = ComputeInputDigests(input)
|
||||
};
|
||||
|
||||
_logger.LogDebug("Rendered rationale {RationaleId} for verdict {VerdictId}",
|
||||
rationale.RationaleId, input.Verdict.AttestationId);
|
||||
|
||||
return rationale;
|
||||
}
|
||||
|
||||
private RationaleEvidence RenderEvidence(VerdictRationaleInput input)
|
||||
{
|
||||
var verdict = input.Verdict;
|
||||
var witness = input.PathWitness;
|
||||
|
||||
// Extract primary CVE from reason codes or evidence
|
||||
var vulnId = ExtractPrimaryVulnerabilityId(verdict);
|
||||
var componentPurl = verdict.Subject.Name ?? verdict.Subject.Digest;
|
||||
var componentVersion = ExtractVersion(componentPurl);
|
||||
|
||||
var text = witness is not null
|
||||
? $"{vulnId} in `{componentPurl}` {componentVersion}; " +
|
||||
$"symbol `{witness.Sink.Symbol}` reachable from `{witness.Entrypoint.Name}`."
|
||||
: $"{vulnId} in `{componentPurl}` {componentVersion}.";
|
||||
|
||||
return new RationaleEvidence
|
||||
{
|
||||
VulnerabilityId = vulnId,
|
||||
ComponentPurl = componentPurl,
|
||||
ComponentVersion = componentVersion,
|
||||
VulnerableSymbol = witness?.Sink.Symbol,
|
||||
Entrypoint = witness?.Entrypoint.Name,
|
||||
Text = text
|
||||
};
|
||||
}
|
||||
|
||||
private RationalePolicyClause RenderPolicyClause(VerdictRationaleInput input)
|
||||
{
|
||||
var rule = input.TriggeringRule;
|
||||
|
||||
if (rule is null)
|
||||
{
|
||||
// Infer from reason codes
|
||||
var primaryReason = input.Verdict.ReasonCodes.FirstOrDefault();
|
||||
return new RationalePolicyClause
|
||||
{
|
||||
Section = "default",
|
||||
RuleExpression = primaryReason?.GetDescription() ?? "policy evaluation",
|
||||
TriagePriority = MapVerdictToPriority(input.Verdict.Verdict),
|
||||
Text = $"Policy: {primaryReason?.GetDescription() ?? "default evaluation"} => " +
|
||||
$"triage={MapVerdictToPriority(input.Verdict.Verdict)}."
|
||||
};
|
||||
}
|
||||
|
||||
return new RationalePolicyClause
|
||||
{
|
||||
Section = rule.Section,
|
||||
RuleExpression = rule.Expression,
|
||||
TriagePriority = rule.TriagePriority,
|
||||
Text = $"Policy {rule.Section}: {rule.Expression} => triage={rule.TriagePriority}."
|
||||
};
|
||||
}
|
||||
|
||||
private RationaleAttestations RenderAttestations(VerdictRationaleInput input)
|
||||
{
|
||||
var parts = new List<string>();
|
||||
|
||||
BuildIdMatchInfo? buildIdMatch = null;
|
||||
CallPathSummary? callPath = null;
|
||||
SuppressionProofSummary? suppressionProof = null;
|
||||
|
||||
// Build-ID match
|
||||
if (input.PathWitness?.Evidence.BuildId is not null)
|
||||
{
|
||||
buildIdMatch = new BuildIdMatchInfo
|
||||
{
|
||||
BuildId = input.PathWitness.Evidence.BuildId,
|
||||
MatchSource = "vendor advisory",
|
||||
Confidence = 1.0
|
||||
};
|
||||
parts.Add($"Build-ID match to vendor advisory");
|
||||
}
|
||||
|
||||
// Call path
|
||||
if (input.PathWitness?.Path.Count > 0)
|
||||
{
|
||||
var abbreviated = AbbreviatePath(input.PathWitness.Path);
|
||||
callPath = new CallPathSummary
|
||||
{
|
||||
HopCount = input.PathWitness.Path.Count,
|
||||
PathAbbreviated = abbreviated,
|
||||
WitnessId = input.PathWitness.WitnessId
|
||||
};
|
||||
parts.Add($"call-path: `{abbreviated}`");
|
||||
}
|
||||
|
||||
// VEX source
|
||||
string? vexSource = null;
|
||||
if (input.VexConsensus is not null)
|
||||
{
|
||||
vexSource = $"VEX consensus ({input.VexConsensus.ContributingStatements} statements)";
|
||||
parts.Add(vexSource);
|
||||
}
|
||||
|
||||
// Suppression proof
|
||||
if (input.SuppressionWitness is not null)
|
||||
{
|
||||
suppressionProof = new SuppressionProofSummary
|
||||
{
|
||||
Type = input.SuppressionWitness.Type.ToString(),
|
||||
Reason = input.SuppressionWitness.Reason,
|
||||
ProofId = input.SuppressionWitness.WitnessId
|
||||
};
|
||||
parts.Add($"suppression: {input.SuppressionWitness.Reason}");
|
||||
}
|
||||
|
||||
var text = parts.Count > 0
|
||||
? string.Join("; ", parts) + "."
|
||||
: "No attestations available.";
|
||||
|
||||
return new RationaleAttestations
|
||||
{
|
||||
BuildIdMatch = buildIdMatch,
|
||||
CallPath = callPath,
|
||||
VexSource = vexSource,
|
||||
SuppressionProof = suppressionProof,
|
||||
Text = text
|
||||
};
|
||||
}
|
||||
|
||||
private RationaleDecision RenderDecision(VerdictRationaleInput input)
|
||||
{
|
||||
var verdict = input.Verdict;
|
||||
var score = input.ScoreExplanation?.Factors
|
||||
.Sum(f => f.Value * GetFactorWeight(f.Factor)) ?? 0.0;
|
||||
|
||||
var status = verdict.Verdict switch
|
||||
{
|
||||
RiskVerdictStatus.Pass => "Not Affected",
|
||||
RiskVerdictStatus.Fail => "Affected",
|
||||
RiskVerdictStatus.PassWithExceptions => "Affected (excepted)",
|
||||
RiskVerdictStatus.Indeterminate => "Under Investigation",
|
||||
_ => "Unknown"
|
||||
};
|
||||
|
||||
var band = score switch
|
||||
{
|
||||
>= 0.75 => "P1",
|
||||
>= 0.50 => "P2",
|
||||
>= 0.25 => "P3",
|
||||
_ => "P4"
|
||||
};
|
||||
|
||||
var recommendation = input.Recommendation?.Action ?? "Review finding and take appropriate action.";
|
||||
var kbRef = input.Recommendation?.KbRef;
|
||||
|
||||
var text = kbRef is not null
|
||||
? $"{status} (score {score:F2}). Mitigation recommended: {recommendation} {kbRef}."
|
||||
: $"{status} (score {score:F2}). Mitigation recommended: {recommendation}";
|
||||
|
||||
return new RationaleDecision
|
||||
{
|
||||
Status = status,
|
||||
Score = Math.Round(score, 2),
|
||||
Band = band,
|
||||
Recommendation = recommendation,
|
||||
KbRef = kbRef,
|
||||
Text = text
|
||||
};
|
||||
}
|
||||
|
||||
public string RenderPlainText(VerdictRationale rationale)
|
||||
{
|
||||
return $"""
|
||||
{rationale.Evidence.Text}
|
||||
{rationale.PolicyClause.Text}
|
||||
{rationale.Attestations.Text}
|
||||
{rationale.Decision.Text}
|
||||
""";
|
||||
}
|
||||
|
||||
public string RenderMarkdown(VerdictRationale rationale)
|
||||
{
|
||||
return $"""
|
||||
**Evidence:** {rationale.Evidence.Text}
|
||||
|
||||
**Policy:** {rationale.PolicyClause.Text}
|
||||
|
||||
**Attestations:** {rationale.Attestations.Text}
|
||||
|
||||
**Decision:** {rationale.Decision.Text}
|
||||
""";
|
||||
}
|
||||
|
||||
public string RenderJson(VerdictRationale rationale)
|
||||
{
|
||||
return CanonicalJsonSerializer.Serialize(rationale);
|
||||
}
|
||||
|
||||
private static string AbbreviatePath(IReadOnlyList<PathStep> path)
|
||||
{
|
||||
if (path.Count <= 3)
|
||||
{
|
||||
return string.Join("->", path.Select(p => p.Symbol));
|
||||
}
|
||||
|
||||
return $"{path[0].Symbol}->...({path.Count - 2} hops)->->{path[^1].Symbol}";
|
||||
}
|
||||
|
||||
private static string ComputeRationaleId(VerdictRationaleInput input)
|
||||
{
|
||||
var canonical = CanonicalJsonSerializer.Serialize(new
|
||||
{
|
||||
verdict_id = input.Verdict.AttestationId,
|
||||
witness_id = input.PathWitness?.WitnessId,
|
||||
score_factors = input.ScoreExplanation?.Factors.Count ?? 0
|
||||
});
|
||||
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(canonical));
|
||||
return $"rationale:sha256:{Convert.ToHexString(hash).ToLowerInvariant()}";
|
||||
}
|
||||
|
||||
private static RationaleInputDigests ComputeInputDigests(VerdictRationaleInput input)
|
||||
{
|
||||
return new RationaleInputDigests
|
||||
{
|
||||
VerdictDigest = input.Verdict.AttestationId,
|
||||
WitnessDigest = input.PathWitness?.Evidence.CallgraphDigest,
|
||||
ScoreExplanationDigest = input.ScoreExplanation is not null
|
||||
? ComputeDigest(input.ScoreExplanation)
|
||||
: null,
|
||||
VexConsensusDigest = input.VexConsensus is not null
|
||||
? ComputeDigest(input.VexConsensus)
|
||||
: null
|
||||
};
|
||||
}
|
||||
|
||||
private static string ComputeDigest(object obj)
|
||||
{
|
||||
var json = CanonicalJsonSerializer.Serialize(obj);
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(json));
|
||||
return $"sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..16]}";
|
||||
}
|
||||
|
||||
private static string ExtractPrimaryVulnerabilityId(RiskVerdictAttestation verdict)
|
||||
{
|
||||
// Try to extract from evidence refs
|
||||
var cveRef = verdict.Evidence.FirstOrDefault(e =>
|
||||
e.Type == "cve" || e.Description?.StartsWith("CVE-") == true);
|
||||
|
||||
return cveRef?.Description ?? "CVE-UNKNOWN";
|
||||
}
|
||||
|
||||
private static string ExtractVersion(string purl)
|
||||
{
|
||||
var atIndex = purl.LastIndexOf('@');
|
||||
return atIndex > 0 ? purl[(atIndex + 1)..] : "unknown";
|
||||
}
|
||||
|
||||
private static string MapVerdictToPriority(RiskVerdictStatus status)
|
||||
{
|
||||
return status switch
|
||||
{
|
||||
RiskVerdictStatus.Fail => "P1",
|
||||
RiskVerdictStatus.PassWithExceptions => "P2",
|
||||
RiskVerdictStatus.Indeterminate => "P3",
|
||||
RiskVerdictStatus.Pass => "P4",
|
||||
_ => "P4"
|
||||
};
|
||||
}
|
||||
|
||||
private static double GetFactorWeight(string factor)
|
||||
{
|
||||
return factor.ToLowerInvariant() switch
|
||||
{
|
||||
"reachability" => 0.30,
|
||||
"evidence" => 0.25,
|
||||
"provenance" => 0.20,
|
||||
"severity" => 0.25,
|
||||
_ => 0.10
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Service Registration
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Explainability;
|
||||
|
||||
public static class ExplainabilityServiceCollectionExtensions
|
||||
{
|
||||
public static IServiceCollection AddVerdictExplainability(this IServiceCollection services)
|
||||
{
|
||||
services.AddSingleton<IVerdictRationaleRenderer, VerdictRationaleRenderer>();
|
||||
return services;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | VRR-001 | TODO | - | - | Create `StellaOps.Policy.Explainability` project |
|
||||
| 2 | VRR-002 | TODO | VRR-001 | - | Define `VerdictRationale` and component records |
|
||||
| 3 | VRR-003 | TODO | VRR-002 | - | Define `IVerdictRationaleRenderer` interface |
|
||||
| 4 | VRR-004 | TODO | VRR-003 | - | Implement `VerdictRationaleRenderer.RenderEvidence()` |
|
||||
| 5 | VRR-005 | TODO | VRR-004 | - | Implement `VerdictRationaleRenderer.RenderPolicyClause()` |
|
||||
| 6 | VRR-006 | TODO | VRR-005 | - | Implement `VerdictRationaleRenderer.RenderAttestations()` |
|
||||
| 7 | VRR-007 | TODO | VRR-006 | - | Implement `VerdictRationaleRenderer.RenderDecision()` |
|
||||
| 8 | VRR-008 | TODO | VRR-007 | - | Implement `Render()` composition method |
|
||||
| 9 | VRR-009 | TODO | VRR-008 | - | Implement `RenderPlainText()` output |
|
||||
| 10 | VRR-010 | TODO | VRR-008 | - | Implement `RenderMarkdown()` output |
|
||||
| 11 | VRR-011 | TODO | VRR-008 | - | Implement `RenderJson()` with RFC 8785 canonicalization |
|
||||
| 12 | VRR-012 | TODO | VRR-011 | - | Add input digest computation for reproducibility |
|
||||
| 13 | VRR-013 | TODO | VRR-012 | - | Create service registration extension |
|
||||
| 14 | VRR-014 | TODO | VRR-013 | - | Write unit tests: evidence rendering |
|
||||
| 15 | VRR-015 | TODO | VRR-014 | - | Write unit tests: policy clause rendering |
|
||||
| 16 | VRR-016 | TODO | VRR-015 | - | Write unit tests: attestations rendering |
|
||||
| 17 | VRR-017 | TODO | VRR-016 | - | Write unit tests: decision rendering |
|
||||
| 18 | VRR-018 | TODO | VRR-017 | - | Write golden fixture tests for output formats |
|
||||
| 19 | VRR-019 | TODO | VRR-018 | - | Write determinism tests: same input -> same rationale ID |
|
||||
| 20 | VRR-020 | TODO | VRR-019 | - | Integrate into Scanner.WebService verdict endpoints |
|
||||
| 21 | VRR-021 | TODO | VRR-020 | - | Integrate into CLI triage commands |
|
||||
| 22 | VRR-022 | TODO | VRR-021 | - | Add OpenAPI schema for `VerdictRationale` |
|
||||
| 23 | VRR-023 | TODO | VRR-022 | - | Document rationale template in docs/modules/policy/ |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **4-Line Template:** All rationales follow Evidence -> Policy -> Attestations -> Decision format
|
||||
2. **Determinism:** Same inputs produce identical rationale IDs (content-addressed)
|
||||
3. **Output Formats:** Plain text, Markdown, and JSON outputs available
|
||||
4. **Reproducibility:** Input digests enable verification of rationale computation
|
||||
5. **Integration:** Renderer integrated into Scanner.WebService and CLI
|
||||
6. **Test Coverage:** Unit tests for each line, golden fixtures for formats
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| New library vs extension | Clean separation; renderer has no side effects |
|
||||
| Content-addressed IDs | Enables caching and deduplication |
|
||||
| RFC 8785 JSON | Consistent with existing canonical JSON usage |
|
||||
| Optional components | Graceful degradation when PathWitness/VEX unavailable |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Template too rigid | Make format configurable via options |
|
||||
| Missing context | Fallback text when components unavailable |
|
||||
| Performance | Cache rendered rationales by input digest |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |
|
||||
|
||||
@@ -0,0 +1,833 @@
|
||||
# Sprint 20260106_001_002_LB - Determinization: Scoring and Decay Calculations
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Implement the scoring and decay calculation services for the Determinization subsystem. This includes `UncertaintyScoreCalculator` (entropy from signal completeness), `DecayedConfidenceCalculator` (half-life decay), configurable signal weights, and prior distributions for missing signals.
|
||||
|
||||
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Determinization/`
|
||||
- **Evidence:** Calculator implementations, configuration options, unit tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Current confidence calculation:
|
||||
- Uses `ConfidenceScore` with weighted factors
|
||||
- No explicit "knowledge completeness" entropy calculation
|
||||
- `FreshnessCalculator` exists but uses 90-day half-life, not configurable per-observation
|
||||
- No prior distributions for missing signals
|
||||
|
||||
Advisory requires:
|
||||
- Entropy formula: `entropy = 1 - (weighted_present_signals / max_possible_weight)`
|
||||
- Decay formula: `decayed = max(floor, exp(-ln(2) * age_days / half_life_days))`
|
||||
- Configurable signal weights (default: VEX=0.25, EPSS=0.15, Reach=0.25, Runtime=0.15, Backport=0.10, SBOM=0.10)
|
||||
- 14-day half-life default (configurable)
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** SPRINT_20260106_001_001_LB (core models)
|
||||
- **Blocks:** SPRINT_20260106_001_003_POLICY (gates)
|
||||
- **Parallel safe:** Library additions; no cross-module conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/determinization-architecture.md
|
||||
- SPRINT_20260106_001_001_LB (core models)
|
||||
- Existing: `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/FreshnessCalculator.cs`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Directory Structure Addition
|
||||
|
||||
```
|
||||
src/Policy/__Libraries/StellaOps.Policy.Determinization/
|
||||
├── Scoring/
|
||||
│ ├── IUncertaintyScoreCalculator.cs
|
||||
│ ├── UncertaintyScoreCalculator.cs
|
||||
│ ├── IDecayedConfidenceCalculator.cs
|
||||
│ ├── DecayedConfidenceCalculator.cs
|
||||
│ ├── SignalWeights.cs
|
||||
│ ├── PriorDistribution.cs
|
||||
│ └── TrustScoreAggregator.cs
|
||||
├── DeterminizationOptions.cs
|
||||
└── ServiceCollectionExtensions.cs
|
||||
```
|
||||
|
||||
### IUncertaintyScoreCalculator Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Calculates knowledge completeness entropy from signal snapshots.
|
||||
/// </summary>
|
||||
public interface IUncertaintyScoreCalculator
|
||||
{
|
||||
/// <summary>
|
||||
/// Calculate uncertainty score from a signal snapshot.
|
||||
/// </summary>
|
||||
/// <param name="snapshot">Point-in-time signal collection.</param>
|
||||
/// <returns>Uncertainty score with entropy and missing signal details.</returns>
|
||||
UncertaintyScore Calculate(SignalSnapshot snapshot);
|
||||
|
||||
/// <summary>
|
||||
/// Calculate uncertainty score with custom weights.
|
||||
/// </summary>
|
||||
/// <param name="snapshot">Point-in-time signal collection.</param>
|
||||
/// <param name="weights">Custom signal weights.</param>
|
||||
/// <returns>Uncertainty score with entropy and missing signal details.</returns>
|
||||
UncertaintyScore Calculate(SignalSnapshot snapshot, SignalWeights weights);
|
||||
}
|
||||
```
|
||||
|
||||
### UncertaintyScoreCalculator Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Calculates knowledge completeness entropy from signal snapshot.
|
||||
/// Formula: entropy = 1 - (sum of weighted present signals / max possible weight)
|
||||
/// </summary>
|
||||
public sealed class UncertaintyScoreCalculator : IUncertaintyScoreCalculator
|
||||
{
|
||||
private readonly SignalWeights _defaultWeights;
|
||||
private readonly ILogger<UncertaintyScoreCalculator> _logger;
|
||||
|
||||
public UncertaintyScoreCalculator(
|
||||
IOptions<DeterminizationOptions> options,
|
||||
ILogger<UncertaintyScoreCalculator> logger)
|
||||
{
|
||||
_defaultWeights = options.Value.SignalWeights.Normalize();
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public UncertaintyScore Calculate(SignalSnapshot snapshot) =>
|
||||
Calculate(snapshot, _defaultWeights);
|
||||
|
||||
public UncertaintyScore Calculate(SignalSnapshot snapshot, SignalWeights weights)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(snapshot);
|
||||
ArgumentNullException.ThrowIfNull(weights);
|
||||
|
||||
var normalizedWeights = weights.Normalize();
|
||||
var gaps = new List<SignalGap>();
|
||||
var weightedSum = 0.0;
|
||||
|
||||
// EPSS signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.Epss,
|
||||
"EPSS",
|
||||
normalizedWeights.Epss,
|
||||
gaps);
|
||||
|
||||
// VEX signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.Vex,
|
||||
"VEX",
|
||||
normalizedWeights.Vex,
|
||||
gaps);
|
||||
|
||||
// Reachability signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.Reachability,
|
||||
"Reachability",
|
||||
normalizedWeights.Reachability,
|
||||
gaps);
|
||||
|
||||
// Runtime signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.Runtime,
|
||||
"Runtime",
|
||||
normalizedWeights.Runtime,
|
||||
gaps);
|
||||
|
||||
// Backport signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.Backport,
|
||||
"Backport",
|
||||
normalizedWeights.Backport,
|
||||
gaps);
|
||||
|
||||
// SBOM Lineage signal
|
||||
weightedSum += EvaluateSignal(
|
||||
snapshot.SbomLineage,
|
||||
"SBOMLineage",
|
||||
normalizedWeights.SbomLineage,
|
||||
gaps);
|
||||
|
||||
var maxWeight = normalizedWeights.TotalWeight;
|
||||
var entropy = 1.0 - (weightedSum / maxWeight);
|
||||
|
||||
var result = new UncertaintyScore
|
||||
{
|
||||
Entropy = Math.Clamp(entropy, 0.0, 1.0),
|
||||
MissingSignals = gaps.ToImmutableArray(),
|
||||
WeightedEvidenceSum = weightedSum,
|
||||
MaxPossibleWeight = maxWeight
|
||||
};
|
||||
|
||||
_logger.LogDebug(
|
||||
"Calculated uncertainty for CVE {CveId}: entropy={Entropy:F3}, tier={Tier}, missing={MissingCount}",
|
||||
snapshot.CveId,
|
||||
result.Entropy,
|
||||
result.Tier,
|
||||
gaps.Count);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
private static double EvaluateSignal<T>(
|
||||
SignalState<T> signal,
|
||||
string signalName,
|
||||
double weight,
|
||||
List<SignalGap> gaps)
|
||||
{
|
||||
if (signal.HasValue)
|
||||
{
|
||||
return weight;
|
||||
}
|
||||
|
||||
gaps.Add(new SignalGap(
|
||||
signalName,
|
||||
weight,
|
||||
signal.Status,
|
||||
signal.FailureReason));
|
||||
|
||||
return 0.0;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### IDecayedConfidenceCalculator Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Calculates time-based confidence decay for evidence staleness.
|
||||
/// </summary>
|
||||
public interface IDecayedConfidenceCalculator
|
||||
{
|
||||
/// <summary>
|
||||
/// Calculate decay for evidence age.
|
||||
/// </summary>
|
||||
/// <param name="lastSignalUpdate">When the last signal was updated.</param>
|
||||
/// <returns>Observation decay with multiplier and staleness flag.</returns>
|
||||
ObservationDecay Calculate(DateTimeOffset lastSignalUpdate);
|
||||
|
||||
/// <summary>
|
||||
/// Calculate decay with custom half-life and floor.
|
||||
/// </summary>
|
||||
/// <param name="lastSignalUpdate">When the last signal was updated.</param>
|
||||
/// <param name="halfLife">Custom half-life duration.</param>
|
||||
/// <param name="floor">Minimum confidence floor.</param>
|
||||
/// <returns>Observation decay with multiplier and staleness flag.</returns>
|
||||
ObservationDecay Calculate(DateTimeOffset lastSignalUpdate, TimeSpan halfLife, double floor);
|
||||
|
||||
/// <summary>
|
||||
/// Apply decay multiplier to a confidence score.
|
||||
/// </summary>
|
||||
/// <param name="baseConfidence">Base confidence score [0.0-1.0].</param>
|
||||
/// <param name="decay">Decay calculation result.</param>
|
||||
/// <returns>Decayed confidence score.</returns>
|
||||
double ApplyDecay(double baseConfidence, ObservationDecay decay);
|
||||
}
|
||||
```
|
||||
|
||||
### DecayedConfidenceCalculator Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Applies exponential decay to confidence based on evidence staleness.
|
||||
/// Formula: decayed = max(floor, exp(-ln(2) * age_days / half_life_days))
|
||||
/// </summary>
|
||||
public sealed class DecayedConfidenceCalculator : IDecayedConfidenceCalculator
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly DeterminizationOptions _options;
|
||||
private readonly ILogger<DecayedConfidenceCalculator> _logger;
|
||||
|
||||
public DecayedConfidenceCalculator(
|
||||
TimeProvider timeProvider,
|
||||
IOptions<DeterminizationOptions> options,
|
||||
ILogger<DecayedConfidenceCalculator> logger)
|
||||
{
|
||||
_timeProvider = timeProvider;
|
||||
_options = options.Value;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public ObservationDecay Calculate(DateTimeOffset lastSignalUpdate) =>
|
||||
Calculate(
|
||||
lastSignalUpdate,
|
||||
TimeSpan.FromDays(_options.DecayHalfLifeDays),
|
||||
_options.DecayFloor);
|
||||
|
||||
public ObservationDecay Calculate(
|
||||
DateTimeOffset lastSignalUpdate,
|
||||
TimeSpan halfLife,
|
||||
double floor)
|
||||
{
|
||||
if (halfLife <= TimeSpan.Zero)
|
||||
throw new ArgumentOutOfRangeException(nameof(halfLife), "Half-life must be positive");
|
||||
|
||||
if (floor is < 0.0 or > 1.0)
|
||||
throw new ArgumentOutOfRangeException(nameof(floor), "Floor must be between 0.0 and 1.0");
|
||||
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
var ageDays = (now - lastSignalUpdate).TotalDays;
|
||||
|
||||
double decayedMultiplier;
|
||||
if (ageDays <= 0)
|
||||
{
|
||||
// Evidence is fresh or from the future (clock skew)
|
||||
decayedMultiplier = 1.0;
|
||||
}
|
||||
else
|
||||
{
|
||||
// Exponential decay: e^(-ln(2) * t / t_half)
|
||||
var rawDecay = Math.Exp(-Math.Log(2) * ageDays / halfLife.TotalDays);
|
||||
decayedMultiplier = Math.Max(rawDecay, floor);
|
||||
}
|
||||
|
||||
// Calculate next review time (when decay crosses 50% threshold)
|
||||
var daysTo50Percent = halfLife.TotalDays;
|
||||
var nextReviewAt = lastSignalUpdate.AddDays(daysTo50Percent);
|
||||
|
||||
// Stale threshold: below 50% of original
|
||||
var isStale = decayedMultiplier <= 0.5;
|
||||
|
||||
var result = new ObservationDecay
|
||||
{
|
||||
HalfLife = halfLife,
|
||||
Floor = floor,
|
||||
LastSignalUpdate = lastSignalUpdate,
|
||||
DecayedMultiplier = decayedMultiplier,
|
||||
NextReviewAt = nextReviewAt,
|
||||
IsStale = isStale,
|
||||
AgeDays = Math.Max(0, ageDays)
|
||||
};
|
||||
|
||||
_logger.LogDebug(
|
||||
"Calculated decay: age={AgeDays:F1}d, halfLife={HalfLife}d, multiplier={Multiplier:F3}, stale={IsStale}",
|
||||
ageDays,
|
||||
halfLife.TotalDays,
|
||||
decayedMultiplier,
|
||||
isStale);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
public double ApplyDecay(double baseConfidence, ObservationDecay decay)
|
||||
{
|
||||
if (baseConfidence is < 0.0 or > 1.0)
|
||||
throw new ArgumentOutOfRangeException(nameof(baseConfidence), "Confidence must be between 0.0 and 1.0");
|
||||
|
||||
return baseConfidence * decay.DecayedMultiplier;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### SignalWeights Configuration
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Configurable weights for signal contribution to completeness.
|
||||
/// Weights should sum to 1.0 for normalized entropy.
|
||||
/// </summary>
|
||||
public sealed record SignalWeights
|
||||
{
|
||||
/// <summary>VEX statement weight. Default: 0.25</summary>
|
||||
public double Vex { get; init; } = 0.25;
|
||||
|
||||
/// <summary>EPSS score weight. Default: 0.15</summary>
|
||||
public double Epss { get; init; } = 0.15;
|
||||
|
||||
/// <summary>Reachability analysis weight. Default: 0.25</summary>
|
||||
public double Reachability { get; init; } = 0.25;
|
||||
|
||||
/// <summary>Runtime observation weight. Default: 0.15</summary>
|
||||
public double Runtime { get; init; } = 0.15;
|
||||
|
||||
/// <summary>Fix backport detection weight. Default: 0.10</summary>
|
||||
public double Backport { get; init; } = 0.10;
|
||||
|
||||
/// <summary>SBOM lineage weight. Default: 0.10</summary>
|
||||
public double SbomLineage { get; init; } = 0.10;
|
||||
|
||||
/// <summary>Total weight (sum of all signals).</summary>
|
||||
public double TotalWeight =>
|
||||
Vex + Epss + Reachability + Runtime + Backport + SbomLineage;
|
||||
|
||||
/// <summary>
|
||||
/// Returns normalized weights that sum to 1.0.
|
||||
/// </summary>
|
||||
public SignalWeights Normalize()
|
||||
{
|
||||
var total = TotalWeight;
|
||||
if (total <= 0)
|
||||
throw new InvalidOperationException("Total weight must be positive");
|
||||
|
||||
if (Math.Abs(total - 1.0) < 0.0001)
|
||||
return this; // Already normalized
|
||||
|
||||
return new SignalWeights
|
||||
{
|
||||
Vex = Vex / total,
|
||||
Epss = Epss / total,
|
||||
Reachability = Reachability / total,
|
||||
Runtime = Runtime / total,
|
||||
Backport = Backport / total,
|
||||
SbomLineage = SbomLineage / total
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Validates that all weights are non-negative and total is positive.
|
||||
/// </summary>
|
||||
public bool IsValid =>
|
||||
Vex >= 0 && Epss >= 0 && Reachability >= 0 &&
|
||||
Runtime >= 0 && Backport >= 0 && SbomLineage >= 0 &&
|
||||
TotalWeight > 0;
|
||||
|
||||
/// <summary>
|
||||
/// Default weights per advisory recommendation.
|
||||
/// </summary>
|
||||
public static SignalWeights Default => new();
|
||||
|
||||
/// <summary>
|
||||
/// Weights emphasizing VEX and reachability (for production).
|
||||
/// </summary>
|
||||
public static SignalWeights ProductionEmphasis => new()
|
||||
{
|
||||
Vex = 0.30,
|
||||
Epss = 0.15,
|
||||
Reachability = 0.30,
|
||||
Runtime = 0.10,
|
||||
Backport = 0.08,
|
||||
SbomLineage = 0.07
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Weights emphasizing runtime signals (for observed environments).
|
||||
/// </summary>
|
||||
public static SignalWeights RuntimeEmphasis => new()
|
||||
{
|
||||
Vex = 0.20,
|
||||
Epss = 0.10,
|
||||
Reachability = 0.20,
|
||||
Runtime = 0.30,
|
||||
Backport = 0.10,
|
||||
SbomLineage = 0.10
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### PriorDistribution for Missing Signals
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Prior distributions for missing signals.
|
||||
/// Used when a signal is not available but we need a default assumption.
|
||||
/// </summary>
|
||||
public sealed record PriorDistribution
|
||||
{
|
||||
/// <summary>
|
||||
/// Default prior for EPSS when not available.
|
||||
/// Median EPSS is ~0.04, so we use a conservative prior.
|
||||
/// </summary>
|
||||
public double EpssPrior { get; init; } = 0.10;
|
||||
|
||||
/// <summary>
|
||||
/// Default prior for reachability when not analyzed.
|
||||
/// Conservative: assume reachable until proven otherwise.
|
||||
/// </summary>
|
||||
public ReachabilityStatus ReachabilityPrior { get; init; } = ReachabilityStatus.Unknown;
|
||||
|
||||
/// <summary>
|
||||
/// Default prior for KEV when not checked.
|
||||
/// Conservative: assume not in KEV (most CVEs are not).
|
||||
/// </summary>
|
||||
public bool KevPrior { get; init; } = false;
|
||||
|
||||
/// <summary>
|
||||
/// Confidence in the prior values [0.0-1.0].
|
||||
/// Lower values indicate priors should be weighted less.
|
||||
/// </summary>
|
||||
public double PriorConfidence { get; init; } = 0.3;
|
||||
|
||||
/// <summary>
|
||||
/// Default conservative priors.
|
||||
/// </summary>
|
||||
public static PriorDistribution Default => new();
|
||||
|
||||
/// <summary>
|
||||
/// Pessimistic priors (assume worst case).
|
||||
/// </summary>
|
||||
public static PriorDistribution Pessimistic => new()
|
||||
{
|
||||
EpssPrior = 0.30,
|
||||
ReachabilityPrior = ReachabilityStatus.Reachable,
|
||||
KevPrior = false,
|
||||
PriorConfidence = 0.2
|
||||
};
|
||||
|
||||
/// <summary>
|
||||
/// Optimistic priors (assume best case).
|
||||
/// </summary>
|
||||
public static PriorDistribution Optimistic => new()
|
||||
{
|
||||
EpssPrior = 0.02,
|
||||
ReachabilityPrior = ReachabilityStatus.Unreachable,
|
||||
KevPrior = false,
|
||||
PriorConfidence = 0.2
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### TrustScoreAggregator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization.Scoring;
|
||||
|
||||
/// <summary>
|
||||
/// Aggregates trust score from signal snapshot.
|
||||
/// Combines signal values with weights to produce overall trust score.
|
||||
/// </summary>
|
||||
public interface ITrustScoreAggregator
|
||||
{
|
||||
/// <summary>
|
||||
/// Calculate aggregate trust score from signals.
|
||||
/// </summary>
|
||||
/// <param name="snapshot">Signal snapshot.</param>
|
||||
/// <param name="priors">Priors for missing signals.</param>
|
||||
/// <returns>Trust score [0.0-1.0].</returns>
|
||||
double Calculate(SignalSnapshot snapshot, PriorDistribution? priors = null);
|
||||
}
|
||||
|
||||
public sealed class TrustScoreAggregator : ITrustScoreAggregator
|
||||
{
|
||||
private readonly SignalWeights _weights;
|
||||
private readonly PriorDistribution _defaultPriors;
|
||||
private readonly ILogger<TrustScoreAggregator> _logger;
|
||||
|
||||
public TrustScoreAggregator(
|
||||
IOptions<DeterminizationOptions> options,
|
||||
ILogger<TrustScoreAggregator> logger)
|
||||
{
|
||||
_weights = options.Value.SignalWeights.Normalize();
|
||||
_defaultPriors = options.Value.Priors ?? PriorDistribution.Default;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public double Calculate(SignalSnapshot snapshot, PriorDistribution? priors = null)
|
||||
{
|
||||
priors ??= _defaultPriors;
|
||||
var normalized = _weights.Normalize();
|
||||
|
||||
var score = 0.0;
|
||||
|
||||
// VEX contribution: high trust if not_affected with good issuer trust
|
||||
score += CalculateVexContribution(snapshot.Vex, priors) * normalized.Vex;
|
||||
|
||||
// EPSS contribution: inverse (lower EPSS = higher trust)
|
||||
score += CalculateEpssContribution(snapshot.Epss, priors) * normalized.Epss;
|
||||
|
||||
// Reachability contribution: high trust if unreachable
|
||||
score += CalculateReachabilityContribution(snapshot.Reachability, priors) * normalized.Reachability;
|
||||
|
||||
// Runtime contribution: high trust if not observed loaded
|
||||
score += CalculateRuntimeContribution(snapshot.Runtime, priors) * normalized.Runtime;
|
||||
|
||||
// Backport contribution: high trust if backport detected
|
||||
score += CalculateBackportContribution(snapshot.Backport, priors) * normalized.Backport;
|
||||
|
||||
// SBOM lineage contribution: high trust if verified
|
||||
score += CalculateSbomContribution(snapshot.SbomLineage, priors) * normalized.SbomLineage;
|
||||
|
||||
var result = Math.Clamp(score, 0.0, 1.0);
|
||||
|
||||
_logger.LogDebug(
|
||||
"Calculated trust score for CVE {CveId}: {Score:F3}",
|
||||
snapshot.CveId,
|
||||
result);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
private static double CalculateVexContribution(SignalState<VexClaimSummary> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
return priors.PriorConfidence * 0.5; // Uncertain
|
||||
|
||||
var vex = signal.Value!;
|
||||
return vex.Status switch
|
||||
{
|
||||
"not_affected" => vex.IssuerTrust,
|
||||
"fixed" => vex.IssuerTrust * 0.9,
|
||||
"under_investigation" => 0.4,
|
||||
"affected" => 0.1,
|
||||
_ => 0.3
|
||||
};
|
||||
}
|
||||
|
||||
private static double CalculateEpssContribution(SignalState<EpssEvidence> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
return 1.0 - priors.EpssPrior; // Use prior
|
||||
|
||||
// Inverse: low EPSS = high trust
|
||||
return 1.0 - signal.Value!.Score;
|
||||
}
|
||||
|
||||
private static double CalculateReachabilityContribution(SignalState<ReachabilityEvidence> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
{
|
||||
return priors.ReachabilityPrior switch
|
||||
{
|
||||
ReachabilityStatus.Unreachable => 0.9 * priors.PriorConfidence,
|
||||
ReachabilityStatus.Reachable => 0.1 * priors.PriorConfidence,
|
||||
_ => 0.5 * priors.PriorConfidence
|
||||
};
|
||||
}
|
||||
|
||||
var reach = signal.Value!;
|
||||
return reach.Status switch
|
||||
{
|
||||
ReachabilityStatus.Unreachable => reach.Confidence,
|
||||
ReachabilityStatus.Gated => reach.Confidence * 0.6,
|
||||
ReachabilityStatus.Unknown => 0.4,
|
||||
ReachabilityStatus.Reachable => 0.1,
|
||||
ReachabilityStatus.ObservedReachable => 0.0,
|
||||
_ => 0.3
|
||||
};
|
||||
}
|
||||
|
||||
private static double CalculateRuntimeContribution(SignalState<RuntimeEvidence> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
return 0.5 * priors.PriorConfidence; // No runtime data
|
||||
|
||||
return signal.Value!.ObservedLoaded ? 0.0 : 0.9;
|
||||
}
|
||||
|
||||
private static double CalculateBackportContribution(SignalState<BackportEvidence> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
return 0.5 * priors.PriorConfidence;
|
||||
|
||||
return signal.Value!.BackportDetected ? signal.Value.Confidence : 0.3;
|
||||
}
|
||||
|
||||
private static double CalculateSbomContribution(SignalState<SbomLineageEvidence> signal, PriorDistribution priors)
|
||||
{
|
||||
if (!signal.HasValue)
|
||||
return 0.5 * priors.PriorConfidence;
|
||||
|
||||
var sbom = signal.Value!;
|
||||
var score = sbom.QualityScore;
|
||||
if (sbom.LineageVerified) score *= 1.1;
|
||||
if (sbom.HasProvenanceAttestation) score *= 1.1;
|
||||
return Math.Min(score, 1.0);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationOptions
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization;
|
||||
|
||||
/// <summary>
|
||||
/// Configuration options for the Determinization subsystem.
|
||||
/// </summary>
|
||||
public sealed class DeterminizationOptions
|
||||
{
|
||||
/// <summary>Configuration section name.</summary>
|
||||
public const string SectionName = "Determinization";
|
||||
|
||||
/// <summary>EPSS score that triggers quarantine (block). Default: 0.4</summary>
|
||||
public double EpssQuarantineThreshold { get; set; } = 0.4;
|
||||
|
||||
/// <summary>Trust score threshold for guarded allow. Default: 0.5</summary>
|
||||
public double GuardedAllowScoreThreshold { get; set; } = 0.5;
|
||||
|
||||
/// <summary>Entropy threshold for guarded allow. Default: 0.4</summary>
|
||||
public double GuardedAllowEntropyThreshold { get; set; } = 0.4;
|
||||
|
||||
/// <summary>Entropy threshold for production block. Default: 0.3</summary>
|
||||
public double ProductionBlockEntropyThreshold { get; set; } = 0.3;
|
||||
|
||||
/// <summary>Half-life for evidence decay in days. Default: 14</summary>
|
||||
public int DecayHalfLifeDays { get; set; } = 14;
|
||||
|
||||
/// <summary>Minimum confidence floor after decay. Default: 0.35</summary>
|
||||
public double DecayFloor { get; set; } = 0.35;
|
||||
|
||||
/// <summary>Review interval for guarded observations in days. Default: 7</summary>
|
||||
public int GuardedReviewIntervalDays { get; set; } = 7;
|
||||
|
||||
/// <summary>Maximum time in guarded state in days. Default: 30</summary>
|
||||
public int MaxGuardedDurationDays { get; set; } = 30;
|
||||
|
||||
/// <summary>Signal weights for uncertainty calculation.</summary>
|
||||
public SignalWeights SignalWeights { get; set; } = new();
|
||||
|
||||
/// <summary>Prior distributions for missing signals.</summary>
|
||||
public PriorDistribution? Priors { get; set; }
|
||||
|
||||
/// <summary>Per-environment threshold overrides.</summary>
|
||||
public Dictionary<string, EnvironmentThresholds> EnvironmentThresholds { get; set; } = new();
|
||||
|
||||
/// <summary>Enable detailed logging for debugging.</summary>
|
||||
public bool EnableDetailedLogging { get; set; } = false;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Per-environment threshold configuration.
|
||||
/// </summary>
|
||||
public sealed record EnvironmentThresholds
|
||||
{
|
||||
public DeploymentEnvironment Environment { get; init; }
|
||||
public double MinConfidenceForNotAffected { get; init; }
|
||||
public double MaxEntropyForAllow { get; init; }
|
||||
public double EpssBlockThreshold { get; init; }
|
||||
public bool RequireReachabilityForAllow { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### ServiceCollectionExtensions
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Determinization;
|
||||
|
||||
/// <summary>
|
||||
/// DI registration for Determinization services.
|
||||
/// </summary>
|
||||
public static class ServiceCollectionExtensions
|
||||
{
|
||||
/// <summary>
|
||||
/// Adds Determinization services to the DI container.
|
||||
/// </summary>
|
||||
public static IServiceCollection AddDeterminization(
|
||||
this IServiceCollection services,
|
||||
IConfiguration configuration)
|
||||
{
|
||||
// Bind options
|
||||
services.AddOptions<DeterminizationOptions>()
|
||||
.Bind(configuration.GetSection(DeterminizationOptions.SectionName))
|
||||
.ValidateDataAnnotations()
|
||||
.ValidateOnStart();
|
||||
|
||||
// Register services
|
||||
services.AddSingleton<IUncertaintyScoreCalculator, UncertaintyScoreCalculator>();
|
||||
services.AddSingleton<IDecayedConfidenceCalculator, DecayedConfidenceCalculator>();
|
||||
services.AddSingleton<ITrustScoreAggregator, TrustScoreAggregator>();
|
||||
|
||||
return services;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Adds Determinization services with custom options.
|
||||
/// </summary>
|
||||
public static IServiceCollection AddDeterminization(
|
||||
this IServiceCollection services,
|
||||
Action<DeterminizationOptions> configure)
|
||||
{
|
||||
services.Configure(configure);
|
||||
services.PostConfigure<DeterminizationOptions>(options =>
|
||||
{
|
||||
// Validate and normalize weights
|
||||
if (!options.SignalWeights.IsValid)
|
||||
throw new OptionsValidationException(
|
||||
nameof(DeterminizationOptions.SignalWeights),
|
||||
typeof(SignalWeights),
|
||||
new[] { "Signal weights must be non-negative and have positive total" });
|
||||
});
|
||||
|
||||
services.AddSingleton<IUncertaintyScoreCalculator, UncertaintyScoreCalculator>();
|
||||
services.AddSingleton<IDecayedConfidenceCalculator, DecayedConfidenceCalculator>();
|
||||
services.AddSingleton<ITrustScoreAggregator, TrustScoreAggregator>();
|
||||
|
||||
return services;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | DCS-001 | TODO | DCM-030 | Guild | Create `Scoring/` directory structure |
|
||||
| 2 | DCS-002 | TODO | DCS-001 | Guild | Implement `SignalWeights` record with presets |
|
||||
| 3 | DCS-003 | TODO | DCS-002 | Guild | Implement `PriorDistribution` record with presets |
|
||||
| 4 | DCS-004 | TODO | DCS-003 | Guild | Implement `IUncertaintyScoreCalculator` interface |
|
||||
| 5 | DCS-005 | TODO | DCS-004 | Guild | Implement `UncertaintyScoreCalculator` with logging |
|
||||
| 6 | DCS-006 | TODO | DCS-005 | Guild | Implement `IDecayedConfidenceCalculator` interface |
|
||||
| 7 | DCS-007 | TODO | DCS-006 | Guild | Implement `DecayedConfidenceCalculator` with TimeProvider |
|
||||
| 8 | DCS-008 | TODO | DCS-007 | Guild | Implement `ITrustScoreAggregator` interface |
|
||||
| 9 | DCS-009 | TODO | DCS-008 | Guild | Implement `TrustScoreAggregator` with all signal types |
|
||||
| 10 | DCS-010 | TODO | DCS-009 | Guild | Implement `EnvironmentThresholds` record |
|
||||
| 11 | DCS-011 | TODO | DCS-010 | Guild | Implement `DeterminizationOptions` with validation |
|
||||
| 12 | DCS-012 | TODO | DCS-011 | Guild | Implement `ServiceCollectionExtensions` for DI |
|
||||
| 13 | DCS-013 | TODO | DCS-012 | Guild | Write unit tests: `SignalWeights.Normalize()` |
|
||||
| 14 | DCS-014 | TODO | DCS-013 | Guild | Write unit tests: `UncertaintyScoreCalculator` entropy bounds |
|
||||
| 15 | DCS-015 | TODO | DCS-014 | Guild | Write unit tests: `UncertaintyScoreCalculator` missing signals |
|
||||
| 16 | DCS-016 | TODO | DCS-015 | Guild | Write unit tests: `DecayedConfidenceCalculator` half-life |
|
||||
| 17 | DCS-017 | TODO | DCS-016 | Guild | Write unit tests: `DecayedConfidenceCalculator` floor |
|
||||
| 18 | DCS-018 | TODO | DCS-017 | Guild | Write unit tests: `DecayedConfidenceCalculator` staleness |
|
||||
| 19 | DCS-019 | TODO | DCS-018 | Guild | Write unit tests: `TrustScoreAggregator` signal combinations |
|
||||
| 20 | DCS-020 | TODO | DCS-019 | Guild | Write unit tests: `TrustScoreAggregator` with priors |
|
||||
| 21 | DCS-021 | TODO | DCS-020 | Guild | Write property tests: entropy always [0.0, 1.0] |
|
||||
| 22 | DCS-022 | TODO | DCS-021 | Guild | Write property tests: decay monotonically decreasing |
|
||||
| 23 | DCS-023 | TODO | DCS-022 | Guild | Write determinism tests: same snapshot same entropy |
|
||||
| 24 | DCS-024 | TODO | DCS-023 | Guild | Integration test: DI registration with configuration |
|
||||
| 25 | DCS-025 | TODO | DCS-024 | Guild | Add metrics: `stellaops_determinization_uncertainty_entropy` |
|
||||
| 26 | DCS-026 | TODO | DCS-025 | Guild | Add metrics: `stellaops_determinization_decay_multiplier` |
|
||||
| 27 | DCS-027 | TODO | DCS-026 | Guild | Document configuration options in architecture.md |
|
||||
| 28 | DCS-028 | TODO | DCS-027 | Guild | Verify build with `dotnet build` |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. `UncertaintyScoreCalculator` produces entropy [0.0, 1.0] for any input
|
||||
2. `DecayedConfidenceCalculator` correctly applies half-life formula
|
||||
3. Decay never drops below configured floor
|
||||
4. Missing signals correctly contribute to higher entropy
|
||||
5. Signal weights are normalized before calculation
|
||||
6. Priors are applied when signals are missing
|
||||
7. All services registered in DI correctly
|
||||
8. Configuration options validated at startup
|
||||
9. Metrics emitted for observability
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| 14-day default half-life | Per advisory; shorter than existing 90-day gives more urgency |
|
||||
| 0.35 floor | Consistent with existing FreshnessCalculator; prevents zero confidence |
|
||||
| Normalized weights | Ensures entropy calculation is consistent regardless of weight scale |
|
||||
| Conservative priors | Missing data assumes moderate risk, not best/worst case |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Calculation overhead | Cache results per snapshot; calculators are stateless |
|
||||
| Weight misconfiguration | Validation at startup; presets for common scenarios |
|
||||
| Clock skew affecting decay | Use TimeProvider abstraction; handle future timestamps gracefully |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-08: DCS-001 to DCS-012 complete (implementations)
|
||||
- 2026-01-09: DCS-013 to DCS-023 complete (tests)
|
||||
- 2026-01-10: DCS-024 to DCS-028 complete (metrics, docs)
|
||||
@@ -0,0 +1,842 @@
|
||||
# Sprint 20260106_001_002_SCANNER - Suppression Proof Model
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Implement `SuppressionWitness` - a DSSE-signable proof documenting why a vulnerability is **not affected**, complementing the existing `PathWitness` which documents reachable paths.
|
||||
|
||||
- **Working directory:** `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/`
|
||||
- **Evidence:** SuppressionWitness model, builder, signer, tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The product advisory requires **proof objects for both outcomes**:
|
||||
|
||||
- If "affected": attach *minimal counterexample path* (entrypoint -> vulnerable symbol) - **EXISTS: PathWitness**
|
||||
- If "not affected": attach *suppression proof* (e.g., dead code after linker GC; feature flag off; patched symbol diff) - **GAP**
|
||||
|
||||
Current state:
|
||||
- `PathWitness` documents reachability (why code IS reachable)
|
||||
- VEX status can be "not_affected" but lacks structured proof
|
||||
- Gate detection (`DetectedGate`) shows mitigating controls but doesn't form a complete suppression proof
|
||||
- No model for "why this vulnerability doesn't apply"
|
||||
|
||||
**Gap:** No `SuppressionWitness` model to document and attest why a vulnerability is not exploitable.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** None (extends existing Witnesses module)
|
||||
- **Blocks:** SPRINT_20260106_001_001_LB (rationale renderer uses SuppressionWitness)
|
||||
- **Parallel safe:** Extends existing module; no conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/scanner/architecture.md
|
||||
- src/Scanner/AGENTS.md
|
||||
- Existing PathWitness implementation at `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Suppression Types
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.Reachability.Witnesses;
|
||||
|
||||
/// <summary>
|
||||
/// Classification of suppression reasons.
|
||||
/// </summary>
|
||||
public enum SuppressionType
|
||||
{
|
||||
/// <summary>Vulnerable code is unreachable from any entry point.</summary>
|
||||
Unreachable,
|
||||
|
||||
/// <summary>Vulnerable symbol was removed by linker garbage collection.</summary>
|
||||
LinkerGarbageCollected,
|
||||
|
||||
/// <summary>Feature flag disables the vulnerable code path.</summary>
|
||||
FeatureFlagDisabled,
|
||||
|
||||
/// <summary>Vulnerable symbol was patched (backport).</summary>
|
||||
PatchedSymbol,
|
||||
|
||||
/// <summary>Runtime gate (authentication, validation) blocks exploitation.</summary>
|
||||
GateBlocked,
|
||||
|
||||
/// <summary>Compile-time configuration excludes vulnerable code.</summary>
|
||||
CompileTimeExcluded,
|
||||
|
||||
/// <summary>VEX statement from authoritative source declares not_affected.</summary>
|
||||
VexNotAffected,
|
||||
|
||||
/// <summary>Binary does not contain the vulnerable function.</summary>
|
||||
FunctionAbsent,
|
||||
|
||||
/// <summary>Version is outside the affected range.</summary>
|
||||
VersionNotAffected,
|
||||
|
||||
/// <summary>Platform/architecture not vulnerable.</summary>
|
||||
PlatformNotAffected
|
||||
}
|
||||
```
|
||||
|
||||
### SuppressionWitness Model
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.Reachability.Witnesses;
|
||||
|
||||
/// <summary>
|
||||
/// A DSSE-signable suppression witness documenting why a vulnerability is not exploitable.
|
||||
/// Conforms to stellaops.suppression.v1 schema.
|
||||
/// </summary>
|
||||
public sealed record SuppressionWitness
|
||||
{
|
||||
/// <summary>Schema version identifier.</summary>
|
||||
[JsonPropertyName("witness_schema")]
|
||||
public string WitnessSchema { get; init; } = SuppressionWitnessSchema.Version;
|
||||
|
||||
/// <summary>Content-addressed witness ID (e.g., "sup:sha256:...").</summary>
|
||||
[JsonPropertyName("witness_id")]
|
||||
public required string WitnessId { get; init; }
|
||||
|
||||
/// <summary>The artifact (SBOM, component) this witness relates to.</summary>
|
||||
[JsonPropertyName("artifact")]
|
||||
public required WitnessArtifact Artifact { get; init; }
|
||||
|
||||
/// <summary>The vulnerability this witness concerns.</summary>
|
||||
[JsonPropertyName("vuln")]
|
||||
public required WitnessVuln Vuln { get; init; }
|
||||
|
||||
/// <summary>Type of suppression.</summary>
|
||||
[JsonPropertyName("type")]
|
||||
public required SuppressionType Type { get; init; }
|
||||
|
||||
/// <summary>Human-readable reason for suppression.</summary>
|
||||
[JsonPropertyName("reason")]
|
||||
public required string Reason { get; init; }
|
||||
|
||||
/// <summary>Detailed evidence supporting the suppression.</summary>
|
||||
[JsonPropertyName("evidence")]
|
||||
public required SuppressionEvidence Evidence { get; init; }
|
||||
|
||||
/// <summary>Confidence level (0.0 - 1.0).</summary>
|
||||
[JsonPropertyName("confidence")]
|
||||
public required double Confidence { get; init; }
|
||||
|
||||
/// <summary>When this witness was generated (UTC ISO-8601).</summary>
|
||||
[JsonPropertyName("observed_at")]
|
||||
public required DateTimeOffset ObservedAt { get; init; }
|
||||
|
||||
/// <summary>Optional expiration for time-bounded suppressions.</summary>
|
||||
[JsonPropertyName("expires_at")]
|
||||
public DateTimeOffset? ExpiresAt { get; init; }
|
||||
|
||||
/// <summary>Additional metadata.</summary>
|
||||
[JsonPropertyName("metadata")]
|
||||
public IReadOnlyDictionary<string, string>? Metadata { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Evidence supporting a suppression claim.
|
||||
/// </summary>
|
||||
public sealed record SuppressionEvidence
|
||||
{
|
||||
/// <summary>BLAKE3 digest of the call graph analyzed.</summary>
|
||||
[JsonPropertyName("callgraph_digest")]
|
||||
public string? CallgraphDigest { get; init; }
|
||||
|
||||
/// <summary>Build identifier for the analyzed artifact.</summary>
|
||||
[JsonPropertyName("build_id")]
|
||||
public string? BuildId { get; init; }
|
||||
|
||||
/// <summary>Linker map digest (for GC-based suppression).</summary>
|
||||
[JsonPropertyName("linker_map_digest")]
|
||||
public string? LinkerMapDigest { get; init; }
|
||||
|
||||
/// <summary>Symbol that was expected but absent.</summary>
|
||||
[JsonPropertyName("absent_symbol")]
|
||||
public AbsentSymbolInfo? AbsentSymbol { get; init; }
|
||||
|
||||
/// <summary>Patched symbol comparison.</summary>
|
||||
[JsonPropertyName("patched_symbol")]
|
||||
public PatchedSymbolInfo? PatchedSymbol { get; init; }
|
||||
|
||||
/// <summary>Feature flag that disables the code path.</summary>
|
||||
[JsonPropertyName("feature_flag")]
|
||||
public FeatureFlagInfo? FeatureFlag { get; init; }
|
||||
|
||||
/// <summary>Gates that block exploitation.</summary>
|
||||
[JsonPropertyName("blocking_gates")]
|
||||
public IReadOnlyList<DetectedGate>? BlockingGates { get; init; }
|
||||
|
||||
/// <summary>VEX statement reference.</summary>
|
||||
[JsonPropertyName("vex_statement")]
|
||||
public VexStatementRef? VexStatement { get; init; }
|
||||
|
||||
/// <summary>Version comparison evidence.</summary>
|
||||
[JsonPropertyName("version_comparison")]
|
||||
public VersionComparisonInfo? VersionComparison { get; init; }
|
||||
|
||||
/// <summary>SHA-256 digest of the analysis configuration.</summary>
|
||||
[JsonPropertyName("analysis_config_digest")]
|
||||
public string? AnalysisConfigDigest { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Information about an absent symbol.</summary>
|
||||
public sealed record AbsentSymbolInfo
|
||||
{
|
||||
[JsonPropertyName("symbol_id")]
|
||||
public required string SymbolId { get; init; }
|
||||
|
||||
[JsonPropertyName("expected_in_version")]
|
||||
public required string ExpectedInVersion { get; init; }
|
||||
|
||||
[JsonPropertyName("search_scope")]
|
||||
public required string SearchScope { get; init; }
|
||||
|
||||
[JsonPropertyName("searched_binaries")]
|
||||
public IReadOnlyList<string>? SearchedBinaries { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Information about a patched symbol.</summary>
|
||||
public sealed record PatchedSymbolInfo
|
||||
{
|
||||
[JsonPropertyName("symbol_id")]
|
||||
public required string SymbolId { get; init; }
|
||||
|
||||
[JsonPropertyName("vulnerable_fingerprint")]
|
||||
public required string VulnerableFingerprint { get; init; }
|
||||
|
||||
[JsonPropertyName("actual_fingerprint")]
|
||||
public required string ActualFingerprint { get; init; }
|
||||
|
||||
[JsonPropertyName("similarity_score")]
|
||||
public required double SimilarityScore { get; init; }
|
||||
|
||||
[JsonPropertyName("patch_source")]
|
||||
public string? PatchSource { get; init; }
|
||||
|
||||
[JsonPropertyName("diff_summary")]
|
||||
public string? DiffSummary { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Information about a disabling feature flag.</summary>
|
||||
public sealed record FeatureFlagInfo
|
||||
{
|
||||
[JsonPropertyName("flag_name")]
|
||||
public required string FlagName { get; init; }
|
||||
|
||||
[JsonPropertyName("flag_value")]
|
||||
public required string FlagValue { get; init; }
|
||||
|
||||
[JsonPropertyName("source")]
|
||||
public required string Source { get; init; }
|
||||
|
||||
[JsonPropertyName("controls_symbol")]
|
||||
public string? ControlsSymbol { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Reference to a VEX statement.</summary>
|
||||
public sealed record VexStatementRef
|
||||
{
|
||||
[JsonPropertyName("document_id")]
|
||||
public required string DocumentId { get; init; }
|
||||
|
||||
[JsonPropertyName("statement_id")]
|
||||
public required string StatementId { get; init; }
|
||||
|
||||
[JsonPropertyName("issuer")]
|
||||
public required string Issuer { get; init; }
|
||||
|
||||
[JsonPropertyName("status")]
|
||||
public required string Status { get; init; }
|
||||
|
||||
[JsonPropertyName("justification")]
|
||||
public string? Justification { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Version comparison evidence.</summary>
|
||||
public sealed record VersionComparisonInfo
|
||||
{
|
||||
[JsonPropertyName("actual_version")]
|
||||
public required string ActualVersion { get; init; }
|
||||
|
||||
[JsonPropertyName("affected_range")]
|
||||
public required string AffectedRange { get; init; }
|
||||
|
||||
[JsonPropertyName("comparison_result")]
|
||||
public required string ComparisonResult { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### SuppressionWitness Builder
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.Reachability.Witnesses;
|
||||
|
||||
/// <summary>
|
||||
/// Builds suppression witnesses from analysis results.
|
||||
/// </summary>
|
||||
public interface ISuppressionWitnessBuilder
|
||||
{
|
||||
/// <summary>
|
||||
/// Build a suppression witness for unreachable code.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildUnreachable(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
string callgraphDigest,
|
||||
string reason);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness for patched symbol.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildPatchedSymbol(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
PatchedSymbolInfo patchInfo);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness for absent function.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildFunctionAbsent(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
AbsentSymbolInfo absentInfo);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness for gate-blocked path.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildGateBlocked(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
IReadOnlyList<DetectedGate> blockingGates);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness for feature flag disabled.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildFeatureFlagDisabled(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
FeatureFlagInfo flagInfo);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness from VEX not_affected statement.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildFromVexStatement(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
VexStatementRef vexStatement);
|
||||
|
||||
/// <summary>
|
||||
/// Build a suppression witness for version not in affected range.
|
||||
/// </summary>
|
||||
SuppressionWitness BuildVersionNotAffected(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
VersionComparisonInfo versionInfo);
|
||||
}
|
||||
|
||||
public sealed class SuppressionWitnessBuilder : ISuppressionWitnessBuilder
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<SuppressionWitnessBuilder> _logger;
|
||||
|
||||
public SuppressionWitnessBuilder(
|
||||
TimeProvider timeProvider,
|
||||
ILogger<SuppressionWitnessBuilder> logger)
|
||||
{
|
||||
_timeProvider = timeProvider;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildUnreachable(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
string callgraphDigest,
|
||||
string reason)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
CallgraphDigest = callgraphDigest
|
||||
};
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.Unreachable,
|
||||
reason,
|
||||
evidence,
|
||||
confidence: 0.95);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildPatchedSymbol(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
PatchedSymbolInfo patchInfo)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
PatchedSymbol = patchInfo
|
||||
};
|
||||
|
||||
var reason = $"Symbol `{patchInfo.SymbolId}` differs from vulnerable version " +
|
||||
$"(similarity: {patchInfo.SimilarityScore:P1})";
|
||||
|
||||
// Confidence based on similarity: lower similarity = higher confidence it's patched
|
||||
var confidence = 1.0 - patchInfo.SimilarityScore;
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.PatchedSymbol,
|
||||
reason,
|
||||
evidence,
|
||||
confidence);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildFunctionAbsent(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
AbsentSymbolInfo absentInfo)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
AbsentSymbol = absentInfo
|
||||
};
|
||||
|
||||
var reason = $"Vulnerable symbol `{absentInfo.SymbolId}` not found in binary";
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.FunctionAbsent,
|
||||
reason,
|
||||
evidence,
|
||||
confidence: 0.90);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildGateBlocked(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
IReadOnlyList<DetectedGate> blockingGates)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
BlockingGates = blockingGates
|
||||
};
|
||||
|
||||
var gateTypes = string.Join(", ", blockingGates.Select(g => g.Type).Distinct());
|
||||
var reason = $"Exploitation blocked by gates: {gateTypes}";
|
||||
|
||||
// Confidence based on minimum gate confidence
|
||||
var confidence = blockingGates.Min(g => g.Confidence);
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.GateBlocked,
|
||||
reason,
|
||||
evidence,
|
||||
confidence);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildFeatureFlagDisabled(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
FeatureFlagInfo flagInfo)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
FeatureFlag = flagInfo
|
||||
};
|
||||
|
||||
var reason = $"Feature flag `{flagInfo.FlagName}` = `{flagInfo.FlagValue}` disables vulnerable code path";
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.FeatureFlagDisabled,
|
||||
reason,
|
||||
evidence,
|
||||
confidence: 0.85);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildFromVexStatement(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
VexStatementRef vexStatement)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
VexStatement = vexStatement
|
||||
};
|
||||
|
||||
var reason = vexStatement.Justification
|
||||
?? $"VEX statement from {vexStatement.Issuer} declares not_affected";
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.VexNotAffected,
|
||||
reason,
|
||||
evidence,
|
||||
confidence: 0.95);
|
||||
}
|
||||
|
||||
public SuppressionWitness BuildVersionNotAffected(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
VersionComparisonInfo versionInfo)
|
||||
{
|
||||
var evidence = new SuppressionEvidence
|
||||
{
|
||||
VersionComparison = versionInfo
|
||||
};
|
||||
|
||||
var reason = $"Version {versionInfo.ActualVersion} is outside affected range {versionInfo.AffectedRange}";
|
||||
|
||||
return Build(
|
||||
artifact,
|
||||
vuln,
|
||||
SuppressionType.VersionNotAffected,
|
||||
reason,
|
||||
evidence,
|
||||
confidence: 0.99);
|
||||
}
|
||||
|
||||
private SuppressionWitness Build(
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
SuppressionType type,
|
||||
string reason,
|
||||
SuppressionEvidence evidence,
|
||||
double confidence)
|
||||
{
|
||||
var observedAt = _timeProvider.GetUtcNow();
|
||||
|
||||
var witness = new SuppressionWitness
|
||||
{
|
||||
WitnessId = "", // Computed below
|
||||
Artifact = artifact,
|
||||
Vuln = vuln,
|
||||
Type = type,
|
||||
Reason = reason,
|
||||
Evidence = evidence,
|
||||
Confidence = Math.Round(confidence, 4),
|
||||
ObservedAt = observedAt
|
||||
};
|
||||
|
||||
// Compute content-addressed ID
|
||||
var witnessId = ComputeWitnessId(witness);
|
||||
witness = witness with { WitnessId = witnessId };
|
||||
|
||||
_logger.LogDebug(
|
||||
"Built suppression witness {WitnessId} for {VulnId} on {Component}: {Type}",
|
||||
witnessId, vuln.Id, artifact.ComponentPurl, type);
|
||||
|
||||
return witness;
|
||||
}
|
||||
|
||||
private static string ComputeWitnessId(SuppressionWitness witness)
|
||||
{
|
||||
var canonical = CanonicalJsonSerializer.Serialize(new
|
||||
{
|
||||
artifact = witness.Artifact,
|
||||
vuln = witness.Vuln,
|
||||
type = witness.Type.ToString(),
|
||||
reason = witness.Reason,
|
||||
evidence_callgraph = witness.Evidence.CallgraphDigest,
|
||||
evidence_build_id = witness.Evidence.BuildId,
|
||||
evidence_patched = witness.Evidence.PatchedSymbol?.ActualFingerprint,
|
||||
evidence_vex = witness.Evidence.VexStatement?.StatementId
|
||||
});
|
||||
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(canonical));
|
||||
return $"sup:sha256:{Convert.ToHexString(hash).ToLowerInvariant()}";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### DSSE Signing
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.Reachability.Witnesses;
|
||||
|
||||
/// <summary>
|
||||
/// Signs suppression witnesses with DSSE.
|
||||
/// </summary>
|
||||
public interface ISuppressionDsseSigner
|
||||
{
|
||||
/// <summary>
|
||||
/// Sign a suppression witness.
|
||||
/// </summary>
|
||||
Task<DsseEnvelope> SignAsync(
|
||||
SuppressionWitness witness,
|
||||
string keyId,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Verify a signed suppression witness.
|
||||
/// </summary>
|
||||
Task<bool> VerifyAsync(
|
||||
DsseEnvelope envelope,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed class SuppressionDsseSigner : ISuppressionDsseSigner
|
||||
{
|
||||
public const string PredicateType = "stellaops.dev/predicates/suppression-witness@v1";
|
||||
|
||||
private readonly ISigningService _signingService;
|
||||
private readonly ILogger<SuppressionDsseSigner> _logger;
|
||||
|
||||
public SuppressionDsseSigner(
|
||||
ISigningService signingService,
|
||||
ILogger<SuppressionDsseSigner> logger)
|
||||
{
|
||||
_signingService = signingService;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task<DsseEnvelope> SignAsync(
|
||||
SuppressionWitness witness,
|
||||
string keyId,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var payload = CanonicalJsonSerializer.Serialize(witness);
|
||||
var payloadBytes = Encoding.UTF8.GetBytes(payload);
|
||||
|
||||
var pae = DsseHelper.ComputePreAuthenticationEncoding(
|
||||
PredicateType,
|
||||
payloadBytes);
|
||||
|
||||
var signature = await _signingService.SignAsync(
|
||||
pae,
|
||||
keyId,
|
||||
ct);
|
||||
|
||||
var envelope = new DsseEnvelope
|
||||
{
|
||||
PayloadType = PredicateType,
|
||||
Payload = Convert.ToBase64String(payloadBytes),
|
||||
Signatures =
|
||||
[
|
||||
new DsseSignature
|
||||
{
|
||||
KeyId = keyId,
|
||||
Sig = Convert.ToBase64String(signature)
|
||||
}
|
||||
]
|
||||
};
|
||||
|
||||
_logger.LogInformation(
|
||||
"Signed suppression witness {WitnessId} with key {KeyId}",
|
||||
witness.WitnessId, keyId);
|
||||
|
||||
return envelope;
|
||||
}
|
||||
|
||||
public async Task<bool> VerifyAsync(
|
||||
DsseEnvelope envelope,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
if (envelope.PayloadType != PredicateType)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Invalid payload type: expected {Expected}, got {Actual}",
|
||||
PredicateType, envelope.PayloadType);
|
||||
return false;
|
||||
}
|
||||
|
||||
var payloadBytes = Convert.FromBase64String(envelope.Payload);
|
||||
var pae = DsseHelper.ComputePreAuthenticationEncoding(
|
||||
PredicateType,
|
||||
payloadBytes);
|
||||
|
||||
foreach (var sig in envelope.Signatures)
|
||||
{
|
||||
var signatureBytes = Convert.FromBase64String(sig.Sig);
|
||||
var valid = await _signingService.VerifyAsync(
|
||||
pae,
|
||||
signatureBytes,
|
||||
sig.KeyId,
|
||||
ct);
|
||||
|
||||
if (!valid)
|
||||
{
|
||||
_logger.LogWarning(
|
||||
"Signature verification failed for key {KeyId}",
|
||||
sig.KeyId);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Integration with Reachability Evaluator
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.Reachability.Stack;
|
||||
|
||||
public sealed class ReachabilityStackEvaluator
|
||||
{
|
||||
private readonly ISuppressionWitnessBuilder _suppressionBuilder;
|
||||
// ... existing dependencies
|
||||
|
||||
/// <summary>
|
||||
/// Evaluate reachability and produce either PathWitness (affected) or SuppressionWitness (not affected).
|
||||
/// </summary>
|
||||
public async Task<ReachabilityResult> EvaluateAsync(
|
||||
RichGraph graph,
|
||||
WitnessArtifact artifact,
|
||||
WitnessVuln vuln,
|
||||
string targetSymbol,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// L1: Static analysis
|
||||
var staticResult = await EvaluateStaticReachabilityAsync(graph, targetSymbol, ct);
|
||||
|
||||
if (staticResult.Verdict == ReachabilityVerdict.Unreachable)
|
||||
{
|
||||
var suppression = _suppressionBuilder.BuildUnreachable(
|
||||
artifact,
|
||||
vuln,
|
||||
staticResult.CallgraphDigest,
|
||||
"No path from any entry point to vulnerable symbol");
|
||||
|
||||
return ReachabilityResult.NotAffected(suppression);
|
||||
}
|
||||
|
||||
// L2: Binary resolution
|
||||
var binaryResult = await EvaluateBinaryResolutionAsync(artifact, targetSymbol, ct);
|
||||
|
||||
if (binaryResult.FunctionAbsent)
|
||||
{
|
||||
var suppression = _suppressionBuilder.BuildFunctionAbsent(
|
||||
artifact,
|
||||
vuln,
|
||||
binaryResult.AbsentSymbolInfo!);
|
||||
|
||||
return ReachabilityResult.NotAffected(suppression);
|
||||
}
|
||||
|
||||
if (binaryResult.IsPatched)
|
||||
{
|
||||
var suppression = _suppressionBuilder.BuildPatchedSymbol(
|
||||
artifact,
|
||||
vuln,
|
||||
binaryResult.PatchedSymbolInfo!);
|
||||
|
||||
return ReachabilityResult.NotAffected(suppression);
|
||||
}
|
||||
|
||||
// L3: Runtime gating
|
||||
var gateResult = await EvaluateGatesAsync(graph, staticResult.Path!, ct);
|
||||
|
||||
if (gateResult.AllPathsBlocked)
|
||||
{
|
||||
var suppression = _suppressionBuilder.BuildGateBlocked(
|
||||
artifact,
|
||||
vuln,
|
||||
gateResult.BlockingGates);
|
||||
|
||||
return ReachabilityResult.NotAffected(suppression);
|
||||
}
|
||||
|
||||
// Reachable - build PathWitness
|
||||
var pathWitness = await _pathWitnessBuilder.BuildAsync(
|
||||
artifact,
|
||||
vuln,
|
||||
staticResult.Path!,
|
||||
gateResult.DetectedGates,
|
||||
ct);
|
||||
|
||||
return ReachabilityResult.Affected(pathWitness);
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record ReachabilityResult
|
||||
{
|
||||
public required ReachabilityVerdict Verdict { get; init; }
|
||||
public PathWitness? PathWitness { get; init; }
|
||||
public SuppressionWitness? SuppressionWitness { get; init; }
|
||||
|
||||
public static ReachabilityResult Affected(PathWitness witness) =>
|
||||
new() { Verdict = ReachabilityVerdict.Affected, PathWitness = witness };
|
||||
|
||||
public static ReachabilityResult NotAffected(SuppressionWitness witness) =>
|
||||
new() { Verdict = ReachabilityVerdict.NotAffected, SuppressionWitness = witness };
|
||||
}
|
||||
|
||||
public enum ReachabilityVerdict
|
||||
{
|
||||
Affected,
|
||||
NotAffected,
|
||||
Unknown
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | SUP-001 | TODO | - | - | Define `SuppressionType` enum |
|
||||
| 2 | SUP-002 | TODO | SUP-001 | - | Define `SuppressionWitness` record |
|
||||
| 3 | SUP-003 | TODO | SUP-002 | - | Define `SuppressionEvidence` and sub-records |
|
||||
| 4 | SUP-004 | TODO | SUP-003 | - | Define `SuppressionWitnessSchema` version |
|
||||
| 5 | SUP-005 | TODO | SUP-004 | - | Define `ISuppressionWitnessBuilder` interface |
|
||||
| 6 | SUP-006 | TODO | SUP-005 | - | Implement `SuppressionWitnessBuilder.BuildUnreachable()` |
|
||||
| 7 | SUP-007 | TODO | SUP-006 | - | Implement `SuppressionWitnessBuilder.BuildPatchedSymbol()` |
|
||||
| 8 | SUP-008 | TODO | SUP-007 | - | Implement `SuppressionWitnessBuilder.BuildFunctionAbsent()` |
|
||||
| 9 | SUP-009 | TODO | SUP-008 | - | Implement `SuppressionWitnessBuilder.BuildGateBlocked()` |
|
||||
| 10 | SUP-010 | TODO | SUP-009 | - | Implement `SuppressionWitnessBuilder.BuildFeatureFlagDisabled()` |
|
||||
| 11 | SUP-011 | TODO | SUP-010 | - | Implement `SuppressionWitnessBuilder.BuildFromVexStatement()` |
|
||||
| 12 | SUP-012 | TODO | SUP-011 | - | Implement `SuppressionWitnessBuilder.BuildVersionNotAffected()` |
|
||||
| 13 | SUP-013 | TODO | SUP-012 | - | Implement content-addressed witness ID computation |
|
||||
| 14 | SUP-014 | TODO | SUP-013 | - | Define `ISuppressionDsseSigner` interface |
|
||||
| 15 | SUP-015 | TODO | SUP-014 | - | Implement `SuppressionDsseSigner.SignAsync()` |
|
||||
| 16 | SUP-016 | TODO | SUP-015 | - | Implement `SuppressionDsseSigner.VerifyAsync()` |
|
||||
| 17 | SUP-017 | TODO | SUP-016 | - | Create `ReachabilityResult` unified result type |
|
||||
| 18 | SUP-018 | TODO | SUP-017 | - | Integrate SuppressionWitnessBuilder into ReachabilityStackEvaluator |
|
||||
| 19 | SUP-019 | TODO | SUP-018 | - | Add service registration extensions |
|
||||
| 20 | SUP-020 | TODO | SUP-019 | - | Write unit tests: SuppressionWitnessBuilder (all types) |
|
||||
| 21 | SUP-021 | TODO | SUP-020 | - | Write unit tests: SuppressionDsseSigner |
|
||||
| 22 | SUP-022 | TODO | SUP-021 | - | Write unit tests: ReachabilityStackEvaluator with suppression |
|
||||
| 23 | SUP-023 | TODO | SUP-022 | - | Write golden fixture tests for witness serialization |
|
||||
| 24 | SUP-024 | TODO | SUP-023 | - | Write property tests: witness ID determinism |
|
||||
| 25 | SUP-025 | TODO | SUP-024 | - | Add JSON schema for SuppressionWitness (stellaops.suppression.v1) |
|
||||
| 26 | SUP-026 | TODO | SUP-025 | - | Document suppression types in docs/modules/scanner/ |
|
||||
| 27 | SUP-027 | TODO | SUP-026 | - | Expose suppression witnesses via Scanner.WebService API |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Completeness:** All 10 suppression types have dedicated builders
|
||||
2. **DSSE Signing:** All suppression witnesses are signable with DSSE
|
||||
3. **Determinism:** Same inputs produce identical witness IDs (content-addressed)
|
||||
4. **Schema:** JSON schema registered at `stellaops.suppression.v1`
|
||||
5. **Integration:** ReachabilityStackEvaluator returns SuppressionWitness for not-affected findings
|
||||
6. **Test Coverage:** Unit tests for all builder methods, property tests for determinism
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| 10 suppression types | Covers all common not-affected scenarios per advisory |
|
||||
| Content-addressed IDs | Enables caching and deduplication |
|
||||
| Confidence scores | Different evidence has different reliability |
|
||||
| Optional expiration | Some suppressions are time-bounded (e.g., pending patches) |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| False suppression | Confidence thresholds; manual review for low confidence |
|
||||
| Missing suppression type | Extensible enum; can add new types |
|
||||
| Complex evidence | Structured sub-records for each type |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |
|
||||
|
||||
@@ -0,0 +1,962 @@
|
||||
# Sprint 20260106_001_003_BINDEX - Symbol Table Diff
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Extend `PatchDiffEngine` with symbol table comparison capabilities to track exported/imported symbol changes, version maps, and GOT/PLT table modifications between binary versions.
|
||||
|
||||
- **Working directory:** `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
|
||||
- **Evidence:** SymbolTableDiff model, analyzer, tests, integration with MaterialChange
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The product advisory requires **per-layer diffs** including:
|
||||
> **Symbols:** exported symbols and version maps; highlight ABI-relevant changes.
|
||||
|
||||
Current state:
|
||||
- `PatchDiffEngine` compares **function bodies** (fingerprints, CFG, basic blocks)
|
||||
- `DeltaSignatureGenerator` creates CVE signatures at function level
|
||||
- No comparison of:
|
||||
- Exported symbol table (.dynsym, .symtab)
|
||||
- Imported symbols and version requirements (.gnu.version_r)
|
||||
- Symbol versioning maps (.gnu.version, .gnu.version_d)
|
||||
- GOT/PLT entries (dynamic linking)
|
||||
- Relocation entries
|
||||
|
||||
**Gap:** Symbol-level changes between binaries are not detected or reported.
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** StellaOps.BinaryIndex.Disassembly (for ELF/PE parsing)
|
||||
- **Blocks:** SPRINT_20260106_001_004_LB (orchestrator uses symbol diffs)
|
||||
- **Parallel safe:** Extends existing module; no conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/binary-index/architecture.md
|
||||
- src/BinaryIndex/AGENTS.md
|
||||
- Existing PatchDiffEngine at `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Data Contracts
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
|
||||
|
||||
/// <summary>
|
||||
/// Complete symbol table diff between two binaries.
|
||||
/// </summary>
|
||||
public sealed record SymbolTableDiff
|
||||
{
|
||||
/// <summary>Content-addressed diff ID.</summary>
|
||||
[JsonPropertyName("diff_id")]
|
||||
public required string DiffId { get; init; }
|
||||
|
||||
/// <summary>Base binary identity.</summary>
|
||||
[JsonPropertyName("base")]
|
||||
public required BinaryRef Base { get; init; }
|
||||
|
||||
/// <summary>Target binary identity.</summary>
|
||||
[JsonPropertyName("target")]
|
||||
public required BinaryRef Target { get; init; }
|
||||
|
||||
/// <summary>Exported symbol changes.</summary>
|
||||
[JsonPropertyName("exports")]
|
||||
public required SymbolChangeSummary Exports { get; init; }
|
||||
|
||||
/// <summary>Imported symbol changes.</summary>
|
||||
[JsonPropertyName("imports")]
|
||||
public required SymbolChangeSummary Imports { get; init; }
|
||||
|
||||
/// <summary>Version map changes.</summary>
|
||||
[JsonPropertyName("versions")]
|
||||
public required VersionMapDiff Versions { get; init; }
|
||||
|
||||
/// <summary>GOT/PLT changes (dynamic linking).</summary>
|
||||
[JsonPropertyName("dynamic")]
|
||||
public DynamicLinkingDiff? Dynamic { get; init; }
|
||||
|
||||
/// <summary>Overall ABI compatibility assessment.</summary>
|
||||
[JsonPropertyName("abi_compatibility")]
|
||||
public required AbiCompatibility AbiCompatibility { get; init; }
|
||||
|
||||
/// <summary>When this diff was computed (UTC).</summary>
|
||||
[JsonPropertyName("computed_at")]
|
||||
public required DateTimeOffset ComputedAt { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Reference to a binary.</summary>
|
||||
public sealed record BinaryRef
|
||||
{
|
||||
[JsonPropertyName("path")]
|
||||
public required string Path { get; init; }
|
||||
|
||||
[JsonPropertyName("sha256")]
|
||||
public required string Sha256 { get; init; }
|
||||
|
||||
[JsonPropertyName("build_id")]
|
||||
public string? BuildId { get; init; }
|
||||
|
||||
[JsonPropertyName("architecture")]
|
||||
public required string Architecture { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Summary of symbol changes.</summary>
|
||||
public sealed record SymbolChangeSummary
|
||||
{
|
||||
[JsonPropertyName("added")]
|
||||
public required IReadOnlyList<SymbolChange> Added { get; init; }
|
||||
|
||||
[JsonPropertyName("removed")]
|
||||
public required IReadOnlyList<SymbolChange> Removed { get; init; }
|
||||
|
||||
[JsonPropertyName("modified")]
|
||||
public required IReadOnlyList<SymbolModification> Modified { get; init; }
|
||||
|
||||
[JsonPropertyName("renamed")]
|
||||
public required IReadOnlyList<SymbolRename> Renamed { get; init; }
|
||||
|
||||
/// <summary>Count summaries.</summary>
|
||||
[JsonPropertyName("counts")]
|
||||
public required SymbolChangeCounts Counts { get; init; }
|
||||
}
|
||||
|
||||
public sealed record SymbolChangeCounts
|
||||
{
|
||||
[JsonPropertyName("added")]
|
||||
public int Added { get; init; }
|
||||
|
||||
[JsonPropertyName("removed")]
|
||||
public int Removed { get; init; }
|
||||
|
||||
[JsonPropertyName("modified")]
|
||||
public int Modified { get; init; }
|
||||
|
||||
[JsonPropertyName("renamed")]
|
||||
public int Renamed { get; init; }
|
||||
|
||||
[JsonPropertyName("unchanged")]
|
||||
public int Unchanged { get; init; }
|
||||
|
||||
[JsonPropertyName("total_base")]
|
||||
public int TotalBase { get; init; }
|
||||
|
||||
[JsonPropertyName("total_target")]
|
||||
public int TotalTarget { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>A single symbol change.</summary>
|
||||
public sealed record SymbolChange
|
||||
{
|
||||
[JsonPropertyName("name")]
|
||||
public required string Name { get; init; }
|
||||
|
||||
[JsonPropertyName("demangled")]
|
||||
public string? Demangled { get; init; }
|
||||
|
||||
[JsonPropertyName("type")]
|
||||
public required SymbolType Type { get; init; }
|
||||
|
||||
[JsonPropertyName("binding")]
|
||||
public required SymbolBinding Binding { get; init; }
|
||||
|
||||
[JsonPropertyName("visibility")]
|
||||
public required SymbolVisibility Visibility { get; init; }
|
||||
|
||||
[JsonPropertyName("version")]
|
||||
public string? Version { get; init; }
|
||||
|
||||
[JsonPropertyName("address")]
|
||||
public ulong? Address { get; init; }
|
||||
|
||||
[JsonPropertyName("size")]
|
||||
public ulong? Size { get; init; }
|
||||
|
||||
[JsonPropertyName("section")]
|
||||
public string? Section { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>A symbol that was modified.</summary>
|
||||
public sealed record SymbolModification
|
||||
{
|
||||
[JsonPropertyName("name")]
|
||||
public required string Name { get; init; }
|
||||
|
||||
[JsonPropertyName("demangled")]
|
||||
public string? Demangled { get; init; }
|
||||
|
||||
[JsonPropertyName("changes")]
|
||||
public required IReadOnlyList<SymbolFieldChange> Changes { get; init; }
|
||||
|
||||
[JsonPropertyName("abi_breaking")]
|
||||
public bool AbiBreaking { get; init; }
|
||||
}
|
||||
|
||||
public sealed record SymbolFieldChange
|
||||
{
|
||||
[JsonPropertyName("field")]
|
||||
public required string Field { get; init; }
|
||||
|
||||
[JsonPropertyName("old_value")]
|
||||
public required string OldValue { get; init; }
|
||||
|
||||
[JsonPropertyName("new_value")]
|
||||
public required string NewValue { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>A symbol that was renamed.</summary>
|
||||
public sealed record SymbolRename
|
||||
{
|
||||
[JsonPropertyName("old_name")]
|
||||
public required string OldName { get; init; }
|
||||
|
||||
[JsonPropertyName("new_name")]
|
||||
public required string NewName { get; init; }
|
||||
|
||||
[JsonPropertyName("confidence")]
|
||||
public required double Confidence { get; init; }
|
||||
|
||||
[JsonPropertyName("reason")]
|
||||
public required string Reason { get; init; }
|
||||
}
|
||||
|
||||
public enum SymbolType
|
||||
{
|
||||
Function,
|
||||
Object,
|
||||
TlsObject,
|
||||
Section,
|
||||
File,
|
||||
Common,
|
||||
Indirect,
|
||||
Unknown
|
||||
}
|
||||
|
||||
public enum SymbolBinding
|
||||
{
|
||||
Local,
|
||||
Global,
|
||||
Weak,
|
||||
Unknown
|
||||
}
|
||||
|
||||
public enum SymbolVisibility
|
||||
{
|
||||
Default,
|
||||
Internal,
|
||||
Hidden,
|
||||
Protected
|
||||
}
|
||||
|
||||
/// <summary>Version map changes.</summary>
|
||||
public sealed record VersionMapDiff
|
||||
{
|
||||
/// <summary>Version definitions added.</summary>
|
||||
[JsonPropertyName("definitions_added")]
|
||||
public required IReadOnlyList<VersionDefinition> DefinitionsAdded { get; init; }
|
||||
|
||||
/// <summary>Version definitions removed.</summary>
|
||||
[JsonPropertyName("definitions_removed")]
|
||||
public required IReadOnlyList<VersionDefinition> DefinitionsRemoved { get; init; }
|
||||
|
||||
/// <summary>Version requirements added.</summary>
|
||||
[JsonPropertyName("requirements_added")]
|
||||
public required IReadOnlyList<VersionRequirement> RequirementsAdded { get; init; }
|
||||
|
||||
/// <summary>Version requirements removed.</summary>
|
||||
[JsonPropertyName("requirements_removed")]
|
||||
public required IReadOnlyList<VersionRequirement> RequirementsRemoved { get; init; }
|
||||
|
||||
/// <summary>Symbols with version changes.</summary>
|
||||
[JsonPropertyName("symbol_version_changes")]
|
||||
public required IReadOnlyList<SymbolVersionChange> SymbolVersionChanges { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VersionDefinition
|
||||
{
|
||||
[JsonPropertyName("name")]
|
||||
public required string Name { get; init; }
|
||||
|
||||
[JsonPropertyName("index")]
|
||||
public int Index { get; init; }
|
||||
|
||||
[JsonPropertyName("predecessors")]
|
||||
public IReadOnlyList<string>? Predecessors { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VersionRequirement
|
||||
{
|
||||
[JsonPropertyName("library")]
|
||||
public required string Library { get; init; }
|
||||
|
||||
[JsonPropertyName("version")]
|
||||
public required string Version { get; init; }
|
||||
|
||||
[JsonPropertyName("symbols")]
|
||||
public IReadOnlyList<string>? Symbols { get; init; }
|
||||
}
|
||||
|
||||
public sealed record SymbolVersionChange
|
||||
{
|
||||
[JsonPropertyName("symbol")]
|
||||
public required string Symbol { get; init; }
|
||||
|
||||
[JsonPropertyName("old_version")]
|
||||
public required string OldVersion { get; init; }
|
||||
|
||||
[JsonPropertyName("new_version")]
|
||||
public required string NewVersion { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Dynamic linking changes (GOT/PLT).</summary>
|
||||
public sealed record DynamicLinkingDiff
|
||||
{
|
||||
/// <summary>GOT entries added.</summary>
|
||||
[JsonPropertyName("got_added")]
|
||||
public required IReadOnlyList<GotEntry> GotAdded { get; init; }
|
||||
|
||||
/// <summary>GOT entries removed.</summary>
|
||||
[JsonPropertyName("got_removed")]
|
||||
public required IReadOnlyList<GotEntry> GotRemoved { get; init; }
|
||||
|
||||
/// <summary>PLT entries added.</summary>
|
||||
[JsonPropertyName("plt_added")]
|
||||
public required IReadOnlyList<PltEntry> PltAdded { get; init; }
|
||||
|
||||
/// <summary>PLT entries removed.</summary>
|
||||
[JsonPropertyName("plt_removed")]
|
||||
public required IReadOnlyList<PltEntry> PltRemoved { get; init; }
|
||||
|
||||
/// <summary>Relocation changes.</summary>
|
||||
[JsonPropertyName("relocation_changes")]
|
||||
public IReadOnlyList<RelocationChange>? RelocationChanges { get; init; }
|
||||
}
|
||||
|
||||
public sealed record GotEntry
|
||||
{
|
||||
[JsonPropertyName("symbol")]
|
||||
public required string Symbol { get; init; }
|
||||
|
||||
[JsonPropertyName("offset")]
|
||||
public ulong Offset { get; init; }
|
||||
}
|
||||
|
||||
public sealed record PltEntry
|
||||
{
|
||||
[JsonPropertyName("symbol")]
|
||||
public required string Symbol { get; init; }
|
||||
|
||||
[JsonPropertyName("address")]
|
||||
public ulong Address { get; init; }
|
||||
}
|
||||
|
||||
public sealed record RelocationChange
|
||||
{
|
||||
[JsonPropertyName("type")]
|
||||
public required string Type { get; init; }
|
||||
|
||||
[JsonPropertyName("symbol")]
|
||||
public required string Symbol { get; init; }
|
||||
|
||||
[JsonPropertyName("change_kind")]
|
||||
public required string ChangeKind { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>ABI compatibility assessment.</summary>
|
||||
public sealed record AbiCompatibility
|
||||
{
|
||||
[JsonPropertyName("level")]
|
||||
public required AbiCompatibilityLevel Level { get; init; }
|
||||
|
||||
[JsonPropertyName("breaking_changes")]
|
||||
public required IReadOnlyList<AbiBreakingChange> BreakingChanges { get; init; }
|
||||
|
||||
[JsonPropertyName("score")]
|
||||
public required double Score { get; init; }
|
||||
}
|
||||
|
||||
public enum AbiCompatibilityLevel
|
||||
{
|
||||
/// <summary>Fully backward compatible.</summary>
|
||||
Compatible,
|
||||
|
||||
/// <summary>Minor changes, likely compatible.</summary>
|
||||
MinorChanges,
|
||||
|
||||
/// <summary>Breaking changes detected.</summary>
|
||||
Breaking,
|
||||
|
||||
/// <summary>Cannot determine compatibility.</summary>
|
||||
Unknown
|
||||
}
|
||||
|
||||
public sealed record AbiBreakingChange
|
||||
{
|
||||
[JsonPropertyName("category")]
|
||||
public required string Category { get; init; }
|
||||
|
||||
[JsonPropertyName("symbol")]
|
||||
public required string Symbol { get; init; }
|
||||
|
||||
[JsonPropertyName("description")]
|
||||
public required string Description { get; init; }
|
||||
|
||||
[JsonPropertyName("severity")]
|
||||
public required string Severity { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Symbol Table Analyzer Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
|
||||
|
||||
/// <summary>
|
||||
/// Analyzes symbol table differences between binaries.
|
||||
/// </summary>
|
||||
public interface ISymbolTableDiffAnalyzer
|
||||
{
|
||||
/// <summary>
|
||||
/// Compute symbol table diff between two binaries.
|
||||
/// </summary>
|
||||
Task<SymbolTableDiff> ComputeDiffAsync(
|
||||
string basePath,
|
||||
string targetPath,
|
||||
SymbolDiffOptions? options = null,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Extract symbol table from a binary.
|
||||
/// </summary>
|
||||
Task<SymbolTable> ExtractSymbolTableAsync(
|
||||
string binaryPath,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Options for symbol diff analysis.
|
||||
/// </summary>
|
||||
public sealed record SymbolDiffOptions
|
||||
{
|
||||
/// <summary>Include local symbols (default: false).</summary>
|
||||
public bool IncludeLocalSymbols { get; init; } = false;
|
||||
|
||||
/// <summary>Include debug symbols (default: false).</summary>
|
||||
public bool IncludeDebugSymbols { get; init; } = false;
|
||||
|
||||
/// <summary>Demangle C++ symbols (default: true).</summary>
|
||||
public bool Demangle { get; init; } = true;
|
||||
|
||||
/// <summary>Detect renames via fingerprint matching (default: true).</summary>
|
||||
public bool DetectRenames { get; init; } = true;
|
||||
|
||||
/// <summary>Minimum confidence for rename detection (default: 0.7).</summary>
|
||||
public double RenameConfidenceThreshold { get; init; } = 0.7;
|
||||
|
||||
/// <summary>Include GOT/PLT analysis (default: true).</summary>
|
||||
public bool IncludeDynamicLinking { get; init; } = true;
|
||||
|
||||
/// <summary>Include version map analysis (default: true).</summary>
|
||||
public bool IncludeVersionMaps { get; init; } = true;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Extracted symbol table from a binary.
|
||||
/// </summary>
|
||||
public sealed record SymbolTable
|
||||
{
|
||||
public required string BinaryPath { get; init; }
|
||||
public required string Sha256 { get; init; }
|
||||
public string? BuildId { get; init; }
|
||||
public required string Architecture { get; init; }
|
||||
public required IReadOnlyList<Symbol> Exports { get; init; }
|
||||
public required IReadOnlyList<Symbol> Imports { get; init; }
|
||||
public required IReadOnlyList<VersionDefinition> VersionDefinitions { get; init; }
|
||||
public required IReadOnlyList<VersionRequirement> VersionRequirements { get; init; }
|
||||
public IReadOnlyList<GotEntry>? GotEntries { get; init; }
|
||||
public IReadOnlyList<PltEntry>? PltEntries { get; init; }
|
||||
}
|
||||
|
||||
public sealed record Symbol
|
||||
{
|
||||
public required string Name { get; init; }
|
||||
public string? Demangled { get; init; }
|
||||
public required SymbolType Type { get; init; }
|
||||
public required SymbolBinding Binding { get; init; }
|
||||
public required SymbolVisibility Visibility { get; init; }
|
||||
public string? Version { get; init; }
|
||||
public ulong Address { get; init; }
|
||||
public ulong Size { get; init; }
|
||||
public string? Section { get; init; }
|
||||
public string? Fingerprint { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Symbol Table Diff Analyzer Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
|
||||
|
||||
public sealed class SymbolTableDiffAnalyzer : ISymbolTableDiffAnalyzer
|
||||
{
|
||||
private readonly IDisassemblyService _disassembly;
|
||||
private readonly IFunctionFingerprintExtractor _fingerprinter;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<SymbolTableDiffAnalyzer> _logger;
|
||||
|
||||
public SymbolTableDiffAnalyzer(
|
||||
IDisassemblyService disassembly,
|
||||
IFunctionFingerprintExtractor fingerprinter,
|
||||
TimeProvider timeProvider,
|
||||
ILogger<SymbolTableDiffAnalyzer> logger)
|
||||
{
|
||||
_disassembly = disassembly;
|
||||
_fingerprinter = fingerprinter;
|
||||
_timeProvider = timeProvider;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task<SymbolTableDiff> ComputeDiffAsync(
|
||||
string basePath,
|
||||
string targetPath,
|
||||
SymbolDiffOptions? options = null,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
options ??= new SymbolDiffOptions();
|
||||
|
||||
var baseTable = await ExtractSymbolTableAsync(basePath, ct);
|
||||
var targetTable = await ExtractSymbolTableAsync(targetPath, ct);
|
||||
|
||||
var exports = ComputeSymbolChanges(
|
||||
baseTable.Exports, targetTable.Exports, options);
|
||||
|
||||
var imports = ComputeSymbolChanges(
|
||||
baseTable.Imports, targetTable.Imports, options);
|
||||
|
||||
var versions = ComputeVersionDiff(baseTable, targetTable);
|
||||
|
||||
DynamicLinkingDiff? dynamic = null;
|
||||
if (options.IncludeDynamicLinking)
|
||||
{
|
||||
dynamic = ComputeDynamicLinkingDiff(baseTable, targetTable);
|
||||
}
|
||||
|
||||
var abiCompatibility = AssessAbiCompatibility(exports, imports, versions);
|
||||
|
||||
var diff = new SymbolTableDiff
|
||||
{
|
||||
DiffId = ComputeDiffId(baseTable, targetTable),
|
||||
Base = new BinaryRef
|
||||
{
|
||||
Path = basePath,
|
||||
Sha256 = baseTable.Sha256,
|
||||
BuildId = baseTable.BuildId,
|
||||
Architecture = baseTable.Architecture
|
||||
},
|
||||
Target = new BinaryRef
|
||||
{
|
||||
Path = targetPath,
|
||||
Sha256 = targetTable.Sha256,
|
||||
BuildId = targetTable.BuildId,
|
||||
Architecture = targetTable.Architecture
|
||||
},
|
||||
Exports = exports,
|
||||
Imports = imports,
|
||||
Versions = versions,
|
||||
Dynamic = dynamic,
|
||||
AbiCompatibility = abiCompatibility,
|
||||
ComputedAt = _timeProvider.GetUtcNow()
|
||||
};
|
||||
|
||||
_logger.LogInformation(
|
||||
"Computed symbol diff {DiffId}: exports (+{Added}/-{Removed}), " +
|
||||
"imports (+{ImpAdded}/-{ImpRemoved}), ABI={AbiLevel}",
|
||||
diff.DiffId,
|
||||
exports.Counts.Added, exports.Counts.Removed,
|
||||
imports.Counts.Added, imports.Counts.Removed,
|
||||
abiCompatibility.Level);
|
||||
|
||||
return diff;
|
||||
}
|
||||
|
||||
public async Task<SymbolTable> ExtractSymbolTableAsync(
|
||||
string binaryPath,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var binary = await _disassembly.LoadBinaryAsync(binaryPath, ct);
|
||||
|
||||
var exports = new List<Symbol>();
|
||||
var imports = new List<Symbol>();
|
||||
|
||||
foreach (var sym in binary.Symbols)
|
||||
{
|
||||
var symbol = new Symbol
|
||||
{
|
||||
Name = sym.Name,
|
||||
Demangled = Demangle(sym.Name),
|
||||
Type = MapSymbolType(sym.Type),
|
||||
Binding = MapSymbolBinding(sym.Binding),
|
||||
Visibility = MapSymbolVisibility(sym.Visibility),
|
||||
Version = sym.Version,
|
||||
Address = sym.Address,
|
||||
Size = sym.Size,
|
||||
Section = sym.Section,
|
||||
Fingerprint = sym.Type == ElfSymbolType.Function
|
||||
? await ComputeFingerprintAsync(binary, sym, ct)
|
||||
: null
|
||||
};
|
||||
|
||||
if (sym.IsExport)
|
||||
{
|
||||
exports.Add(symbol);
|
||||
}
|
||||
else if (sym.IsImport)
|
||||
{
|
||||
imports.Add(symbol);
|
||||
}
|
||||
}
|
||||
|
||||
return new SymbolTable
|
||||
{
|
||||
BinaryPath = binaryPath,
|
||||
Sha256 = binary.Sha256,
|
||||
BuildId = binary.BuildId,
|
||||
Architecture = binary.Architecture,
|
||||
Exports = exports,
|
||||
Imports = imports,
|
||||
VersionDefinitions = ExtractVersionDefinitions(binary),
|
||||
VersionRequirements = ExtractVersionRequirements(binary),
|
||||
GotEntries = ExtractGotEntries(binary),
|
||||
PltEntries = ExtractPltEntries(binary)
|
||||
};
|
||||
}
|
||||
|
||||
private SymbolChangeSummary ComputeSymbolChanges(
|
||||
IReadOnlyList<Symbol> baseSymbols,
|
||||
IReadOnlyList<Symbol> targetSymbols,
|
||||
SymbolDiffOptions options)
|
||||
{
|
||||
var baseByName = baseSymbols.ToDictionary(s => s.Name);
|
||||
var targetByName = targetSymbols.ToDictionary(s => s.Name);
|
||||
|
||||
var added = new List<SymbolChange>();
|
||||
var removed = new List<SymbolChange>();
|
||||
var modified = new List<SymbolModification>();
|
||||
var renamed = new List<SymbolRename>();
|
||||
var unchanged = 0;
|
||||
|
||||
// Find added symbols
|
||||
foreach (var (name, sym) in targetByName)
|
||||
{
|
||||
if (!baseByName.ContainsKey(name))
|
||||
{
|
||||
added.Add(MapToChange(sym));
|
||||
}
|
||||
}
|
||||
|
||||
// Find removed and modified symbols
|
||||
foreach (var (name, baseSym) in baseByName)
|
||||
{
|
||||
if (!targetByName.TryGetValue(name, out var targetSym))
|
||||
{
|
||||
removed.Add(MapToChange(baseSym));
|
||||
}
|
||||
else
|
||||
{
|
||||
var changes = CompareSymbols(baseSym, targetSym);
|
||||
if (changes.Count > 0)
|
||||
{
|
||||
modified.Add(new SymbolModification
|
||||
{
|
||||
Name = name,
|
||||
Demangled = baseSym.Demangled,
|
||||
Changes = changes,
|
||||
AbiBreaking = IsAbiBreaking(changes)
|
||||
});
|
||||
}
|
||||
else
|
||||
{
|
||||
unchanged++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Detect renames (removed symbol with matching fingerprint in added)
|
||||
if (options.DetectRenames)
|
||||
{
|
||||
renamed = DetectRenames(
|
||||
removed, added,
|
||||
options.RenameConfidenceThreshold);
|
||||
|
||||
// Remove detected renames from added/removed lists
|
||||
var renamedOld = renamed.Select(r => r.OldName).ToHashSet();
|
||||
var renamedNew = renamed.Select(r => r.NewName).ToHashSet();
|
||||
|
||||
removed = removed.Where(s => !renamedOld.Contains(s.Name)).ToList();
|
||||
added = added.Where(s => !renamedNew.Contains(s.Name)).ToList();
|
||||
}
|
||||
|
||||
return new SymbolChangeSummary
|
||||
{
|
||||
Added = added,
|
||||
Removed = removed,
|
||||
Modified = modified,
|
||||
Renamed = renamed,
|
||||
Counts = new SymbolChangeCounts
|
||||
{
|
||||
Added = added.Count,
|
||||
Removed = removed.Count,
|
||||
Modified = modified.Count,
|
||||
Renamed = renamed.Count,
|
||||
Unchanged = unchanged,
|
||||
TotalBase = baseSymbols.Count,
|
||||
TotalTarget = targetSymbols.Count
|
||||
}
|
||||
};
|
||||
}
|
||||
|
||||
private List<SymbolRename> DetectRenames(
|
||||
List<SymbolChange> removed,
|
||||
List<SymbolChange> added,
|
||||
double threshold)
|
||||
{
|
||||
var renames = new List<SymbolRename>();
|
||||
|
||||
// Match by fingerprint (for functions with computed fingerprints)
|
||||
var removedFunctions = removed
|
||||
.Where(s => s.Type == SymbolType.Function)
|
||||
.ToList();
|
||||
|
||||
var addedFunctions = added
|
||||
.Where(s => s.Type == SymbolType.Function)
|
||||
.ToList();
|
||||
|
||||
// Use fingerprint matching from PatchDiffEngine
|
||||
foreach (var oldSym in removedFunctions)
|
||||
{
|
||||
foreach (var newSym in addedFunctions)
|
||||
{
|
||||
// Size similarity as quick filter
|
||||
if (oldSym.Size.HasValue && newSym.Size.HasValue)
|
||||
{
|
||||
var sizeRatio = Math.Min(oldSym.Size.Value, newSym.Size.Value) /
|
||||
Math.Max(oldSym.Size.Value, newSym.Size.Value);
|
||||
|
||||
if (sizeRatio < 0.5) continue;
|
||||
}
|
||||
|
||||
// TODO: Use fingerprint comparison when available
|
||||
// For now, use name similarity heuristic
|
||||
var nameSimilarity = ComputeNameSimilarity(oldSym.Name, newSym.Name);
|
||||
|
||||
if (nameSimilarity >= threshold)
|
||||
{
|
||||
renames.Add(new SymbolRename
|
||||
{
|
||||
OldName = oldSym.Name,
|
||||
NewName = newSym.Name,
|
||||
Confidence = nameSimilarity,
|
||||
Reason = "Name similarity match"
|
||||
});
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return renames;
|
||||
}
|
||||
|
||||
private AbiCompatibility AssessAbiCompatibility(
|
||||
SymbolChangeSummary exports,
|
||||
SymbolChangeSummary imports,
|
||||
VersionMapDiff versions)
|
||||
{
|
||||
var breakingChanges = new List<AbiBreakingChange>();
|
||||
|
||||
// Removed exports are ABI breaking
|
||||
foreach (var sym in exports.Removed)
|
||||
{
|
||||
if (sym.Binding == SymbolBinding.Global)
|
||||
{
|
||||
breakingChanges.Add(new AbiBreakingChange
|
||||
{
|
||||
Category = "RemovedExport",
|
||||
Symbol = sym.Name,
|
||||
Description = $"Global symbol `{sym.Name}` was removed",
|
||||
Severity = "High"
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Modified exports with type/size changes
|
||||
foreach (var mod in exports.Modified.Where(m => m.AbiBreaking))
|
||||
{
|
||||
breakingChanges.Add(new AbiBreakingChange
|
||||
{
|
||||
Category = "ModifiedExport",
|
||||
Symbol = mod.Name,
|
||||
Description = $"Symbol `{mod.Name}` has ABI-breaking changes: " +
|
||||
string.Join(", ", mod.Changes.Select(c => c.Field)),
|
||||
Severity = "Medium"
|
||||
});
|
||||
}
|
||||
|
||||
// New required versions are potentially breaking
|
||||
foreach (var req in versions.RequirementsAdded)
|
||||
{
|
||||
breakingChanges.Add(new AbiBreakingChange
|
||||
{
|
||||
Category = "NewVersionRequirement",
|
||||
Symbol = req.Library,
|
||||
Description = $"New version requirement: {req.Library}@{req.Version}",
|
||||
Severity = "Low"
|
||||
});
|
||||
}
|
||||
|
||||
var level = breakingChanges.Count switch
|
||||
{
|
||||
0 => AbiCompatibilityLevel.Compatible,
|
||||
_ when breakingChanges.All(b => b.Severity == "Low") => AbiCompatibilityLevel.MinorChanges,
|
||||
_ => AbiCompatibilityLevel.Breaking
|
||||
};
|
||||
|
||||
var score = 1.0 - (breakingChanges.Count * 0.1);
|
||||
score = Math.Max(0.0, Math.Min(1.0, score));
|
||||
|
||||
return new AbiCompatibility
|
||||
{
|
||||
Level = level,
|
||||
BreakingChanges = breakingChanges,
|
||||
Score = Math.Round(score, 4)
|
||||
};
|
||||
}
|
||||
|
||||
private static string ComputeDiffId(SymbolTable baseTable, SymbolTable targetTable)
|
||||
{
|
||||
var input = $"{baseTable.Sha256}:{targetTable.Sha256}";
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(input));
|
||||
return $"symdiff:sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..32]}";
|
||||
}
|
||||
|
||||
// Helper methods omitted for brevity...
|
||||
}
|
||||
```
|
||||
|
||||
### Integration with MaterialChange
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Scanner.SmartDiff;
|
||||
|
||||
/// <summary>
|
||||
/// Extended MaterialChange with symbol-level scope.
|
||||
/// </summary>
|
||||
public sealed record MaterialChange
|
||||
{
|
||||
// Existing fields...
|
||||
|
||||
/// <summary>Scope of the change: file, symbol, or package.</summary>
|
||||
[JsonPropertyName("scope")]
|
||||
public MaterialChangeScope Scope { get; init; } = MaterialChangeScope.Package;
|
||||
|
||||
/// <summary>Symbol-level details (when scope = Symbol).</summary>
|
||||
[JsonPropertyName("symbolDetails")]
|
||||
public SymbolChangeDetails? SymbolDetails { get; init; }
|
||||
}
|
||||
|
||||
public enum MaterialChangeScope
|
||||
{
|
||||
Package,
|
||||
File,
|
||||
Symbol
|
||||
}
|
||||
|
||||
public sealed record SymbolChangeDetails
|
||||
{
|
||||
[JsonPropertyName("symbol_name")]
|
||||
public required string SymbolName { get; init; }
|
||||
|
||||
[JsonPropertyName("demangled")]
|
||||
public string? Demangled { get; init; }
|
||||
|
||||
[JsonPropertyName("change_type")]
|
||||
public required SymbolMaterialChangeType ChangeType { get; init; }
|
||||
|
||||
[JsonPropertyName("abi_impact")]
|
||||
public required string AbiImpact { get; init; }
|
||||
|
||||
[JsonPropertyName("diff_ref")]
|
||||
public string? DiffRef { get; init; }
|
||||
}
|
||||
|
||||
public enum SymbolMaterialChangeType
|
||||
{
|
||||
Added,
|
||||
Removed,
|
||||
Modified,
|
||||
Renamed,
|
||||
VersionChanged
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | SYM-001 | TODO | - | - | Define `SymbolTableDiff` and related records |
|
||||
| 2 | SYM-002 | TODO | SYM-001 | - | Define `SymbolChangeSummary` and change records |
|
||||
| 3 | SYM-003 | TODO | SYM-002 | - | Define `VersionMapDiff` records |
|
||||
| 4 | SYM-004 | TODO | SYM-003 | - | Define `DynamicLinkingDiff` records (GOT/PLT) |
|
||||
| 5 | SYM-005 | TODO | SYM-004 | - | Define `AbiCompatibility` assessment model |
|
||||
| 6 | SYM-006 | TODO | SYM-005 | - | Define `ISymbolTableDiffAnalyzer` interface |
|
||||
| 7 | SYM-007 | TODO | SYM-006 | - | Implement `ExtractSymbolTableAsync()` for ELF |
|
||||
| 8 | SYM-008 | TODO | SYM-007 | - | Implement `ExtractSymbolTableAsync()` for PE |
|
||||
| 9 | SYM-009 | TODO | SYM-008 | - | Implement `ComputeSymbolChanges()` for exports |
|
||||
| 10 | SYM-010 | TODO | SYM-009 | - | Implement `ComputeSymbolChanges()` for imports |
|
||||
| 11 | SYM-011 | TODO | SYM-010 | - | Implement `ComputeVersionDiff()` |
|
||||
| 12 | SYM-012 | TODO | SYM-011 | - | Implement `ComputeDynamicLinkingDiff()` |
|
||||
| 13 | SYM-013 | TODO | SYM-012 | - | Implement `DetectRenames()` via fingerprint matching |
|
||||
| 14 | SYM-014 | TODO | SYM-013 | - | Implement `AssessAbiCompatibility()` |
|
||||
| 15 | SYM-015 | TODO | SYM-014 | - | Implement content-addressed diff ID computation |
|
||||
| 16 | SYM-016 | TODO | SYM-015 | - | Add C++ name demangling support |
|
||||
| 17 | SYM-017 | TODO | SYM-016 | - | Add Rust name demangling support |
|
||||
| 18 | SYM-018 | TODO | SYM-017 | - | Extend `MaterialChange` with symbol scope |
|
||||
| 19 | SYM-019 | TODO | SYM-018 | - | Add service registration extensions |
|
||||
| 20 | SYM-020 | TODO | SYM-019 | - | Write unit tests: ELF symbol extraction |
|
||||
| 21 | SYM-021 | TODO | SYM-020 | - | Write unit tests: PE symbol extraction |
|
||||
| 22 | SYM-022 | TODO | SYM-021 | - | Write unit tests: symbol change detection |
|
||||
| 23 | SYM-023 | TODO | SYM-022 | - | Write unit tests: rename detection |
|
||||
| 24 | SYM-024 | TODO | SYM-023 | - | Write unit tests: ABI compatibility assessment |
|
||||
| 25 | SYM-025 | TODO | SYM-024 | - | Write golden fixture tests with known binaries |
|
||||
| 26 | SYM-026 | TODO | SYM-025 | - | Add JSON schema for SymbolTableDiff |
|
||||
| 27 | SYM-027 | TODO | SYM-026 | - | Document in docs/modules/binary-index/ |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Completeness:** Extract exports, imports, versions, GOT/PLT from ELF and PE
|
||||
2. **Change Detection:** Identify added, removed, modified, renamed symbols
|
||||
3. **ABI Assessment:** Classify compatibility level with breaking change details
|
||||
4. **Rename Detection:** Match renames via fingerprint similarity (threshold 0.7)
|
||||
5. **MaterialChange Integration:** Symbol changes appear as `scope: symbol` in diffs
|
||||
6. **Test Coverage:** Unit tests for all extractors, golden fixtures for known binaries
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Content-addressed diff IDs | Enables caching and deduplication |
|
||||
| ABI compatibility scoring | Provides quick triage of binary changes |
|
||||
| Fingerprint-based rename detection | Handles version-to-version symbol renames |
|
||||
| Separate ELF/PE extractors | Different binary formats require different parsing |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Large symbol tables | Paginate results; index by name |
|
||||
| False rename detection | Confidence threshold; manual review for low confidence |
|
||||
| Stripped binaries | Graceful degradation; note limited analysis |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |
|
||||
|
||||
@@ -0,0 +1,986 @@
|
||||
# Sprint 20260106_001_003_POLICY - Determinization: Policy Engine Integration
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Integrate the Determinization subsystem into the Policy Engine. This includes the `DeterminizationGate`, policy rules for allow/quarantine/escalate, `GuardedPass` verdict status extension, and event-driven re-evaluation subscriptions.
|
||||
|
||||
- **Working directory:** `src/Policy/StellaOps.Policy.Engine/` and `src/Policy/__Libraries/StellaOps.Policy/`
|
||||
- **Evidence:** Gate implementation, verdict extension, policy rules, integration tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Current Policy Engine:
|
||||
- Uses `PolicyVerdictStatus` with Pass, Blocked, Ignored, Warned, Deferred, Escalated, RequiresVex
|
||||
- No "allow with guardrails" outcome for uncertain observations
|
||||
- No gate specifically for determinization/uncertainty thresholds
|
||||
- No automatic re-evaluation when new signals arrive
|
||||
|
||||
Advisory requires:
|
||||
- `GuardedPass` status for allowing uncertain observations with monitoring
|
||||
- `DeterminizationGate` that checks entropy/score thresholds
|
||||
- Policy rules: allow (score<0.5, entropy>0.4, non-prod), quarantine (EPSS>=0.4 or reachable), escalate (runtime proof)
|
||||
- Signal update subscriptions for automatic re-evaluation
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** SPRINT_20260106_001_001_LB, SPRINT_20260106_001_002_LB (determinization library)
|
||||
- **Blocks:** SPRINT_20260106_001_004_BE (backend integration)
|
||||
- **Parallel safe:** Policy module changes; coordinate with existing gate implementations
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/determinization-architecture.md
|
||||
- docs/modules/policy/architecture.md
|
||||
- src/Policy/AGENTS.md
|
||||
- Existing: `src/Policy/__Libraries/StellaOps.Policy/PolicyVerdict.cs`
|
||||
- Existing: `src/Policy/StellaOps.Policy.Engine/Gates/`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Directory Structure Changes
|
||||
|
||||
```
|
||||
src/Policy/__Libraries/StellaOps.Policy/
|
||||
├── PolicyVerdict.cs # MODIFY: Add GuardedPass status
|
||||
├── PolicyVerdictStatus.cs # MODIFY: Add GuardedPass enum value
|
||||
└── Determinization/ # NEW: Reference to library
|
||||
|
||||
src/Policy/StellaOps.Policy.Engine/
|
||||
├── Gates/
|
||||
│ ├── IDeterminizationGate.cs # NEW
|
||||
│ ├── DeterminizationGate.cs # NEW
|
||||
│ └── DeterminizationGateOptions.cs # NEW
|
||||
├── Policies/
|
||||
│ ├── IDeterminizationPolicy.cs # NEW
|
||||
│ ├── DeterminizationPolicy.cs # NEW
|
||||
│ └── DeterminizationRuleSet.cs # NEW
|
||||
└── Subscriptions/
|
||||
├── ISignalUpdateSubscription.cs # NEW
|
||||
├── SignalUpdateHandler.cs # NEW
|
||||
└── DeterminizationEventTypes.cs # NEW
|
||||
```
|
||||
|
||||
### PolicyVerdictStatus Extension
|
||||
|
||||
```csharp
|
||||
// In src/Policy/__Libraries/StellaOps.Policy/PolicyVerdictStatus.cs
|
||||
|
||||
namespace StellaOps.Policy;
|
||||
|
||||
/// <summary>
|
||||
/// Status outcomes for policy verdicts.
|
||||
/// </summary>
|
||||
public enum PolicyVerdictStatus
|
||||
{
|
||||
/// <summary>Finding meets policy requirements.</summary>
|
||||
Pass = 0,
|
||||
|
||||
/// <summary>
|
||||
/// NEW: Finding allowed with runtime monitoring enabled.
|
||||
/// Used for uncertain observations that don't exceed risk thresholds.
|
||||
/// </summary>
|
||||
GuardedPass = 1,
|
||||
|
||||
/// <summary>Finding fails policy checks; must be remediated.</summary>
|
||||
Blocked = 2,
|
||||
|
||||
/// <summary>Finding deliberately ignored via exception.</summary>
|
||||
Ignored = 3,
|
||||
|
||||
/// <summary>Finding passes but with warnings.</summary>
|
||||
Warned = 4,
|
||||
|
||||
/// <summary>Decision deferred; needs additional evidence.</summary>
|
||||
Deferred = 5,
|
||||
|
||||
/// <summary>Decision escalated for human review.</summary>
|
||||
Escalated = 6,
|
||||
|
||||
/// <summary>VEX statement required to make decision.</summary>
|
||||
RequiresVex = 7
|
||||
}
|
||||
```
|
||||
|
||||
### PolicyVerdict Extension
|
||||
|
||||
```csharp
|
||||
// Additions to src/Policy/__Libraries/StellaOps.Policy/PolicyVerdict.cs
|
||||
|
||||
namespace StellaOps.Policy;
|
||||
|
||||
public sealed record PolicyVerdict
|
||||
{
|
||||
// ... existing properties ...
|
||||
|
||||
/// <summary>
|
||||
/// Guardrails applied when Status is GuardedPass.
|
||||
/// Null for other statuses.
|
||||
/// </summary>
|
||||
public GuardRails? GuardRails { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Observation state suggested by the verdict.
|
||||
/// Used for determinization tracking.
|
||||
/// </summary>
|
||||
public ObservationState? SuggestedObservationState { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Uncertainty score at time of verdict.
|
||||
/// </summary>
|
||||
public UncertaintyScore? UncertaintyScore { get; init; }
|
||||
|
||||
/// <summary>
|
||||
/// Whether this verdict allows the finding to proceed (Pass or GuardedPass).
|
||||
/// </summary>
|
||||
public bool IsAllowing => Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass;
|
||||
|
||||
/// <summary>
|
||||
/// Whether this verdict requires monitoring (GuardedPass only).
|
||||
/// </summary>
|
||||
public bool RequiresMonitoring => Status == PolicyVerdictStatus.GuardedPass;
|
||||
}
|
||||
```
|
||||
|
||||
### IDeterminizationGate Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Gates;
|
||||
|
||||
/// <summary>
|
||||
/// Gate that evaluates determinization state and uncertainty for findings.
|
||||
/// </summary>
|
||||
public interface IDeterminizationGate : IPolicyGate
|
||||
{
|
||||
/// <summary>
|
||||
/// Evaluate a finding against determinization thresholds.
|
||||
/// </summary>
|
||||
/// <param name="context">Policy evaluation context.</param>
|
||||
/// <param name="ct">Cancellation token.</param>
|
||||
/// <returns>Gate evaluation result.</returns>
|
||||
Task<DeterminizationGateResult> EvaluateDeterminizationAsync(
|
||||
PolicyEvaluationContext context,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Result of determinization gate evaluation.
|
||||
/// </summary>
|
||||
public sealed record DeterminizationGateResult
|
||||
{
|
||||
/// <summary>Whether the gate passed.</summary>
|
||||
public required bool Passed { get; init; }
|
||||
|
||||
/// <summary>Policy verdict status.</summary>
|
||||
public required PolicyVerdictStatus Status { get; init; }
|
||||
|
||||
/// <summary>Reason for the decision.</summary>
|
||||
public required string Reason { get; init; }
|
||||
|
||||
/// <summary>Guardrails if GuardedPass.</summary>
|
||||
public GuardRails? GuardRails { get; init; }
|
||||
|
||||
/// <summary>Uncertainty score.</summary>
|
||||
public required UncertaintyScore UncertaintyScore { get; init; }
|
||||
|
||||
/// <summary>Decay information.</summary>
|
||||
public required ObservationDecay Decay { get; init; }
|
||||
|
||||
/// <summary>Trust score.</summary>
|
||||
public required double TrustScore { get; init; }
|
||||
|
||||
/// <summary>Rule that matched.</summary>
|
||||
public string? MatchedRule { get; init; }
|
||||
|
||||
/// <summary>Additional metadata for audit.</summary>
|
||||
public ImmutableDictionary<string, object>? Metadata { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationGate Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Gates;
|
||||
|
||||
/// <summary>
|
||||
/// Gate that evaluates CVE observations against determinization thresholds.
|
||||
/// </summary>
|
||||
public sealed class DeterminizationGate : IDeterminizationGate
|
||||
{
|
||||
private readonly IDeterminizationPolicy _policy;
|
||||
private readonly IUncertaintyScoreCalculator _uncertaintyCalculator;
|
||||
private readonly IDecayedConfidenceCalculator _decayCalculator;
|
||||
private readonly ITrustScoreAggregator _trustAggregator;
|
||||
private readonly ISignalSnapshotBuilder _snapshotBuilder;
|
||||
private readonly ILogger<DeterminizationGate> _logger;
|
||||
|
||||
public DeterminizationGate(
|
||||
IDeterminizationPolicy policy,
|
||||
IUncertaintyScoreCalculator uncertaintyCalculator,
|
||||
IDecayedConfidenceCalculator decayCalculator,
|
||||
ITrustScoreAggregator trustAggregator,
|
||||
ISignalSnapshotBuilder snapshotBuilder,
|
||||
ILogger<DeterminizationGate> logger)
|
||||
{
|
||||
_policy = policy;
|
||||
_uncertaintyCalculator = uncertaintyCalculator;
|
||||
_decayCalculator = decayCalculator;
|
||||
_trustAggregator = trustAggregator;
|
||||
_snapshotBuilder = snapshotBuilder;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public string GateName => "DeterminizationGate";
|
||||
public int Priority => 50; // After VEX gates, before compliance gates
|
||||
|
||||
public async Task<GateResult> EvaluateAsync(
|
||||
PolicyEvaluationContext context,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var result = await EvaluateDeterminizationAsync(context, ct);
|
||||
|
||||
return new GateResult
|
||||
{
|
||||
GateName = GateName,
|
||||
Passed = result.Passed,
|
||||
Status = result.Status,
|
||||
Reason = result.Reason,
|
||||
Metadata = BuildMetadata(result)
|
||||
};
|
||||
}
|
||||
|
||||
public async Task<DeterminizationGateResult> EvaluateDeterminizationAsync(
|
||||
PolicyEvaluationContext context,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
// 1. Build signal snapshot for the CVE/component
|
||||
var snapshot = await _snapshotBuilder.BuildAsync(
|
||||
context.CveId,
|
||||
context.ComponentPurl,
|
||||
ct);
|
||||
|
||||
// 2. Calculate uncertainty
|
||||
var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
|
||||
|
||||
// 3. Calculate decay
|
||||
var lastUpdate = DetermineLastSignalUpdate(snapshot);
|
||||
var decay = _decayCalculator.Calculate(lastUpdate);
|
||||
|
||||
// 4. Calculate trust score
|
||||
var trustScore = _trustAggregator.Calculate(snapshot);
|
||||
|
||||
// 5. Build determinization context
|
||||
var determCtx = new DeterminizationContext
|
||||
{
|
||||
SignalSnapshot = snapshot,
|
||||
UncertaintyScore = uncertainty,
|
||||
Decay = decay,
|
||||
TrustScore = trustScore,
|
||||
Environment = context.Environment,
|
||||
AssetCriticality = context.AssetCriticality,
|
||||
CurrentState = context.CurrentObservationState,
|
||||
Options = context.DeterminizationOptions
|
||||
};
|
||||
|
||||
// 6. Evaluate policy
|
||||
var policyResult = _policy.Evaluate(determCtx);
|
||||
|
||||
_logger.LogInformation(
|
||||
"DeterminizationGate evaluated CVE {CveId} on {Purl}: status={Status}, entropy={Entropy:F3}, trust={Trust:F3}, rule={Rule}",
|
||||
context.CveId,
|
||||
context.ComponentPurl,
|
||||
policyResult.Status,
|
||||
uncertainty.Entropy,
|
||||
trustScore,
|
||||
policyResult.MatchedRule);
|
||||
|
||||
return new DeterminizationGateResult
|
||||
{
|
||||
Passed = policyResult.Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass,
|
||||
Status = policyResult.Status,
|
||||
Reason = policyResult.Reason,
|
||||
GuardRails = policyResult.GuardRails,
|
||||
UncertaintyScore = uncertainty,
|
||||
Decay = decay,
|
||||
TrustScore = trustScore,
|
||||
MatchedRule = policyResult.MatchedRule,
|
||||
Metadata = policyResult.Metadata
|
||||
};
|
||||
}
|
||||
|
||||
private static DateTimeOffset DetermineLastSignalUpdate(SignalSnapshot snapshot)
|
||||
{
|
||||
var timestamps = new List<DateTimeOffset?>();
|
||||
|
||||
if (snapshot.Epss.QueriedAt.HasValue) timestamps.Add(snapshot.Epss.QueriedAt);
|
||||
if (snapshot.Vex.QueriedAt.HasValue) timestamps.Add(snapshot.Vex.QueriedAt);
|
||||
if (snapshot.Reachability.QueriedAt.HasValue) timestamps.Add(snapshot.Reachability.QueriedAt);
|
||||
if (snapshot.Runtime.QueriedAt.HasValue) timestamps.Add(snapshot.Runtime.QueriedAt);
|
||||
if (snapshot.Backport.QueriedAt.HasValue) timestamps.Add(snapshot.Backport.QueriedAt);
|
||||
if (snapshot.SbomLineage.QueriedAt.HasValue) timestamps.Add(snapshot.SbomLineage.QueriedAt);
|
||||
|
||||
return timestamps.Where(t => t.HasValue).Max() ?? snapshot.CapturedAt;
|
||||
}
|
||||
|
||||
private static ImmutableDictionary<string, object> BuildMetadata(DeterminizationGateResult result)
|
||||
{
|
||||
var builder = ImmutableDictionary.CreateBuilder<string, object>();
|
||||
|
||||
builder["uncertainty_entropy"] = result.UncertaintyScore.Entropy;
|
||||
builder["uncertainty_tier"] = result.UncertaintyScore.Tier.ToString();
|
||||
builder["uncertainty_completeness"] = result.UncertaintyScore.Completeness;
|
||||
builder["decay_multiplier"] = result.Decay.DecayedMultiplier;
|
||||
builder["decay_is_stale"] = result.Decay.IsStale;
|
||||
builder["decay_age_days"] = result.Decay.AgeDays;
|
||||
builder["trust_score"] = result.TrustScore;
|
||||
builder["missing_signals"] = result.UncertaintyScore.MissingSignals.Select(g => g.SignalName).ToArray();
|
||||
|
||||
if (result.MatchedRule is not null)
|
||||
builder["matched_rule"] = result.MatchedRule;
|
||||
|
||||
if (result.GuardRails is not null)
|
||||
{
|
||||
builder["guardrails_monitoring"] = result.GuardRails.EnableRuntimeMonitoring;
|
||||
builder["guardrails_review_interval"] = result.GuardRails.ReviewInterval.ToString();
|
||||
}
|
||||
|
||||
return builder.ToImmutable();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### IDeterminizationPolicy Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Policies;
|
||||
|
||||
/// <summary>
|
||||
/// Policy for evaluating determinization decisions (allow/quarantine/escalate).
|
||||
/// </summary>
|
||||
public interface IDeterminizationPolicy
|
||||
{
|
||||
/// <summary>
|
||||
/// Evaluate a CVE observation against determinization rules.
|
||||
/// </summary>
|
||||
/// <param name="context">Determinization context.</param>
|
||||
/// <returns>Policy decision result.</returns>
|
||||
DeterminizationResult Evaluate(DeterminizationContext context);
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationPolicy Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Policies;
|
||||
|
||||
/// <summary>
|
||||
/// Implements allow/quarantine/escalate logic per advisory specification.
|
||||
/// </summary>
|
||||
public sealed class DeterminizationPolicy : IDeterminizationPolicy
|
||||
{
|
||||
private readonly DeterminizationOptions _options;
|
||||
private readonly DeterminizationRuleSet _ruleSet;
|
||||
private readonly ILogger<DeterminizationPolicy> _logger;
|
||||
|
||||
public DeterminizationPolicy(
|
||||
IOptions<DeterminizationOptions> options,
|
||||
ILogger<DeterminizationPolicy> logger)
|
||||
{
|
||||
_options = options.Value;
|
||||
_ruleSet = DeterminizationRuleSet.Default(_options);
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public DeterminizationResult Evaluate(DeterminizationContext ctx)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(ctx);
|
||||
|
||||
// Get environment-specific thresholds
|
||||
var thresholds = GetEnvironmentThresholds(ctx.Environment);
|
||||
|
||||
// Evaluate rules in priority order
|
||||
foreach (var rule in _ruleSet.Rules.OrderBy(r => r.Priority))
|
||||
{
|
||||
if (rule.Condition(ctx, thresholds))
|
||||
{
|
||||
var result = rule.Action(ctx, thresholds);
|
||||
result = result with { MatchedRule = rule.Name };
|
||||
|
||||
_logger.LogDebug(
|
||||
"Rule {RuleName} matched for CVE {CveId}: {Status}",
|
||||
rule.Name,
|
||||
ctx.SignalSnapshot.CveId,
|
||||
result.Status);
|
||||
|
||||
return result;
|
||||
}
|
||||
}
|
||||
|
||||
// Default: Deferred (no rule matched, needs more evidence)
|
||||
return DeterminizationResult.Deferred(
|
||||
"No determinization rule matched; additional evidence required",
|
||||
PolicyVerdictStatus.Deferred);
|
||||
}
|
||||
|
||||
private EnvironmentThresholds GetEnvironmentThresholds(DeploymentEnvironment env)
|
||||
{
|
||||
var key = env.ToString();
|
||||
if (_options.EnvironmentThresholds.TryGetValue(key, out var custom))
|
||||
return custom;
|
||||
|
||||
return env switch
|
||||
{
|
||||
DeploymentEnvironment.Production => DefaultEnvironmentThresholds.Production,
|
||||
DeploymentEnvironment.Staging => DefaultEnvironmentThresholds.Staging,
|
||||
_ => DefaultEnvironmentThresholds.Development
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Default environment thresholds per advisory.
|
||||
/// </summary>
|
||||
public static class DefaultEnvironmentThresholds
|
||||
{
|
||||
public static EnvironmentThresholds Production => new()
|
||||
{
|
||||
Environment = DeploymentEnvironment.Production,
|
||||
MinConfidenceForNotAffected = 0.75,
|
||||
MaxEntropyForAllow = 0.3,
|
||||
EpssBlockThreshold = 0.3,
|
||||
RequireReachabilityForAllow = true
|
||||
};
|
||||
|
||||
public static EnvironmentThresholds Staging => new()
|
||||
{
|
||||
Environment = DeploymentEnvironment.Staging,
|
||||
MinConfidenceForNotAffected = 0.60,
|
||||
MaxEntropyForAllow = 0.5,
|
||||
EpssBlockThreshold = 0.4,
|
||||
RequireReachabilityForAllow = true
|
||||
};
|
||||
|
||||
public static EnvironmentThresholds Development => new()
|
||||
{
|
||||
Environment = DeploymentEnvironment.Development,
|
||||
MinConfidenceForNotAffected = 0.40,
|
||||
MaxEntropyForAllow = 0.7,
|
||||
EpssBlockThreshold = 0.6,
|
||||
RequireReachabilityForAllow = false
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### DeterminizationRuleSet
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Policies;
|
||||
|
||||
/// <summary>
|
||||
/// Rule set for determinization policy evaluation.
|
||||
/// Rules are evaluated in priority order (lower = higher priority).
|
||||
/// </summary>
|
||||
public sealed class DeterminizationRuleSet
|
||||
{
|
||||
public IReadOnlyList<DeterminizationRule> Rules { get; }
|
||||
|
||||
private DeterminizationRuleSet(IReadOnlyList<DeterminizationRule> rules)
|
||||
{
|
||||
Rules = rules;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Creates the default rule set per advisory specification.
|
||||
/// </summary>
|
||||
public static DeterminizationRuleSet Default(DeterminizationOptions options) =>
|
||||
new(new List<DeterminizationRule>
|
||||
{
|
||||
// Rule 1: Escalate if runtime evidence shows vulnerable code loaded
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "RuntimeEscalation",
|
||||
Priority = 10,
|
||||
Condition = (ctx, _) =>
|
||||
ctx.SignalSnapshot.Runtime.HasValue &&
|
||||
ctx.SignalSnapshot.Runtime.Value!.ObservedLoaded,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Escalated(
|
||||
"Runtime evidence shows vulnerable code loaded in memory",
|
||||
PolicyVerdictStatus.Escalated)
|
||||
},
|
||||
|
||||
// Rule 2: Quarantine if EPSS exceeds threshold
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "EpssQuarantine",
|
||||
Priority = 20,
|
||||
Condition = (ctx, thresholds) =>
|
||||
ctx.SignalSnapshot.Epss.HasValue &&
|
||||
ctx.SignalSnapshot.Epss.Value!.Score >= thresholds.EpssBlockThreshold,
|
||||
Action = (ctx, thresholds) =>
|
||||
DeterminizationResult.Quarantined(
|
||||
$"EPSS score {ctx.SignalSnapshot.Epss.Value!.Score:P1} exceeds threshold {thresholds.EpssBlockThreshold:P1}",
|
||||
PolicyVerdictStatus.Blocked)
|
||||
},
|
||||
|
||||
// Rule 3: Quarantine if proven reachable
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "ReachabilityQuarantine",
|
||||
Priority = 25,
|
||||
Condition = (ctx, _) =>
|
||||
ctx.SignalSnapshot.Reachability.HasValue &&
|
||||
ctx.SignalSnapshot.Reachability.Value!.Status is
|
||||
ReachabilityStatus.Reachable or
|
||||
ReachabilityStatus.ObservedReachable,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Quarantined(
|
||||
$"Vulnerable code is {ctx.SignalSnapshot.Reachability.Value!.Status} via call graph analysis",
|
||||
PolicyVerdictStatus.Blocked)
|
||||
},
|
||||
|
||||
// Rule 4: Block high entropy in production
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "ProductionEntropyBlock",
|
||||
Priority = 30,
|
||||
Condition = (ctx, thresholds) =>
|
||||
ctx.Environment == DeploymentEnvironment.Production &&
|
||||
ctx.UncertaintyScore.Entropy > thresholds.MaxEntropyForAllow,
|
||||
Action = (ctx, thresholds) =>
|
||||
DeterminizationResult.Quarantined(
|
||||
$"High uncertainty (entropy={ctx.UncertaintyScore.Entropy:F2}) exceeds production threshold ({thresholds.MaxEntropyForAllow:F2})",
|
||||
PolicyVerdictStatus.Blocked)
|
||||
},
|
||||
|
||||
// Rule 5: Defer if evidence is stale
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "StaleEvidenceDefer",
|
||||
Priority = 40,
|
||||
Condition = (ctx, _) => ctx.Decay.IsStale,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Deferred(
|
||||
$"Evidence is stale (last update: {ctx.Decay.LastSignalUpdate:u}, age: {ctx.Decay.AgeDays:F1} days)",
|
||||
PolicyVerdictStatus.Deferred)
|
||||
},
|
||||
|
||||
// Rule 6: Guarded allow for uncertain observations in non-prod
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "GuardedAllowNonProd",
|
||||
Priority = 50,
|
||||
Condition = (ctx, _) =>
|
||||
ctx.TrustScore < options.GuardedAllowScoreThreshold &&
|
||||
ctx.UncertaintyScore.Entropy > options.GuardedAllowEntropyThreshold &&
|
||||
ctx.Environment != DeploymentEnvironment.Production,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.GuardedAllow(
|
||||
$"Uncertain observation (entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}) allowed with guardrails in {ctx.Environment}",
|
||||
PolicyVerdictStatus.GuardedPass,
|
||||
BuildGuardrails(ctx, options))
|
||||
},
|
||||
|
||||
// Rule 7: Allow if unreachable with high confidence
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "UnreachableAllow",
|
||||
Priority = 60,
|
||||
Condition = (ctx, thresholds) =>
|
||||
ctx.SignalSnapshot.Reachability.HasValue &&
|
||||
ctx.SignalSnapshot.Reachability.Value!.Status == ReachabilityStatus.Unreachable &&
|
||||
ctx.SignalSnapshot.Reachability.Value.Confidence >= thresholds.MinConfidenceForNotAffected,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Allowed(
|
||||
$"Vulnerable code is unreachable (confidence={ctx.SignalSnapshot.Reachability.Value!.Confidence:P0})",
|
||||
PolicyVerdictStatus.Pass)
|
||||
},
|
||||
|
||||
// Rule 8: Allow if VEX not_affected with trusted issuer
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "VexNotAffectedAllow",
|
||||
Priority = 65,
|
||||
Condition = (ctx, thresholds) =>
|
||||
ctx.SignalSnapshot.Vex.HasValue &&
|
||||
ctx.SignalSnapshot.Vex.Value!.Status == "not_affected" &&
|
||||
ctx.SignalSnapshot.Vex.Value.IssuerTrust >= thresholds.MinConfidenceForNotAffected,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Allowed(
|
||||
$"VEX statement from {ctx.SignalSnapshot.Vex.Value!.Issuer} indicates not_affected (trust={ctx.SignalSnapshot.Vex.Value.IssuerTrust:P0})",
|
||||
PolicyVerdictStatus.Pass)
|
||||
},
|
||||
|
||||
// Rule 9: Allow if sufficient evidence and low entropy
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "SufficientEvidenceAllow",
|
||||
Priority = 70,
|
||||
Condition = (ctx, thresholds) =>
|
||||
ctx.UncertaintyScore.Entropy <= thresholds.MaxEntropyForAllow &&
|
||||
ctx.TrustScore >= thresholds.MinConfidenceForNotAffected,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Allowed(
|
||||
$"Sufficient evidence (entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}) for confident determination",
|
||||
PolicyVerdictStatus.Pass)
|
||||
},
|
||||
|
||||
// Rule 10: Guarded allow for moderate uncertainty
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "GuardedAllowModerateUncertainty",
|
||||
Priority = 80,
|
||||
Condition = (ctx, _) =>
|
||||
ctx.UncertaintyScore.Tier <= UncertaintyTier.Medium &&
|
||||
ctx.TrustScore >= 0.4,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.GuardedAllow(
|
||||
$"Moderate uncertainty (tier={ctx.UncertaintyScore.Tier}, trust={ctx.TrustScore:F2}) allowed with monitoring",
|
||||
PolicyVerdictStatus.GuardedPass,
|
||||
BuildGuardrails(ctx, options))
|
||||
},
|
||||
|
||||
// Rule 11: Default - require more evidence
|
||||
new DeterminizationRule
|
||||
{
|
||||
Name = "DefaultDefer",
|
||||
Priority = 100,
|
||||
Condition = (_, _) => true,
|
||||
Action = (ctx, _) =>
|
||||
DeterminizationResult.Deferred(
|
||||
$"Insufficient evidence for determination (entropy={ctx.UncertaintyScore.Entropy:F2}, tier={ctx.UncertaintyScore.Tier})",
|
||||
PolicyVerdictStatus.Deferred)
|
||||
}
|
||||
});
|
||||
|
||||
private static GuardRails BuildGuardrails(DeterminizationContext ctx, DeterminizationOptions options) =>
|
||||
new GuardRails
|
||||
{
|
||||
EnableRuntimeMonitoring = true,
|
||||
ReviewInterval = TimeSpan.FromDays(options.GuardedReviewIntervalDays),
|
||||
EpssEscalationThreshold = options.EpssQuarantineThreshold,
|
||||
EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
|
||||
MaxGuardedDuration = TimeSpan.FromDays(options.MaxGuardedDurationDays),
|
||||
PolicyRationale = $"Auto-allowed: entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}, env={ctx.Environment}"
|
||||
};
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// A single determinization rule.
|
||||
/// </summary>
|
||||
public sealed record DeterminizationRule
|
||||
{
|
||||
/// <summary>Rule name for audit/logging.</summary>
|
||||
public required string Name { get; init; }
|
||||
|
||||
/// <summary>Priority (lower = evaluated first).</summary>
|
||||
public required int Priority { get; init; }
|
||||
|
||||
/// <summary>Condition function.</summary>
|
||||
public required Func<DeterminizationContext, EnvironmentThresholds, bool> Condition { get; init; }
|
||||
|
||||
/// <summary>Action function.</summary>
|
||||
public required Func<DeterminizationContext, EnvironmentThresholds, DeterminizationResult> Action { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Signal Update Subscription
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Subscriptions;
|
||||
|
||||
/// <summary>
|
||||
/// Events for signal updates that trigger re-evaluation.
|
||||
/// </summary>
|
||||
public static class DeterminizationEventTypes
|
||||
{
|
||||
public const string EpssUpdated = "epss.updated";
|
||||
public const string VexUpdated = "vex.updated";
|
||||
public const string ReachabilityUpdated = "reachability.updated";
|
||||
public const string RuntimeUpdated = "runtime.updated";
|
||||
public const string BackportUpdated = "backport.updated";
|
||||
public const string ObservationStateChanged = "observation.state_changed";
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Event published when a signal is updated.
|
||||
/// </summary>
|
||||
public sealed record SignalUpdatedEvent
|
||||
{
|
||||
public required string EventType { get; init; }
|
||||
public required string CveId { get; init; }
|
||||
public required string Purl { get; init; }
|
||||
public required DateTimeOffset UpdatedAt { get; init; }
|
||||
public required string Source { get; init; }
|
||||
public object? NewValue { get; init; }
|
||||
public object? PreviousValue { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Event published when observation state changes.
|
||||
/// </summary>
|
||||
public sealed record ObservationStateChangedEvent
|
||||
{
|
||||
public required Guid ObservationId { get; init; }
|
||||
public required string CveId { get; init; }
|
||||
public required string Purl { get; init; }
|
||||
public required ObservationState PreviousState { get; init; }
|
||||
public required ObservationState NewState { get; init; }
|
||||
public required string Reason { get; init; }
|
||||
public required DateTimeOffset ChangedAt { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Handler for signal update events.
|
||||
/// </summary>
|
||||
public interface ISignalUpdateSubscription
|
||||
{
|
||||
/// <summary>
|
||||
/// Handle a signal update and re-evaluate affected observations.
|
||||
/// </summary>
|
||||
Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Implementation of signal update handling.
|
||||
/// </summary>
|
||||
public sealed class SignalUpdateHandler : ISignalUpdateSubscription
|
||||
{
|
||||
private readonly IObservationRepository _observations;
|
||||
private readonly IDeterminizationGate _gate;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly ILogger<SignalUpdateHandler> _logger;
|
||||
|
||||
public SignalUpdateHandler(
|
||||
IObservationRepository observations,
|
||||
IDeterminizationGate gate,
|
||||
IEventPublisher eventPublisher,
|
||||
ILogger<SignalUpdateHandler> logger)
|
||||
{
|
||||
_observations = observations;
|
||||
_gate = gate;
|
||||
_eventPublisher = eventPublisher;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct = default)
|
||||
{
|
||||
_logger.LogInformation(
|
||||
"Processing signal update: {EventType} for CVE {CveId} on {Purl}",
|
||||
evt.EventType,
|
||||
evt.CveId,
|
||||
evt.Purl);
|
||||
|
||||
// Find observations affected by this signal
|
||||
var affected = await _observations.FindByCveAndPurlAsync(evt.CveId, evt.Purl, ct);
|
||||
|
||||
foreach (var obs in affected)
|
||||
{
|
||||
try
|
||||
{
|
||||
await ReEvaluateObservationAsync(obs, evt, ct);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex,
|
||||
"Failed to re-evaluate observation {ObservationId} after signal update",
|
||||
obs.Id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private async Task ReEvaluateObservationAsync(
|
||||
CveObservation obs,
|
||||
SignalUpdatedEvent trigger,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var context = new PolicyEvaluationContext
|
||||
{
|
||||
CveId = obs.CveId,
|
||||
ComponentPurl = obs.SubjectPurl,
|
||||
Environment = obs.Environment,
|
||||
CurrentObservationState = obs.ObservationState
|
||||
};
|
||||
|
||||
var result = await _gate.EvaluateDeterminizationAsync(context, ct);
|
||||
|
||||
// Determine if state should change
|
||||
var newState = DetermineNewState(obs.ObservationState, result);
|
||||
|
||||
if (newState != obs.ObservationState)
|
||||
{
|
||||
_logger.LogInformation(
|
||||
"Observation {ObservationId} state transition: {OldState} -> {NewState} (trigger: {Trigger})",
|
||||
obs.Id,
|
||||
obs.ObservationState,
|
||||
newState,
|
||||
trigger.EventType);
|
||||
|
||||
await _observations.UpdateStateAsync(obs.Id, newState, result, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new ObservationStateChangedEvent
|
||||
{
|
||||
ObservationId = obs.Id,
|
||||
CveId = obs.CveId,
|
||||
Purl = obs.SubjectPurl,
|
||||
PreviousState = obs.ObservationState,
|
||||
NewState = newState,
|
||||
Reason = result.Reason,
|
||||
ChangedAt = DateTimeOffset.UtcNow
|
||||
}, ct);
|
||||
}
|
||||
}
|
||||
|
||||
private static ObservationState DetermineNewState(
|
||||
ObservationState current,
|
||||
DeterminizationGateResult result)
|
||||
{
|
||||
// Escalation always triggers ManualReviewRequired
|
||||
if (result.Status == PolicyVerdictStatus.Escalated)
|
||||
return ObservationState.ManualReviewRequired;
|
||||
|
||||
// Very low uncertainty means we have enough evidence
|
||||
if (result.UncertaintyScore.Tier == UncertaintyTier.VeryLow)
|
||||
return ObservationState.Determined;
|
||||
|
||||
// Transition from Pending to Determined when evidence sufficient
|
||||
if (current == ObservationState.PendingDeterminization &&
|
||||
result.UncertaintyScore.Tier <= UncertaintyTier.Low &&
|
||||
result.Status == PolicyVerdictStatus.Pass)
|
||||
return ObservationState.Determined;
|
||||
|
||||
// Stale evidence
|
||||
if (result.Decay.IsStale && current != ObservationState.StaleRequiresRefresh)
|
||||
return ObservationState.StaleRequiresRefresh;
|
||||
|
||||
// Otherwise maintain current state
|
||||
return current;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### DI Registration Updates
|
||||
|
||||
```csharp
|
||||
// Additions to Policy.Engine DI registration
|
||||
|
||||
public static class DeterminizationEngineExtensions
|
||||
{
|
||||
public static IServiceCollection AddDeterminizationEngine(
|
||||
this IServiceCollection services,
|
||||
IConfiguration configuration)
|
||||
{
|
||||
// Register determinization library services
|
||||
services.AddDeterminization(configuration);
|
||||
|
||||
// Register policy engine services
|
||||
services.AddScoped<IDeterminizationPolicy, DeterminizationPolicy>();
|
||||
services.AddScoped<IDeterminizationGate, DeterminizationGate>();
|
||||
services.AddScoped<ISignalSnapshotBuilder, SignalSnapshotBuilder>();
|
||||
services.AddScoped<ISignalUpdateSubscription, SignalUpdateHandler>();
|
||||
|
||||
return services;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | DPE-001 | TODO | DCS-028 | Guild | Add `GuardedPass` to `PolicyVerdictStatus` enum |
|
||||
| 2 | DPE-002 | TODO | DPE-001 | Guild | Extend `PolicyVerdict` with GuardRails and UncertaintyScore |
|
||||
| 3 | DPE-003 | TODO | DPE-002 | Guild | Create `IDeterminizationGate` interface |
|
||||
| 4 | DPE-004 | TODO | DPE-003 | Guild | Implement `DeterminizationGate` with priority 50 |
|
||||
| 5 | DPE-005 | TODO | DPE-004 | Guild | Create `DeterminizationGateResult` record |
|
||||
| 6 | DPE-006 | TODO | DPE-005 | Guild | Create `ISignalSnapshotBuilder` interface |
|
||||
| 7 | DPE-007 | TODO | DPE-006 | Guild | Implement `SignalSnapshotBuilder` |
|
||||
| 8 | DPE-008 | TODO | DPE-007 | Guild | Create `IDeterminizationPolicy` interface |
|
||||
| 9 | DPE-009 | TODO | DPE-008 | Guild | Implement `DeterminizationPolicy` |
|
||||
| 10 | DPE-010 | TODO | DPE-009 | Guild | Implement `DeterminizationRuleSet` with 11 rules |
|
||||
| 11 | DPE-011 | TODO | DPE-010 | Guild | Implement `DefaultEnvironmentThresholds` |
|
||||
| 12 | DPE-012 | TODO | DPE-011 | Guild | Create `DeterminizationEventTypes` constants |
|
||||
| 13 | DPE-013 | TODO | DPE-012 | Guild | Create `SignalUpdatedEvent` record |
|
||||
| 14 | DPE-014 | TODO | DPE-013 | Guild | Create `ObservationStateChangedEvent` record |
|
||||
| 15 | DPE-015 | TODO | DPE-014 | Guild | Create `ISignalUpdateSubscription` interface |
|
||||
| 16 | DPE-016 | TODO | DPE-015 | Guild | Implement `SignalUpdateHandler` |
|
||||
| 17 | DPE-017 | TODO | DPE-016 | Guild | Create `IObservationRepository` interface |
|
||||
| 18 | DPE-018 | TODO | DPE-017 | Guild | Implement `DeterminizationEngineExtensions` for DI |
|
||||
| 19 | DPE-019 | TODO | DPE-018 | Guild | Write unit tests: `DeterminizationPolicy` rule evaluation |
|
||||
| 20 | DPE-020 | TODO | DPE-019 | Guild | Write unit tests: `DeterminizationGate` metadata building |
|
||||
| 21 | DPE-021 | TODO | DPE-020 | Guild | Write unit tests: `SignalUpdateHandler` state transitions |
|
||||
| 22 | DPE-022 | TODO | DPE-021 | Guild | Write unit tests: Rule priority ordering |
|
||||
| 23 | DPE-023 | TODO | DPE-022 | Guild | Write integration tests: Gate in policy pipeline |
|
||||
| 24 | DPE-024 | TODO | DPE-023 | Guild | Write integration tests: Signal update re-evaluation |
|
||||
| 25 | DPE-025 | TODO | DPE-024 | Guild | Add metrics: `stellaops_policy_determinization_evaluations_total` |
|
||||
| 26 | DPE-026 | TODO | DPE-025 | Guild | Add metrics: `stellaops_policy_determinization_rule_matches_total` |
|
||||
| 27 | DPE-027 | TODO | DPE-026 | Guild | Add metrics: `stellaops_policy_observation_state_transitions_total` |
|
||||
| 28 | DPE-028 | TODO | DPE-027 | Guild | Update existing PolicyEngine to register DeterminizationGate |
|
||||
| 29 | DPE-029 | TODO | DPE-028 | Guild | Document new PolicyVerdictStatus.GuardedPass in API docs |
|
||||
| 30 | DPE-030 | TODO | DPE-029 | Guild | Verify build with `dotnet build` |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. `PolicyVerdictStatus.GuardedPass` compiles and serializes correctly
|
||||
2. `DeterminizationGate` integrates with existing gate pipeline
|
||||
3. All 11 rules evaluate in correct priority order
|
||||
4. `SignalUpdateHandler` correctly triggers re-evaluation
|
||||
5. State transitions follow expected logic
|
||||
6. Metrics emitted for all evaluations and transitions
|
||||
7. Integration tests pass with mock signal sources
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Gate priority 50 | After VEX gates (30-40), before compliance gates (60+) |
|
||||
| 11 rules in default set | Covers all advisory scenarios; extensible |
|
||||
| Event-driven re-evaluation | Reactive system; no polling required |
|
||||
| Separate IObservationRepository | Decouples from specific persistence; testable |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Rule evaluation performance | Rules short-circuit on first match; cached signal snapshots |
|
||||
| Event storm on bulk updates | Batch processing; debounce repeated events |
|
||||
| Breaking existing PolicyVerdictStatus consumers | GuardedPass=1 shifts existing values; requires migration |
|
||||
|
||||
## Migration Notes
|
||||
|
||||
### PolicyVerdictStatus Value Change
|
||||
|
||||
Adding `GuardedPass = 1` shifts existing enum values:
|
||||
- `Blocked` was 1, now 2
|
||||
- `Ignored` was 2, now 3
|
||||
- etc.
|
||||
|
||||
**Migration strategy:**
|
||||
1. Add `GuardedPass` at the end first (`= 8`) for backward compatibility
|
||||
2. Update all consumers
|
||||
3. Reorder enum values in next major version
|
||||
|
||||
Alternatively, insert `GuardedPass` with explicit value assignment to avoid breaking changes:
|
||||
|
||||
```csharp
|
||||
public enum PolicyVerdictStatus
|
||||
{
|
||||
Pass = 0,
|
||||
Blocked = 1, // Keep existing
|
||||
Ignored = 2, // Keep existing
|
||||
Warned = 3, // Keep existing
|
||||
Deferred = 4, // Keep existing
|
||||
Escalated = 5, // Keep existing
|
||||
RequiresVex = 6, // Keep existing
|
||||
GuardedPass = 7 // NEW - at end
|
||||
}
|
||||
```
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-10: DPE-001 to DPE-011 complete (core implementation)
|
||||
- 2026-01-11: DPE-012 to DPE-018 complete (events, subscriptions)
|
||||
- 2026-01-12: DPE-019 to DPE-030 complete (tests, metrics, docs)
|
||||
@@ -0,0 +1,906 @@
|
||||
# Sprint 20260106_001_004_BE - Determinization: Backend Integration
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Integrate the Determinization subsystem with backend modules: Feedser (signal attachment), VexLens (VEX signal emission), Graph (CVE node enhancement), and Findings (observation persistence). This connects the policy infrastructure to data sources.
|
||||
|
||||
- **Working directories:**
|
||||
- `src/Feedser/`
|
||||
- `src/VexLens/`
|
||||
- `src/Graph/`
|
||||
- `src/Findings/`
|
||||
- **Evidence:** Signal attachers, repository implementations, graph node enhancements, integration tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Current backend state:
|
||||
- Feedser collects EPSS/VEX/advisories but doesn't emit `SignalState<T>`
|
||||
- VexLens normalizes VEX but doesn't notify on updates
|
||||
- Graph has CVE nodes but no `ObservationState` or `UncertaintyScore`
|
||||
- Findings tracks verdicts but not determinization state
|
||||
|
||||
Advisory requires:
|
||||
- Feedser attaches `SignalState<EpssEvidence>` with query status
|
||||
- VexLens emits `SignalUpdatedEvent` on VEX changes
|
||||
- Graph nodes carry `ObservationState`, `UncertaintyScore`, `GuardRails`
|
||||
- Findings persists observation lifecycle with state transitions
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** SPRINT_20260106_001_003_POLICY (gates and policies)
|
||||
- **Blocks:** SPRINT_20260106_001_005_FE (frontend)
|
||||
- **Parallel safe with:** Graph module internal changes; coordinate with Feedser/VexLens teams
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/determinization-architecture.md
|
||||
- SPRINT_20260106_001_003_POLICY (events and subscriptions)
|
||||
- src/Feedser/AGENTS.md
|
||||
- src/VexLens/AGENTS.md (if exists)
|
||||
- src/Graph/AGENTS.md
|
||||
- src/Findings/AGENTS.md
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Feedser: Signal Attachment
|
||||
|
||||
#### Directory Structure Changes
|
||||
|
||||
```
|
||||
src/Feedser/StellaOps.Feedser/
|
||||
├── Signals/
|
||||
│ ├── ISignalAttacher.cs # NEW
|
||||
│ ├── EpssSignalAttacher.cs # NEW
|
||||
│ ├── KevSignalAttacher.cs # NEW
|
||||
│ └── SignalAttachmentResult.cs # NEW
|
||||
├── Events/
|
||||
│ └── SignalAttachmentEventEmitter.cs # NEW
|
||||
└── Extensions/
|
||||
└── SignalAttacherServiceExtensions.cs # NEW
|
||||
```
|
||||
|
||||
#### ISignalAttacher Interface
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Feedser.Signals;
|
||||
|
||||
/// <summary>
|
||||
/// Attaches signal evidence to CVE observations.
|
||||
/// </summary>
|
||||
/// <typeparam name="T">The evidence type.</typeparam>
|
||||
public interface ISignalAttacher<T>
|
||||
{
|
||||
/// <summary>
|
||||
/// Attach signal evidence for a CVE.
|
||||
/// </summary>
|
||||
/// <param name="cveId">CVE identifier.</param>
|
||||
/// <param name="purl">Component PURL.</param>
|
||||
/// <param name="ct">Cancellation token.</param>
|
||||
/// <returns>Signal state with query status.</returns>
|
||||
Task<SignalState<T>> AttachAsync(string cveId, string purl, CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Batch attach signal evidence for multiple CVEs.
|
||||
/// </summary>
|
||||
/// <param name="requests">CVE/PURL pairs.</param>
|
||||
/// <param name="ct">Cancellation token.</param>
|
||||
/// <returns>Signal states keyed by CVE ID.</returns>
|
||||
Task<IReadOnlyDictionary<string, SignalState<T>>> AttachBatchAsync(
|
||||
IEnumerable<(string CveId, string Purl)> requests,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
#### EpssSignalAttacher Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Feedser.Signals;
|
||||
|
||||
/// <summary>
|
||||
/// Attaches EPSS evidence to CVE observations.
|
||||
/// </summary>
|
||||
public sealed class EpssSignalAttacher : ISignalAttacher<EpssEvidence>
|
||||
{
|
||||
private readonly IEpssClient _epssClient;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<EpssSignalAttacher> _logger;
|
||||
|
||||
public EpssSignalAttacher(
|
||||
IEpssClient epssClient,
|
||||
IEventPublisher eventPublisher,
|
||||
TimeProvider timeProvider,
|
||||
ILogger<EpssSignalAttacher> logger)
|
||||
{
|
||||
_epssClient = epssClient;
|
||||
_eventPublisher = eventPublisher;
|
||||
_timeProvider = timeProvider;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task<SignalState<EpssEvidence>> AttachAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
try
|
||||
{
|
||||
var epssData = await _epssClient.GetScoreAsync(cveId, ct);
|
||||
|
||||
if (epssData is null)
|
||||
{
|
||||
_logger.LogDebug("EPSS data not found for CVE {CveId}", cveId);
|
||||
|
||||
return SignalState<EpssEvidence>.Absent(now, "first.org");
|
||||
}
|
||||
|
||||
var evidence = new EpssEvidence
|
||||
{
|
||||
Score = epssData.Score,
|
||||
Percentile = epssData.Percentile,
|
||||
ModelDate = epssData.ModelDate
|
||||
};
|
||||
|
||||
// Emit event for signal update
|
||||
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
|
||||
{
|
||||
EventType = DeterminizationEventTypes.EpssUpdated,
|
||||
CveId = cveId,
|
||||
Purl = purl,
|
||||
UpdatedAt = now,
|
||||
Source = "first.org",
|
||||
NewValue = evidence
|
||||
}, ct);
|
||||
|
||||
_logger.LogDebug(
|
||||
"Attached EPSS for CVE {CveId}: score={Score:P1}, percentile={Percentile:P1}",
|
||||
cveId,
|
||||
evidence.Score,
|
||||
evidence.Percentile);
|
||||
|
||||
return SignalState<EpssEvidence>.WithValue(evidence, now, "first.org");
|
||||
}
|
||||
catch (EpssNotFoundException)
|
||||
{
|
||||
return SignalState<EpssEvidence>.Absent(now, "first.org");
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "Failed to fetch EPSS for CVE {CveId}", cveId);
|
||||
|
||||
return SignalState<EpssEvidence>.Failed(ex.Message);
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<IReadOnlyDictionary<string, SignalState<EpssEvidence>>> AttachBatchAsync(
|
||||
IEnumerable<(string CveId, string Purl)> requests,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var results = new Dictionary<string, SignalState<EpssEvidence>>();
|
||||
var requestList = requests.ToList();
|
||||
|
||||
// Batch query EPSS
|
||||
var cveIds = requestList.Select(r => r.CveId).Distinct().ToList();
|
||||
var batchResult = await _epssClient.GetScoresBatchAsync(cveIds, ct);
|
||||
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
foreach (var (cveId, purl) in requestList)
|
||||
{
|
||||
if (batchResult.Found.TryGetValue(cveId, out var epssData))
|
||||
{
|
||||
var evidence = new EpssEvidence
|
||||
{
|
||||
Score = epssData.Score,
|
||||
Percentile = epssData.Percentile,
|
||||
ModelDate = epssData.ModelDate
|
||||
};
|
||||
|
||||
results[cveId] = SignalState<EpssEvidence>.WithValue(evidence, now, "first.org");
|
||||
|
||||
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
|
||||
{
|
||||
EventType = DeterminizationEventTypes.EpssUpdated,
|
||||
CveId = cveId,
|
||||
Purl = purl,
|
||||
UpdatedAt = now,
|
||||
Source = "first.org",
|
||||
NewValue = evidence
|
||||
}, ct);
|
||||
}
|
||||
else if (batchResult.NotFound.Contains(cveId))
|
||||
{
|
||||
results[cveId] = SignalState<EpssEvidence>.Absent(now, "first.org");
|
||||
}
|
||||
else
|
||||
{
|
||||
results[cveId] = SignalState<EpssEvidence>.Failed("Batch query did not return result");
|
||||
}
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### KevSignalAttacher Implementation
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Feedser.Signals;
|
||||
|
||||
/// <summary>
|
||||
/// Attaches KEV (Known Exploited Vulnerabilities) flag to CVE observations.
|
||||
/// </summary>
|
||||
public sealed class KevSignalAttacher : ISignalAttacher<bool>
|
||||
{
|
||||
private readonly IKevCatalog _kevCatalog;
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<KevSignalAttacher> _logger;
|
||||
|
||||
public async Task<SignalState<bool>> AttachAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
try
|
||||
{
|
||||
var isInKev = await _kevCatalog.ContainsAsync(cveId, ct);
|
||||
|
||||
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
|
||||
{
|
||||
EventType = "kev.updated",
|
||||
CveId = cveId,
|
||||
Purl = purl,
|
||||
UpdatedAt = now,
|
||||
Source = "cisa-kev",
|
||||
NewValue = isInKev
|
||||
}, ct);
|
||||
|
||||
return SignalState<bool>.WithValue(isInKev, now, "cisa-kev");
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "Failed to check KEV for CVE {CveId}", cveId);
|
||||
return SignalState<bool>.Failed(ex.Message);
|
||||
}
|
||||
}
|
||||
|
||||
public async Task<IReadOnlyDictionary<string, SignalState<bool>>> AttachBatchAsync(
|
||||
IEnumerable<(string CveId, string Purl)> requests,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var results = new Dictionary<string, SignalState<bool>>();
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
foreach (var (cveId, purl) in requests)
|
||||
{
|
||||
results[cveId] = await AttachAsync(cveId, purl, ct);
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### VexLens: Signal Emission
|
||||
|
||||
#### VexSignalEmitter
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.VexLens.Signals;
|
||||
|
||||
/// <summary>
|
||||
/// Emits VEX signal updates when VEX documents are processed.
|
||||
/// </summary>
|
||||
public sealed class VexSignalEmitter
|
||||
{
|
||||
private readonly IEventPublisher _eventPublisher;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<VexSignalEmitter> _logger;
|
||||
|
||||
public async Task EmitVexUpdateAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
VexClaimSummary newClaim,
|
||||
VexClaimSummary? previousClaim,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
|
||||
{
|
||||
EventType = DeterminizationEventTypes.VexUpdated,
|
||||
CveId = cveId,
|
||||
Purl = purl,
|
||||
UpdatedAt = now,
|
||||
Source = newClaim.Issuer,
|
||||
NewValue = newClaim,
|
||||
PreviousValue = previousClaim
|
||||
}, ct);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Emitted VEX update for CVE {CveId}: {Status} from {Issuer} (previous: {PreviousStatus})",
|
||||
cveId,
|
||||
newClaim.Status,
|
||||
newClaim.Issuer,
|
||||
previousClaim?.Status ?? "none");
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Converts normalized VEX documents to signal-compatible summaries.
|
||||
/// </summary>
|
||||
public sealed class VexClaimSummaryMapper
|
||||
{
|
||||
public VexClaimSummary Map(NormalizedVexStatement statement, double issuerTrust)
|
||||
{
|
||||
return new VexClaimSummary
|
||||
{
|
||||
Status = statement.Status.ToString().ToLowerInvariant(),
|
||||
Justification = statement.Justification?.ToString(),
|
||||
Issuer = statement.IssuerId,
|
||||
IssuerTrust = issuerTrust
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Graph: CVE Node Enhancement
|
||||
|
||||
#### Enhanced CveObservationNode
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Graph.Indexer.Nodes;
|
||||
|
||||
/// <summary>
|
||||
/// Enhanced CVE observation node with determinization state.
|
||||
/// </summary>
|
||||
public sealed record CveObservationNode
|
||||
{
|
||||
/// <summary>Node identifier (CVE ID + PURL hash).</summary>
|
||||
public required string NodeId { get; init; }
|
||||
|
||||
/// <summary>CVE identifier.</summary>
|
||||
public required string CveId { get; init; }
|
||||
|
||||
/// <summary>Subject component PURL.</summary>
|
||||
public required string SubjectPurl { get; init; }
|
||||
|
||||
/// <summary>VEX status (orthogonal to observation state).</summary>
|
||||
public VexClaimStatus? VexStatus { get; init; }
|
||||
|
||||
/// <summary>Observation lifecycle state.</summary>
|
||||
public required ObservationState ObservationState { get; init; }
|
||||
|
||||
/// <summary>Knowledge completeness score.</summary>
|
||||
public required UncertaintyScore Uncertainty { get; init; }
|
||||
|
||||
/// <summary>Evidence freshness decay.</summary>
|
||||
public required ObservationDecay Decay { get; init; }
|
||||
|
||||
/// <summary>Aggregated trust score [0.0-1.0].</summary>
|
||||
public required double TrustScore { get; init; }
|
||||
|
||||
/// <summary>Policy verdict status.</summary>
|
||||
public required PolicyVerdictStatus PolicyHint { get; init; }
|
||||
|
||||
/// <summary>Guardrails if PolicyHint is GuardedPass.</summary>
|
||||
public GuardRails? GuardRails { get; init; }
|
||||
|
||||
/// <summary>Signal snapshot timestamp.</summary>
|
||||
public required DateTimeOffset LastEvaluatedAt { get; init; }
|
||||
|
||||
/// <summary>Next scheduled review (if guarded or stale).</summary>
|
||||
public DateTimeOffset? NextReviewAt { get; init; }
|
||||
|
||||
/// <summary>Environment where observation applies.</summary>
|
||||
public DeploymentEnvironment? Environment { get; init; }
|
||||
|
||||
/// <summary>Generates node ID from CVE and PURL.</summary>
|
||||
public static string GenerateNodeId(string cveId, string purl)
|
||||
{
|
||||
using var sha = SHA256.Create();
|
||||
var input = $"{cveId}|{purl}";
|
||||
var hash = sha.ComputeHash(Encoding.UTF8.GetBytes(input));
|
||||
return $"obs:{Convert.ToHexString(hash)[..16].ToLowerInvariant()}";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### CveObservationNodeRepository
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Graph.Indexer.Repositories;
|
||||
|
||||
/// <summary>
|
||||
/// Repository for CVE observation nodes in the graph.
|
||||
/// </summary>
|
||||
public interface ICveObservationNodeRepository
|
||||
{
|
||||
/// <summary>Get observation node by CVE and PURL.</summary>
|
||||
Task<CveObservationNode?> GetAsync(string cveId, string purl, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get all observations for a CVE.</summary>
|
||||
Task<IReadOnlyList<CveObservationNode>> GetByCveAsync(string cveId, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get all observations for a component.</summary>
|
||||
Task<IReadOnlyList<CveObservationNode>> GetByPurlAsync(string purl, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get observations in a specific state.</summary>
|
||||
Task<IReadOnlyList<CveObservationNode>> GetByStateAsync(
|
||||
ObservationState state,
|
||||
int limit = 100,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get observations needing review (past NextReviewAt).</summary>
|
||||
Task<IReadOnlyList<CveObservationNode>> GetPendingReviewAsync(
|
||||
DateTimeOffset asOf,
|
||||
int limit = 100,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>Upsert observation node.</summary>
|
||||
Task UpsertAsync(CveObservationNode node, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Update observation state.</summary>
|
||||
Task UpdateStateAsync(
|
||||
string nodeId,
|
||||
ObservationState newState,
|
||||
DeterminizationGateResult? result,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// PostgreSQL implementation of observation node repository.
|
||||
/// </summary>
|
||||
public sealed class PostgresCveObservationNodeRepository : ICveObservationNodeRepository
|
||||
{
|
||||
private readonly IDbConnectionFactory _connectionFactory;
|
||||
private readonly ILogger<PostgresCveObservationNodeRepository> _logger;
|
||||
|
||||
private const string TableName = "graph.cve_observation_nodes";
|
||||
|
||||
public async Task<CveObservationNode?> GetAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var nodeId = CveObservationNode.GenerateNodeId(cveId, purl);
|
||||
|
||||
await using var connection = await _connectionFactory.CreateAsync(ct);
|
||||
|
||||
var sql = $"""
|
||||
SELECT
|
||||
node_id,
|
||||
cve_id,
|
||||
subject_purl,
|
||||
vex_status,
|
||||
observation_state,
|
||||
uncertainty_entropy,
|
||||
uncertainty_completeness,
|
||||
uncertainty_tier,
|
||||
uncertainty_missing_signals,
|
||||
decay_half_life_days,
|
||||
decay_floor,
|
||||
decay_last_update,
|
||||
decay_multiplier,
|
||||
decay_is_stale,
|
||||
trust_score,
|
||||
policy_hint,
|
||||
guard_rails,
|
||||
last_evaluated_at,
|
||||
next_review_at,
|
||||
environment
|
||||
FROM {TableName}
|
||||
WHERE node_id = @NodeId
|
||||
""";
|
||||
|
||||
return await connection.QuerySingleOrDefaultAsync<CveObservationNode>(
|
||||
sql,
|
||||
new { NodeId = nodeId },
|
||||
ct);
|
||||
}
|
||||
|
||||
public async Task UpsertAsync(CveObservationNode node, CancellationToken ct = default)
|
||||
{
|
||||
await using var connection = await _connectionFactory.CreateAsync(ct);
|
||||
|
||||
var sql = $"""
|
||||
INSERT INTO {TableName} (
|
||||
node_id,
|
||||
cve_id,
|
||||
subject_purl,
|
||||
vex_status,
|
||||
observation_state,
|
||||
uncertainty_entropy,
|
||||
uncertainty_completeness,
|
||||
uncertainty_tier,
|
||||
uncertainty_missing_signals,
|
||||
decay_half_life_days,
|
||||
decay_floor,
|
||||
decay_last_update,
|
||||
decay_multiplier,
|
||||
decay_is_stale,
|
||||
trust_score,
|
||||
policy_hint,
|
||||
guard_rails,
|
||||
last_evaluated_at,
|
||||
next_review_at,
|
||||
environment,
|
||||
created_at,
|
||||
updated_at
|
||||
) VALUES (
|
||||
@NodeId,
|
||||
@CveId,
|
||||
@SubjectPurl,
|
||||
@VexStatus,
|
||||
@ObservationState,
|
||||
@UncertaintyEntropy,
|
||||
@UncertaintyCompleteness,
|
||||
@UncertaintyTier,
|
||||
@UncertaintyMissingSignals,
|
||||
@DecayHalfLifeDays,
|
||||
@DecayFloor,
|
||||
@DecayLastUpdate,
|
||||
@DecayMultiplier,
|
||||
@DecayIsStale,
|
||||
@TrustScore,
|
||||
@PolicyHint,
|
||||
@GuardRails,
|
||||
@LastEvaluatedAt,
|
||||
@NextReviewAt,
|
||||
@Environment,
|
||||
NOW(),
|
||||
NOW()
|
||||
)
|
||||
ON CONFLICT (node_id) DO UPDATE SET
|
||||
vex_status = EXCLUDED.vex_status,
|
||||
observation_state = EXCLUDED.observation_state,
|
||||
uncertainty_entropy = EXCLUDED.uncertainty_entropy,
|
||||
uncertainty_completeness = EXCLUDED.uncertainty_completeness,
|
||||
uncertainty_tier = EXCLUDED.uncertainty_tier,
|
||||
uncertainty_missing_signals = EXCLUDED.uncertainty_missing_signals,
|
||||
decay_half_life_days = EXCLUDED.decay_half_life_days,
|
||||
decay_floor = EXCLUDED.decay_floor,
|
||||
decay_last_update = EXCLUDED.decay_last_update,
|
||||
decay_multiplier = EXCLUDED.decay_multiplier,
|
||||
decay_is_stale = EXCLUDED.decay_is_stale,
|
||||
trust_score = EXCLUDED.trust_score,
|
||||
policy_hint = EXCLUDED.policy_hint,
|
||||
guard_rails = EXCLUDED.guard_rails,
|
||||
last_evaluated_at = EXCLUDED.last_evaluated_at,
|
||||
next_review_at = EXCLUDED.next_review_at,
|
||||
environment = EXCLUDED.environment,
|
||||
updated_at = NOW()
|
||||
""";
|
||||
|
||||
var parameters = new
|
||||
{
|
||||
node.NodeId,
|
||||
node.CveId,
|
||||
node.SubjectPurl,
|
||||
VexStatus = node.VexStatus?.ToString(),
|
||||
ObservationState = node.ObservationState.ToString(),
|
||||
UncertaintyEntropy = node.Uncertainty.Entropy,
|
||||
UncertaintyCompleteness = node.Uncertainty.Completeness,
|
||||
UncertaintyTier = node.Uncertainty.Tier.ToString(),
|
||||
UncertaintyMissingSignals = JsonSerializer.Serialize(node.Uncertainty.MissingSignals),
|
||||
DecayHalfLifeDays = node.Decay.HalfLife.TotalDays,
|
||||
DecayFloor = node.Decay.Floor,
|
||||
DecayLastUpdate = node.Decay.LastSignalUpdate,
|
||||
DecayMultiplier = node.Decay.DecayedMultiplier,
|
||||
DecayIsStale = node.Decay.IsStale,
|
||||
node.TrustScore,
|
||||
PolicyHint = node.PolicyHint.ToString(),
|
||||
GuardRails = node.GuardRails is not null ? JsonSerializer.Serialize(node.GuardRails) : null,
|
||||
node.LastEvaluatedAt,
|
||||
node.NextReviewAt,
|
||||
Environment = node.Environment?.ToString()
|
||||
};
|
||||
|
||||
await connection.ExecuteAsync(sql, parameters, ct);
|
||||
}
|
||||
|
||||
public async Task<IReadOnlyList<CveObservationNode>> GetPendingReviewAsync(
|
||||
DateTimeOffset asOf,
|
||||
int limit = 100,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
await using var connection = await _connectionFactory.CreateAsync(ct);
|
||||
|
||||
var sql = $"""
|
||||
SELECT *
|
||||
FROM {TableName}
|
||||
WHERE next_review_at <= @AsOf
|
||||
AND observation_state IN ('PendingDeterminization', 'StaleRequiresRefresh')
|
||||
ORDER BY next_review_at ASC
|
||||
LIMIT @Limit
|
||||
""";
|
||||
|
||||
var results = await connection.QueryAsync<CveObservationNode>(
|
||||
sql,
|
||||
new { AsOf = asOf, Limit = limit },
|
||||
ct);
|
||||
|
||||
return results.ToList();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Database Migration
|
||||
|
||||
```sql
|
||||
-- Migration: Add CVE observation nodes table
|
||||
-- File: src/Graph/StellaOps.Graph.Indexer/Migrations/003_cve_observation_nodes.sql
|
||||
|
||||
CREATE TABLE IF NOT EXISTS graph.cve_observation_nodes (
|
||||
node_id TEXT PRIMARY KEY,
|
||||
cve_id TEXT NOT NULL,
|
||||
subject_purl TEXT NOT NULL,
|
||||
vex_status TEXT,
|
||||
observation_state TEXT NOT NULL DEFAULT 'PendingDeterminization',
|
||||
|
||||
-- Uncertainty score
|
||||
uncertainty_entropy DOUBLE PRECISION NOT NULL,
|
||||
uncertainty_completeness DOUBLE PRECISION NOT NULL,
|
||||
uncertainty_tier TEXT NOT NULL,
|
||||
uncertainty_missing_signals JSONB NOT NULL DEFAULT '[]',
|
||||
|
||||
-- Decay tracking
|
||||
decay_half_life_days DOUBLE PRECISION NOT NULL DEFAULT 14,
|
||||
decay_floor DOUBLE PRECISION NOT NULL DEFAULT 0.35,
|
||||
decay_last_update TIMESTAMPTZ NOT NULL,
|
||||
decay_multiplier DOUBLE PRECISION NOT NULL,
|
||||
decay_is_stale BOOLEAN NOT NULL DEFAULT FALSE,
|
||||
|
||||
-- Trust and policy
|
||||
trust_score DOUBLE PRECISION NOT NULL,
|
||||
policy_hint TEXT NOT NULL,
|
||||
guard_rails JSONB,
|
||||
|
||||
-- Timestamps
|
||||
last_evaluated_at TIMESTAMPTZ NOT NULL,
|
||||
next_review_at TIMESTAMPTZ,
|
||||
environment TEXT,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
|
||||
CONSTRAINT uq_cve_observation_cve_purl UNIQUE (cve_id, subject_purl)
|
||||
);
|
||||
|
||||
-- Indexes for common queries
|
||||
CREATE INDEX idx_cve_obs_cve_id ON graph.cve_observation_nodes(cve_id);
|
||||
CREATE INDEX idx_cve_obs_purl ON graph.cve_observation_nodes(subject_purl);
|
||||
CREATE INDEX idx_cve_obs_state ON graph.cve_observation_nodes(observation_state);
|
||||
CREATE INDEX idx_cve_obs_review ON graph.cve_observation_nodes(next_review_at)
|
||||
WHERE observation_state IN ('PendingDeterminization', 'StaleRequiresRefresh');
|
||||
CREATE INDEX idx_cve_obs_policy ON graph.cve_observation_nodes(policy_hint);
|
||||
|
||||
-- Trigger for updated_at
|
||||
CREATE OR REPLACE FUNCTION graph.update_cve_obs_timestamp()
|
||||
RETURNS TRIGGER AS $$
|
||||
BEGIN
|
||||
NEW.updated_at = NOW();
|
||||
RETURN NEW;
|
||||
END;
|
||||
$$ LANGUAGE plpgsql;
|
||||
|
||||
CREATE TRIGGER trg_cve_obs_updated
|
||||
BEFORE UPDATE ON graph.cve_observation_nodes
|
||||
FOR EACH ROW EXECUTE FUNCTION graph.update_cve_obs_timestamp();
|
||||
```
|
||||
|
||||
### Findings: Observation Persistence
|
||||
|
||||
#### IObservationRepository (Full Implementation)
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Findings.Ledger.Repositories;
|
||||
|
||||
/// <summary>
|
||||
/// Repository for CVE observations in the findings ledger.
|
||||
/// </summary>
|
||||
public interface IObservationRepository
|
||||
{
|
||||
/// <summary>Find observations by CVE and PURL.</summary>
|
||||
Task<IReadOnlyList<CveObservation>> FindByCveAndPurlAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get observation by ID.</summary>
|
||||
Task<CveObservation?> GetByIdAsync(Guid id, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Create new observation.</summary>
|
||||
Task<CveObservation> CreateAsync(CveObservation observation, CancellationToken ct = default);
|
||||
|
||||
/// <summary>Update observation state with audit trail.</summary>
|
||||
Task UpdateStateAsync(
|
||||
Guid id,
|
||||
ObservationState newState,
|
||||
DeterminizationGateResult? result,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>Get observations needing review.</summary>
|
||||
Task<IReadOnlyList<CveObservation>> GetPendingReviewAsync(
|
||||
DateTimeOffset asOf,
|
||||
int limit = 100,
|
||||
CancellationToken ct = default);
|
||||
|
||||
/// <summary>Record state transition in audit log.</summary>
|
||||
Task RecordTransitionAsync(
|
||||
Guid observationId,
|
||||
ObservationState fromState,
|
||||
ObservationState toState,
|
||||
string reason,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// CVE observation entity for findings ledger.
|
||||
/// </summary>
|
||||
public sealed record CveObservation
|
||||
{
|
||||
public required Guid Id { get; init; }
|
||||
public required string CveId { get; init; }
|
||||
public required string SubjectPurl { get; init; }
|
||||
public required ObservationState ObservationState { get; init; }
|
||||
public required DeploymentEnvironment Environment { get; init; }
|
||||
public UncertaintyScore? LastUncertaintyScore { get; init; }
|
||||
public double? LastTrustScore { get; init; }
|
||||
public PolicyVerdictStatus? LastPolicyHint { get; init; }
|
||||
public GuardRails? GuardRails { get; init; }
|
||||
public required DateTimeOffset CreatedAt { get; init; }
|
||||
public required DateTimeOffset UpdatedAt { get; init; }
|
||||
public DateTimeOffset? NextReviewAt { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### SignalSnapshotBuilder (Full Implementation)
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Policy.Engine.Signals;
|
||||
|
||||
/// <summary>
|
||||
/// Builds signal snapshots by aggregating from multiple sources.
|
||||
/// </summary>
|
||||
public interface ISignalSnapshotBuilder
|
||||
{
|
||||
/// <summary>Build snapshot for a CVE/PURL pair.</summary>
|
||||
Task<SignalSnapshot> BuildAsync(string cveId, string purl, CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed class SignalSnapshotBuilder : ISignalSnapshotBuilder
|
||||
{
|
||||
private readonly ISignalAttacher<EpssEvidence> _epssAttacher;
|
||||
private readonly ISignalAttacher<bool> _kevAttacher;
|
||||
private readonly IVexSignalProvider _vexProvider;
|
||||
private readonly IReachabilitySignalProvider _reachabilityProvider;
|
||||
private readonly IRuntimeSignalProvider _runtimeProvider;
|
||||
private readonly IBackportSignalProvider _backportProvider;
|
||||
private readonly ISbomLineageSignalProvider _sbomProvider;
|
||||
private readonly ICvssSignalProvider _cvssProvider;
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<SignalSnapshotBuilder> _logger;
|
||||
|
||||
public async Task<SignalSnapshot> BuildAsync(
|
||||
string cveId,
|
||||
string purl,
|
||||
CancellationToken ct = default)
|
||||
{
|
||||
var now = _timeProvider.GetUtcNow();
|
||||
|
||||
_logger.LogDebug("Building signal snapshot for CVE {CveId} on {Purl}", cveId, purl);
|
||||
|
||||
// Fetch all signals in parallel
|
||||
var epssTask = _epssAttacher.AttachAsync(cveId, purl, ct);
|
||||
var kevTask = _kevAttacher.AttachAsync(cveId, purl, ct);
|
||||
var vexTask = _vexProvider.GetSignalAsync(cveId, purl, ct);
|
||||
var reachTask = _reachabilityProvider.GetSignalAsync(cveId, purl, ct);
|
||||
var runtimeTask = _runtimeProvider.GetSignalAsync(cveId, purl, ct);
|
||||
var backportTask = _backportProvider.GetSignalAsync(cveId, purl, ct);
|
||||
var sbomTask = _sbomProvider.GetSignalAsync(purl, ct);
|
||||
var cvssTask = _cvssProvider.GetSignalAsync(cveId, ct);
|
||||
|
||||
await Task.WhenAll(
|
||||
epssTask, kevTask, vexTask, reachTask,
|
||||
runtimeTask, backportTask, sbomTask, cvssTask);
|
||||
|
||||
var snapshot = new SignalSnapshot
|
||||
{
|
||||
CveId = cveId,
|
||||
SubjectPurl = purl,
|
||||
CapturedAt = now,
|
||||
Epss = await epssTask,
|
||||
Kev = await kevTask,
|
||||
Vex = await vexTask,
|
||||
Reachability = await reachTask,
|
||||
Runtime = await runtimeTask,
|
||||
Backport = await backportTask,
|
||||
SbomLineage = await sbomTask,
|
||||
Cvss = await cvssTask
|
||||
};
|
||||
|
||||
_logger.LogDebug(
|
||||
"Built signal snapshot for CVE {CveId}: EPSS={EpssStatus}, VEX={VexStatus}, Reach={ReachStatus}",
|
||||
cveId,
|
||||
snapshot.Epss.Status,
|
||||
snapshot.Vex.Status,
|
||||
snapshot.Reachability.Status);
|
||||
|
||||
return snapshot;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | DBI-001 | TODO | DPE-030 | Guild | Create `ISignalAttacher<T>` interface in Feedser |
|
||||
| 2 | DBI-002 | TODO | DBI-001 | Guild | Implement `EpssSignalAttacher` with event emission |
|
||||
| 3 | DBI-003 | TODO | DBI-002 | Guild | Implement `KevSignalAttacher` |
|
||||
| 4 | DBI-004 | TODO | DBI-003 | Guild | Create `SignalAttacherServiceExtensions` for DI |
|
||||
| 5 | DBI-005 | TODO | DBI-004 | Guild | Create `VexSignalEmitter` in VexLens |
|
||||
| 6 | DBI-006 | TODO | DBI-005 | Guild | Create `VexClaimSummaryMapper` |
|
||||
| 7 | DBI-007 | TODO | DBI-006 | Guild | Integrate VexSignalEmitter into VEX processing pipeline |
|
||||
| 8 | DBI-008 | TODO | DBI-007 | Guild | Create `CveObservationNode` record in Graph |
|
||||
| 9 | DBI-009 | TODO | DBI-008 | Guild | Create `ICveObservationNodeRepository` interface |
|
||||
| 10 | DBI-010 | TODO | DBI-009 | Guild | Implement `PostgresCveObservationNodeRepository` |
|
||||
| 11 | DBI-011 | TODO | DBI-010 | Guild | Create migration `003_cve_observation_nodes.sql` |
|
||||
| 12 | DBI-012 | TODO | DBI-011 | Guild | Create `IObservationRepository` in Findings |
|
||||
| 13 | DBI-013 | TODO | DBI-012 | Guild | Implement `PostgresObservationRepository` |
|
||||
| 14 | DBI-014 | TODO | DBI-013 | Guild | Create `ISignalSnapshotBuilder` interface |
|
||||
| 15 | DBI-015 | TODO | DBI-014 | Guild | Implement `SignalSnapshotBuilder` with parallel fetch |
|
||||
| 16 | DBI-016 | TODO | DBI-015 | Guild | Create signal provider interfaces (VEX, Reachability, etc.) |
|
||||
| 17 | DBI-017 | TODO | DBI-016 | Guild | Implement signal provider adapters |
|
||||
| 18 | DBI-018 | TODO | DBI-017 | Guild | Write unit tests: `EpssSignalAttacher` scenarios |
|
||||
| 19 | DBI-019 | TODO | DBI-018 | Guild | Write unit tests: `SignalSnapshotBuilder` parallel fetch |
|
||||
| 20 | DBI-020 | TODO | DBI-019 | Guild | Write integration tests: Graph node persistence |
|
||||
| 21 | DBI-021 | TODO | DBI-020 | Guild | Write integration tests: Findings observation lifecycle |
|
||||
| 22 | DBI-022 | TODO | DBI-021 | Guild | Write integration tests: End-to-end signal flow |
|
||||
| 23 | DBI-023 | TODO | DBI-022 | Guild | Add metrics: `stellaops_feedser_signal_attachments_total` |
|
||||
| 24 | DBI-024 | TODO | DBI-023 | Guild | Add metrics: `stellaops_graph_observation_nodes_total` |
|
||||
| 25 | DBI-025 | TODO | DBI-024 | Guild | Update module AGENTS.md files |
|
||||
| 26 | DBI-026 | TODO | DBI-025 | Guild | Verify build across all affected modules |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. `EpssSignalAttacher` correctly wraps EPSS results in `SignalState<T>`
|
||||
2. VEX updates emit `SignalUpdatedEvent` for downstream processing
|
||||
3. Graph nodes persist `ObservationState` and `UncertaintyScore`
|
||||
4. Findings ledger tracks state transitions with audit trail
|
||||
5. `SignalSnapshotBuilder` fetches all signals in parallel
|
||||
6. Migration creates proper indexes for common queries
|
||||
7. All integration tests pass with Testcontainers
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Parallel signal fetch | Reduces latency; signals are independent |
|
||||
| Graph node hash ID | Deterministic; avoids UUID collision across systems |
|
||||
| JSONB for missing_signals | Flexible schema; supports varying signal sets |
|
||||
| Separate Graph and Findings storage | Graph for query patterns; Findings for audit trail |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Signal provider availability | Graceful degradation to `SignalState.Failed` |
|
||||
| Event storm on bulk VEX import | Batch event emission; debounce handler |
|
||||
| Schema drift across modules | Shared Evidence models in Determinization library |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-12: DBI-001 to DBI-011 complete (Feedser, VexLens, Graph)
|
||||
- 2026-01-13: DBI-012 to DBI-017 complete (Findings, SignalSnapshotBuilder)
|
||||
- 2026-01-14: DBI-018 to DBI-026 complete (tests, metrics)
|
||||
File diff suppressed because it is too large
Load Diff
914
docs/implplan/SPRINT_20260106_001_005_FE_determinization_ui.md
Normal file
914
docs/implplan/SPRINT_20260106_001_005_FE_determinization_ui.md
Normal file
@@ -0,0 +1,914 @@
|
||||
# Sprint 20260106_001_005_FE - Determinization: Frontend UI Components
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Create Angular UI components for displaying and managing CVE observation state, uncertainty scores, guardrails status, and review workflows. This includes the "Unknown (auto-tracking)" chip with next review ETA and a determinization dashboard.
|
||||
|
||||
- **Working directory:** `src/Web/StellaOps.Web/`
|
||||
- **Evidence:** Angular components, services, tests, Storybook stories
|
||||
|
||||
## Problem Statement
|
||||
|
||||
Current UI state:
|
||||
- Vulnerability findings show VEX status but not observation state
|
||||
- No visibility into uncertainty/entropy levels
|
||||
- No guardrails status indicator
|
||||
- No review workflow for uncertain observations
|
||||
|
||||
Advisory requires:
|
||||
- UI chip: "Unknown (auto-tracking)" with next review ETA
|
||||
- Uncertainty tier visualization
|
||||
- Guardrails status and monitoring indicators
|
||||
- Review queue for pending observations
|
||||
- State transition history
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** SPRINT_20260106_001_004_BE (API endpoints)
|
||||
- **Blocks:** None (end of chain)
|
||||
- **Parallel safe:** Frontend-only changes
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/policy/determinization-architecture.md
|
||||
- SPRINT_20260106_001_004_BE (API contracts)
|
||||
- src/Web/StellaOps.Web/AGENTS.md (if exists)
|
||||
- Existing: Vulnerability findings components
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
src/Web/StellaOps.Web/src/app/
|
||||
├── shared/
|
||||
│ └── components/
|
||||
│ └── determinization/
|
||||
│ ├── observation-state-chip/
|
||||
│ │ ├── observation-state-chip.component.ts
|
||||
│ │ ├── observation-state-chip.component.html
|
||||
│ │ ├── observation-state-chip.component.scss
|
||||
│ │ └── observation-state-chip.component.spec.ts
|
||||
│ ├── uncertainty-indicator/
|
||||
│ │ ├── uncertainty-indicator.component.ts
|
||||
│ │ ├── uncertainty-indicator.component.html
|
||||
│ │ ├── uncertainty-indicator.component.scss
|
||||
│ │ └── uncertainty-indicator.component.spec.ts
|
||||
│ ├── guardrails-badge/
|
||||
│ │ ├── guardrails-badge.component.ts
|
||||
│ │ ├── guardrails-badge.component.html
|
||||
│ │ ├── guardrails-badge.component.scss
|
||||
│ │ └── guardrails-badge.component.spec.ts
|
||||
│ ├── decay-progress/
|
||||
│ │ ├── decay-progress.component.ts
|
||||
│ │ ├── decay-progress.component.html
|
||||
│ │ ├── decay-progress.component.scss
|
||||
│ │ └── decay-progress.component.spec.ts
|
||||
│ └── determinization.module.ts
|
||||
├── features/
|
||||
│ └── vulnerabilities/
|
||||
│ └── components/
|
||||
│ ├── observation-details-panel/
|
||||
│ │ ├── observation-details-panel.component.ts
|
||||
│ │ ├── observation-details-panel.component.html
|
||||
│ │ └── observation-details-panel.component.scss
|
||||
│ └── observation-review-queue/
|
||||
│ ├── observation-review-queue.component.ts
|
||||
│ ├── observation-review-queue.component.html
|
||||
│ └── observation-review-queue.component.scss
|
||||
├── core/
|
||||
│ └── services/
|
||||
│ └── determinization/
|
||||
│ ├── determinization.service.ts
|
||||
│ ├── determinization.models.ts
|
||||
│ └── determinization.service.spec.ts
|
||||
└── core/
|
||||
└── models/
|
||||
└── determinization.models.ts
|
||||
```
|
||||
|
||||
### TypeScript Models
|
||||
|
||||
```typescript
|
||||
// src/app/core/models/determinization.models.ts
|
||||
|
||||
export enum ObservationState {
|
||||
PendingDeterminization = 'PendingDeterminization',
|
||||
Determined = 'Determined',
|
||||
Disputed = 'Disputed',
|
||||
StaleRequiresRefresh = 'StaleRequiresRefresh',
|
||||
ManualReviewRequired = 'ManualReviewRequired',
|
||||
Suppressed = 'Suppressed'
|
||||
}
|
||||
|
||||
export enum UncertaintyTier {
|
||||
VeryLow = 'VeryLow',
|
||||
Low = 'Low',
|
||||
Medium = 'Medium',
|
||||
High = 'High',
|
||||
VeryHigh = 'VeryHigh'
|
||||
}
|
||||
|
||||
export enum PolicyVerdictStatus {
|
||||
Pass = 'Pass',
|
||||
GuardedPass = 'GuardedPass',
|
||||
Blocked = 'Blocked',
|
||||
Ignored = 'Ignored',
|
||||
Warned = 'Warned',
|
||||
Deferred = 'Deferred',
|
||||
Escalated = 'Escalated',
|
||||
RequiresVex = 'RequiresVex'
|
||||
}
|
||||
|
||||
export interface UncertaintyScore {
|
||||
entropy: number;
|
||||
completeness: number;
|
||||
tier: UncertaintyTier;
|
||||
missingSignals: SignalGap[];
|
||||
weightedEvidenceSum: number;
|
||||
maxPossibleWeight: number;
|
||||
}
|
||||
|
||||
export interface SignalGap {
|
||||
signalName: string;
|
||||
weight: number;
|
||||
status: 'NotQueried' | 'Queried' | 'Failed';
|
||||
reason?: string;
|
||||
}
|
||||
|
||||
export interface ObservationDecay {
|
||||
halfLifeDays: number;
|
||||
floor: number;
|
||||
lastSignalUpdate: string;
|
||||
decayedMultiplier: number;
|
||||
nextReviewAt?: string;
|
||||
isStale: boolean;
|
||||
ageDays: number;
|
||||
}
|
||||
|
||||
export interface GuardRails {
|
||||
enableRuntimeMonitoring: boolean;
|
||||
reviewIntervalDays: number;
|
||||
epssEscalationThreshold: number;
|
||||
escalatingReachabilityStates: string[];
|
||||
maxGuardedDurationDays: number;
|
||||
alertChannels: string[];
|
||||
policyRationale?: string;
|
||||
}
|
||||
|
||||
export interface CveObservation {
|
||||
id: string;
|
||||
cveId: string;
|
||||
subjectPurl: string;
|
||||
observationState: ObservationState;
|
||||
uncertaintyScore: UncertaintyScore;
|
||||
decay: ObservationDecay;
|
||||
trustScore: number;
|
||||
policyHint: PolicyVerdictStatus;
|
||||
guardRails?: GuardRails;
|
||||
lastEvaluatedAt: string;
|
||||
nextReviewAt?: string;
|
||||
environment?: string;
|
||||
vexStatus?: string;
|
||||
}
|
||||
|
||||
export interface ObservationStateTransition {
|
||||
id: string;
|
||||
observationId: string;
|
||||
fromState: ObservationState;
|
||||
toState: ObservationState;
|
||||
reason: string;
|
||||
triggeredBy: string;
|
||||
timestamp: string;
|
||||
}
|
||||
```
|
||||
|
||||
### ObservationStateChip Component
|
||||
|
||||
```typescript
|
||||
// observation-state-chip.component.ts
|
||||
|
||||
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
|
||||
import { CommonModule } from '@angular/common';
|
||||
import { MatChipsModule } from '@angular/material/chips';
|
||||
import { MatIconModule } from '@angular/material/icon';
|
||||
import { MatTooltipModule } from '@angular/material/tooltip';
|
||||
import { ObservationState, CveObservation } from '@core/models/determinization.models';
|
||||
import { formatDistanceToNow, parseISO } from 'date-fns';
|
||||
|
||||
@Component({
|
||||
selector: 'stellaops-observation-state-chip',
|
||||
standalone: true,
|
||||
imports: [CommonModule, MatChipsModule, MatIconModule, MatTooltipModule],
|
||||
templateUrl: './observation-state-chip.component.html',
|
||||
styleUrls: ['./observation-state-chip.component.scss'],
|
||||
changeDetection: ChangeDetectionStrategy.OnPush
|
||||
})
|
||||
export class ObservationStateChipComponent {
|
||||
@Input({ required: true }) observation!: CveObservation;
|
||||
@Input() showReviewEta = true;
|
||||
|
||||
get stateConfig(): StateConfig {
|
||||
return STATE_CONFIGS[this.observation.observationState];
|
||||
}
|
||||
|
||||
get reviewEtaText(): string | null {
|
||||
if (!this.observation.nextReviewAt) return null;
|
||||
const nextReview = parseISO(this.observation.nextReviewAt);
|
||||
return formatDistanceToNow(nextReview, { addSuffix: true });
|
||||
}
|
||||
|
||||
get tooltipText(): string {
|
||||
const config = this.stateConfig;
|
||||
let tooltip = config.description;
|
||||
|
||||
if (this.observation.observationState === ObservationState.PendingDeterminization) {
|
||||
const missing = this.observation.uncertaintyScore.missingSignals
|
||||
.map(g => g.signalName)
|
||||
.join(', ');
|
||||
if (missing) {
|
||||
tooltip += ` Missing: ${missing}`;
|
||||
}
|
||||
}
|
||||
|
||||
if (this.reviewEtaText) {
|
||||
tooltip += ` Next review: ${this.reviewEtaText}`;
|
||||
}
|
||||
|
||||
return tooltip;
|
||||
}
|
||||
}
|
||||
|
||||
interface StateConfig {
|
||||
label: string;
|
||||
icon: string;
|
||||
color: 'primary' | 'accent' | 'warn' | 'default';
|
||||
description: string;
|
||||
}
|
||||
|
||||
const STATE_CONFIGS: Record<ObservationState, StateConfig> = {
|
||||
[ObservationState.PendingDeterminization]: {
|
||||
label: 'Unknown (auto-tracking)',
|
||||
icon: 'hourglass_empty',
|
||||
color: 'accent',
|
||||
description: 'Evidence incomplete; tracking for updates.'
|
||||
},
|
||||
[ObservationState.Determined]: {
|
||||
label: 'Determined',
|
||||
icon: 'check_circle',
|
||||
color: 'primary',
|
||||
description: 'Sufficient evidence for confident determination.'
|
||||
},
|
||||
[ObservationState.Disputed]: {
|
||||
label: 'Disputed',
|
||||
icon: 'warning',
|
||||
color: 'warn',
|
||||
description: 'Conflicting evidence detected; requires review.'
|
||||
},
|
||||
[ObservationState.StaleRequiresRefresh]: {
|
||||
label: 'Stale',
|
||||
icon: 'update',
|
||||
color: 'warn',
|
||||
description: 'Evidence has decayed; needs refresh.'
|
||||
},
|
||||
[ObservationState.ManualReviewRequired]: {
|
||||
label: 'Review Required',
|
||||
icon: 'rate_review',
|
||||
color: 'warn',
|
||||
description: 'Manual review required before proceeding.'
|
||||
},
|
||||
[ObservationState.Suppressed]: {
|
||||
label: 'Suppressed',
|
||||
icon: 'visibility_off',
|
||||
color: 'default',
|
||||
description: 'Observation suppressed by policy exception.'
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
```html
|
||||
<!-- observation-state-chip.component.html -->
|
||||
|
||||
<mat-chip
|
||||
[class]="'observation-chip observation-chip--' + observation.observationState.toLowerCase()"
|
||||
[matTooltip]="tooltipText"
|
||||
matTooltipPosition="above">
|
||||
<mat-icon class="chip-icon">{{ stateConfig.icon }}</mat-icon>
|
||||
<span class="chip-label">{{ stateConfig.label }}</span>
|
||||
<span *ngIf="showReviewEta && reviewEtaText" class="chip-eta">
|
||||
({{ reviewEtaText }})
|
||||
</span>
|
||||
</mat-chip>
|
||||
```
|
||||
|
||||
```scss
|
||||
// observation-state-chip.component.scss
|
||||
|
||||
.observation-chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 4px;
|
||||
font-size: 12px;
|
||||
height: 24px;
|
||||
|
||||
.chip-icon {
|
||||
font-size: 16px;
|
||||
width: 16px;
|
||||
height: 16px;
|
||||
}
|
||||
|
||||
.chip-eta {
|
||||
font-size: 10px;
|
||||
opacity: 0.8;
|
||||
}
|
||||
|
||||
&--pendingdeterminization {
|
||||
background-color: #fff3e0;
|
||||
color: #e65100;
|
||||
}
|
||||
|
||||
&--determined {
|
||||
background-color: #e8f5e9;
|
||||
color: #2e7d32;
|
||||
}
|
||||
|
||||
&--disputed {
|
||||
background-color: #fff8e1;
|
||||
color: #f57f17;
|
||||
}
|
||||
|
||||
&--stalerequiresrefresh {
|
||||
background-color: #fce4ec;
|
||||
color: #c2185b;
|
||||
}
|
||||
|
||||
&--manualreviewrequired {
|
||||
background-color: #ffebee;
|
||||
color: #c62828;
|
||||
}
|
||||
|
||||
&--suppressed {
|
||||
background-color: #f5f5f5;
|
||||
color: #757575;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### UncertaintyIndicator Component
|
||||
|
||||
```typescript
|
||||
// uncertainty-indicator.component.ts
|
||||
|
||||
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
|
||||
import { CommonModule } from '@angular/common';
|
||||
import { MatProgressBarModule } from '@angular/material/progress-bar';
|
||||
import { MatTooltipModule } from '@angular/material/tooltip';
|
||||
import { UncertaintyScore, UncertaintyTier } from '@core/models/determinization.models';
|
||||
|
||||
@Component({
|
||||
selector: 'stellaops-uncertainty-indicator',
|
||||
standalone: true,
|
||||
imports: [CommonModule, MatProgressBarModule, MatTooltipModule],
|
||||
templateUrl: './uncertainty-indicator.component.html',
|
||||
styleUrls: ['./uncertainty-indicator.component.scss'],
|
||||
changeDetection: ChangeDetectionStrategy.OnPush
|
||||
})
|
||||
export class UncertaintyIndicatorComponent {
|
||||
@Input({ required: true }) score!: UncertaintyScore;
|
||||
@Input() showLabel = true;
|
||||
@Input() compact = false;
|
||||
|
||||
get completenessPercent(): number {
|
||||
return Math.round(this.score.completeness * 100);
|
||||
}
|
||||
|
||||
get tierConfig(): TierConfig {
|
||||
return TIER_CONFIGS[this.score.tier];
|
||||
}
|
||||
|
||||
get tooltipText(): string {
|
||||
const missing = this.score.missingSignals.map(g => g.signalName).join(', ');
|
||||
return `Evidence completeness: ${this.completenessPercent}%` +
|
||||
(missing ? ` | Missing: ${missing}` : '');
|
||||
}
|
||||
}
|
||||
|
||||
interface TierConfig {
|
||||
label: string;
|
||||
color: string;
|
||||
barColor: 'primary' | 'accent' | 'warn';
|
||||
}
|
||||
|
||||
const TIER_CONFIGS: Record<UncertaintyTier, TierConfig> = {
|
||||
[UncertaintyTier.VeryLow]: {
|
||||
label: 'Very Low Uncertainty',
|
||||
color: '#4caf50',
|
||||
barColor: 'primary'
|
||||
},
|
||||
[UncertaintyTier.Low]: {
|
||||
label: 'Low Uncertainty',
|
||||
color: '#8bc34a',
|
||||
barColor: 'primary'
|
||||
},
|
||||
[UncertaintyTier.Medium]: {
|
||||
label: 'Moderate Uncertainty',
|
||||
color: '#ffc107',
|
||||
barColor: 'accent'
|
||||
},
|
||||
[UncertaintyTier.High]: {
|
||||
label: 'High Uncertainty',
|
||||
color: '#ff9800',
|
||||
barColor: 'warn'
|
||||
},
|
||||
[UncertaintyTier.VeryHigh]: {
|
||||
label: 'Very High Uncertainty',
|
||||
color: '#f44336',
|
||||
barColor: 'warn'
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
```html
|
||||
<!-- uncertainty-indicator.component.html -->
|
||||
|
||||
<div class="uncertainty-indicator"
|
||||
[class.compact]="compact"
|
||||
[matTooltip]="tooltipText">
|
||||
<div class="indicator-header" *ngIf="showLabel">
|
||||
<span class="tier-label" [style.color]="tierConfig.color">
|
||||
{{ tierConfig.label }}
|
||||
</span>
|
||||
<span class="completeness-value">{{ completenessPercent }}%</span>
|
||||
</div>
|
||||
<mat-progress-bar
|
||||
[value]="completenessPercent"
|
||||
[color]="tierConfig.barColor"
|
||||
mode="determinate">
|
||||
</mat-progress-bar>
|
||||
<div class="missing-signals" *ngIf="!compact && score.missingSignals.length > 0">
|
||||
<span class="missing-label">Missing:</span>
|
||||
<span class="missing-list">
|
||||
{{ score.missingSignals | slice:0:3 | map:'signalName' | join:', ' }}
|
||||
<span *ngIf="score.missingSignals.length > 3">
|
||||
+{{ score.missingSignals.length - 3 }} more
|
||||
</span>
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
```
|
||||
|
||||
### GuardrailsBadge Component
|
||||
|
||||
```typescript
|
||||
// guardrails-badge.component.ts
|
||||
|
||||
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
|
||||
import { CommonModule } from '@angular/common';
|
||||
import { MatBadgeModule } from '@angular/material/badge';
|
||||
import { MatIconModule } from '@angular/material/icon';
|
||||
import { MatTooltipModule } from '@angular/material/tooltip';
|
||||
import { GuardRails } from '@core/models/determinization.models';
|
||||
|
||||
@Component({
|
||||
selector: 'stellaops-guardrails-badge',
|
||||
standalone: true,
|
||||
imports: [CommonModule, MatBadgeModule, MatIconModule, MatTooltipModule],
|
||||
templateUrl: './guardrails-badge.component.html',
|
||||
styleUrls: ['./guardrails-badge.component.scss'],
|
||||
changeDetection: ChangeDetectionStrategy.OnPush
|
||||
})
|
||||
export class GuardrailsBadgeComponent {
|
||||
@Input({ required: true }) guardRails!: GuardRails;
|
||||
|
||||
get activeGuardrailsCount(): number {
|
||||
let count = 0;
|
||||
if (this.guardRails.enableRuntimeMonitoring) count++;
|
||||
if (this.guardRails.alertChannels.length > 0) count++;
|
||||
if (this.guardRails.epssEscalationThreshold < 1.0) count++;
|
||||
return count;
|
||||
}
|
||||
|
||||
get tooltipText(): string {
|
||||
const parts: string[] = [];
|
||||
|
||||
if (this.guardRails.enableRuntimeMonitoring) {
|
||||
parts.push('Runtime monitoring enabled');
|
||||
}
|
||||
|
||||
parts.push(`Review every ${this.guardRails.reviewIntervalDays} days`);
|
||||
parts.push(`EPSS escalation at ${(this.guardRails.epssEscalationThreshold * 100).toFixed(0)}%`);
|
||||
|
||||
if (this.guardRails.alertChannels.length > 0) {
|
||||
parts.push(`Alerts: ${this.guardRails.alertChannels.join(', ')}`);
|
||||
}
|
||||
|
||||
if (this.guardRails.policyRationale) {
|
||||
parts.push(`Rationale: ${this.guardRails.policyRationale}`);
|
||||
}
|
||||
|
||||
return parts.join(' | ');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```html
|
||||
<!-- guardrails-badge.component.html -->
|
||||
|
||||
<div class="guardrails-badge" [matTooltip]="tooltipText">
|
||||
<mat-icon
|
||||
[matBadge]="activeGuardrailsCount"
|
||||
matBadgeColor="accent"
|
||||
matBadgeSize="small">
|
||||
security
|
||||
</mat-icon>
|
||||
<span class="badge-label">Guarded</span>
|
||||
<div class="guardrails-icons">
|
||||
<mat-icon *ngIf="guardRails.enableRuntimeMonitoring"
|
||||
class="guardrail-icon"
|
||||
matTooltip="Runtime monitoring active">
|
||||
monitor_heart
|
||||
</mat-icon>
|
||||
<mat-icon *ngIf="guardRails.alertChannels.length > 0"
|
||||
class="guardrail-icon"
|
||||
matTooltip="Alerts configured">
|
||||
notifications_active
|
||||
</mat-icon>
|
||||
</div>
|
||||
</div>
|
||||
```
|
||||
|
||||
### DecayProgress Component
|
||||
|
||||
```typescript
|
||||
// decay-progress.component.ts
|
||||
|
||||
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
|
||||
import { CommonModule } from '@angular/common';
|
||||
import { MatProgressBarModule } from '@angular/material/progress-bar';
|
||||
import { MatTooltipModule } from '@angular/material/tooltip';
|
||||
import { ObservationDecay } from '@core/models/determinization.models';
|
||||
import { formatDistanceToNow, parseISO } from 'date-fns';
|
||||
|
||||
@Component({
|
||||
selector: 'stellaops-decay-progress',
|
||||
standalone: true,
|
||||
imports: [CommonModule, MatProgressBarModule, MatTooltipModule],
|
||||
templateUrl: './decay-progress.component.html',
|
||||
styleUrls: ['./decay-progress.component.scss'],
|
||||
changeDetection: ChangeDetectionStrategy.OnPush
|
||||
})
|
||||
export class DecayProgressComponent {
|
||||
@Input({ required: true }) decay!: ObservationDecay;
|
||||
|
||||
get freshness(): number {
|
||||
return Math.round(this.decay.decayedMultiplier * 100);
|
||||
}
|
||||
|
||||
get ageText(): string {
|
||||
return `${this.decay.ageDays.toFixed(1)} days old`;
|
||||
}
|
||||
|
||||
get nextReviewText(): string | null {
|
||||
if (!this.decay.nextReviewAt) return null;
|
||||
return formatDistanceToNow(parseISO(this.decay.nextReviewAt), { addSuffix: true });
|
||||
}
|
||||
|
||||
get barColor(): 'primary' | 'accent' | 'warn' {
|
||||
if (this.decay.isStale) return 'warn';
|
||||
if (this.decay.decayedMultiplier < 0.7) return 'accent';
|
||||
return 'primary';
|
||||
}
|
||||
|
||||
get tooltipText(): string {
|
||||
return `Freshness: ${this.freshness}% | Age: ${this.ageText} | ` +
|
||||
`Half-life: ${this.decay.halfLifeDays} days` +
|
||||
(this.decay.isStale ? ' | STALE - needs refresh' : '');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Determinization Service
|
||||
|
||||
```typescript
|
||||
// determinization.service.ts
|
||||
|
||||
import { Injectable, inject } from '@angular/core';
|
||||
import { HttpClient, HttpParams } from '@angular/common/http';
|
||||
import { Observable } from 'rxjs';
|
||||
import {
|
||||
CveObservation,
|
||||
ObservationState,
|
||||
ObservationStateTransition
|
||||
} from '@core/models/determinization.models';
|
||||
import { ApiConfig } from '@core/config/api.config';
|
||||
|
||||
@Injectable({ providedIn: 'root' })
|
||||
export class DeterminizationService {
|
||||
private readonly http = inject(HttpClient);
|
||||
private readonly apiConfig = inject(ApiConfig);
|
||||
|
||||
private get baseUrl(): string {
|
||||
return `${this.apiConfig.baseUrl}/api/v1/observations`;
|
||||
}
|
||||
|
||||
getObservation(cveId: string, purl: string): Observable<CveObservation> {
|
||||
const params = new HttpParams()
|
||||
.set('cveId', cveId)
|
||||
.set('purl', purl);
|
||||
return this.http.get<CveObservation>(this.baseUrl, { params });
|
||||
}
|
||||
|
||||
getObservationById(id: string): Observable<CveObservation> {
|
||||
return this.http.get<CveObservation>(`${this.baseUrl}/${id}`);
|
||||
}
|
||||
|
||||
getPendingReview(limit = 50): Observable<CveObservation[]> {
|
||||
const params = new HttpParams()
|
||||
.set('state', ObservationState.PendingDeterminization)
|
||||
.set('limit', limit.toString());
|
||||
return this.http.get<CveObservation[]>(`${this.baseUrl}/pending-review`, { params });
|
||||
}
|
||||
|
||||
getByState(state: ObservationState, limit = 100): Observable<CveObservation[]> {
|
||||
const params = new HttpParams()
|
||||
.set('state', state)
|
||||
.set('limit', limit.toString());
|
||||
return this.http.get<CveObservation[]>(this.baseUrl, { params });
|
||||
}
|
||||
|
||||
getTransitionHistory(observationId: string): Observable<ObservationStateTransition[]> {
|
||||
return this.http.get<ObservationStateTransition[]>(
|
||||
`${this.baseUrl}/${observationId}/transitions`
|
||||
);
|
||||
}
|
||||
|
||||
requestReview(observationId: string, reason: string): Observable<void> {
|
||||
return this.http.post<void>(
|
||||
`${this.baseUrl}/${observationId}/request-review`,
|
||||
{ reason }
|
||||
);
|
||||
}
|
||||
|
||||
suppress(observationId: string, reason: string): Observable<void> {
|
||||
return this.http.post<void>(
|
||||
`${this.baseUrl}/${observationId}/suppress`,
|
||||
{ reason }
|
||||
);
|
||||
}
|
||||
|
||||
refreshSignals(observationId: string): Observable<CveObservation> {
|
||||
return this.http.post<CveObservation>(
|
||||
`${this.baseUrl}/${observationId}/refresh`,
|
||||
{}
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Observation Review Queue Component
|
||||
|
||||
```typescript
|
||||
// observation-review-queue.component.ts
|
||||
|
||||
import { Component, OnInit, inject, ChangeDetectionStrategy } from '@angular/core';
|
||||
import { CommonModule } from '@angular/common';
|
||||
import { MatTableModule } from '@angular/material/table';
|
||||
import { MatPaginatorModule, PageEvent } from '@angular/material/paginator';
|
||||
import { MatButtonModule } from '@angular/material/button';
|
||||
import { MatIconModule } from '@angular/material/icon';
|
||||
import { MatMenuModule } from '@angular/material/menu';
|
||||
import { BehaviorSubject, switchMap } from 'rxjs';
|
||||
import { DeterminizationService } from '@core/services/determinization/determinization.service';
|
||||
import { CveObservation } from '@core/models/determinization.models';
|
||||
import { ObservationStateChipComponent } from '@shared/components/determinization/observation-state-chip/observation-state-chip.component';
|
||||
import { UncertaintyIndicatorComponent } from '@shared/components/determinization/uncertainty-indicator/uncertainty-indicator.component';
|
||||
import { GuardrailsBadgeComponent } from '@shared/components/determinization/guardrails-badge/guardrails-badge.component';
|
||||
import { DecayProgressComponent } from '@shared/components/determinization/decay-progress/decay-progress.component';
|
||||
|
||||
@Component({
|
||||
selector: 'stellaops-observation-review-queue',
|
||||
standalone: true,
|
||||
imports: [
|
||||
CommonModule,
|
||||
MatTableModule,
|
||||
MatPaginatorModule,
|
||||
MatButtonModule,
|
||||
MatIconModule,
|
||||
MatMenuModule,
|
||||
ObservationStateChipComponent,
|
||||
UncertaintyIndicatorComponent,
|
||||
GuardrailsBadgeComponent,
|
||||
DecayProgressComponent
|
||||
],
|
||||
templateUrl: './observation-review-queue.component.html',
|
||||
styleUrls: ['./observation-review-queue.component.scss'],
|
||||
changeDetection: ChangeDetectionStrategy.OnPush
|
||||
})
|
||||
export class ObservationReviewQueueComponent implements OnInit {
|
||||
private readonly determinizationService = inject(DeterminizationService);
|
||||
|
||||
displayedColumns = ['cveId', 'purl', 'state', 'uncertainty', 'freshness', 'actions'];
|
||||
observations$ = new BehaviorSubject<CveObservation[]>([]);
|
||||
loading$ = new BehaviorSubject<boolean>(false);
|
||||
|
||||
pageSize = 25;
|
||||
pageIndex = 0;
|
||||
|
||||
ngOnInit(): void {
|
||||
this.loadObservations();
|
||||
}
|
||||
|
||||
loadObservations(): void {
|
||||
this.loading$.next(true);
|
||||
this.determinizationService.getPendingReview(this.pageSize)
|
||||
.subscribe({
|
||||
next: (observations) => {
|
||||
this.observations$.next(observations);
|
||||
this.loading$.next(false);
|
||||
},
|
||||
error: () => this.loading$.next(false)
|
||||
});
|
||||
}
|
||||
|
||||
onPageChange(event: PageEvent): void {
|
||||
this.pageSize = event.pageSize;
|
||||
this.pageIndex = event.pageIndex;
|
||||
this.loadObservations();
|
||||
}
|
||||
|
||||
onRefresh(observation: CveObservation): void {
|
||||
this.determinizationService.refreshSignals(observation.id)
|
||||
.subscribe(() => this.loadObservations());
|
||||
}
|
||||
|
||||
onRequestReview(observation: CveObservation): void {
|
||||
// Open dialog for review request
|
||||
}
|
||||
|
||||
onSuppress(observation: CveObservation): void {
|
||||
// Open dialog for suppression
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
```html
|
||||
<!-- observation-review-queue.component.html -->
|
||||
|
||||
<div class="review-queue">
|
||||
<div class="queue-header">
|
||||
<h2>Pending Determinization Review</h2>
|
||||
<button mat-icon-button (click)="loadObservations()" matTooltip="Refresh">
|
||||
<mat-icon>refresh</mat-icon>
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<table mat-table [dataSource]="observations$ | async" class="queue-table">
|
||||
<!-- CVE ID Column -->
|
||||
<ng-container matColumnDef="cveId">
|
||||
<th mat-header-cell *matHeaderCellDef>CVE</th>
|
||||
<td mat-cell *matCellDef="let obs">
|
||||
<a [routerLink]="['/vulnerabilities', obs.cveId]">{{ obs.cveId }}</a>
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<!-- PURL Column -->
|
||||
<ng-container matColumnDef="purl">
|
||||
<th mat-header-cell *matHeaderCellDef>Component</th>
|
||||
<td mat-cell *matCellDef="let obs" class="purl-cell">
|
||||
{{ obs.subjectPurl | truncate:50 }}
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<!-- State Column -->
|
||||
<ng-container matColumnDef="state">
|
||||
<th mat-header-cell *matHeaderCellDef>State</th>
|
||||
<td mat-cell *matCellDef="let obs">
|
||||
<stellaops-observation-state-chip [observation]="obs">
|
||||
</stellaops-observation-state-chip>
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<!-- Uncertainty Column -->
|
||||
<ng-container matColumnDef="uncertainty">
|
||||
<th mat-header-cell *matHeaderCellDef>Evidence</th>
|
||||
<td mat-cell *matCellDef="let obs">
|
||||
<stellaops-uncertainty-indicator
|
||||
[score]="obs.uncertaintyScore"
|
||||
[compact]="true">
|
||||
</stellaops-uncertainty-indicator>
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<!-- Freshness Column -->
|
||||
<ng-container matColumnDef="freshness">
|
||||
<th mat-header-cell *matHeaderCellDef>Freshness</th>
|
||||
<td mat-cell *matCellDef="let obs">
|
||||
<stellaops-decay-progress [decay]="obs.decay">
|
||||
</stellaops-decay-progress>
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<!-- Actions Column -->
|
||||
<ng-container matColumnDef="actions">
|
||||
<th mat-header-cell *matHeaderCellDef></th>
|
||||
<td mat-cell *matCellDef="let obs">
|
||||
<button mat-icon-button [matMenuTriggerFor]="menu">
|
||||
<mat-icon>more_vert</mat-icon>
|
||||
</button>
|
||||
<mat-menu #menu="matMenu">
|
||||
<button mat-menu-item (click)="onRefresh(obs)">
|
||||
<mat-icon>refresh</mat-icon>
|
||||
<span>Refresh Signals</span>
|
||||
</button>
|
||||
<button mat-menu-item (click)="onRequestReview(obs)">
|
||||
<mat-icon>rate_review</mat-icon>
|
||||
<span>Request Review</span>
|
||||
</button>
|
||||
<button mat-menu-item (click)="onSuppress(obs)">
|
||||
<mat-icon>visibility_off</mat-icon>
|
||||
<span>Suppress</span>
|
||||
</button>
|
||||
</mat-menu>
|
||||
</td>
|
||||
</ng-container>
|
||||
|
||||
<tr mat-header-row *matHeaderRowDef="displayedColumns"></tr>
|
||||
<tr mat-row *matRowDef="let row; columns: displayedColumns;"></tr>
|
||||
</table>
|
||||
|
||||
<mat-paginator
|
||||
[pageSize]="pageSize"
|
||||
[pageIndex]="pageIndex"
|
||||
[pageSizeOptions]="[10, 25, 50, 100]"
|
||||
(page)="onPageChange($event)">
|
||||
</mat-paginator>
|
||||
</div>
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | DFE-001 | TODO | DBI-026 | Guild | Create `determinization.models.ts` TypeScript interfaces |
|
||||
| 2 | DFE-002 | TODO | DFE-001 | Guild | Create `DeterminizationService` with API methods |
|
||||
| 3 | DFE-003 | TODO | DFE-002 | Guild | Create `ObservationStateChipComponent` |
|
||||
| 4 | DFE-004 | TODO | DFE-003 | Guild | Create `UncertaintyIndicatorComponent` |
|
||||
| 5 | DFE-005 | TODO | DFE-004 | Guild | Create `GuardrailsBadgeComponent` |
|
||||
| 6 | DFE-006 | TODO | DFE-005 | Guild | Create `DecayProgressComponent` |
|
||||
| 7 | DFE-007 | TODO | DFE-006 | Guild | Create `DeterminizationModule` to export components |
|
||||
| 8 | DFE-008 | TODO | DFE-007 | Guild | Create `ObservationDetailsPanelComponent` |
|
||||
| 9 | DFE-009 | TODO | DFE-008 | Guild | Create `ObservationReviewQueueComponent` |
|
||||
| 10 | DFE-010 | TODO | DFE-009 | Guild | Integrate state chip into existing vulnerability list |
|
||||
| 11 | DFE-011 | TODO | DFE-010 | Guild | Add uncertainty indicator to vulnerability details |
|
||||
| 12 | DFE-012 | TODO | DFE-011 | Guild | Add guardrails badge to guarded findings |
|
||||
| 13 | DFE-013 | TODO | DFE-012 | Guild | Create state transition history timeline component |
|
||||
| 14 | DFE-014 | TODO | DFE-013 | Guild | Add review queue to navigation |
|
||||
| 15 | DFE-015 | TODO | DFE-014 | Guild | Write unit tests: ObservationStateChipComponent |
|
||||
| 16 | DFE-016 | TODO | DFE-015 | Guild | Write unit tests: UncertaintyIndicatorComponent |
|
||||
| 17 | DFE-017 | TODO | DFE-016 | Guild | Write unit tests: DeterminizationService |
|
||||
| 18 | DFE-018 | TODO | DFE-017 | Guild | Write Storybook stories for all components |
|
||||
| 19 | DFE-019 | TODO | DFE-018 | Guild | Add i18n translations for state labels |
|
||||
| 20 | DFE-020 | TODO | DFE-019 | Guild | Implement dark mode styles |
|
||||
| 21 | DFE-021 | TODO | DFE-020 | Guild | Add accessibility (ARIA) attributes |
|
||||
| 22 | DFE-022 | TODO | DFE-021 | Guild | E2E tests: review queue workflow |
|
||||
| 23 | DFE-023 | TODO | DFE-022 | Guild | Performance optimization: virtual scroll for large lists |
|
||||
| 24 | DFE-024 | TODO | DFE-023 | Guild | Verify build with `ng build --configuration production` |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. "Unknown (auto-tracking)" chip displays correctly with review ETA
|
||||
2. Uncertainty indicator shows tier and completeness percentage
|
||||
3. Guardrails badge shows active guardrail count and details
|
||||
4. Decay progress shows freshness and staleness warnings
|
||||
5. Review queue lists pending observations with sorting
|
||||
6. All components work in dark mode
|
||||
7. ARIA attributes present for accessibility
|
||||
8. Storybook stories document all component states
|
||||
9. Unit tests achieve 80%+ coverage
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Standalone components | Tree-shakeable; modern Angular pattern |
|
||||
| Material Design | Consistent with existing StellaOps UI |
|
||||
| date-fns for formatting | Lighter than moment; tree-shakeable |
|
||||
| Virtual scroll for queue | Performance with large observation counts |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| API contract drift | TypeScript interfaces from OpenAPI spec |
|
||||
| Performance with many observations | Pagination; virtual scroll; lazy loading |
|
||||
| Localization complexity | i18n from day one; extract all strings |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
|
||||
|
||||
## Next Checkpoints
|
||||
|
||||
- 2026-01-15: DFE-001 to DFE-009 complete (core components)
|
||||
- 2026-01-16: DFE-010 to DFE-014 complete (integration)
|
||||
- 2026-01-17: DFE-015 to DFE-024 complete (tests, polish)
|
||||
@@ -0,0 +1,990 @@
|
||||
# Sprint 20260106_001_005_UNKNOWNS - Provenance Hint Enhancement
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Extend the Unknowns module with structured provenance hints that help explain **why** something is unknown and provide hypotheses for resolution, following the advisory's requirement for "provenance hints like: Build-ID match, import table fingerprint, section layout deltas."
|
||||
|
||||
- **Working directory:** `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/`
|
||||
- **Evidence:** ProvenanceHint model, builders, integration with Unknown, tests
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The product advisory requires:
|
||||
> **Unknown tagging with provenance hints:**
|
||||
> - ELF Build-ID / debuglink match; import table fingerprint; section layout deltas.
|
||||
> - Attach hypotheses like: "Binary matches distro build-ID, likely backport."
|
||||
|
||||
Current state:
|
||||
- `Unknown` model has `Context` as flexible `JsonDocument`
|
||||
- No structured provenance hint types
|
||||
- No confidence scoring for hints
|
||||
- No hypothesis generation for resolution
|
||||
|
||||
**Gap:** Unknown.Context lacks structured provenance-specific fields. No way to express "we don't know what this is, but here's evidence that might help identify it."
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** None (extends existing Unknowns module)
|
||||
- **Blocks:** SPRINT_20260106_001_004_LB (orchestrator uses provenance hints)
|
||||
- **Parallel safe:** Extends existing module; no conflicts
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- docs/modules/unknowns/architecture.md
|
||||
- src/Unknowns/AGENTS.md
|
||||
- Existing Unknown model at `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Provenance Hint Types
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Unknowns.Core.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Classification of provenance hint types.
|
||||
/// </summary>
|
||||
public enum ProvenanceHintType
|
||||
{
|
||||
/// <summary>ELF/PE Build-ID match against known catalog.</summary>
|
||||
BuildIdMatch,
|
||||
|
||||
/// <summary>Debug link (.gnu_debuglink) reference.</summary>
|
||||
DebugLink,
|
||||
|
||||
/// <summary>Import table fingerprint comparison.</summary>
|
||||
ImportTableFingerprint,
|
||||
|
||||
/// <summary>Export table fingerprint comparison.</summary>
|
||||
ExportTableFingerprint,
|
||||
|
||||
/// <summary>Section layout similarity.</summary>
|
||||
SectionLayout,
|
||||
|
||||
/// <summary>String table signature match.</summary>
|
||||
StringTableSignature,
|
||||
|
||||
/// <summary>Compiler/linker identification.</summary>
|
||||
CompilerSignature,
|
||||
|
||||
/// <summary>Package manager metadata (RPATH, NEEDED, etc.).</summary>
|
||||
PackageMetadata,
|
||||
|
||||
/// <summary>Distro/vendor pattern match.</summary>
|
||||
DistroPattern,
|
||||
|
||||
/// <summary>Version string extraction.</summary>
|
||||
VersionString,
|
||||
|
||||
/// <summary>Symbol name pattern match.</summary>
|
||||
SymbolPattern,
|
||||
|
||||
/// <summary>File path pattern match.</summary>
|
||||
PathPattern,
|
||||
|
||||
/// <summary>Hash match against known corpus.</summary>
|
||||
CorpusMatch,
|
||||
|
||||
/// <summary>SBOM cross-reference.</summary>
|
||||
SbomCrossReference,
|
||||
|
||||
/// <summary>Advisory cross-reference.</summary>
|
||||
AdvisoryCrossReference
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Confidence level for a provenance hint.
|
||||
/// </summary>
|
||||
public enum HintConfidence
|
||||
{
|
||||
/// <summary>Very high confidence (>= 0.9).</summary>
|
||||
VeryHigh,
|
||||
|
||||
/// <summary>High confidence (0.7 - 0.9).</summary>
|
||||
High,
|
||||
|
||||
/// <summary>Medium confidence (0.5 - 0.7).</summary>
|
||||
Medium,
|
||||
|
||||
/// <summary>Low confidence (0.3 - 0.5).</summary>
|
||||
Low,
|
||||
|
||||
/// <summary>Very low confidence (< 0.3).</summary>
|
||||
VeryLow
|
||||
}
|
||||
```
|
||||
|
||||
### Provenance Hint Model
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Unknowns.Core.Models;
|
||||
|
||||
/// <summary>
|
||||
/// A provenance hint providing evidence about an unknown's identity.
|
||||
/// </summary>
|
||||
public sealed record ProvenanceHint
|
||||
{
|
||||
/// <summary>Unique hint ID (content-addressed).</summary>
|
||||
[JsonPropertyName("hint_id")]
|
||||
public required string HintId { get; init; }
|
||||
|
||||
/// <summary>Type of provenance hint.</summary>
|
||||
[JsonPropertyName("type")]
|
||||
public required ProvenanceHintType Type { get; init; }
|
||||
|
||||
/// <summary>Confidence score (0.0 - 1.0).</summary>
|
||||
[JsonPropertyName("confidence")]
|
||||
public required double Confidence { get; init; }
|
||||
|
||||
/// <summary>Confidence level classification.</summary>
|
||||
[JsonPropertyName("confidence_level")]
|
||||
public required HintConfidence ConfidenceLevel { get; init; }
|
||||
|
||||
/// <summary>Human-readable summary of the hint.</summary>
|
||||
[JsonPropertyName("summary")]
|
||||
public required string Summary { get; init; }
|
||||
|
||||
/// <summary>Hypothesis about the unknown's identity.</summary>
|
||||
[JsonPropertyName("hypothesis")]
|
||||
public required string Hypothesis { get; init; }
|
||||
|
||||
/// <summary>Type-specific evidence details.</summary>
|
||||
[JsonPropertyName("evidence")]
|
||||
public required ProvenanceEvidence Evidence { get; init; }
|
||||
|
||||
/// <summary>Suggested resolution actions.</summary>
|
||||
[JsonPropertyName("suggested_actions")]
|
||||
public required IReadOnlyList<SuggestedAction> SuggestedActions { get; init; }
|
||||
|
||||
/// <summary>When this hint was generated (UTC).</summary>
|
||||
[JsonPropertyName("generated_at")]
|
||||
public required DateTimeOffset GeneratedAt { get; init; }
|
||||
|
||||
/// <summary>Source of the hint (analyzer, corpus, etc.).</summary>
|
||||
[JsonPropertyName("source")]
|
||||
public required string Source { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Type-specific evidence for a provenance hint.
|
||||
/// </summary>
|
||||
public sealed record ProvenanceEvidence
|
||||
{
|
||||
/// <summary>Build-ID match details.</summary>
|
||||
[JsonPropertyName("build_id")]
|
||||
public BuildIdEvidence? BuildId { get; init; }
|
||||
|
||||
/// <summary>Debug link details.</summary>
|
||||
[JsonPropertyName("debug_link")]
|
||||
public DebugLinkEvidence? DebugLink { get; init; }
|
||||
|
||||
/// <summary>Import table fingerprint details.</summary>
|
||||
[JsonPropertyName("import_fingerprint")]
|
||||
public ImportFingerprintEvidence? ImportFingerprint { get; init; }
|
||||
|
||||
/// <summary>Export table fingerprint details.</summary>
|
||||
[JsonPropertyName("export_fingerprint")]
|
||||
public ExportFingerprintEvidence? ExportFingerprint { get; init; }
|
||||
|
||||
/// <summary>Section layout details.</summary>
|
||||
[JsonPropertyName("section_layout")]
|
||||
public SectionLayoutEvidence? SectionLayout { get; init; }
|
||||
|
||||
/// <summary>Compiler signature details.</summary>
|
||||
[JsonPropertyName("compiler")]
|
||||
public CompilerEvidence? Compiler { get; init; }
|
||||
|
||||
/// <summary>Distro pattern match details.</summary>
|
||||
[JsonPropertyName("distro_pattern")]
|
||||
public DistroPatternEvidence? DistroPattern { get; init; }
|
||||
|
||||
/// <summary>Version string extraction details.</summary>
|
||||
[JsonPropertyName("version_string")]
|
||||
public VersionStringEvidence? VersionString { get; init; }
|
||||
|
||||
/// <summary>Corpus match details.</summary>
|
||||
[JsonPropertyName("corpus_match")]
|
||||
public CorpusMatchEvidence? CorpusMatch { get; init; }
|
||||
|
||||
/// <summary>Raw evidence as JSON (for extensibility).</summary>
|
||||
[JsonPropertyName("raw")]
|
||||
public JsonDocument? Raw { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Build-ID match evidence.</summary>
|
||||
public sealed record BuildIdEvidence
|
||||
{
|
||||
[JsonPropertyName("build_id")]
|
||||
public required string BuildId { get; init; }
|
||||
|
||||
[JsonPropertyName("build_id_type")]
|
||||
public required string BuildIdType { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_package")]
|
||||
public string? MatchedPackage { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_version")]
|
||||
public string? MatchedVersion { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_distro")]
|
||||
public string? MatchedDistro { get; init; }
|
||||
|
||||
[JsonPropertyName("catalog_source")]
|
||||
public string? CatalogSource { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Debug link evidence.</summary>
|
||||
public sealed record DebugLinkEvidence
|
||||
{
|
||||
[JsonPropertyName("debug_link")]
|
||||
public required string DebugLink { get; init; }
|
||||
|
||||
[JsonPropertyName("crc32")]
|
||||
public uint? Crc32 { get; init; }
|
||||
|
||||
[JsonPropertyName("debug_info_found")]
|
||||
public bool DebugInfoFound { get; init; }
|
||||
|
||||
[JsonPropertyName("debug_info_path")]
|
||||
public string? DebugInfoPath { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Import table fingerprint evidence.</summary>
|
||||
public sealed record ImportFingerprintEvidence
|
||||
{
|
||||
[JsonPropertyName("fingerprint")]
|
||||
public required string Fingerprint { get; init; }
|
||||
|
||||
[JsonPropertyName("imported_libraries")]
|
||||
public required IReadOnlyList<string> ImportedLibraries { get; init; }
|
||||
|
||||
[JsonPropertyName("import_count")]
|
||||
public int ImportCount { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_fingerprints")]
|
||||
public IReadOnlyList<FingerprintMatch>? MatchedFingerprints { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Export table fingerprint evidence.</summary>
|
||||
public sealed record ExportFingerprintEvidence
|
||||
{
|
||||
[JsonPropertyName("fingerprint")]
|
||||
public required string Fingerprint { get; init; }
|
||||
|
||||
[JsonPropertyName("export_count")]
|
||||
public int ExportCount { get; init; }
|
||||
|
||||
[JsonPropertyName("notable_exports")]
|
||||
public IReadOnlyList<string>? NotableExports { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_fingerprints")]
|
||||
public IReadOnlyList<FingerprintMatch>? MatchedFingerprints { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Fingerprint match from corpus.</summary>
|
||||
public sealed record FingerprintMatch
|
||||
{
|
||||
[JsonPropertyName("package")]
|
||||
public required string Package { get; init; }
|
||||
|
||||
[JsonPropertyName("version")]
|
||||
public required string Version { get; init; }
|
||||
|
||||
[JsonPropertyName("similarity")]
|
||||
public required double Similarity { get; init; }
|
||||
|
||||
[JsonPropertyName("source")]
|
||||
public required string Source { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Section layout evidence.</summary>
|
||||
public sealed record SectionLayoutEvidence
|
||||
{
|
||||
[JsonPropertyName("sections")]
|
||||
public required IReadOnlyList<SectionInfo> Sections { get; init; }
|
||||
|
||||
[JsonPropertyName("layout_hash")]
|
||||
public required string LayoutHash { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_layouts")]
|
||||
public IReadOnlyList<LayoutMatch>? MatchedLayouts { get; init; }
|
||||
}
|
||||
|
||||
public sealed record SectionInfo
|
||||
{
|
||||
[JsonPropertyName("name")]
|
||||
public required string Name { get; init; }
|
||||
|
||||
[JsonPropertyName("type")]
|
||||
public required string Type { get; init; }
|
||||
|
||||
[JsonPropertyName("size")]
|
||||
public ulong Size { get; init; }
|
||||
|
||||
[JsonPropertyName("flags")]
|
||||
public string? Flags { get; init; }
|
||||
}
|
||||
|
||||
public sealed record LayoutMatch
|
||||
{
|
||||
[JsonPropertyName("package")]
|
||||
public required string Package { get; init; }
|
||||
|
||||
[JsonPropertyName("similarity")]
|
||||
public required double Similarity { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Compiler signature evidence.</summary>
|
||||
public sealed record CompilerEvidence
|
||||
{
|
||||
[JsonPropertyName("compiler")]
|
||||
public required string Compiler { get; init; }
|
||||
|
||||
[JsonPropertyName("version")]
|
||||
public string? Version { get; init; }
|
||||
|
||||
[JsonPropertyName("flags")]
|
||||
public IReadOnlyList<string>? Flags { get; init; }
|
||||
|
||||
[JsonPropertyName("detection_method")]
|
||||
public required string DetectionMethod { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Distro pattern match evidence.</summary>
|
||||
public sealed record DistroPatternEvidence
|
||||
{
|
||||
[JsonPropertyName("distro")]
|
||||
public required string Distro { get; init; }
|
||||
|
||||
[JsonPropertyName("release")]
|
||||
public string? Release { get; init; }
|
||||
|
||||
[JsonPropertyName("pattern_type")]
|
||||
public required string PatternType { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_pattern")]
|
||||
public required string MatchedPattern { get; init; }
|
||||
|
||||
[JsonPropertyName("examples")]
|
||||
public IReadOnlyList<string>? Examples { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Version string extraction evidence.</summary>
|
||||
public sealed record VersionStringEvidence
|
||||
{
|
||||
[JsonPropertyName("version_strings")]
|
||||
public required IReadOnlyList<ExtractedVersionString> VersionStrings { get; init; }
|
||||
|
||||
[JsonPropertyName("best_guess")]
|
||||
public string? BestGuess { get; init; }
|
||||
}
|
||||
|
||||
public sealed record ExtractedVersionString
|
||||
{
|
||||
[JsonPropertyName("value")]
|
||||
public required string Value { get; init; }
|
||||
|
||||
[JsonPropertyName("location")]
|
||||
public required string Location { get; init; }
|
||||
|
||||
[JsonPropertyName("confidence")]
|
||||
public double Confidence { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Corpus match evidence.</summary>
|
||||
public sealed record CorpusMatchEvidence
|
||||
{
|
||||
[JsonPropertyName("corpus_name")]
|
||||
public required string CorpusName { get; init; }
|
||||
|
||||
[JsonPropertyName("matched_entry")]
|
||||
public required string MatchedEntry { get; init; }
|
||||
|
||||
[JsonPropertyName("match_type")]
|
||||
public required string MatchType { get; init; }
|
||||
|
||||
[JsonPropertyName("similarity")]
|
||||
public required double Similarity { get; init; }
|
||||
|
||||
[JsonPropertyName("metadata")]
|
||||
public IReadOnlyDictionary<string, string>? Metadata { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>Suggested action for resolving the unknown.</summary>
|
||||
public sealed record SuggestedAction
|
||||
{
|
||||
[JsonPropertyName("action")]
|
||||
public required string Action { get; init; }
|
||||
|
||||
[JsonPropertyName("priority")]
|
||||
public required int Priority { get; init; }
|
||||
|
||||
[JsonPropertyName("effort")]
|
||||
public required string Effort { get; init; }
|
||||
|
||||
[JsonPropertyName("description")]
|
||||
public required string Description { get; init; }
|
||||
|
||||
[JsonPropertyName("link")]
|
||||
public string? Link { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Extended Unknown Model
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Unknowns.Core.Models;
|
||||
|
||||
/// <summary>
|
||||
/// Extended Unknown model with structured provenance hints.
|
||||
/// </summary>
|
||||
public sealed record Unknown
|
||||
{
|
||||
// ... existing fields ...
|
||||
|
||||
/// <summary>Structured provenance hints about this unknown.</summary>
|
||||
public IReadOnlyList<ProvenanceHint> ProvenanceHints { get; init; } = [];
|
||||
|
||||
/// <summary>Best hypothesis based on hints (highest confidence).</summary>
|
||||
public string? BestHypothesis { get; init; }
|
||||
|
||||
/// <summary>Combined confidence from all hints.</summary>
|
||||
public double? CombinedConfidence { get; init; }
|
||||
|
||||
/// <summary>Primary suggested action (highest priority).</summary>
|
||||
public string? PrimarySuggestedAction { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Provenance Hint Builder
|
||||
|
||||
```csharp
|
||||
namespace StellaOps.Unknowns.Core.Hints;
|
||||
|
||||
/// <summary>
|
||||
/// Builds provenance hints from various evidence sources.
|
||||
/// </summary>
|
||||
public interface IProvenanceHintBuilder
|
||||
{
|
||||
/// <summary>Build hint from Build-ID match.</summary>
|
||||
ProvenanceHint BuildFromBuildId(
|
||||
string buildId,
|
||||
string buildIdType,
|
||||
BuildIdMatchResult? match);
|
||||
|
||||
/// <summary>Build hint from import table fingerprint.</summary>
|
||||
ProvenanceHint BuildFromImportFingerprint(
|
||||
string fingerprint,
|
||||
IReadOnlyList<string> importedLibraries,
|
||||
IReadOnlyList<FingerprintMatch>? matches);
|
||||
|
||||
/// <summary>Build hint from section layout.</summary>
|
||||
ProvenanceHint BuildFromSectionLayout(
|
||||
IReadOnlyList<SectionInfo> sections,
|
||||
IReadOnlyList<LayoutMatch>? matches);
|
||||
|
||||
/// <summary>Build hint from distro pattern.</summary>
|
||||
ProvenanceHint BuildFromDistroPattern(
|
||||
string distro,
|
||||
string? release,
|
||||
string patternType,
|
||||
string matchedPattern);
|
||||
|
||||
/// <summary>Build hint from version strings.</summary>
|
||||
ProvenanceHint BuildFromVersionStrings(
|
||||
IReadOnlyList<ExtractedVersionString> versionStrings);
|
||||
|
||||
/// <summary>Build hint from corpus match.</summary>
|
||||
ProvenanceHint BuildFromCorpusMatch(
|
||||
string corpusName,
|
||||
string matchedEntry,
|
||||
string matchType,
|
||||
double similarity,
|
||||
IReadOnlyDictionary<string, string>? metadata);
|
||||
|
||||
/// <summary>Combine multiple hints into a best hypothesis.</summary>
|
||||
(string Hypothesis, double Confidence) CombineHints(
|
||||
IReadOnlyList<ProvenanceHint> hints);
|
||||
}
|
||||
|
||||
public sealed class ProvenanceHintBuilder : IProvenanceHintBuilder
|
||||
{
|
||||
private readonly TimeProvider _timeProvider;
|
||||
private readonly ILogger<ProvenanceHintBuilder> _logger;
|
||||
|
||||
public ProvenanceHintBuilder(
|
||||
TimeProvider timeProvider,
|
||||
ILogger<ProvenanceHintBuilder> logger)
|
||||
{
|
||||
_timeProvider = timeProvider;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromBuildId(
|
||||
string buildId,
|
||||
string buildIdType,
|
||||
BuildIdMatchResult? match)
|
||||
{
|
||||
var confidence = match is not null ? 0.95 : 0.3;
|
||||
var hypothesis = match is not null
|
||||
? $"Binary matches {match.Package}@{match.Version} from {match.Distro}"
|
||||
: $"Build-ID {buildId[..Math.Min(16, buildId.Length)]}... not found in catalog";
|
||||
|
||||
var suggestedActions = new List<SuggestedAction>();
|
||||
|
||||
if (match is not null)
|
||||
{
|
||||
suggestedActions.Add(new SuggestedAction
|
||||
{
|
||||
Action = "verify_package",
|
||||
Priority = 1,
|
||||
Effort = "low",
|
||||
Description = $"Verify component is {match.Package}@{match.Version}",
|
||||
Link = match.AdvisoryLink
|
||||
});
|
||||
}
|
||||
else
|
||||
{
|
||||
suggestedActions.Add(new SuggestedAction
|
||||
{
|
||||
Action = "catalog_lookup",
|
||||
Priority = 1,
|
||||
Effort = "medium",
|
||||
Description = "Search additional Build-ID catalogs",
|
||||
Link = null
|
||||
});
|
||||
suggestedActions.Add(new SuggestedAction
|
||||
{
|
||||
Action = "manual_identification",
|
||||
Priority = 2,
|
||||
Effort = "high",
|
||||
Description = "Manually identify binary using other methods",
|
||||
Link = null
|
||||
});
|
||||
}
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.BuildIdMatch, buildId),
|
||||
Type = ProvenanceHintType.BuildIdMatch,
|
||||
Confidence = confidence,
|
||||
ConfidenceLevel = MapConfidenceLevel(confidence),
|
||||
Summary = $"Build-ID: {buildId[..Math.Min(16, buildId.Length)]}...",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
BuildId = new BuildIdEvidence
|
||||
{
|
||||
BuildId = buildId,
|
||||
BuildIdType = buildIdType,
|
||||
MatchedPackage = match?.Package,
|
||||
MatchedVersion = match?.Version,
|
||||
MatchedDistro = match?.Distro,
|
||||
CatalogSource = match?.CatalogSource
|
||||
}
|
||||
},
|
||||
SuggestedActions = suggestedActions,
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = "BuildIdAnalyzer"
|
||||
};
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromImportFingerprint(
|
||||
string fingerprint,
|
||||
IReadOnlyList<string> importedLibraries,
|
||||
IReadOnlyList<FingerprintMatch>? matches)
|
||||
{
|
||||
var bestMatch = matches?.OrderByDescending(m => m.Similarity).FirstOrDefault();
|
||||
var confidence = bestMatch?.Similarity ?? 0.2;
|
||||
|
||||
var hypothesis = bestMatch is not null
|
||||
? $"Import pattern matches {bestMatch.Package}@{bestMatch.Version} ({bestMatch.Similarity:P0} similar)"
|
||||
: $"Import pattern not found in corpus (imports: {string.Join(", ", importedLibraries.Take(3))})";
|
||||
|
||||
var suggestedActions = new List<SuggestedAction>();
|
||||
|
||||
if (bestMatch is not null && bestMatch.Similarity >= 0.8)
|
||||
{
|
||||
suggestedActions.Add(new SuggestedAction
|
||||
{
|
||||
Action = "verify_import_match",
|
||||
Priority = 1,
|
||||
Effort = "low",
|
||||
Description = $"Verify component is {bestMatch.Package}",
|
||||
Link = null
|
||||
});
|
||||
}
|
||||
else
|
||||
{
|
||||
suggestedActions.Add(new SuggestedAction
|
||||
{
|
||||
Action = "analyze_imports",
|
||||
Priority = 1,
|
||||
Effort = "medium",
|
||||
Description = "Analyze imported libraries for identification",
|
||||
Link = null
|
||||
});
|
||||
}
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.ImportTableFingerprint, fingerprint),
|
||||
Type = ProvenanceHintType.ImportTableFingerprint,
|
||||
Confidence = confidence,
|
||||
ConfidenceLevel = MapConfidenceLevel(confidence),
|
||||
Summary = $"Import fingerprint: {fingerprint[..Math.Min(16, fingerprint.Length)]}...",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
ImportFingerprint = new ImportFingerprintEvidence
|
||||
{
|
||||
Fingerprint = fingerprint,
|
||||
ImportedLibraries = importedLibraries,
|
||||
ImportCount = importedLibraries.Count,
|
||||
MatchedFingerprints = matches
|
||||
}
|
||||
},
|
||||
SuggestedActions = suggestedActions,
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = "ImportTableAnalyzer"
|
||||
};
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromSectionLayout(
|
||||
IReadOnlyList<SectionInfo> sections,
|
||||
IReadOnlyList<LayoutMatch>? matches)
|
||||
{
|
||||
var layoutHash = ComputeLayoutHash(sections);
|
||||
var bestMatch = matches?.OrderByDescending(m => m.Similarity).FirstOrDefault();
|
||||
var confidence = bestMatch?.Similarity ?? 0.15;
|
||||
|
||||
var hypothesis = bestMatch is not null
|
||||
? $"Section layout matches {bestMatch.Package} ({bestMatch.Similarity:P0} similar)"
|
||||
: "Section layout not found in corpus";
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.SectionLayout, layoutHash),
|
||||
Type = ProvenanceHintType.SectionLayout,
|
||||
Confidence = confidence,
|
||||
ConfidenceLevel = MapConfidenceLevel(confidence),
|
||||
Summary = $"Section layout: {sections.Count} sections",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
SectionLayout = new SectionLayoutEvidence
|
||||
{
|
||||
Sections = sections,
|
||||
LayoutHash = layoutHash,
|
||||
MatchedLayouts = matches
|
||||
}
|
||||
},
|
||||
SuggestedActions =
|
||||
[
|
||||
new SuggestedAction
|
||||
{
|
||||
Action = "section_analysis",
|
||||
Priority = 2,
|
||||
Effort = "high",
|
||||
Description = "Detailed section analysis required",
|
||||
Link = null
|
||||
}
|
||||
],
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = "SectionLayoutAnalyzer"
|
||||
};
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromDistroPattern(
|
||||
string distro,
|
||||
string? release,
|
||||
string patternType,
|
||||
string matchedPattern)
|
||||
{
|
||||
var confidence = 0.7;
|
||||
var hypothesis = release is not null
|
||||
? $"Binary appears to be from {distro} {release}"
|
||||
: $"Binary appears to be from {distro}";
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.DistroPattern, $"{distro}:{matchedPattern}"),
|
||||
Type = ProvenanceHintType.DistroPattern,
|
||||
Confidence = confidence,
|
||||
ConfidenceLevel = MapConfidenceLevel(confidence),
|
||||
Summary = $"Distro pattern: {distro}",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
DistroPattern = new DistroPatternEvidence
|
||||
{
|
||||
Distro = distro,
|
||||
Release = release,
|
||||
PatternType = patternType,
|
||||
MatchedPattern = matchedPattern
|
||||
}
|
||||
},
|
||||
SuggestedActions =
|
||||
[
|
||||
new SuggestedAction
|
||||
{
|
||||
Action = "distro_package_lookup",
|
||||
Priority = 1,
|
||||
Effort = "low",
|
||||
Description = $"Search {distro} package repositories",
|
||||
Link = GetDistroPackageSearchUrl(distro)
|
||||
}
|
||||
],
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = "DistroPatternAnalyzer"
|
||||
};
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromVersionStrings(
|
||||
IReadOnlyList<ExtractedVersionString> versionStrings)
|
||||
{
|
||||
var bestGuess = versionStrings
|
||||
.OrderByDescending(v => v.Confidence)
|
||||
.FirstOrDefault();
|
||||
|
||||
var confidence = bestGuess?.Confidence ?? 0.3;
|
||||
var hypothesis = bestGuess is not null
|
||||
? $"Version appears to be {bestGuess.Value}"
|
||||
: "No clear version string found";
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.VersionString,
|
||||
string.Join(",", versionStrings.Select(v => v.Value))),
|
||||
Type = ProvenanceHintType.VersionString,
|
||||
Confidence = confidence,
|
||||
ConfidenceLevel = MapConfidenceLevel(confidence),
|
||||
Summary = $"Found {versionStrings.Count} version string(s)",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
VersionString = new VersionStringEvidence
|
||||
{
|
||||
VersionStrings = versionStrings,
|
||||
BestGuess = bestGuess?.Value
|
||||
}
|
||||
},
|
||||
SuggestedActions =
|
||||
[
|
||||
new SuggestedAction
|
||||
{
|
||||
Action = "version_verification",
|
||||
Priority = 1,
|
||||
Effort = "low",
|
||||
Description = "Verify extracted version against known releases",
|
||||
Link = null
|
||||
}
|
||||
],
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = "VersionStringExtractor"
|
||||
};
|
||||
}
|
||||
|
||||
public ProvenanceHint BuildFromCorpusMatch(
|
||||
string corpusName,
|
||||
string matchedEntry,
|
||||
string matchType,
|
||||
double similarity,
|
||||
IReadOnlyDictionary<string, string>? metadata)
|
||||
{
|
||||
var hypothesis = similarity >= 0.9
|
||||
? $"High confidence match: {matchedEntry}"
|
||||
: $"Possible match: {matchedEntry} ({similarity:P0} similar)";
|
||||
|
||||
return new ProvenanceHint
|
||||
{
|
||||
HintId = ComputeHintId(ProvenanceHintType.CorpusMatch, $"{corpusName}:{matchedEntry}"),
|
||||
Type = ProvenanceHintType.CorpusMatch,
|
||||
Confidence = similarity,
|
||||
ConfidenceLevel = MapConfidenceLevel(similarity),
|
||||
Summary = $"Corpus match: {matchedEntry}",
|
||||
Hypothesis = hypothesis,
|
||||
Evidence = new ProvenanceEvidence
|
||||
{
|
||||
CorpusMatch = new CorpusMatchEvidence
|
||||
{
|
||||
CorpusName = corpusName,
|
||||
MatchedEntry = matchedEntry,
|
||||
MatchType = matchType,
|
||||
Similarity = similarity,
|
||||
Metadata = metadata
|
||||
}
|
||||
},
|
||||
SuggestedActions =
|
||||
[
|
||||
new SuggestedAction
|
||||
{
|
||||
Action = "verify_corpus_match",
|
||||
Priority = 1,
|
||||
Effort = "low",
|
||||
Description = $"Verify match against {corpusName}",
|
||||
Link = null
|
||||
}
|
||||
],
|
||||
GeneratedAt = _timeProvider.GetUtcNow(),
|
||||
Source = $"{corpusName}Matcher"
|
||||
};
|
||||
}
|
||||
|
||||
public (string Hypothesis, double Confidence) CombineHints(
|
||||
IReadOnlyList<ProvenanceHint> hints)
|
||||
{
|
||||
if (hints.Count == 0)
|
||||
{
|
||||
return ("No provenance hints available", 0.0);
|
||||
}
|
||||
|
||||
// Sort by confidence descending
|
||||
var sorted = hints.OrderByDescending(h => h.Confidence).ToList();
|
||||
|
||||
// Best single hypothesis
|
||||
var bestHint = sorted[0];
|
||||
|
||||
// If we have multiple high-confidence hints that agree, boost confidence
|
||||
var agreeing = sorted
|
||||
.Where(h => h.Confidence >= 0.5)
|
||||
.GroupBy(h => ExtractPackageFromHypothesis(h.Hypothesis))
|
||||
.OrderByDescending(g => g.Count())
|
||||
.FirstOrDefault();
|
||||
|
||||
if (agreeing is not null && agreeing.Count() >= 2)
|
||||
{
|
||||
// Multiple hints agree - combine confidence
|
||||
var combinedConfidence = Math.Min(0.99,
|
||||
agreeing.Max(h => h.Confidence) + (agreeing.Count() - 1) * 0.1);
|
||||
|
||||
return (
|
||||
$"{agreeing.Key} (confirmed by {agreeing.Count()} evidence sources)",
|
||||
Math.Round(combinedConfidence, 4)
|
||||
);
|
||||
}
|
||||
|
||||
return (bestHint.Hypothesis, Math.Round(bestHint.Confidence, 4));
|
||||
}
|
||||
|
||||
private static string ComputeHintId(ProvenanceHintType type, string evidence)
|
||||
{
|
||||
var input = $"{type}:{evidence}";
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(input));
|
||||
return $"hint:sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..24]}";
|
||||
}
|
||||
|
||||
private static HintConfidence MapConfidenceLevel(double confidence)
|
||||
{
|
||||
return confidence switch
|
||||
{
|
||||
>= 0.9 => HintConfidence.VeryHigh,
|
||||
>= 0.7 => HintConfidence.High,
|
||||
>= 0.5 => HintConfidence.Medium,
|
||||
>= 0.3 => HintConfidence.Low,
|
||||
_ => HintConfidence.VeryLow
|
||||
};
|
||||
}
|
||||
|
||||
private static string ComputeLayoutHash(IReadOnlyList<SectionInfo> sections)
|
||||
{
|
||||
var normalized = string.Join("|",
|
||||
sections.OrderBy(s => s.Name).Select(s => $"{s.Name}:{s.Type}:{s.Size}"));
|
||||
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(normalized));
|
||||
return Convert.ToHexString(hash).ToLowerInvariant()[..16];
|
||||
}
|
||||
|
||||
private static string? GetDistroPackageSearchUrl(string distro)
|
||||
{
|
||||
return distro.ToLowerInvariant() switch
|
||||
{
|
||||
"debian" => "https://packages.debian.org/search",
|
||||
"ubuntu" => "https://packages.ubuntu.com/",
|
||||
"rhel" or "centos" => "https://access.redhat.com/downloads",
|
||||
"alpine" => "https://pkgs.alpinelinux.org/packages",
|
||||
_ => null
|
||||
};
|
||||
}
|
||||
|
||||
private static string ExtractPackageFromHypothesis(string hypothesis)
|
||||
{
|
||||
// Simple extraction - could be more sophisticated
|
||||
var match = Regex.Match(hypothesis, @"matches?\s+(\S+)");
|
||||
return match.Success ? match.Groups[1].Value : hypothesis;
|
||||
}
|
||||
}
|
||||
|
||||
public sealed record BuildIdMatchResult
|
||||
{
|
||||
public required string Package { get; init; }
|
||||
public required string Version { get; init; }
|
||||
public required string Distro { get; init; }
|
||||
public string? CatalogSource { get; init; }
|
||||
public string? AdvisoryLink { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Dependency | Owner | Task Definition |
|
||||
|---|---------|--------|------------|-------|-----------------|
|
||||
| 1 | PH-001 | TODO | - | - | Define `ProvenanceHintType` enum (15+ types) |
|
||||
| 2 | PH-002 | TODO | PH-001 | - | Define `HintConfidence` enum |
|
||||
| 3 | PH-003 | TODO | PH-002 | - | Define `ProvenanceHint` record |
|
||||
| 4 | PH-004 | TODO | PH-003 | - | Define `ProvenanceEvidence` and sub-records |
|
||||
| 5 | PH-005 | TODO | PH-004 | - | Define evidence records: BuildId, DebugLink |
|
||||
| 6 | PH-006 | TODO | PH-005 | - | Define evidence records: ImportFingerprint, ExportFingerprint |
|
||||
| 7 | PH-007 | TODO | PH-006 | - | Define evidence records: SectionLayout, Compiler |
|
||||
| 8 | PH-008 | TODO | PH-007 | - | Define evidence records: DistroPattern, VersionString |
|
||||
| 9 | PH-009 | TODO | PH-008 | - | Define evidence records: CorpusMatch |
|
||||
| 10 | PH-010 | TODO | PH-009 | - | Define `SuggestedAction` record |
|
||||
| 11 | PH-011 | TODO | PH-010 | - | Extend `Unknown` model with `ProvenanceHints` |
|
||||
| 12 | PH-012 | TODO | PH-011 | - | Define `IProvenanceHintBuilder` interface |
|
||||
| 13 | PH-013 | TODO | PH-012 | - | Implement `BuildFromBuildId()` |
|
||||
| 14 | PH-014 | TODO | PH-013 | - | Implement `BuildFromImportFingerprint()` |
|
||||
| 15 | PH-015 | TODO | PH-014 | - | Implement `BuildFromSectionLayout()` |
|
||||
| 16 | PH-016 | TODO | PH-015 | - | Implement `BuildFromDistroPattern()` |
|
||||
| 17 | PH-017 | TODO | PH-016 | - | Implement `BuildFromVersionStrings()` |
|
||||
| 18 | PH-018 | TODO | PH-017 | - | Implement `BuildFromCorpusMatch()` |
|
||||
| 19 | PH-019 | TODO | PH-018 | - | Implement `CombineHints()` for best hypothesis |
|
||||
| 20 | PH-020 | TODO | PH-019 | - | Add service registration extensions |
|
||||
| 21 | PH-021 | TODO | PH-020 | - | Update Unknown repository to persist hints |
|
||||
| 22 | PH-022 | TODO | PH-021 | - | Add database migration for provenance_hints table |
|
||||
| 23 | PH-023 | TODO | PH-022 | - | Write unit tests: hint builders (all types) |
|
||||
| 24 | PH-024 | TODO | PH-023 | - | Write unit tests: hint combination |
|
||||
| 25 | PH-025 | TODO | PH-024 | - | Write golden fixture tests for hint serialization |
|
||||
| 26 | PH-026 | TODO | PH-025 | - | Add JSON schema for ProvenanceHint |
|
||||
| 27 | PH-027 | TODO | PH-026 | - | Document in docs/modules/unknowns/ |
|
||||
| 28 | PH-028 | TODO | PH-027 | - | Expose hints via Unknowns.WebService API |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Completeness:** All 15 hint types have dedicated evidence records
|
||||
2. **Confidence Scoring:** All hints have confidence scores (0-1) and levels
|
||||
3. **Hypothesis Generation:** Each hint produces a human-readable hypothesis
|
||||
4. **Suggested Actions:** Each hint includes prioritized resolution actions
|
||||
5. **Combination:** Multiple hints can be combined for best hypothesis
|
||||
6. **Persistence:** Hints are stored with unknowns in database
|
||||
7. **Test Coverage:** Unit tests for all builders, golden fixtures for serialization
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| 15+ hint types | Covers common provenance evidence per advisory |
|
||||
| Content-addressed IDs | Enables deduplication of identical hints |
|
||||
| Confidence levels | Both numeric and categorical for different use cases |
|
||||
| Suggested actions | Actionable output for resolution workflow |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Low-quality hints | Confidence thresholds; manual review for low confidence |
|
||||
| Hint explosion | Aggregate/dedupe hints by type |
|
||||
| Corpus dependency | Graceful degradation without corpus matches |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |
|
||||
|
||||
@@ -0,0 +1,168 @@
|
||||
# Sprint Series 20260106_003 - Verifiable Software Supply Chain Pipeline
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This sprint series completes the "quiet, verifiable software supply chain pipeline" as outlined in the product advisory. While StellaOps already implements ~85% of the advisory requirements, this series addresses the remaining gaps to deliver a fully integrated, production-ready pipeline from SBOMs to signed evidence bundles.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The product advisory outlines a complete software supply chain pipeline with:
|
||||
- Deterministic per-layer SBOMs with normalization
|
||||
- VEX-first gating to reduce noise before triage
|
||||
- DSSE/in-toto attestations for everything
|
||||
- Traceable event flow with breadcrumbs
|
||||
- Portable evidence bundles for audits
|
||||
|
||||
**Current State Analysis:**
|
||||
|
||||
| Capability | Status | Gap |
|
||||
|------------|--------|-----|
|
||||
| Deterministic SBOMs | 95% | Per-layer files not exposed, Composition Recipe API missing |
|
||||
| VEX-first gating | 75% | No explicit "gate" service that blocks/warns before triage |
|
||||
| DSSE attestations | 90% | Per-layer attestations missing, cross-attestation linking missing |
|
||||
| Evidence bundles | 85% | No standardized export format with verify commands |
|
||||
| Event flow | 90% | Router idempotency enforcement not formalized |
|
||||
|
||||
## Solution Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Verifiable Supply Chain Pipeline │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
|
||||
│ │ Scanner │───▶│ VEX Gate │───▶│ Attestor │───▶│ Evidence │ │
|
||||
│ │ (Per-layer │ │ (Verdict + │ │ (Chain │ │ Locker │ │
|
||||
│ │ SBOMs) │ │ Rationale) │ │ Linking) │ │ (Bundle) │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
|
||||
│ │ │ │ │ │
|
||||
│ ▼ ▼ ▼ ▼ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Router (Event Flow) │ │
|
||||
│ │ - Idempotent keys (artifact digest + stage) │ │
|
||||
│ │ - Trace records at each hop │ │
|
||||
│ │ - Timeline queryable by artifact digest │ │
|
||||
│ └─────────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌─────────────────┐ │
|
||||
│ │ Evidence Bundle │ │
|
||||
│ │ Export │ │
|
||||
│ │ (zip + verify) │ │
|
||||
│ └─────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Sprint Breakdown
|
||||
|
||||
| Sprint | Module | Scope | Dependencies |
|
||||
|--------|--------|-------|--------------|
|
||||
| [003_001](SPRINT_20260106_003_001_SCANNER_perlayer_sbom_api.md) | Scanner | Per-layer SBOM export + Composition Recipe API | None |
|
||||
| [003_002](SPRINT_20260106_003_002_SCANNER_vex_gate_service.md) | Scanner/Excititor | VEX-first gating service integration | 003_001 |
|
||||
| [003_003](SPRINT_20260106_003_003_EVIDENCE_export_bundle.md) | EvidenceLocker | Standardized export with verify commands | 003_001 |
|
||||
| [003_004](SPRINT_20260106_003_004_ATTESTOR_chain_linking.md) | Attestor | Cross-attestation linking + per-layer attestations | 003_001, 003_002 |
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
```
|
||||
┌──────────────────────────────┐
|
||||
│ SPRINT_20260106_003_001 │
|
||||
│ Per-layer SBOM + Recipe API │
|
||||
└──────────────┬───────────────┘
|
||||
│
|
||||
┌──────────────────────┼──────────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
|
||||
│ SPRINT_003_002 │ │ SPRINT_003_003 │ │ │
|
||||
│ VEX Gate Service │ │ Evidence Export │ │ │
|
||||
└────────┬──────────┘ └───────────────────┘ │ │
|
||||
│ │ │
|
||||
└─────────────────────────────────────┘ │
|
||||
│ │
|
||||
▼ │
|
||||
┌───────────────────┐ │
|
||||
│ SPRINT_003_004 │◀────────────────────────────┘
|
||||
│ Cross-Attestation │
|
||||
│ Linking │
|
||||
└───────────────────┘
|
||||
│
|
||||
▼
|
||||
Production Rollout
|
||||
```
|
||||
|
||||
## Key Deliverables
|
||||
|
||||
### Sprint 003_001: Per-layer SBOM & Composition Recipe API
|
||||
- Per-layer CycloneDX/SPDX files stored separately in CAS
|
||||
- `GET /scans/{id}/layers/{digest}/sbom` API endpoint
|
||||
- `GET /scans/{id}/composition-recipe` API endpoint
|
||||
- Deterministic layer ordering with Merkle root in recipe
|
||||
- CLI: `stella scan sbom --layer <digest> --format cdx|spdx`
|
||||
|
||||
### Sprint 003_002: VEX Gate Service
|
||||
- `IVexGateService` interface with gate decisions: `PASS`, `WARN`, `BLOCK`
|
||||
- Pre-triage filtering that reduces noise
|
||||
- Evidence tracking for each gate decision
|
||||
- Integration with Excititor VEX observations
|
||||
- Configurable gate policies (exploitable+reachable+no-control = BLOCK)
|
||||
|
||||
### Sprint 003_003: Evidence Bundle Export
|
||||
- Standardized export format: `evidence-bundle-<id>.tar.gz`
|
||||
- Contents: SBOMs, VEX statements, attestations, public keys, README
|
||||
- `verify.sh` script embedded in bundle
|
||||
- `stella evidence export --bundle <id> --output ./audit-bundle.tar.gz`
|
||||
- Offline verification support
|
||||
|
||||
### Sprint 003_004: Cross-Attestation Linking
|
||||
- SBOM attestation links to VEX attestation via subject reference
|
||||
- Policy verdict attestation links to both
|
||||
- Per-layer attestations with layer-specific subjects
|
||||
- `GET /attestations?artifact=<digest>&chain=true` for full chain retrieval
|
||||
|
||||
## Acceptance Criteria (Series)
|
||||
|
||||
1. **Determinism**: Same inputs produce identical SBOMs, recipes, and attestation hashes
|
||||
2. **Traceability**: Any artifact can be traced through the full pipeline via digest
|
||||
3. **Verifiability**: Evidence bundles can be verified offline without network access
|
||||
4. **Completeness**: All artifacts (SBOMs, VEX, verdicts, attestations) are included in bundles
|
||||
5. **Integration**: VEX gate reduces triage noise by at least 50% (measured via test corpus)
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| Per-layer SBOMs increase storage | Medium | Content-addressable deduplication, TTL for stale layers |
|
||||
| VEX gate false positives | High | Conservative defaults, policy override mechanism |
|
||||
| Cross-attestation circular deps | Low | DAG validation at creation time |
|
||||
| Export bundle size | Medium | Compression, selective export by date range |
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
- **Unit tests**: Each service with determinism verification
|
||||
- **Integration tests**: Full pipeline from scan to export
|
||||
- **Replay tests**: Identical inputs produce identical outputs
|
||||
- **Corpus tests**: Advisory test corpus for VEX gate accuracy
|
||||
- **E2E tests**: Air-gapped verification of exported bundles
|
||||
|
||||
## Documentation Updates Required
|
||||
|
||||
- `docs/modules/scanner/architecture.md` - Per-layer SBOM section
|
||||
- `docs/modules/evidence-locker/architecture.md` - Export bundle format
|
||||
- `docs/modules/attestor/architecture.md` - Cross-attestation linking
|
||||
- `docs/API_CLI_REFERENCE.md` - New endpoints and commands
|
||||
- `docs/OFFLINE_KIT.md` - Evidence bundle verification
|
||||
|
||||
## Related Work
|
||||
|
||||
- SPRINT_20260105_002_* (HLC) - Required for timestamp ordering in attestation chains
|
||||
- SPRINT_20251229_001_002_BE_vex_delta - VEX delta foundation
|
||||
- Epic 10 (Export Center) - Bundle export workflows
|
||||
- Epic 19 (Attestor Console) - Attestation verification UI
|
||||
|
||||
## Execution Notes
|
||||
|
||||
- All changes must maintain backward compatibility
|
||||
- Feature flags for gradual rollout recommended
|
||||
- Cross-module changes require coordinated deployment
|
||||
- CLI commands should support both new and legacy formats during transition
|
||||
@@ -0,0 +1,230 @@
|
||||
# SPRINT_20260106_003_001_SCANNER_perlayer_sbom_api
|
||||
|
||||
## Sprint Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Sprint ID | 20260106_003_001 |
|
||||
| Module | SCANNER |
|
||||
| Title | Per-layer SBOM Export & Composition Recipe API |
|
||||
| Working Directory | `src/Scanner/` |
|
||||
| Dependencies | None |
|
||||
| Blocking | 003_002, 003_003, 003_004 |
|
||||
|
||||
## Objective
|
||||
|
||||
Expose per-layer SBOMs as first-class artifacts and add a Composition Recipe API that enables downstream verification of SBOM determinism. This completes Step 1 of the product advisory: "Deterministic SBOMs (per layer, per build)".
|
||||
|
||||
## Context
|
||||
|
||||
**Current State:**
|
||||
- `LayerComponentFragment` model tracks components per layer internally
|
||||
- SBOM composition aggregates fragments into single image-level SBOM
|
||||
- Composition recipe stored in CAS but not exposed via API
|
||||
- No mechanism to retrieve SBOM for a specific layer
|
||||
|
||||
**Target State:**
|
||||
- Per-layer SBOMs stored as individual CAS artifacts
|
||||
- API endpoints to retrieve layer-specific SBOMs
|
||||
- Composition Recipe API for determinism verification
|
||||
- CLI support for per-layer SBOM export
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1: Per-layer SBOM Generation (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T001 | Create `ILayerSbomWriter` interface | TODO | `src/Scanner/__Libraries/StellaOps.Scanner.Emit/` |
|
||||
| T002 | Implement `CycloneDxLayerWriter` for per-layer CDX | TODO | Extends existing writer |
|
||||
| T003 | Implement `SpdxLayerWriter` for per-layer SPDX | TODO | Extends existing writer |
|
||||
| T004 | Update `SbomCompositionEngine` to emit layer SBOMs | TODO | Store in CAS with layer digest key |
|
||||
| T005 | Add layer SBOM paths to `SbomCompositionResult` | TODO | `LayerSboms: ImmutableDictionary<string, SbomRef>` |
|
||||
| T006 | Unit tests for per-layer SBOM generation | TODO | Determinism tests required |
|
||||
|
||||
### Phase 2: Composition Recipe API (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T007 | Define `CompositionRecipeResponse` contract | TODO | Include Merkle root, fragment order, digests |
|
||||
| T008 | Add `GET /scans/{id}/composition-recipe` endpoint | TODO | Scanner.WebService |
|
||||
| T009 | Implement `ICompositionRecipeService` | TODO | Retrieves and validates recipe from CAS |
|
||||
| T010 | Add recipe verification logic | TODO | Verify Merkle root matches layer digests |
|
||||
| T011 | Integration tests for composition recipe API | TODO | Round-trip determinism verification |
|
||||
|
||||
### Phase 3: Per-layer SBOM API (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T012 | Add `GET /scans/{id}/layers` endpoint | TODO | List layers with SBOM availability |
|
||||
| T013 | Add `GET /scans/{id}/layers/{digest}/sbom` endpoint | TODO | Format param: `cdx`, `spdx` |
|
||||
| T014 | Add content negotiation for SBOM format | TODO | Accept header support |
|
||||
| T015 | Implement caching headers for layer SBOMs | TODO | ETag based on content hash |
|
||||
| T016 | Integration tests for layer SBOM API | TODO | |
|
||||
|
||||
### Phase 4: CLI Commands (4 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T017 | Add `stella scan sbom --layer <digest>` command | TODO | `src/Cli/StellaOps.Cli/` |
|
||||
| T018 | Add `stella scan recipe` command | TODO | Output composition recipe |
|
||||
| T019 | Add `--verify` flag to recipe command | TODO | Verify recipe against stored SBOMs |
|
||||
| T020 | CLI integration tests | TODO | |
|
||||
|
||||
## Contracts
|
||||
|
||||
### CompositionRecipeResponse
|
||||
|
||||
```json
|
||||
{
|
||||
"scanId": "scan-abc123",
|
||||
"imageDigest": "sha256:abcdef...",
|
||||
"createdAt": "2026-01-06T10:30:00.000000Z",
|
||||
"recipe": {
|
||||
"version": "1.0.0",
|
||||
"generatorName": "StellaOps.Scanner",
|
||||
"generatorVersion": "2026.04",
|
||||
"layers": [
|
||||
{
|
||||
"digest": "sha256:layer1...",
|
||||
"order": 0,
|
||||
"fragmentDigest": "sha256:frag1...",
|
||||
"sbomDigests": {
|
||||
"cyclonedx": "sha256:cdx1...",
|
||||
"spdx": "sha256:spdx1..."
|
||||
},
|
||||
"componentCount": 42
|
||||
}
|
||||
],
|
||||
"merkleRoot": "sha256:merkle...",
|
||||
"aggregatedSbomDigests": {
|
||||
"cyclonedx": "sha256:finalcdx...",
|
||||
"spdx": "sha256:finalspdx..."
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### LayerSbomRef
|
||||
|
||||
```csharp
|
||||
public sealed record LayerSbomRef
|
||||
{
|
||||
public required string LayerDigest { get; init; }
|
||||
public required int Order { get; init; }
|
||||
public required string FragmentDigest { get; init; }
|
||||
public required string CycloneDxDigest { get; init; }
|
||||
public required string CycloneDxCasUri { get; init; }
|
||||
public required string SpdxDigest { get; init; }
|
||||
public required string SpdxCasUri { get; init; }
|
||||
public required int ComponentCount { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### GET /api/v1/scans/{scanId}/layers
|
||||
|
||||
```
|
||||
Response 200:
|
||||
{
|
||||
"scanId": "...",
|
||||
"imageDigest": "sha256:...",
|
||||
"layers": [
|
||||
{
|
||||
"digest": "sha256:layer1...",
|
||||
"order": 0,
|
||||
"hasSbom": true,
|
||||
"componentCount": 42
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/v1/scans/{scanId}/layers/{layerDigest}/sbom
|
||||
|
||||
```
|
||||
Query params:
|
||||
- format: "cdx" | "spdx" (default: "cdx")
|
||||
|
||||
Response 200: SBOM content (application/json)
|
||||
Headers:
|
||||
- ETag: "<content-digest>"
|
||||
- X-StellaOps-Layer-Digest: "sha256:..."
|
||||
- X-StellaOps-Format: "cyclonedx-1.7"
|
||||
```
|
||||
|
||||
### GET /api/v1/scans/{scanId}/composition-recipe
|
||||
|
||||
```
|
||||
Response 200: CompositionRecipeResponse (application/json)
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# List layers with SBOM info
|
||||
stella scan layers <scan-id>
|
||||
|
||||
# Get per-layer SBOM
|
||||
stella scan sbom <scan-id> --layer sha256:abc123 --format cdx --output layer.cdx.json
|
||||
|
||||
# Get composition recipe
|
||||
stella scan recipe <scan-id> --output recipe.json
|
||||
|
||||
# Verify composition recipe against stored SBOMs
|
||||
stella scan recipe <scan-id> --verify
|
||||
```
|
||||
|
||||
## Storage Schema
|
||||
|
||||
Per-layer SBOMs stored in CAS with paths:
|
||||
```
|
||||
/evidence/sboms/<image-digest>/layers/<layer-digest>.cdx.json
|
||||
/evidence/sboms/<image-digest>/layers/<layer-digest>.spdx.json
|
||||
/evidence/sboms/<image-digest>/recipe.json
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Determinism**: Same image scan produces identical per-layer SBOMs
|
||||
2. **Completeness**: Every layer in the image has a corresponding SBOM
|
||||
3. **Verifiability**: Composition recipe Merkle root matches layer SBOM digests
|
||||
4. **Performance**: Per-layer SBOM retrieval < 100ms (cached)
|
||||
5. **Backward Compatibility**: Existing SBOM APIs continue to work unchanged
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Unit Tests
|
||||
- `LayerSbomWriter` produces deterministic output for identical fragments
|
||||
- Composition recipe Merkle root computation is RFC 6962 compliant
|
||||
- Layer ordering is stable (sorted by layer order, not discovery order)
|
||||
|
||||
### Integration Tests
|
||||
- Full scan produces per-layer SBOMs stored in CAS
|
||||
- API returns correct layer SBOM by digest
|
||||
- Recipe verification passes for valid scans
|
||||
- Recipe verification fails for tampered SBOMs
|
||||
|
||||
### Determinism Tests
|
||||
- Two scans of identical images produce identical per-layer SBOM digests
|
||||
- Composition recipe is identical across runs
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Store per-layer SBOMs in CAS | Content-addressable deduplication handles shared layers |
|
||||
| Use layer digest as key | Deterministic, unique per layer content |
|
||||
| Include both CDX and SPDX per layer | Supports customer format preferences |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Storage growth with many layers | TTL-based cleanup for orphaned layer SBOMs |
|
||||
| Cache invalidation complexity | Layer SBOMs are immutable once created |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Author | Action |
|
||||
|------|--------|--------|
|
||||
| 2026-01-06 | Claude | Sprint created from product advisory |
|
||||
@@ -0,0 +1,310 @@
|
||||
# SPRINT_20260106_003_002_SCANNER_vex_gate_service
|
||||
|
||||
## Sprint Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Sprint ID | 20260106_003_002 |
|
||||
| Module | SCANNER/EXCITITOR |
|
||||
| Title | VEX-first Gating Service |
|
||||
| Working Directory | `src/Scanner/`, `src/Excititor/` |
|
||||
| Dependencies | SPRINT_20260106_003_001 |
|
||||
| Blocking | SPRINT_20260106_003_004 |
|
||||
|
||||
## Objective
|
||||
|
||||
Implement a VEX-first gating service that filters vulnerability findings before triage, reducing noise by applying VEX statements and configurable policies. This completes Step 2 of the product advisory: "VEX-first gating (reduce noise before triage)".
|
||||
|
||||
## Context
|
||||
|
||||
**Current State:**
|
||||
- Excititor ingests VEX statements and stores as immutable observations
|
||||
- VexLens computes consensus across weighted statements
|
||||
- Scanner produces findings without pre-filtering
|
||||
- No explicit "gate" decision before findings reach triage queue
|
||||
|
||||
**Target State:**
|
||||
- `IVexGateService` applies VEX evidence before triage
|
||||
- Gate decisions: `PASS` (proceed), `WARN` (proceed with flag), `BLOCK` (requires attention)
|
||||
- Evidence tracking for each gate decision
|
||||
- Configurable gate policies per tenant
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1: VEX Gate Core Service (8 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T001 | Define `VexGateDecision` enum: `Pass`, `Warn`, `Block` | TODO | `src/Scanner/__Libraries/StellaOps.Scanner.Gate/` |
|
||||
| T002 | Define `VexGateResult` model with evidence | TODO | Include rationale, contributing statements |
|
||||
| T003 | Define `IVexGateService` interface | TODO | `EvaluateAsync(Finding, CancellationToken)` |
|
||||
| T004 | Implement `VexGateService` core logic | TODO | Integrates with VexLens consensus |
|
||||
| T005 | Create `VexGatePolicy` configuration model | TODO | Rules for PASS/WARN/BLOCK decisions |
|
||||
| T006 | Implement default policy rules | TODO | Per advisory: exploitable+reachable+no-control=BLOCK |
|
||||
| T007 | Add `IVexGatePolicy` interface | TODO | Pluggable policy evaluation |
|
||||
| T008 | Unit tests for VexGateService | TODO | |
|
||||
|
||||
### Phase 2: Excititor Integration (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T009 | Add `IVexObservationQuery` for gate lookups | TODO | `src/Excititor/__Libraries/` |
|
||||
| T010 | Implement efficient CVE+PURL batch lookup | TODO | Optimize for gate throughput |
|
||||
| T011 | Add VEX statement caching for gate operations | TODO | Short TTL, bounded cache |
|
||||
| T012 | Create `VexGateExcititorAdapter` | TODO | Bridges Scanner → Excititor |
|
||||
| T013 | Integration tests for Excititor lookups | TODO | |
|
||||
| T014 | Performance benchmarks for batch evaluation | TODO | Target: 1000 findings/sec |
|
||||
|
||||
### Phase 3: Scanner Worker Integration (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T015 | Add VEX gate stage to scan pipeline | TODO | After findings, before triage emit |
|
||||
| T016 | Update `ScanResult` with gate decisions | TODO | `GatedFindings: ImmutableArray<GatedFinding>` |
|
||||
| T017 | Add gate metrics to `ScanMetricsCollector` | TODO | pass/warn/block counts |
|
||||
| T018 | Implement gate bypass for emergency scans | TODO | Feature flag or scan option |
|
||||
| T019 | Integration tests for gated scan pipeline | TODO | |
|
||||
|
||||
### Phase 4: Gate Evidence & API (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T020 | Define `GateEvidence` model | TODO | Statement refs, policy rule matched |
|
||||
| T021 | Add `GET /scans/{id}/gate-results` endpoint | TODO | Scanner.WebService |
|
||||
| T022 | Add gate evidence to SBOM findings metadata | TODO | Link to VEX statements |
|
||||
| T023 | Implement gate decision audit logging | TODO | For compliance |
|
||||
| T024 | Add gate summary to scan completion event | TODO | Router notification |
|
||||
| T025 | API integration tests | TODO | |
|
||||
|
||||
### Phase 5: CLI & Configuration (4 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T026 | Add `stella scan gate-policy show` command | TODO | Display current policy |
|
||||
| T027 | Add `stella scan gate-results <scan-id>` command | TODO | Show gate decisions |
|
||||
| T028 | Add gate policy to tenant configuration | TODO | `etc/scanner.yaml` |
|
||||
| T029 | CLI integration tests | TODO | |
|
||||
|
||||
## Contracts
|
||||
|
||||
### VexGateDecision
|
||||
|
||||
```csharp
|
||||
public enum VexGateDecision
|
||||
{
|
||||
Pass, // Finding cleared by VEX evidence - no action needed
|
||||
Warn, // Finding has partial evidence - proceed with caution
|
||||
Block // Finding requires attention - exploitable and reachable
|
||||
}
|
||||
```
|
||||
|
||||
### VexGateResult
|
||||
|
||||
```csharp
|
||||
public sealed record VexGateResult
|
||||
{
|
||||
public required VexGateDecision Decision { get; init; }
|
||||
public required string Rationale { get; init; }
|
||||
public required string PolicyRuleMatched { get; init; }
|
||||
public required ImmutableArray<VexStatementRef> ContributingStatements { get; init; }
|
||||
public required VexGateEvidence Evidence { get; init; }
|
||||
public required DateTimeOffset EvaluatedAt { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VexGateEvidence
|
||||
{
|
||||
public required VexStatus? VendorStatus { get; init; }
|
||||
public required VexJustificationType? Justification { get; init; }
|
||||
public required bool IsReachable { get; init; }
|
||||
public required bool HasCompensatingControl { get; init; }
|
||||
public required double ConfidenceScore { get; init; }
|
||||
public required ImmutableArray<string> BackportHints { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VexStatementRef
|
||||
{
|
||||
public required string StatementId { get; init; }
|
||||
public required string IssuerId { get; init; }
|
||||
public required VexStatus Status { get; init; }
|
||||
public required DateTimeOffset Timestamp { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### VexGatePolicy
|
||||
|
||||
```csharp
|
||||
public sealed record VexGatePolicy
|
||||
{
|
||||
public required ImmutableArray<VexGatePolicyRule> Rules { get; init; }
|
||||
public required VexGateDecision DefaultDecision { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VexGatePolicyRule
|
||||
{
|
||||
public required string RuleId { get; init; }
|
||||
public required VexGatePolicyCondition Condition { get; init; }
|
||||
public required VexGateDecision Decision { get; init; }
|
||||
public required int Priority { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VexGatePolicyCondition
|
||||
{
|
||||
public VexStatus? VendorStatus { get; init; }
|
||||
public bool? IsExploitable { get; init; }
|
||||
public bool? IsReachable { get; init; }
|
||||
public bool? HasCompensatingControl { get; init; }
|
||||
public string[]? SeverityLevels { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### GatedFinding
|
||||
|
||||
```csharp
|
||||
public sealed record GatedFinding
|
||||
{
|
||||
public required FindingRef Finding { get; init; }
|
||||
public required VexGateResult GateResult { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
## Default Gate Policy Rules
|
||||
|
||||
Per product advisory:
|
||||
|
||||
```yaml
|
||||
# etc/scanner.yaml
|
||||
vexGate:
|
||||
enabled: true
|
||||
rules:
|
||||
- ruleId: "block-exploitable-reachable"
|
||||
priority: 100
|
||||
condition:
|
||||
isExploitable: true
|
||||
isReachable: true
|
||||
hasCompensatingControl: false
|
||||
decision: Block
|
||||
|
||||
- ruleId: "warn-high-not-reachable"
|
||||
priority: 90
|
||||
condition:
|
||||
severityLevels: ["critical", "high"]
|
||||
isReachable: false
|
||||
decision: Warn
|
||||
|
||||
- ruleId: "pass-vendor-not-affected"
|
||||
priority: 80
|
||||
condition:
|
||||
vendorStatus: NotAffected
|
||||
decision: Pass
|
||||
|
||||
- ruleId: "pass-backport-confirmed"
|
||||
priority: 70
|
||||
condition:
|
||||
vendorStatus: Fixed
|
||||
# justification implies backport evidence
|
||||
decision: Pass
|
||||
|
||||
defaultDecision: Warn
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### GET /api/v1/scans/{scanId}/gate-results
|
||||
|
||||
```json
|
||||
{
|
||||
"scanId": "...",
|
||||
"gateSummary": {
|
||||
"totalFindings": 150,
|
||||
"passed": 100,
|
||||
"warned": 35,
|
||||
"blocked": 15,
|
||||
"evaluatedAt": "2026-01-06T10:30:00Z"
|
||||
},
|
||||
"gatedFindings": [
|
||||
{
|
||||
"findingId": "...",
|
||||
"cve": "CVE-2025-12345",
|
||||
"decision": "Block",
|
||||
"rationale": "Exploitable + reachable, no compensating control",
|
||||
"policyRuleMatched": "block-exploitable-reachable",
|
||||
"evidence": {
|
||||
"vendorStatus": null,
|
||||
"isReachable": true,
|
||||
"hasCompensatingControl": false,
|
||||
"confidenceScore": 0.95
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# Show current gate policy
|
||||
stella scan gate-policy show
|
||||
|
||||
# Get gate results for a scan
|
||||
stella scan gate-results <scan-id>
|
||||
|
||||
# Get gate results with blocked only
|
||||
stella scan gate-results <scan-id> --decision Block
|
||||
|
||||
# Run scan with gate bypass (emergency)
|
||||
stella scan start <image> --bypass-gate
|
||||
```
|
||||
|
||||
## Performance Targets
|
||||
|
||||
| Metric | Target |
|
||||
|--------|--------|
|
||||
| Gate evaluation throughput | >= 1000 findings/sec |
|
||||
| VEX lookup latency (cached) | < 5ms |
|
||||
| VEX lookup latency (uncached) | < 50ms |
|
||||
| Memory overhead per scan | < 10MB for gate state |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Noise Reduction**: Gate reduces triage queue by >= 50% on test corpus
|
||||
2. **Accuracy**: False positive rate < 1% (findings incorrectly passed)
|
||||
3. **Performance**: Gate evaluation < 1s for typical scan (100 findings)
|
||||
4. **Traceability**: Every gate decision has auditable evidence
|
||||
5. **Configurability**: Policy rules can be customized per tenant
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Unit Tests
|
||||
- Policy rule matching logic for all conditions
|
||||
- Default policy produces expected decisions
|
||||
- Evidence is correctly captured from VEX statements
|
||||
|
||||
### Integration Tests
|
||||
- Gate service queries Excititor correctly
|
||||
- Scan pipeline applies gate decisions
|
||||
- Gate results appear in API response
|
||||
|
||||
### Corpus Tests (test data from `src/__Tests/__Datasets/`)
|
||||
- Known "not affected" CVEs are passed
|
||||
- Known exploitable+reachable CVEs are blocked
|
||||
- Ambiguous cases are warned
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Gate after findings, before triage | Allows full finding context for decision |
|
||||
| Default to Warn not Block | Conservative to avoid blocking legitimate alerts |
|
||||
| Cache VEX lookups with short TTL | Balance freshness vs performance |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| VEX data stale at gate time | TTL-based cache invalidation, async refresh |
|
||||
| Policy misconfiguration | Policy validation at startup, audit logging |
|
||||
| Gate becomes bottleneck | Parallel evaluation, batch VEX lookups |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Author | Action |
|
||||
|------|--------|--------|
|
||||
| 2026-01-06 | Claude | Sprint created from product advisory |
|
||||
350
docs/implplan/SPRINT_20260106_003_003_EVIDENCE_export_bundle.md
Normal file
350
docs/implplan/SPRINT_20260106_003_003_EVIDENCE_export_bundle.md
Normal file
@@ -0,0 +1,350 @@
|
||||
# SPRINT_20260106_003_003_EVIDENCE_export_bundle
|
||||
|
||||
## Sprint Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Sprint ID | 20260106_003_003 |
|
||||
| Module | EVIDENCELOCKER |
|
||||
| Title | Evidence Bundle Export with Verify Commands |
|
||||
| Working Directory | `src/EvidenceLocker/` |
|
||||
| Dependencies | SPRINT_20260106_003_001 |
|
||||
| Blocking | None (can proceed in parallel with 003_004) |
|
||||
|
||||
## Objective
|
||||
|
||||
Implement a standardized evidence bundle export format that includes SBOMs, VEX statements, attestations, public keys, and embedded verification scripts. This enables offline audits and air-gapped verification as specified in the product advisory MVP: "Evidence Bundle export (zip/tar) for audits".
|
||||
|
||||
## Context
|
||||
|
||||
**Current State:**
|
||||
- EvidenceLocker stores sealed bundles with Merkle integrity
|
||||
- Bundles contain SBOM, scan results, policy verdicts, attestations
|
||||
- No standardized export format for external auditors
|
||||
- No embedded verification commands
|
||||
|
||||
**Target State:**
|
||||
- Standardized `evidence-bundle-<id>.tar.gz` export format
|
||||
- Embedded `verify.sh` and `verify.ps1` scripts
|
||||
- README with verification instructions
|
||||
- Public keys bundled for offline verification
|
||||
- CLI command for export
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1: Export Format Definition (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T001 | Define bundle directory structure | TODO | See "Bundle Structure" below |
|
||||
| T002 | Create `BundleManifest` model | TODO | Index of all artifacts in bundle |
|
||||
| T003 | Define `BundleMetadata` model | TODO | Provenance, timestamps, subject |
|
||||
| T004 | Create bundle format specification doc | TODO | `docs/modules/evidence-locker/export-format.md` |
|
||||
| T005 | Unit tests for manifest serialization | TODO | Deterministic JSON output |
|
||||
|
||||
### Phase 2: Export Service Implementation (8 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T006 | Define `IEvidenceBundleExporter` interface | TODO | `src/EvidenceLocker/__Libraries/StellaOps.EvidenceLocker.Export/` |
|
||||
| T007 | Implement `TarGzBundleExporter` | TODO | Creates tar.gz with correct structure |
|
||||
| T008 | Implement artifact collector (SBOMs) | TODO | Fetches from CAS |
|
||||
| T009 | Implement artifact collector (VEX) | TODO | Fetches VEX statements |
|
||||
| T010 | Implement artifact collector (Attestations) | TODO | Fetches DSSE envelopes |
|
||||
| T011 | Implement public key bundler | TODO | Includes signing keys for verification |
|
||||
| T012 | Add compression options (gzip, brotli) | TODO | Configurable compression level |
|
||||
| T013 | Unit tests for export service | TODO | |
|
||||
|
||||
### Phase 3: Verify Script Generation (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T014 | Create `verify.sh` template (bash) | TODO | POSIX-compliant |
|
||||
| T015 | Create `verify.ps1` template (PowerShell) | TODO | Windows support |
|
||||
| T016 | Implement DSSE verification in scripts | TODO | Uses bundled public keys |
|
||||
| T017 | Implement Merkle root verification in scripts | TODO | Checks manifest integrity |
|
||||
| T018 | Implement checksum verification in scripts | TODO | SHA256 of each artifact |
|
||||
| T019 | Script generation tests | TODO | Generated scripts run correctly |
|
||||
|
||||
### Phase 4: API & Worker (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T020 | Add `POST /bundles/{id}/export` endpoint | TODO | Triggers async export |
|
||||
| T021 | Add `GET /bundles/{id}/export/{exportId}` endpoint | TODO | Download exported bundle |
|
||||
| T022 | Implement export worker for large bundles | TODO | Background processing |
|
||||
| T023 | Add export status tracking | TODO | pending/processing/ready/failed |
|
||||
| T024 | API integration tests | TODO | |
|
||||
|
||||
### Phase 5: CLI Commands (4 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T025 | Add `stella evidence export` command | TODO | `--bundle <id> --output <path>` |
|
||||
| T026 | Add `stella evidence verify` command | TODO | Verifies exported bundle |
|
||||
| T027 | Add progress indicator for large exports | TODO | |
|
||||
| T028 | CLI integration tests | TODO | |
|
||||
|
||||
## Bundle Structure
|
||||
|
||||
```
|
||||
evidence-bundle-<id>/
|
||||
+-- manifest.json # Bundle manifest with all artifact refs
|
||||
+-- metadata.json # Bundle metadata (provenance, timestamps)
|
||||
+-- README.md # Human-readable verification instructions
|
||||
+-- verify.sh # Bash verification script
|
||||
+-- verify.ps1 # PowerShell verification script
|
||||
+-- checksums.sha256 # SHA256 checksums for all artifacts
|
||||
+-- keys/
|
||||
| +-- signing-key-001.pem # Public key for DSSE verification
|
||||
| +-- signing-key-002.pem # Additional keys if multi-sig
|
||||
| +-- trust-bundle.pem # CA chain if applicable
|
||||
+-- sboms/
|
||||
| +-- image.cdx.json # Aggregated CycloneDX SBOM
|
||||
| +-- image.spdx.json # Aggregated SPDX SBOM
|
||||
| +-- layers/
|
||||
| +-- <layer-digest>.cdx.json # Per-layer CycloneDX
|
||||
| +-- <layer-digest>.spdx.json # Per-layer SPDX
|
||||
+-- vex/
|
||||
| +-- statements/
|
||||
| | +-- <statement-id>.openvex.json
|
||||
| +-- consensus/
|
||||
| +-- image-consensus.json # VEX consensus result
|
||||
+-- attestations/
|
||||
| +-- sbom.dsse.json # SBOM attestation envelope
|
||||
| +-- vex.dsse.json # VEX attestation envelope
|
||||
| +-- policy.dsse.json # Policy verdict attestation
|
||||
| +-- rekor-proofs/
|
||||
| +-- <uuid>.proof.json # Rekor inclusion proofs
|
||||
+-- findings/
|
||||
| +-- scan-results.json # Vulnerability findings
|
||||
| +-- gate-results.json # VEX gate decisions
|
||||
+-- audit/
|
||||
+-- timeline.ndjson # Audit event timeline
|
||||
```
|
||||
|
||||
## Contracts
|
||||
|
||||
### BundleManifest
|
||||
|
||||
```json
|
||||
{
|
||||
"manifestVersion": "1.0.0",
|
||||
"bundleId": "eb-2026-01-06-abc123",
|
||||
"createdAt": "2026-01-06T10:30:00.000000Z",
|
||||
"subject": {
|
||||
"type": "container-image",
|
||||
"digest": "sha256:abcdef...",
|
||||
"name": "registry.example.com/app:v1.2.3"
|
||||
},
|
||||
"artifacts": [
|
||||
{
|
||||
"path": "sboms/image.cdx.json",
|
||||
"type": "sbom",
|
||||
"format": "cyclonedx-1.7",
|
||||
"digest": "sha256:...",
|
||||
"size": 45678
|
||||
},
|
||||
{
|
||||
"path": "attestations/sbom.dsse.json",
|
||||
"type": "attestation",
|
||||
"format": "dsse-v1",
|
||||
"predicateType": "StellaOps.SBOMAttestation@1",
|
||||
"digest": "sha256:...",
|
||||
"size": 12345,
|
||||
"signedBy": ["sha256:keyabc..."]
|
||||
}
|
||||
],
|
||||
"verification": {
|
||||
"merkleRoot": "sha256:...",
|
||||
"algorithm": "sha256",
|
||||
"checksumFile": "checksums.sha256"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### BundleMetadata
|
||||
|
||||
```json
|
||||
{
|
||||
"bundleId": "eb-2026-01-06-abc123",
|
||||
"exportedAt": "2026-01-06T10:35:00.000000Z",
|
||||
"exportedBy": "stella evidence export",
|
||||
"exportVersion": "2026.04",
|
||||
"provenance": {
|
||||
"tenantId": "tenant-xyz",
|
||||
"scanId": "scan-abc123",
|
||||
"pipelineId": "pipeline-def456",
|
||||
"sourceRepository": "https://github.com/example/app",
|
||||
"sourceCommit": "abc123def456..."
|
||||
},
|
||||
"chainInfo": {
|
||||
"previousBundleId": "eb-2026-01-05-xyz789",
|
||||
"sequenceNumber": 42
|
||||
},
|
||||
"transparency": {
|
||||
"rekorLogUrl": "https://rekor.sigstore.dev",
|
||||
"rekorEntryUuids": ["uuid1", "uuid2"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Verify Script Logic
|
||||
|
||||
### verify.sh (Bash)
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
BUNDLE_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||
MANIFEST="$BUNDLE_DIR/manifest.json"
|
||||
CHECKSUMS="$BUNDLE_DIR/checksums.sha256"
|
||||
|
||||
echo "=== StellaOps Evidence Bundle Verification ==="
|
||||
echo "Bundle: $(basename "$BUNDLE_DIR")"
|
||||
echo ""
|
||||
|
||||
# Step 1: Verify checksums
|
||||
echo "[1/4] Verifying artifact checksums..."
|
||||
cd "$BUNDLE_DIR"
|
||||
sha256sum -c "$CHECKSUMS" --quiet
|
||||
echo " OK: All checksums match"
|
||||
|
||||
# Step 2: Verify Merkle root
|
||||
echo "[2/4] Verifying Merkle root..."
|
||||
COMPUTED_ROOT=$(compute-merkle-root "$CHECKSUMS")
|
||||
EXPECTED_ROOT=$(jq -r '.verification.merkleRoot' "$MANIFEST")
|
||||
if [ "$COMPUTED_ROOT" = "$EXPECTED_ROOT" ]; then
|
||||
echo " OK: Merkle root verified"
|
||||
else
|
||||
echo " FAIL: Merkle root mismatch"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Step 3: Verify DSSE signatures
|
||||
echo "[3/4] Verifying attestation signatures..."
|
||||
for dsse in "$BUNDLE_DIR"/attestations/*.dsse.json; do
|
||||
verify-dsse "$dsse" --keys "$BUNDLE_DIR/keys/"
|
||||
echo " OK: $(basename "$dsse")"
|
||||
done
|
||||
|
||||
# Step 4: Verify Rekor proofs (if online)
|
||||
echo "[4/4] Verifying Rekor proofs..."
|
||||
if [ "${OFFLINE:-false}" = "true" ]; then
|
||||
echo " SKIP: Offline mode, Rekor verification skipped"
|
||||
else
|
||||
for proof in "$BUNDLE_DIR"/attestations/rekor-proofs/*.proof.json; do
|
||||
verify-rekor-proof "$proof"
|
||||
echo " OK: $(basename "$proof")"
|
||||
done
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== Verification Complete: PASSED ==="
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### POST /api/v1/bundles/{bundleId}/export
|
||||
|
||||
```json
|
||||
Request:
|
||||
{
|
||||
"format": "tar.gz",
|
||||
"compression": "gzip",
|
||||
"includeRekorProofs": true,
|
||||
"includeLayerSboms": true
|
||||
}
|
||||
|
||||
Response 202:
|
||||
{
|
||||
"exportId": "exp-123",
|
||||
"status": "processing",
|
||||
"estimatedSize": 1234567,
|
||||
"statusUrl": "/api/v1/bundles/{bundleId}/export/exp-123"
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/v1/bundles/{bundleId}/export/{exportId}
|
||||
|
||||
```
|
||||
Response 200 (when ready):
|
||||
Headers:
|
||||
Content-Type: application/gzip
|
||||
Content-Disposition: attachment; filename="evidence-bundle-eb-123.tar.gz"
|
||||
Body: <binary tar.gz content>
|
||||
|
||||
Response 202 (still processing):
|
||||
{
|
||||
"exportId": "exp-123",
|
||||
"status": "processing",
|
||||
"progress": 65,
|
||||
"estimatedTimeRemaining": "30s"
|
||||
}
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# Export bundle to file
|
||||
stella evidence export --bundle eb-2026-01-06-abc123 --output ./audit-bundle.tar.gz
|
||||
|
||||
# Export with options
|
||||
stella evidence export --bundle eb-123 \
|
||||
--output ./bundle.tar.gz \
|
||||
--include-layers \
|
||||
--include-rekor-proofs
|
||||
|
||||
# Verify an exported bundle
|
||||
stella evidence verify ./audit-bundle.tar.gz
|
||||
|
||||
# Verify offline (skip Rekor)
|
||||
stella evidence verify ./audit-bundle.tar.gz --offline
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Completeness**: Bundle includes all specified artifacts (SBOMs, VEX, attestations, keys)
|
||||
2. **Verifiability**: `verify.sh` and `verify.ps1` run successfully on valid bundles
|
||||
3. **Offline Support**: Verification works without network access (except Rekor)
|
||||
4. **Determinism**: Same bundle exported twice produces identical tar.gz
|
||||
5. **Documentation**: README explains verification steps for non-technical auditors
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Unit Tests
|
||||
- Manifest serialization is deterministic
|
||||
- Merkle root computation matches expected
|
||||
- Checksum file format is correct
|
||||
|
||||
### Integration Tests
|
||||
- Export service collects all artifacts from CAS
|
||||
- Generated verify.sh runs correctly on Linux
|
||||
- Generated verify.ps1 runs correctly on Windows
|
||||
- Large bundles (>100MB) export without OOM
|
||||
|
||||
### E2E Tests
|
||||
- Full flow: scan -> seal -> export -> verify
|
||||
- Exported bundle verifies in air-gapped environment
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| tar.gz format | Universal, works on all platforms |
|
||||
| Embedded verify scripts | No external dependencies for basic verification |
|
||||
| Include public keys in bundle | Enables offline verification |
|
||||
| NDJSON for audit timeline | Streaming-friendly, easy to parse |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Bundle size too large | Compression, optional layer SBOMs |
|
||||
| Script compatibility issues | Test on multiple OS versions |
|
||||
| Key rotation during export | Include all valid keys, document rotation |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Author | Action |
|
||||
|------|--------|--------|
|
||||
| 2026-01-06 | Claude | Sprint created from product advisory |
|
||||
351
docs/implplan/SPRINT_20260106_003_004_ATTESTOR_chain_linking.md
Normal file
351
docs/implplan/SPRINT_20260106_003_004_ATTESTOR_chain_linking.md
Normal file
@@ -0,0 +1,351 @@
|
||||
# SPRINT_20260106_003_004_ATTESTOR_chain_linking
|
||||
|
||||
## Sprint Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Sprint ID | 20260106_003_004 |
|
||||
| Module | ATTESTOR |
|
||||
| Title | Cross-Attestation Linking & Per-Layer Attestations |
|
||||
| Working Directory | `src/Attestor/` |
|
||||
| Dependencies | SPRINT_20260106_003_001, SPRINT_20260106_003_002 |
|
||||
| Blocking | None |
|
||||
|
||||
## Objective
|
||||
|
||||
Implement cross-attestation linking (SBOM -> VEX -> Policy chain) and per-layer attestations to complete the attestation chain model specified in Step 3 of the product advisory: "Sign everything (portable, verifiable evidence)".
|
||||
|
||||
## Context
|
||||
|
||||
**Current State:**
|
||||
- Attestor creates DSSE envelopes for SBOMs, VEX, scan results, policy verdicts
|
||||
- Each attestation is independent with subject pointing to artifact digest
|
||||
- No explicit chain linking between attestations
|
||||
- Single attestation per image (no per-layer)
|
||||
|
||||
**Target State:**
|
||||
- Cross-attestation linking via in-toto layout references
|
||||
- Per-layer attestations with layer-specific subjects
|
||||
- Query API for attestation chains
|
||||
- Full provenance chain from source to final verdict
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1: Cross-Attestation Model (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T001 | Define `AttestationLink` model | TODO | References between attestations |
|
||||
| T002 | Define `AttestationChain` model | TODO | Ordered chain with validation |
|
||||
| T003 | Update `InTotoStatement` to include `materials` refs | TODO | Link to upstream attestations |
|
||||
| T004 | Create `IAttestationLinkResolver` interface | TODO | Resolve chain from any point |
|
||||
| T005 | Implement `AttestationChainValidator` | TODO | Validates DAG structure |
|
||||
| T006 | Unit tests for chain models | TODO | |
|
||||
|
||||
### Phase 2: Chain Linking Implementation (7 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T007 | Update SBOM attestation to include source materials | TODO | Commit SHA, layer digests |
|
||||
| T008 | Update VEX attestation to reference SBOM attestation | TODO | `materials: [{sbom-attestation-digest}]` |
|
||||
| T009 | Update Policy attestation to reference VEX + SBOM | TODO | Complete chain |
|
||||
| T010 | Implement `IAttestationChainBuilder` | TODO | Builds chain from components |
|
||||
| T011 | Add chain validation at submission time | TODO | Reject circular refs |
|
||||
| T012 | Store chain links in `attestor.entry_links` table | TODO | PostgreSQL |
|
||||
| T013 | Integration tests for chain building | TODO | |
|
||||
|
||||
### Phase 3: Per-Layer Attestations (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T014 | Define `LayerAttestationRequest` model | TODO | Layer digest as subject |
|
||||
| T015 | Update `IAttestationSigningService` for layers | TODO | Batch layer attestations |
|
||||
| T016 | Implement `LayerAttestationService` | TODO | Creates per-layer DSSE |
|
||||
| T017 | Add layer attestations to `SbomCompositionResult` | TODO | From Scanner |
|
||||
| T018 | Batch signing for efficiency | TODO | Sign all layers in one operation |
|
||||
| T019 | Unit tests for layer attestations | TODO | |
|
||||
|
||||
### Phase 4: Chain Query API (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T020 | Add `GET /attestations?artifact={digest}&chain=true` | TODO | Returns full chain |
|
||||
| T021 | Add `GET /attestations/{id}/upstream` | TODO | Parent attestations |
|
||||
| T022 | Add `GET /attestations/{id}/downstream` | TODO | Child attestations |
|
||||
| T023 | Implement chain traversal with depth limit | TODO | Prevent infinite loops |
|
||||
| T024 | Add chain visualization endpoint | TODO | Mermaid/DOT graph output |
|
||||
| T025 | API integration tests | TODO | |
|
||||
|
||||
### Phase 5: CLI & Documentation (4 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T026 | Add `stella attest chain <artifact-digest>` command | TODO | Display attestation chain |
|
||||
| T027 | Add `stella attest layers <scan-id>` command | TODO | List layer attestations |
|
||||
| T028 | Update attestor architecture docs | TODO | Cross-attestation linking |
|
||||
| T029 | CLI integration tests | TODO | |
|
||||
|
||||
## Contracts
|
||||
|
||||
### AttestationLink
|
||||
|
||||
```csharp
|
||||
public sealed record AttestationLink
|
||||
{
|
||||
public required string SourceAttestationId { get; init; } // sha256:<hash>
|
||||
public required string TargetAttestationId { get; init; } // sha256:<hash>
|
||||
public required AttestationLinkType LinkType { get; init; }
|
||||
public required DateTimeOffset CreatedAt { get; init; }
|
||||
}
|
||||
|
||||
public enum AttestationLinkType
|
||||
{
|
||||
DependsOn, // Target is a material for source
|
||||
Supersedes, // Source supersedes target (version update)
|
||||
Aggregates // Source aggregates multiple targets (batch)
|
||||
}
|
||||
```
|
||||
|
||||
### AttestationChain
|
||||
|
||||
```csharp
|
||||
public sealed record AttestationChain
|
||||
{
|
||||
public required string RootAttestationId { get; init; }
|
||||
public required ImmutableArray<AttestationChainNode> Nodes { get; init; }
|
||||
public required ImmutableArray<AttestationLink> Links { get; init; }
|
||||
public required bool IsComplete { get; init; }
|
||||
public required DateTimeOffset ResolvedAt { get; init; }
|
||||
}
|
||||
|
||||
public sealed record AttestationChainNode
|
||||
{
|
||||
public required string AttestationId { get; init; }
|
||||
public required string PredicateType { get; init; }
|
||||
public required string SubjectDigest { get; init; }
|
||||
public required int Depth { get; init; }
|
||||
public required DateTimeOffset CreatedAt { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### Enhanced InTotoStatement (with materials)
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://in-toto.io/Statement/v1",
|
||||
"subject": [
|
||||
{
|
||||
"name": "registry.example.com/app@sha256:imageabc...",
|
||||
"digest": { "sha256": "imageabc..." }
|
||||
}
|
||||
],
|
||||
"predicateType": "StellaOps.PolicyEvaluation@1",
|
||||
"predicate": {
|
||||
"verdict": "pass",
|
||||
"evaluatedAt": "2026-01-06T10:30:00Z",
|
||||
"policyVersion": "1.2.3"
|
||||
},
|
||||
"materials": [
|
||||
{
|
||||
"uri": "attestation:sha256:sbom-attest-digest",
|
||||
"digest": { "sha256": "sbom-attest-digest" },
|
||||
"annotations": { "predicateType": "StellaOps.SBOMAttestation@1" }
|
||||
},
|
||||
{
|
||||
"uri": "attestation:sha256:vex-attest-digest",
|
||||
"digest": { "sha256": "vex-attest-digest" },
|
||||
"annotations": { "predicateType": "StellaOps.VEXAttestation@1" }
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### LayerAttestationRequest
|
||||
|
||||
```csharp
|
||||
public sealed record LayerAttestationRequest
|
||||
{
|
||||
public required string ImageDigest { get; init; }
|
||||
public required string LayerDigest { get; init; }
|
||||
public required int LayerOrder { get; init; }
|
||||
public required string SbomDigest { get; init; }
|
||||
public required string SbomFormat { get; init; } // "cyclonedx" | "spdx"
|
||||
}
|
||||
```
|
||||
|
||||
## Database Schema
|
||||
|
||||
### attestor.entry_links
|
||||
|
||||
```sql
|
||||
CREATE TABLE attestor.entry_links (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
source_attestation_id TEXT NOT NULL, -- sha256:<hash>
|
||||
target_attestation_id TEXT NOT NULL, -- sha256:<hash>
|
||||
link_type TEXT NOT NULL, -- 'depends_on', 'supersedes', 'aggregates'
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
|
||||
CONSTRAINT fk_source FOREIGN KEY (source_attestation_id)
|
||||
REFERENCES attestor.entries(bundle_sha256) ON DELETE CASCADE,
|
||||
CONSTRAINT fk_target FOREIGN KEY (target_attestation_id)
|
||||
REFERENCES attestor.entries(bundle_sha256) ON DELETE CASCADE,
|
||||
CONSTRAINT no_self_link CHECK (source_attestation_id != target_attestation_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_entry_links_source ON attestor.entry_links(source_attestation_id);
|
||||
CREATE INDEX idx_entry_links_target ON attestor.entry_links(target_attestation_id);
|
||||
CREATE INDEX idx_entry_links_type ON attestor.entry_links(link_type);
|
||||
```
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### GET /api/v1/attestations?artifact={digest}&chain=true
|
||||
|
||||
```json
|
||||
Response 200:
|
||||
{
|
||||
"artifactDigest": "sha256:imageabc...",
|
||||
"chain": {
|
||||
"rootAttestationId": "sha256:policy-attest...",
|
||||
"isComplete": true,
|
||||
"resolvedAt": "2026-01-06T10:35:00Z",
|
||||
"nodes": [
|
||||
{
|
||||
"attestationId": "sha256:policy-attest...",
|
||||
"predicateType": "StellaOps.PolicyEvaluation@1",
|
||||
"depth": 0
|
||||
},
|
||||
{
|
||||
"attestationId": "sha256:vex-attest...",
|
||||
"predicateType": "StellaOps.VEXAttestation@1",
|
||||
"depth": 1
|
||||
},
|
||||
{
|
||||
"attestationId": "sha256:sbom-attest...",
|
||||
"predicateType": "StellaOps.SBOMAttestation@1",
|
||||
"depth": 2
|
||||
}
|
||||
],
|
||||
"links": [
|
||||
{
|
||||
"source": "sha256:policy-attest...",
|
||||
"target": "sha256:vex-attest...",
|
||||
"type": "DependsOn"
|
||||
},
|
||||
{
|
||||
"source": "sha256:policy-attest...",
|
||||
"target": "sha256:sbom-attest...",
|
||||
"type": "DependsOn"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/v1/attestations/{id}/chain/graph
|
||||
|
||||
```
|
||||
Query params:
|
||||
- format: "mermaid" | "dot" | "json"
|
||||
|
||||
Response 200 (format=mermaid):
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Policy Verdict] -->|depends_on| B[VEX Attestation]
|
||||
A -->|depends_on| C[SBOM Attestation]
|
||||
B -->|depends_on| C
|
||||
C -->|depends_on| D[Layer 0 Attest]
|
||||
C -->|depends_on| E[Layer 1 Attest]
|
||||
```
|
||||
|
||||
## Chain Structure Example
|
||||
|
||||
```
|
||||
┌─────────────────────────┐
|
||||
│ Policy Verdict │
|
||||
│ Attestation │
|
||||
│ (root of chain) │
|
||||
└───────────┬─────────────┘
|
||||
│
|
||||
┌─────────────────┼─────────────────┐
|
||||
│ │ │
|
||||
▼ ▼ │
|
||||
┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ VEX Attestation │ │ Gate Results │ │
|
||||
│ │ │ Attestation │ │
|
||||
└────────┬────────┘ └─────────────────┘ │
|
||||
│ │
|
||||
▼ ▼
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ SBOM Attestation │
|
||||
│ (image level) │
|
||||
└───────────┬─────────────┬───────────────────┘
|
||||
│ │
|
||||
┌───────┴───────┐ └───────┐
|
||||
▼ ▼ ▼
|
||||
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
|
||||
│ Layer 0 SBOM │ │ Layer 1 SBOM │ │ Layer N SBOM │
|
||||
│ Attestation │ │ Attestation │ │ Attestation │
|
||||
└───────────────┘ └───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# Get attestation chain for an artifact
|
||||
stella attest chain sha256:imageabc...
|
||||
|
||||
# Get chain as graph
|
||||
stella attest chain sha256:imageabc... --format mermaid
|
||||
|
||||
# List layer attestations for a scan
|
||||
stella attest layers <scan-id>
|
||||
|
||||
# Verify complete chain
|
||||
stella attest verify-chain sha256:imageabc...
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Chain Completeness**: Policy attestation links to all upstream attestations
|
||||
2. **Per-Layer Coverage**: Every layer has its own attestation
|
||||
3. **Queryability**: Full chain retrievable from any node
|
||||
4. **Validation**: Circular references rejected at creation
|
||||
5. **Performance**: Chain resolution < 100ms for typical depth (5 levels)
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Unit Tests
|
||||
- Chain builder creates correct DAG structure
|
||||
- Link validator detects circular references
|
||||
- Chain traversal respects depth limits
|
||||
|
||||
### Integration Tests
|
||||
- Full scan produces complete attestation chain
|
||||
- Chain query returns all linked attestations
|
||||
- Per-layer attestations stored correctly
|
||||
|
||||
### E2E Tests
|
||||
- End-to-end: scan -> gate -> attestation chain -> export
|
||||
- Chain verification in exported bundle
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Store links in separate table | Efficient traversal, no attestation mutation |
|
||||
| Use DAG not tree | Allows multiple parents (SBOM used by VEX and Policy) |
|
||||
| Batch layer attestations | Performance: one signing operation for all layers |
|
||||
| Materials field for links | in-toto standard compliance |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Chain resolution performance | Depth limit, caching, indexed traversal |
|
||||
| Circular reference bugs | Validation at insertion, periodic audit |
|
||||
| Orphaned attestations | Cleanup job for unlinked entries |
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Author | Action |
|
||||
|------|--------|--------|
|
||||
| 2026-01-06 | Claude | Sprint created from product advisory |
|
||||
@@ -0,0 +1,283 @@
|
||||
# SPRINT_20260106_004_001_FE_quiet_triage_ux_integration
|
||||
|
||||
## Sprint Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Sprint ID | 20260106_004_001 |
|
||||
| Module | FE (Frontend) |
|
||||
| Title | Quiet-by-Default Triage UX Integration |
|
||||
| Working Directory | `src/Web/StellaOps.Web/` |
|
||||
| Dependencies | None (backend APIs complete) |
|
||||
| Blocking | None |
|
||||
| Advisory | `docs-archived/product-advisories/06-Jan-2026 - Quiet-by-Default Triage with Attested Exceptions.md` |
|
||||
|
||||
## Objective
|
||||
|
||||
Integrate the existing quiet-by-default triage backend APIs into the Angular 17 frontend. The backend infrastructure is complete; this sprint delivers the UX layer that enables users to experience "inbox shows only actionables" with one-click access to the Review lane and evidence export.
|
||||
|
||||
## Context
|
||||
|
||||
**Current State:**
|
||||
- Backend APIs fully implemented:
|
||||
- `GatingReasonService` computes gating status
|
||||
- `GatingContracts.cs` defines DTOs (`FindingGatingStatusDto`, `GatedBucketsSummaryDto`)
|
||||
- `ApprovalEndpoints` provides CRUD for approvals
|
||||
- `TriageStatusEndpoints` serves lane/verdict data
|
||||
- `EvidenceLocker` provides bundle export
|
||||
- Frontend has existing findings table but lacks:
|
||||
- Quiet/Review lane toggle
|
||||
- Gated bucket summary chips
|
||||
- Breadcrumb navigation
|
||||
- Approval workflow modal
|
||||
|
||||
**Target State:**
|
||||
- Default view shows only actionable findings (Quiet lane)
|
||||
- Banner displays gated bucket counts with one-click filters
|
||||
- Breadcrumb bar enables image->layer->package->symbol->call-path navigation
|
||||
- Decision drawer supports mute/ack/exception with signing
|
||||
- One-click evidence bundle export
|
||||
|
||||
## Backend APIs (Already Implemented)
|
||||
|
||||
| Endpoint | Purpose |
|
||||
|----------|---------|
|
||||
| `GET /api/v1/triage/findings` | Findings with gating status |
|
||||
| `GET /api/v1/triage/findings/{id}/gating` | Individual gating status |
|
||||
| `GET /api/v1/triage/scans/{id}/gated-buckets` | Gated bucket summary |
|
||||
| `POST /api/v1/scans/{id}/approvals` | Create approval |
|
||||
| `GET /api/v1/scans/{id}/approvals` | List approvals |
|
||||
| `DELETE /api/v1/scans/{id}/approvals/{findingId}` | Revoke approval |
|
||||
| `GET /api/v1/evidence/bundles/{id}/export` | Export evidence bundle |
|
||||
|
||||
## Tasks
|
||||
|
||||
### Phase 1: Lane Toggle & Gated Buckets (8 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T001 | Create `GatingService` Angular service | TODO | Wraps gating API calls |
|
||||
| T002 | Create `TriageLaneToggle` component | TODO | Quiet/Review toggle button |
|
||||
| T003 | Create `GatedBucketChips` component | TODO | Displays counts per gating reason |
|
||||
| T004 | Update `FindingsTableComponent` to filter by lane | TODO | Default to Quiet (non-gated) |
|
||||
| T005 | Add `IncludeHidden` query param support | TODO | Toggle shows hidden findings |
|
||||
| T006 | Add `GatingReasonFilter` dropdown | TODO | Filter to specific bucket |
|
||||
| T007 | Style gated badge indicators | TODO | Visual distinction for gated rows |
|
||||
| T008 | Unit tests for lane toggle and chips | TODO | |
|
||||
|
||||
### Phase 2: Breadcrumb Navigation (6 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T009 | Create `ProvenanceBreadcrumb` component | TODO | Image->Layer->Package->Symbol->CallPath |
|
||||
| T010 | Create `BreadcrumbNodePopover` component | TODO | Inline attestation chips per hop |
|
||||
| T011 | Integrate with `ReachGraphSliceService` API | TODO | Fetch call-path data |
|
||||
| T012 | Add layer SBOM link in breadcrumb | TODO | Click to view layer SBOM |
|
||||
| T013 | Add symbol-to-function link | TODO | Deep link to ReachGraph mini-map |
|
||||
| T014 | Unit tests for breadcrumb navigation | TODO | |
|
||||
|
||||
### Phase 3: Decision Drawer (7 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T015 | Create `DecisionDrawer` component | TODO | Slide-out panel for decisions |
|
||||
| T016 | Add decision kind selector | TODO | Mute Reach/Mute VEX/Ack/Exception |
|
||||
| T017 | Add reason code dropdown | TODO | Controlled vocabulary |
|
||||
| T018 | Add TTL picker for exceptions | TODO | Date picker with validation |
|
||||
| T019 | Add policy reference display | TODO | Auto-filled, admin-editable |
|
||||
| T020 | Implement sign-and-apply flow | TODO | Calls `ApprovalEndpoints` |
|
||||
| T021 | Add undo toast with revoke link | TODO | 10-second undo window |
|
||||
|
||||
### Phase 4: Evidence Export (4 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T022 | Create `ExportEvidenceButton` component | TODO | One-click download |
|
||||
| T023 | Add export progress indicator | TODO | Async job tracking |
|
||||
| T024 | Implement bundle download handler | TODO | DSSE-signed bundle |
|
||||
| T025 | Add "include in bundle" markers | TODO | Per-evidence toggle |
|
||||
|
||||
### Phase 5: Integration & Polish (5 tasks)
|
||||
|
||||
| ID | Task | Status | Notes |
|
||||
|----|------|--------|-------|
|
||||
| T026 | Wire components into findings detail page | TODO | |
|
||||
| T027 | Add keyboard navigation | TODO | Per TRIAGE_UX_GUIDE.md |
|
||||
| T028 | Implement high-contrast mode support | TODO | Accessibility requirement |
|
||||
| T029 | Add TTFS telemetry instrumentation | TODO | Time-to-first-signal metric |
|
||||
| T030 | E2E tests for complete workflow | TODO | Cypress/Playwright |
|
||||
|
||||
## Components
|
||||
|
||||
### TriageLaneToggle
|
||||
|
||||
```typescript
|
||||
@Component({
|
||||
selector: 'stella-triage-lane-toggle',
|
||||
template: `
|
||||
<div class="lane-toggle">
|
||||
<button [class.active]="lane === 'quiet'" (click)="setLane('quiet')">
|
||||
Actionable ({{ visibleCount }})
|
||||
</button>
|
||||
<button [class.active]="lane === 'review'" (click)="setLane('review')">
|
||||
Review ({{ hiddenCount }})
|
||||
</button>
|
||||
</div>
|
||||
`
|
||||
})
|
||||
export class TriageLaneToggleComponent {
|
||||
@Input() visibleCount = 0;
|
||||
@Input() hiddenCount = 0;
|
||||
@Output() laneChange = new EventEmitter<'quiet' | 'review'>();
|
||||
lane: 'quiet' | 'review' = 'quiet';
|
||||
}
|
||||
```
|
||||
|
||||
### GatedBucketChips
|
||||
|
||||
```typescript
|
||||
@Component({
|
||||
selector: 'stella-gated-bucket-chips',
|
||||
template: `
|
||||
<div class="bucket-chips">
|
||||
<span class="chip" *ngIf="buckets.unreachableCount" (click)="filterBy('Unreachable')">
|
||||
Not Reachable: {{ buckets.unreachableCount }}
|
||||
</span>
|
||||
<span class="chip" *ngIf="buckets.vexNotAffectedCount" (click)="filterBy('VexNotAffected')">
|
||||
VEX Not Affected: {{ buckets.vexNotAffectedCount }}
|
||||
</span>
|
||||
<span class="chip" *ngIf="buckets.backportedCount" (click)="filterBy('Backported')">
|
||||
Backported: {{ buckets.backportedCount }}
|
||||
</span>
|
||||
<!-- ... other buckets -->
|
||||
</div>
|
||||
`
|
||||
})
|
||||
export class GatedBucketChipsComponent {
|
||||
@Input() buckets!: GatedBucketsSummaryDto;
|
||||
@Output() filterChange = new EventEmitter<GatingReason>();
|
||||
}
|
||||
```
|
||||
|
||||
### ProvenanceBreadcrumb
|
||||
|
||||
```typescript
|
||||
@Component({
|
||||
selector: 'stella-provenance-breadcrumb',
|
||||
template: `
|
||||
<nav class="breadcrumb-bar">
|
||||
<a (click)="navigateTo('image')">{{ imageRef }}</a>
|
||||
<span class="separator">></span>
|
||||
<a (click)="navigateTo('layer')">{{ layerDigest | truncate:12 }}</a>
|
||||
<span class="separator">></span>
|
||||
<a (click)="navigateTo('package')">{{ packagePurl }}</a>
|
||||
<span class="separator">></span>
|
||||
<a (click)="navigateTo('symbol')">{{ symbolName }}</a>
|
||||
<span class="separator">></span>
|
||||
<span class="current">{{ callPath }}</span>
|
||||
</nav>
|
||||
`
|
||||
})
|
||||
export class ProvenanceBreadcrumbComponent {
|
||||
@Input() finding!: FindingWithProvenance;
|
||||
@Output() navigation = new EventEmitter<BreadcrumbNavigation>();
|
||||
}
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
FindingsPage
|
||||
├── TriageLaneToggle (quiet/review selection)
|
||||
│ └── emits laneChange → updates query params
|
||||
├── GatedBucketChips (bucket counts)
|
||||
│ └── emits filterChange → adds gating reason filter
|
||||
├── FindingsTable (filtered list)
|
||||
│ └── rows show gating badge when applicable
|
||||
└── FindingDetailPanel (selected finding)
|
||||
├── VerdictBanner (SHIP/BLOCK/NEEDS_EXCEPTION)
|
||||
├── StatusChips (reachability, VEX, exploit, gate)
|
||||
│ └── click → opens evidence panel
|
||||
├── ProvenanceBreadcrumb (image→call-path)
|
||||
│ └── click → navigates to hop detail
|
||||
├── EvidenceRail (artifacts list)
|
||||
│ └── ExportEvidenceButton
|
||||
└── ActionsFooter
|
||||
└── DecisionDrawer (mute/ack/exception)
|
||||
```
|
||||
|
||||
## Styling Requirements
|
||||
|
||||
Per `docs/ux/TRIAGE_UX_GUIDE.md`:
|
||||
|
||||
- Status conveyed by text + shape (not color only)
|
||||
- High contrast mode supported
|
||||
- Keyboard navigation for table rows, chips, evidence list
|
||||
- Copy-to-clipboard for digests, PURLs, CVE IDs
|
||||
- Virtual scroll for findings table
|
||||
|
||||
## Telemetry (Required Instrumentation)
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| `triage.ttfs` | Time from notification click to verdict banner rendered |
|
||||
| `triage.time_to_proof` | Time from chip click to proof preview shown |
|
||||
| `triage.mute_reversal_rate` | % of auto-muted findings that become actionable |
|
||||
| `triage.bundle_export_latency` | Evidence bundle export time |
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
1. **Default Quiet**: Findings list shows only non-gated (actionable) findings by default
|
||||
2. **One-Click Review**: Single click toggles to Review lane showing all gated findings
|
||||
3. **Bucket Visibility**: Gated bucket counts always visible, clickable to filter
|
||||
4. **Breadcrumb Navigation**: Click-through from image to call-path works end-to-end
|
||||
5. **Decision Persistence**: Mute/ack/exception decisions persist and show undo toast
|
||||
6. **Evidence Export**: Bundle downloads within 5 seconds for typical findings
|
||||
7. **Accessibility**: Keyboard navigation and high-contrast mode functional
|
||||
8. **Performance**: Findings list renders in <2s for 1000 findings (virtual scroll)
|
||||
|
||||
## Test Cases
|
||||
|
||||
### Unit Tests
|
||||
- Lane toggle emits correct events
|
||||
- Bucket chips render correct counts
|
||||
- Breadcrumb renders all path segments
|
||||
- Decision drawer validates required fields
|
||||
- Export button shows progress state
|
||||
|
||||
### Integration Tests
|
||||
- Lane toggle filters API calls correctly
|
||||
- Bucket click applies gating reason filter
|
||||
- Decision submission calls approval API
|
||||
- Export triggers bundle download
|
||||
|
||||
### E2E Tests
|
||||
- Full workflow: view findings -> toggle lane -> select finding -> view breadcrumb -> export evidence
|
||||
- Approval workflow: select finding -> open drawer -> submit decision -> verify toast -> verify persistence
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Default to Quiet lane | Reduces noise per advisory; Review always one click away |
|
||||
| Breadcrumb as separate component | Reusable across finding detail and evidence views |
|
||||
| Virtual scroll for table | Performance requirement for large finding sets |
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| API latency for gated buckets | Cache bucket summary, refresh on lane toggle |
|
||||
| Complex breadcrumb state | Use route params for deep-linking support |
|
||||
| Bundle export timeout | Async job with polling, show progress |
|
||||
|
||||
## References
|
||||
|
||||
- **UX Guide**: `docs/ux/TRIAGE_UX_GUIDE.md`
|
||||
- **Backend Contracts**: `src/Scanner/StellaOps.Scanner.WebService/Contracts/GatingContracts.cs`
|
||||
- **Approval API**: `src/Scanner/StellaOps.Scanner.WebService/Endpoints/ApprovalEndpoints.cs`
|
||||
- **Archived Advisory**: `docs-archived/product-advisories/06-Jan-2026 - Quiet-by-Default Triage with Attested Exceptions.md`
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date | Author | Action |
|
||||
|------|--------|--------|
|
||||
| 2026-01-06 | Claude | Sprint created from validated product advisory |
|
||||
Reference in New Issue
Block a user