save progress

This commit is contained in:
StellaOps Bot
2026-01-06 09:42:02 +02:00
parent 94d68bee8b
commit 37e11918e0
443 changed files with 85863 additions and 897 deletions

View File

@@ -1,541 +0,0 @@
# Sprint 20260105_001_001_BINDEX - Semantic Diffing Phase 1: IR-Level Semantic Analysis
## Topic & Scope
Enhance the BinaryIndex module to leverage B2R2's Intermediate Representation (IR) for semantic-level function comparison, moving beyond instruction-byte normalization to true semantic matching that is resilient to compiler optimizations, instruction reordering, and register allocation differences.
**Advisory Reference:** Product advisory on semantic diffing breakthrough capabilities (Jan 2026)
**Key Insight:** Current implementation normalizes instruction bytes and computes CFG hashes, but does not lift to B2R2's LowUIR/SSA form for semantic analysis. This limits accuracy on optimized/obfuscated binaries by ~15-20%.
**Working directory:** `src/BinaryIndex/`
**Evidence:** New `StellaOps.BinaryIndex.Semantic` library, updated fingerprint generators, integration tests.
---
## Dependencies & Concurrency
| Dependency | Type | Status |
|------------|------|--------|
| B2R2 v0.9.1+ | Package | Available |
| StellaOps.BinaryIndex.Disassembly | Internal | Stable |
| StellaOps.BinaryIndex.Fingerprints | Internal | Stable |
| StellaOps.BinaryIndex.DeltaSig | Internal | Stable |
**Parallel Execution:** Tasks SEMD-001 through SEMD-004 can proceed in parallel. SEMD-005+ depend on foundation work.
---
## Documentation Prerequisites
- `docs/modules/binary-index/architecture.md`
- `docs/modules/binary-index/README.md`
- B2R2 documentation: https://b2r2.org/
- SemDiff paper: https://arxiv.org/abs/2308.01463
---
## Problem Analysis
### Current State
```
Binary Input
|
v
B2R2 Disassembly --> Raw Instructions
|
v
Normalization Pipeline --> Normalized Bytes (position-independent)
|
v
Hash Generation --> BasicBlockHash, CfgHash, StringRefsHash
|
v
Fingerprint Matching --> Similarity Score
```
**Limitations:**
1. **Instruction-level comparison** - Sensitive to register allocation changes
2. **No semantic lifting** - Cannot detect equivalent operations with different instructions
3. **Optimization blindness** - Loop unrolling, inlining, constant propagation break matches
4. **Basic CFG hashing** - Edge counts/hashes miss semantic equivalence
### Target State
```
Binary Input
|
v
B2R2 Disassembly --> Raw Instructions
|
v
B2R2 IR Lifting --> LowUIR Statements
|
v
SSA Transformation --> SSA Form (optional)
|
v
Semantic Graph Extraction --> Key-Semantics Graph (KSG)
|
v
Graph Fingerprinting --> Semantic Fingerprint
|
v
Graph Isomorphism Check --> Semantic Similarity Score
```
---
## Architecture Design
### New Components
#### 1. IR Lifting Service
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/IrLiftingService.cs
namespace StellaOps.BinaryIndex.Semantic;
public interface IIrLiftingService
{
/// <summary>
/// Lift disassembled instructions to B2R2 LowUIR.
/// </summary>
Task<LiftedFunction> LiftToIrAsync(
DisassembledFunction function,
LiftOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Transform IR to SSA form for dataflow analysis.
/// </summary>
Task<SsaFunction> TransformToSsaAsync(
LiftedFunction lifted,
CancellationToken ct = default);
}
public sealed record LiftedFunction(
string Name,
ulong Address,
ImmutableArray<IrStatement> Statements,
ImmutableArray<IrBasicBlock> BasicBlocks,
ControlFlowGraph Cfg);
public sealed record SsaFunction(
string Name,
ulong Address,
ImmutableArray<SsaStatement> Statements,
ImmutableArray<SsaBasicBlock> BasicBlocks,
DefUseChains DefUse);
```
#### 2. Semantic Graph Extractor
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticGraphExtractor.cs
namespace StellaOps.BinaryIndex.Semantic;
public interface ISemanticGraphExtractor
{
/// <summary>
/// Extract key-semantics graph from lifted IR.
/// Captures: data dependencies, control dependencies, memory operations.
/// </summary>
Task<KeySemanticsGraph> ExtractGraphAsync(
LiftedFunction function,
GraphExtractionOptions? options = null,
CancellationToken ct = default);
}
public sealed record KeySemanticsGraph(
string FunctionName,
ImmutableArray<SemanticNode> Nodes,
ImmutableArray<SemanticEdge> Edges,
GraphProperties Properties);
public sealed record SemanticNode(
int Id,
SemanticNodeType Type, // Compute, Load, Store, Branch, Call, Return
string Operation, // add, mul, cmp, etc.
ImmutableArray<string> Operands);
public sealed record SemanticEdge(
int SourceId,
int TargetId,
SemanticEdgeType Type); // DataDep, ControlDep, MemoryDep
public enum SemanticNodeType { Compute, Load, Store, Branch, Call, Return, Phi }
public enum SemanticEdgeType { DataDependency, ControlDependency, MemoryDependency }
```
#### 3. Semantic Fingerprint Generator
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticFingerprintGenerator.cs
namespace StellaOps.BinaryIndex.Semantic;
public interface ISemanticFingerprintGenerator
{
/// <summary>
/// Generate semantic fingerprint from key-semantics graph.
/// </summary>
Task<SemanticFingerprint> GenerateAsync(
KeySemanticsGraph graph,
SemanticFingerprintOptions? options = null,
CancellationToken ct = default);
}
public sealed record SemanticFingerprint(
string FunctionName,
byte[] GraphHash, // 32-byte SHA-256 of canonical graph
byte[] OperationHash, // Hash of operation sequence
byte[] DataFlowHash, // Hash of data dependency patterns
int NodeCount,
int EdgeCount,
int CyclomaticComplexity,
ImmutableArray<string> ApiCalls, // External calls (semantic anchors)
SemanticFingerprintAlgorithm Algorithm);
public enum SemanticFingerprintAlgorithm
{
KsgV1, // Key-Semantics Graph v1
WeisfeilerLehman, // WL graph hashing
GraphletCounting // Graphlet-based similarity
}
```
#### 4. Semantic Matcher
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/SemanticMatcher.cs
namespace StellaOps.BinaryIndex.Semantic;
public interface ISemanticMatcher
{
/// <summary>
/// Compute semantic similarity between two functions.
/// </summary>
Task<SemanticMatchResult> MatchAsync(
SemanticFingerprint a,
SemanticFingerprint b,
MatchOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Find best matches for a function in a corpus.
/// </summary>
Task<ImmutableArray<SemanticMatchResult>> FindMatchesAsync(
SemanticFingerprint query,
IAsyncEnumerable<SemanticFingerprint> corpus,
decimal minSimilarity = 0.7m,
int maxResults = 10,
CancellationToken ct = default);
}
public sealed record SemanticMatchResult(
string FunctionA,
string FunctionB,
decimal OverallSimilarity,
decimal GraphSimilarity,
decimal DataFlowSimilarity,
decimal ApiCallSimilarity,
MatchConfidence Confidence,
ImmutableArray<MatchDelta> Deltas); // What changed
public enum MatchConfidence { VeryHigh, High, Medium, Low, VeryLow }
public sealed record MatchDelta(
DeltaType Type,
string Description,
decimal Impact);
public enum DeltaType { NodeAdded, NodeRemoved, EdgeAdded, EdgeRemoved, OperationChanged }
```
---
## Delivery Tracker
| # | Task ID | Status | Dependency | Owners | Task Definition |
|---|---------|--------|------------|--------|-----------------|
| 1 | SEMD-001 | TODO | - | Guild | Create `StellaOps.BinaryIndex.Semantic` project structure |
| 2 | SEMD-002 | TODO | - | Guild | Define IR model types (IrStatement, IrBasicBlock, IrOperand) |
| 3 | SEMD-003 | TODO | - | Guild | Define semantic graph model types (KeySemanticsGraph, SemanticNode, SemanticEdge) |
| 4 | SEMD-004 | TODO | - | Guild | Define SemanticFingerprint and matching result types |
| 5 | SEMD-005 | TODO | SEMD-001,002 | Guild | Implement B2R2 IR lifting adapter (LowUIR extraction) |
| 6 | SEMD-006 | TODO | SEMD-005 | Guild | Implement SSA transformation (optional dataflow analysis) |
| 7 | SEMD-007 | TODO | SEMD-003,005 | Guild | Implement KeySemanticsGraph extractor from IR |
| 8 | SEMD-008 | TODO | SEMD-004,007 | Guild | Implement graph canonicalization for deterministic hashing |
| 9 | SEMD-009 | TODO | SEMD-008 | Guild | Implement Weisfeiler-Lehman graph hashing |
| 10 | SEMD-010 | TODO | SEMD-009 | Guild | Implement SemanticFingerprintGenerator |
| 11 | SEMD-011 | TODO | SEMD-010 | Guild | Implement SemanticMatcher with weighted similarity |
| 12 | SEMD-012 | TODO | SEMD-011 | Guild | Integrate semantic fingerprints into PatchDiffEngine |
| 13 | SEMD-013 | TODO | SEMD-012 | Guild | Integrate semantic fingerprints into DeltaSignatureGenerator |
| 14 | SEMD-014 | TODO | SEMD-010 | Guild | Unit tests: IR lifting correctness |
| 15 | SEMD-015 | TODO | SEMD-010 | Guild | Unit tests: Graph extraction determinism |
| 16 | SEMD-016 | TODO | SEMD-011 | Guild | Unit tests: Semantic matching accuracy |
| 17 | SEMD-017 | TODO | SEMD-013 | Guild | Integration tests: End-to-end semantic diffing |
| 18 | SEMD-018 | TODO | SEMD-017 | Guild | Golden corpus: Create test binaries with known semantic equivalences |
| 19 | SEMD-019 | TODO | SEMD-018 | Guild | Benchmark: Compare accuracy vs. instruction-level matching |
| 20 | SEMD-020 | TODO | SEMD-019 | Guild | Documentation: Update architecture.md with semantic diffing |
---
## Task Details
### SEMD-001: Create Project Structure
Create new library project for semantic analysis:
```
src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/
StellaOps.BinaryIndex.Semantic.csproj
IrLiftingService.cs
SemanticGraphExtractor.cs
SemanticFingerprintGenerator.cs
SemanticMatcher.cs
Models/
IrModels.cs
GraphModels.cs
FingerprintModels.cs
MatchModels.cs
Internal/
B2R2IrAdapter.cs
GraphCanonicalizer.cs
WeisfeilerLehmanHasher.cs
```
**Acceptance Criteria:**
- [ ] Project builds successfully
- [ ] References StellaOps.BinaryIndex.Disassembly
- [ ] References B2R2.FrontEnd.BinLifter
---
### SEMD-005: Implement B2R2 IR Lifting Adapter
Leverage B2R2's BinLifter to lift raw instructions to LowUIR:
```csharp
internal sealed class B2R2IrAdapter : IIrLiftingService
{
public async Task<LiftedFunction> LiftToIrAsync(
DisassembledFunction function,
LiftOptions? options = null,
CancellationToken ct = default)
{
var handle = BinHandle.FromBytes(
function.Architecture.ToB2R2Isa(),
function.RawBytes);
var lifter = LowUIRHelper.init(handle);
var statements = new List<IrStatement>();
foreach (var instr in function.Instructions)
{
ct.ThrowIfCancellationRequested();
var stmts = LowUIRHelper.translateInstr(lifter, instr.Address);
statements.AddRange(ConvertStatements(stmts));
}
var cfg = BuildControlFlowGraph(statements, function.StartAddress);
return new LiftedFunction(
function.Name,
function.StartAddress,
[.. statements],
ExtractBasicBlocks(cfg),
cfg);
}
}
```
**Acceptance Criteria:**
- [ ] Successfully lifts x64 instructions to IR
- [ ] Successfully lifts ARM64 instructions to IR
- [ ] CFG is correctly constructed
- [ ] Memory operations are properly modeled
---
### SEMD-007: Implement Key-Semantics Graph Extractor
Extract semantic graph capturing:
- **Computation nodes**: Arithmetic, logic, comparison operations
- **Memory nodes**: Load/store operations with abstract addresses
- **Control nodes**: Branches, calls, returns
- **Data dependency edges**: Def-use chains
- **Control dependency edges**: Branch->target relationships
```csharp
internal sealed class KeySemanticsGraphExtractor : ISemanticGraphExtractor
{
public async Task<KeySemanticsGraph> ExtractGraphAsync(
LiftedFunction function,
GraphExtractionOptions? options = null,
CancellationToken ct = default)
{
var nodes = new List<SemanticNode>();
var edges = new List<SemanticEdge>();
var defMap = new Dictionary<string, int>(); // Variable -> defining node
var nodeId = 0;
foreach (var stmt in function.Statements)
{
ct.ThrowIfCancellationRequested();
var node = CreateNode(ref nodeId, stmt);
nodes.Add(node);
// Add data dependency edges
foreach (var use in GetUses(stmt))
{
if (defMap.TryGetValue(use, out var defNode))
{
edges.Add(new SemanticEdge(defNode, node.Id, SemanticEdgeType.DataDependency));
}
}
// Track definitions
foreach (var def in GetDefs(stmt))
{
defMap[def] = node.Id;
}
}
// Add control dependency edges from CFG
AddControlDependencies(function.Cfg, nodes, edges);
return new KeySemanticsGraph(
function.Name,
[.. nodes],
[.. edges],
ComputeProperties(nodes, edges));
}
}
```
---
### SEMD-009: Implement Weisfeiler-Lehman Graph Hashing
WL hashing provides stable graph fingerprints:
```csharp
internal sealed class WeisfeilerLehmanHasher
{
private readonly int _iterations;
public WeisfeilerLehmanHasher(int iterations = 3)
{
_iterations = iterations;
}
public byte[] ComputeHash(KeySemanticsGraph graph)
{
// Initialize labels from node types
var labels = graph.Nodes.ToDictionary(
n => n.Id,
n => ComputeNodeLabel(n));
// WL iteration
for (var i = 0; i < _iterations; i++)
{
var newLabels = new Dictionary<int, string>();
foreach (var node in graph.Nodes)
{
var neighbors = graph.Edges
.Where(e => e.SourceId == node.Id || e.TargetId == node.Id)
.Select(e => e.SourceId == node.Id ? e.TargetId : e.SourceId)
.OrderBy(id => labels[id])
.ToList();
var multiset = string.Join(",", neighbors.Select(id => labels[id]));
var newLabel = ComputeLabel(labels[node.Id], multiset);
newLabels[node.Id] = newLabel;
}
labels = newLabels;
}
// Compute final hash from sorted labels
var sortedLabels = labels.Values.OrderBy(l => l).ToList();
var combined = string.Join("|", sortedLabels);
return SHA256.HashData(Encoding.UTF8.GetBytes(combined));
}
}
```
---
## Testing Strategy
### Unit Tests
| Test Class | Coverage |
|------------|----------|
| `IrLiftingServiceTests` | IR lifting correctness per architecture |
| `SemanticGraphExtractorTests` | Graph construction, edge types, node types |
| `GraphCanonicalizerTests` | Deterministic ordering |
| `WeisfeilerLehmanHasherTests` | Hash stability, collision resistance |
| `SemanticMatcherTests` | Similarity scoring accuracy |
### Integration Tests
| Test Class | Coverage |
|------------|----------|
| `EndToEndSemanticDiffTests` | Full pipeline from binary to match result |
| `OptimizationResilienceTests` | Same source, different optimization levels |
| `CompilerVariantTests` | Same source, GCC vs Clang |
### Golden Corpus
Create test binaries from known C source with variations:
- `test_func_O0.o` - No optimization
- `test_func_O2.o` - Standard optimization
- `test_func_O3.o` - Aggressive optimization
- `test_func_clang.o` - Different compiler
All should match semantically despite instruction differences.
---
## Success Metrics
| Metric | Current | Target |
|--------|---------|--------|
| Semantic match accuracy (optimized binaries) | ~65% | 85%+ |
| False positive rate | ~5% | <2% |
| Match latency (per function) | N/A | <50ms |
| Memory per function | N/A | <10MB |
---
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
---
## Decisions & Risks
| Decision/Risk | Type | Mitigation |
|---------------|------|------------|
| B2R2 IR coverage may be incomplete for some instructions | Risk | Fallback to instruction-level matching for unsupported operations |
| WL hashing may produce collisions for small functions | Risk | Combine with operation hash and API call hash |
| SSA transformation adds latency | Trade-off | Make SSA optional, use for high-confidence matching only |
| Graph size explosion for large functions | Risk | Limit node count, use sampling for very large functions |
---
## Next Checkpoints
- 2026-01-10: SEMD-001 through SEMD-004 (project structure, models) complete
- 2026-01-17: SEMD-005 through SEMD-010 (core implementation) complete
- 2026-01-24: SEMD-011 through SEMD-020 (integration, testing, benchmarks) complete

View File

@@ -1,592 +0,0 @@
# Sprint 20260105_001_002_BINDEX - Semantic Diffing Phase 2: Function Behavior Corpus
## Topic & Scope
Build a comprehensive function behavior corpus (similar to Ghidra's BSim/FunctionID) containing fingerprints of known library functions across multiple versions and architectures. This enables identification of functions in stripped binaries by matching against a large corpus of pre-indexed function behaviors.
**Advisory Reference:** Product advisory on semantic diffing - BSim behavioral similarity against large signature sets.
**Key Insight:** Current delta signatures are CVE-specific. A large pre-built corpus of "known good" function behaviors enables identifying functions like "this is `memcpy` from glibc 2.31" even in stripped binaries, which is critical for accurate vulnerability attribution.
**Working directory:** `src/BinaryIndex/`
**Evidence:** New `StellaOps.BinaryIndex.Corpus` library, corpus ingestion pipeline, PostgreSQL corpus schema.
---
## Dependencies & Concurrency
| Dependency | Type | Status |
|------------|------|--------|
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Required for semantic fingerprints |
| StellaOps.BinaryIndex.Semantic | Internal | From Phase 1 |
| PostgreSQL | Infrastructure | Available |
| Package mirrors (Debian, Alpine, RHEL) | External | Available |
**Parallel Execution:** Corpus connector development (CORP-005-007) can proceed in parallel after CORP-004.
---
## Documentation Prerequisites
- `docs/modules/binary-index/architecture.md`
- Phase 1 sprint: `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
- Ghidra BSim documentation: https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/BSimServerAPI.html
---
## Problem Analysis
### Current State
- Delta signatures are generated on-demand for specific CVEs
- No pre-built corpus of common library functions
- Cannot identify functions by behavior alone (requires symbols or prior CVE signature)
- Stripped binaries fall back to weaker Build-ID/hash matching
### Target State
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Function Behavior Corpus │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Corpus Ingestion Layer │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ GlibcCorpus │ │ OpenSSLCorpus│ │ zlibCorpus │ ... │ │
│ │ │ Connector │ │ Connector │ │ Connector │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ v │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Fingerprint Generation │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Instruction │ │ Semantic │ │ API Call │ │ │
│ │ │ Fingerprint │ │ Fingerprint │ │ Fingerprint │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ v │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Corpus Storage (PostgreSQL) │ │
│ │ │ │
│ │ corpus.libraries - Known libraries (glibc, openssl, etc.) │ │
│ │ corpus.library_versions - Version snapshots │ │
│ │ corpus.functions - Function metadata │ │
│ │ corpus.fingerprints - Fingerprint index (semantic + instruction) │ │
│ │ corpus.function_clusters - Similar function groups │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ v │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Query Layer │ │
│ │ │ │
│ │ ICorpusQueryService.IdentifyFunctionAsync(fingerprint) │ │
│ │ -> Returns: [{library: "glibc", version: "2.31", name: "memcpy"}] │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Architecture Design
### Database Schema
```sql
-- Corpus schema for function behavior database
CREATE SCHEMA IF NOT EXISTS corpus;
-- Known libraries tracked in corpus
CREATE TABLE corpus.libraries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE, -- glibc, openssl, zlib, curl
description TEXT,
homepage_url TEXT,
source_repo TEXT, -- git URL
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Library versions indexed
CREATE TABLE corpus.library_versions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
version TEXT NOT NULL, -- 2.31, 1.1.1n, 1.2.13
release_date DATE,
is_security_release BOOLEAN DEFAULT false,
source_archive_sha256 TEXT, -- Hash of source tarball
indexed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (library_id, version)
);
-- Architecture variants
CREATE TABLE corpus.build_variants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
library_version_id UUID NOT NULL REFERENCES corpus.library_versions(id),
architecture TEXT NOT NULL, -- x86_64, aarch64, armv7
abi TEXT, -- gnu, musl, msvc
compiler TEXT, -- gcc, clang
compiler_version TEXT,
optimization_level TEXT, -- O0, O2, O3, Os
build_id TEXT, -- ELF Build-ID if available
binary_sha256 TEXT NOT NULL,
indexed_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (library_version_id, architecture, abi, compiler, optimization_level)
);
-- Functions in corpus
CREATE TABLE corpus.functions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
build_variant_id UUID NOT NULL REFERENCES corpus.build_variants(id),
name TEXT NOT NULL, -- Function name (may be mangled)
demangled_name TEXT, -- Demangled C++ name
address BIGINT NOT NULL,
size_bytes INTEGER NOT NULL,
is_exported BOOLEAN DEFAULT false,
is_inline BOOLEAN DEFAULT false,
source_file TEXT, -- Source file if debug info
source_line INTEGER,
UNIQUE (build_variant_id, name, address)
);
-- Function fingerprints (multiple algorithms per function)
CREATE TABLE corpus.fingerprints (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
function_id UUID NOT NULL REFERENCES corpus.functions(id),
algorithm TEXT NOT NULL, -- semantic_ksg, instruction_bb, cfg_wl
fingerprint BYTEA NOT NULL, -- Variable length depending on algorithm
fingerprint_hex TEXT GENERATED ALWAYS AS (encode(fingerprint, 'hex')) STORED,
metadata JSONB, -- Algorithm-specific metadata
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (function_id, algorithm)
);
-- Index for fast fingerprint lookup
CREATE INDEX idx_fingerprints_algorithm_hex ON corpus.fingerprints(algorithm, fingerprint_hex);
CREATE INDEX idx_fingerprints_bytea ON corpus.fingerprints USING hash (fingerprint);
-- Function clusters (similar functions across versions)
CREATE TABLE corpus.function_clusters (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
canonical_name TEXT NOT NULL, -- e.g., "memcpy" across all versions
description TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (library_id, canonical_name)
);
-- Cluster membership
CREATE TABLE corpus.cluster_members (
cluster_id UUID NOT NULL REFERENCES corpus.function_clusters(id),
function_id UUID NOT NULL REFERENCES corpus.functions(id),
similarity_to_centroid DECIMAL(5,4),
PRIMARY KEY (cluster_id, function_id)
);
-- CVE associations (which functions are affected by which CVEs)
CREATE TABLE corpus.function_cves (
function_id UUID NOT NULL REFERENCES corpus.functions(id),
cve_id TEXT NOT NULL,
affected_state TEXT NOT NULL, -- vulnerable, fixed, not_affected
patch_commit TEXT, -- Git commit that fixed
confidence DECIMAL(3,2) NOT NULL,
evidence_type TEXT, -- changelog, commit, advisory
PRIMARY KEY (function_id, cve_id)
);
-- Ingestion job tracking
CREATE TABLE corpus.ingestion_jobs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
library_id UUID NOT NULL REFERENCES corpus.libraries(id),
job_type TEXT NOT NULL, -- full_ingest, incremental, cve_update
status TEXT NOT NULL DEFAULT 'pending',
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
functions_indexed INTEGER,
errors JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
### Core Interfaces
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/ICorpusIngestionService.cs
namespace StellaOps.BinaryIndex.Corpus;
public interface ICorpusIngestionService
{
/// <summary>
/// Ingest all functions from a library binary.
/// </summary>
Task<IngestionResult> IngestLibraryAsync(
LibraryMetadata metadata,
Stream binaryStream,
IngestionOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Ingest a specific version range.
/// </summary>
Task<ImmutableArray<IngestionResult>> IngestVersionRangeAsync(
string libraryName,
VersionRange range,
IAsyncEnumerable<LibraryBinary> binaries,
CancellationToken ct = default);
}
public sealed record LibraryMetadata(
string Name,
string Version,
string Architecture,
string? Abi,
string? Compiler,
string? OptimizationLevel);
public sealed record IngestionResult(
Guid JobId,
string LibraryName,
string Version,
int FunctionsIndexed,
int FingerprintsGenerated,
ImmutableArray<string> Errors);
```
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/ICorpusQueryService.cs
namespace StellaOps.BinaryIndex.Corpus;
public interface ICorpusQueryService
{
/// <summary>
/// Identify a function by its fingerprint.
/// </summary>
Task<ImmutableArray<FunctionMatch>> IdentifyFunctionAsync(
FunctionFingerprints fingerprints,
IdentifyOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Get all functions associated with a CVE.
/// </summary>
Task<ImmutableArray<CorpusFunction>> GetFunctionsForCveAsync(
string cveId,
CancellationToken ct = default);
/// <summary>
/// Get function evolution across versions.
/// </summary>
Task<FunctionEvolution> GetFunctionEvolutionAsync(
string libraryName,
string functionName,
CancellationToken ct = default);
}
public sealed record FunctionFingerprints(
byte[]? SemanticHash,
byte[]? InstructionHash,
byte[]? CfgHash,
ImmutableArray<string>? ApiCalls);
public sealed record FunctionMatch(
string LibraryName,
string Version,
string FunctionName,
decimal Similarity,
MatchConfidence Confidence,
string? CveStatus, // null if not CVE-affected
ImmutableArray<string> AffectedCves);
public sealed record FunctionEvolution(
string LibraryName,
string FunctionName,
ImmutableArray<VersionSnapshot> Versions);
public sealed record VersionSnapshot(
string Version,
int SizeBytes,
string FingerprintHex,
ImmutableArray<string> CveChanges); // CVEs fixed/introduced in this version
```
### Library Connectors
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Corpus/Connectors/IGlibcCorpusConnector.cs
namespace StellaOps.BinaryIndex.Corpus.Connectors;
public interface ILibraryCorpusConnector
{
string LibraryName { get; }
string[] SupportedArchitectures { get; }
/// <summary>
/// Get available versions from source.
/// </summary>
Task<ImmutableArray<string>> GetAvailableVersionsAsync(CancellationToken ct);
/// <summary>
/// Download and extract library binary for a version.
/// </summary>
Task<LibraryBinary> FetchBinaryAsync(
string version,
string architecture,
string? abi = null,
CancellationToken ct = default);
}
// Implementations:
// - GlibcCorpusConnector (GNU C Library)
// - OpenSslCorpusConnector (OpenSSL/LibreSSL/BoringSSL)
// - ZlibCorpusConnector (zlib/zlib-ng)
// - CurlCorpusConnector (libcurl)
// - SqliteCorpusConnector (SQLite)
// - LibpngCorpusConnector (libpng)
// - LibjpegCorpusConnector (libjpeg-turbo)
// - LibxmlCorpusConnector (libxml2)
// - OpenJpegCorpusConnector (OpenJPEG)
// - ExpatCorpusConnector (Expat XML parser)
```
---
## Delivery Tracker
| # | Task ID | Status | Dependency | Owners | Task Definition |
|---|---------|--------|------------|--------|-----------------|
| 1 | CORP-001 | TODO | Phase 1 | Guild | Create `StellaOps.BinaryIndex.Corpus` project structure |
| 2 | CORP-002 | TODO | CORP-001 | Guild | Define corpus model types (LibraryMetadata, FunctionMatch, etc.) |
| 3 | CORP-003 | TODO | CORP-001 | Guild | Create PostgreSQL corpus schema (corpus.* tables) |
| 4 | CORP-004 | TODO | CORP-003 | Guild | Implement PostgreSQL corpus repository |
| 5 | CORP-005 | TODO | CORP-004 | Guild | Implement GlibcCorpusConnector |
| 6 | CORP-006 | TODO | CORP-004 | Guild | Implement OpenSslCorpusConnector |
| 7 | CORP-007 | TODO | CORP-004 | Guild | Implement ZlibCorpusConnector |
| 8 | CORP-008 | TODO | CORP-004 | Guild | Implement CurlCorpusConnector |
| 9 | CORP-009 | TODO | CORP-005-008 | Guild | Implement CorpusIngestionService |
| 10 | CORP-010 | TODO | CORP-009 | Guild | Implement batch fingerprint generation pipeline |
| 11 | CORP-011 | TODO | CORP-010 | Guild | Implement function clustering (group similar functions) |
| 12 | CORP-012 | TODO | CORP-011 | Guild | Implement CorpusQueryService |
| 13 | CORP-013 | TODO | CORP-012 | Guild | Implement CVE-to-function mapping updater |
| 14 | CORP-014 | TODO | CORP-012 | Guild | Integrate corpus queries into BinaryVulnerabilityService |
| 15 | CORP-015 | TODO | CORP-009 | Guild | Initial corpus ingestion: glibc (5 major versions x 3 archs) |
| 16 | CORP-016 | TODO | CORP-015 | Guild | Initial corpus ingestion: OpenSSL (10 versions x 3 archs) |
| 17 | CORP-017 | TODO | CORP-016 | Guild | Initial corpus ingestion: zlib, curl, sqlite |
| 18 | CORP-018 | TODO | CORP-012 | Guild | Unit tests: Corpus ingestion correctness |
| 19 | CORP-019 | TODO | CORP-012 | Guild | Unit tests: Query service accuracy |
| 20 | CORP-020 | TODO | CORP-017 | Guild | Integration tests: End-to-end function identification |
| 21 | CORP-021 | TODO | CORP-020 | Guild | Benchmark: Query latency at scale (100K+ functions) |
| 22 | CORP-022 | TODO | CORP-021 | Guild | Documentation: Corpus management guide |
---
## Task Details
### CORP-005: Implement GlibcCorpusConnector
Fetch glibc binaries from GNU mirrors and Debian/Ubuntu packages:
```csharp
internal sealed class GlibcCorpusConnector : ILibraryCorpusConnector
{
private readonly IHttpClientFactory _httpClientFactory;
private readonly ILogger<GlibcCorpusConnector> _logger;
public string LibraryName => "glibc";
public string[] SupportedArchitectures => ["x86_64", "aarch64", "armv7", "i686"];
public async Task<ImmutableArray<string>> GetAvailableVersionsAsync(CancellationToken ct)
{
// Query GNU FTP mirror for available versions
// https://ftp.gnu.org/gnu/glibc/
var client = _httpClientFactory.CreateClient("GnuMirror");
var html = await client.GetStringAsync("https://ftp.gnu.org/gnu/glibc/", ct);
// Parse directory listing for glibc-X.Y.tar.gz files
var versions = ParseVersionsFromListing(html);
return [.. versions.OrderByDescending(v => Version.Parse(v))];
}
public async Task<LibraryBinary> FetchBinaryAsync(
string version,
string architecture,
string? abi = null,
CancellationToken ct = default)
{
// Strategy 1: Try Debian/Ubuntu package (pre-built)
var debBinary = await TryFetchDebianPackageAsync(version, architecture, ct);
if (debBinary is not null)
return debBinary;
// Strategy 2: Download source and compile with specific flags
var sourceTarball = await DownloadSourceAsync(version, ct);
return await CompileForArchitecture(sourceTarball, architecture, abi, ct);
}
private async Task<LibraryBinary?> TryFetchDebianPackageAsync(
string version,
string architecture,
CancellationToken ct)
{
// Map glibc version to Debian package version
// e.g., glibc 2.31 -> libc6_2.31-13+deb11u5_amd64.deb
var packages = await QueryDebianPackagesAsync(version, architecture, ct);
foreach (var pkg in packages)
{
var binary = await DownloadAndExtractDebAsync(pkg, ct);
if (binary is not null)
return binary;
}
return null;
}
}
```
### CORP-011: Implement Function Clustering
Group semantically similar functions across versions:
```csharp
internal sealed class FunctionClusteringService
{
private readonly ICorpusRepository _repository;
private readonly ISemanticMatcher _matcher;
public async Task ClusterFunctionsAsync(
Guid libraryId,
ClusteringOptions options,
CancellationToken ct)
{
// Get all functions with semantic fingerprints
var functions = await _repository.GetFunctionsWithFingerprintsAsync(libraryId, ct);
// Group by canonical name (demangled, normalized)
var groups = functions
.GroupBy(f => NormalizeCanonicalName(f.DemangledName ?? f.Name))
.ToList();
foreach (var group in groups)
{
ct.ThrowIfCancellationRequested();
// Create or update cluster
var clusterId = await _repository.EnsureClusterAsync(
libraryId,
group.Key,
ct);
// Compute centroid (most common fingerprint)
var centroid = ComputeCentroid(group);
// Add members with similarity scores
foreach (var function in group)
{
var similarity = await _matcher.MatchAsync(
function.SemanticFingerprint,
centroid,
ct: ct);
await _repository.AddClusterMemberAsync(
clusterId,
function.Id,
similarity.OverallSimilarity,
ct);
}
}
}
private static string NormalizeCanonicalName(string name)
{
// Strip version suffixes, GLIBC_2.X annotations
// Demangle C++ names
// Normalize to base function name
return CppDemangler.Demangle(name)
.Replace("@GLIBC_", "")
.TrimEnd("@@".ToCharArray());
}
}
```
---
## Initial Corpus Coverage
### Priority Libraries (Phase 2a)
| Library | Versions | Architectures | Est. Functions | CVE Coverage |
|---------|----------|---------------|----------------|--------------|
| glibc | 2.17, 2.28, 2.31, 2.35, 2.38 | x64, arm64, armv7 | ~15,000 | 50+ CVEs |
| OpenSSL | 1.0.2, 1.1.0, 1.1.1, 3.0, 3.1 | x64, arm64 | ~8,000 | 100+ CVEs |
| zlib | 1.2.8, 1.2.11, 1.2.13, 1.3 | x64, arm64 | ~200 | 5+ CVEs |
| libcurl | 7.50-7.88 (select) | x64, arm64 | ~2,000 | 80+ CVEs |
| SQLite | 3.30-3.44 (select) | x64, arm64 | ~1,500 | 30+ CVEs |
### Extended Coverage (Phase 2b)
| Library | Est. Functions | Priority |
|---------|----------------|----------|
| libpng | ~300 | Medium |
| libjpeg-turbo | ~400 | Medium |
| libxml2 | ~1,200 | High |
| expat | ~150 | High |
| OpenJPEG | ~600 | Medium |
| freetype | ~800 | Medium |
| harfbuzz | ~500 | Low |
**Total estimated corpus size:** ~30,000 unique functions, ~100,000 fingerprints (including variants)
---
## Storage Estimates
| Component | Size Estimate |
|-----------|---------------|
| PostgreSQL tables | ~2 GB |
| Fingerprint index | ~500 MB |
| Full corpus with metadata | ~5 GB |
| Query cache (Valkey) | ~100 MB |
---
## Success Metrics
| Metric | Target |
|--------|--------|
| Function identification accuracy | 90%+ on stripped binaries |
| Query latency (p99) | <100ms |
| Corpus coverage (top 20 libs) | 80%+ of security-critical functions |
| CVE attribution accuracy | 95%+ |
| False positive rate | <3% |
---
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
---
## Decisions & Risks
| Decision/Risk | Type | Mitigation |
|---------------|------|------------|
| Corpus size may grow large | Risk | Implement tiered storage, archive old versions |
| Package version mapping is complex | Risk | Maintain distro-version mapping tables |
| Compilation variants create explosion | Risk | Prioritize common optimization levels (O2, O3) |
| CVE mapping requires manual curation | Risk | Start with high-impact CVEs, automate with NVD data |
---
## Next Checkpoints
- 2026-01-20: CORP-001 through CORP-008 (infrastructure, connectors) complete
- 2026-01-31: CORP-009 through CORP-014 (services, integration) complete
- 2026-02-15: CORP-015 through CORP-022 (corpus ingestion, testing) complete

View File

@@ -1,772 +0,0 @@
# Sprint 20260105_001_003_BINDEX - Semantic Diffing Phase 3: Ghidra Integration
## Topic & Scope
Integrate Ghidra as a secondary analysis backend for cases where B2R2 provides insufficient coverage or accuracy. Leverage Ghidra's mature Version Tracking, BSim, and FunctionID capabilities via headless analysis and the ghidriff Python bridge.
**Advisory Reference:** Product advisory on semantic diffing - Ghidra Version Tracking correlators, BSim behavioral similarity, ghidriff for automated patch diff workflows.
**Key Insight:** Ghidra has 15+ years of refinement in binary diffing. Rather than reimplementing, we should integrate Ghidra as a fallback/enhancement layer for:
1. Architectures B2R2 handles poorly
2. Complex obfuscation scenarios
3. Version Tracking with multiple correlators
4. BSim database queries
**Working directory:** `src/BinaryIndex/`
**Evidence:** New `StellaOps.BinaryIndex.Ghidra` library, Ghidra Headless integration, ghidriff bridge.
---
## Dependencies & Concurrency
| Dependency | Type | Status |
|------------|------|--------|
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Should be complete |
| SPRINT_20260105_001_002 (Corpus) | Sprint | Can run in parallel |
| Ghidra 11.x | External | Available |
| Java 17+ | Runtime | Required for Ghidra |
| Python 3.10+ | Runtime | Required for ghidriff |
| ghidriff | External | Available (pip) |
**Parallel Execution:** Ghidra Headless setup (GHID-001-004) and ghidriff integration (GHID-005-008) can proceed in parallel.
---
## Documentation Prerequisites
- `docs/modules/binary-index/architecture.md`
- Ghidra documentation: https://ghidra.re/ghidra_docs/
- Ghidra Version Tracking: https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing
- ghidriff repository: https://github.com/clearbluejar/ghidriff
- BSim documentation: https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/
---
## Problem Analysis
### Current State
- B2R2 is the sole disassembly/analysis backend
- B2R2 coverage varies by architecture (excellent x64/ARM64, limited others)
- No access to Ghidra's mature correlators and similarity engines
- Cannot leverage BSim's pre-built signature databases
### B2R2 vs Ghidra Trade-offs
| Capability | B2R2 | Ghidra |
|------------|------|--------|
| Speed | Fast (native .NET) | Slower (Java, headless startup) |
| Architecture coverage | 12+ (some limited) | 20+ (mature) |
| IR quality | Good (LowUIR) | Excellent (P-Code) |
| Decompiler | None | Excellent |
| Version Tracking | None | Mature (multiple correlators) |
| BSim | None | Full support |
| Integration | Native .NET | Process/API bridge |
### Target Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Unified Disassembly/Analysis Layer │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ IDisassemblyPlugin Selection Logic │ │
│ │ │ │
│ │ Primary: B2R2 (fast, deterministic) │ │
│ │ Fallback: Ghidra (complex cases, low B2R2 confidence) │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │ │
│ v v │
│ ┌──────────────────────────┐ ┌──────────────────────────────────────┐ │
│ │ B2R2 Backend │ │ Ghidra Backend │ │
│ │ │ │ │ │
│ │ - Native .NET │ │ ┌────────────────────────────────┐ │ │
│ │ - LowUIR lifting │ │ │ Ghidra Headless Server │ │ │
│ │ - CFG recovery │ │ │ │ │ │
│ │ - Fast fingerprinting │ │ │ - P-Code decompilation │ │ │
│ │ │ │ │ - Version Tracking │ │ │
│ └──────────────────────────┘ │ │ - BSim queries │ │ │
│ │ │ - FunctionID matching │ │ │
│ │ └────────────────────────────────┘ │ │
│ │ │ │ │
│ │ v │ │
│ │ ┌────────────────────────────────┐ │ │
│ │ │ ghidriff Bridge │ │ │
│ │ │ │ │ │
│ │ │ - Automated patch diffing │ │ │
│ │ │ - JSON/Markdown output │ │ │
│ │ │ - CI/CD integration │ │ │
│ │ └────────────────────────────────┘ │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Architecture Design
### Ghidra Headless Service
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IGhidraService.cs
namespace StellaOps.BinaryIndex.Ghidra;
public interface IGhidraService
{
/// <summary>
/// Analyze a binary using Ghidra headless.
/// </summary>
Task<GhidraAnalysisResult> AnalyzeAsync(
Stream binaryStream,
GhidraAnalysisOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Run Version Tracking between two binaries.
/// </summary>
Task<VersionTrackingResult> CompareVersionsAsync(
Stream oldBinary,
Stream newBinary,
VersionTrackingOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Query BSim for function matches.
/// </summary>
Task<ImmutableArray<BSimMatch>> QueryBSimAsync(
GhidraFunction function,
BSimQueryOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Check if Ghidra backend is available and healthy.
/// </summary>
Task<bool> IsAvailableAsync(CancellationToken ct = default);
}
public sealed record GhidraAnalysisResult(
string BinaryHash,
ImmutableArray<GhidraFunction> Functions,
ImmutableArray<GhidraImport> Imports,
ImmutableArray<GhidraExport> Exports,
ImmutableArray<GhidraString> Strings,
GhidraMetadata Metadata);
public sealed record GhidraFunction(
string Name,
ulong Address,
int Size,
string? Signature, // Decompiled signature
string? DecompiledCode, // Decompiled C code
byte[] PCodeHash, // P-Code semantic hash
ImmutableArray<string> CalledFunctions,
ImmutableArray<string> CallingFunctions);
```
### Version Tracking Integration
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IVersionTrackingService.cs
namespace StellaOps.BinaryIndex.Ghidra;
public interface IVersionTrackingService
{
/// <summary>
/// Run Ghidra Version Tracking with multiple correlators.
/// </summary>
Task<VersionTrackingResult> TrackVersionsAsync(
Stream oldBinary,
Stream newBinary,
VersionTrackingOptions options,
CancellationToken ct = default);
}
public sealed record VersionTrackingOptions
{
public ImmutableArray<CorrelatorType> Correlators { get; init; } =
[CorrelatorType.ExactBytes, CorrelatorType.ExactMnemonics,
CorrelatorType.SymbolName, CorrelatorType.DataReference,
CorrelatorType.CombinedReference];
public decimal MinSimilarity { get; init; } = 0.5m;
public bool IncludeDecompilation { get; init; } = false;
}
public enum CorrelatorType
{
ExactBytes, // Identical byte sequences
ExactMnemonics, // Identical instruction mnemonics
SymbolName, // Matching symbol names
DataReference, // Similar data references
CombinedReference, // Combined reference scoring
BSim // Behavioral similarity
}
public sealed record VersionTrackingResult(
ImmutableArray<FunctionMatch> Matches,
ImmutableArray<FunctionAdded> AddedFunctions,
ImmutableArray<FunctionRemoved> RemovedFunctions,
ImmutableArray<FunctionModified> ModifiedFunctions,
VersionTrackingStats Statistics);
public sealed record FunctionMatch(
string OldName,
ulong OldAddress,
string NewName,
ulong NewAddress,
decimal Similarity,
CorrelatorType MatchedBy,
ImmutableArray<MatchDifference> Differences);
public sealed record MatchDifference(
DifferenceType Type,
string Description,
string? OldValue,
string? NewValue);
public enum DifferenceType
{
InstructionAdded,
InstructionRemoved,
InstructionChanged,
BranchTargetChanged,
CallTargetChanged,
ConstantChanged,
SizeChanged
}
```
### ghidriff Bridge
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IGhidriffBridge.cs
namespace StellaOps.BinaryIndex.Ghidra;
public interface IGhidriffBridge
{
/// <summary>
/// Run ghidriff to compare two binaries.
/// </summary>
Task<GhidriffResult> DiffAsync(
string oldBinaryPath,
string newBinaryPath,
GhidriffOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Generate patch diff report.
/// </summary>
Task<string> GenerateReportAsync(
GhidriffResult result,
ReportFormat format,
CancellationToken ct = default);
}
public sealed record GhidriffOptions
{
public string? GhidraPath { get; init; }
public string? ProjectPath { get; init; }
public bool IncludeDecompilation { get; init; } = true;
public bool IncludeDisassembly { get; init; } = true;
public ImmutableArray<string> ExcludeFunctions { get; init; } = [];
}
public sealed record GhidriffResult(
string OldBinaryHash,
string NewBinaryHash,
ImmutableArray<GhidriffFunction> AddedFunctions,
ImmutableArray<GhidriffFunction> RemovedFunctions,
ImmutableArray<GhidriffDiff> ModifiedFunctions,
GhidriffStats Statistics,
string RawJsonOutput);
public sealed record GhidriffDiff(
string FunctionName,
string OldSignature,
string NewSignature,
decimal Similarity,
string? OldDecompiled,
string? NewDecompiled,
ImmutableArray<string> InstructionChanges);
public enum ReportFormat { Json, Markdown, Html }
```
### BSim Integration
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/IBSimService.cs
namespace StellaOps.BinaryIndex.Ghidra;
public interface IBSimService
{
/// <summary>
/// Generate BSim signatures for functions.
/// </summary>
Task<ImmutableArray<BSimSignature>> GenerateSignaturesAsync(
GhidraAnalysisResult analysis,
BSimGenerationOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Query BSim database for similar functions.
/// </summary>
Task<ImmutableArray<BSimMatch>> QueryAsync(
BSimSignature signature,
BSimQueryOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Ingest functions into BSim database.
/// </summary>
Task IngestAsync(
string libraryName,
string version,
ImmutableArray<BSimSignature> signatures,
CancellationToken ct = default);
}
public sealed record BSimSignature(
string FunctionName,
ulong Address,
byte[] FeatureVector, // BSim feature extraction
int VectorLength,
double SelfSignificance); // How distinctive is this function
public sealed record BSimMatch(
string MatchedLibrary,
string MatchedVersion,
string MatchedFunction,
double Similarity,
double Significance,
double Confidence);
public sealed record BSimQueryOptions
{
public double MinSimilarity { get; init; } = 0.7;
public double MinSignificance { get; init; } = 0.0;
public int MaxResults { get; init; } = 10;
public ImmutableArray<string> TargetLibraries { get; init; } = [];
}
```
---
## Delivery Tracker
| # | Task ID | Status | Dependency | Owners | Task Definition |
|---|---------|--------|------------|--------|-----------------|
| 1 | GHID-001 | TODO | - | Guild | Create `StellaOps.BinaryIndex.Ghidra` project structure |
| 2 | GHID-002 | TODO | GHID-001 | Guild | Define Ghidra model types (GhidraFunction, VersionTrackingResult, etc.) |
| 3 | GHID-003 | TODO | GHID-001 | Guild | Implement Ghidra Headless launcher/manager |
| 4 | GHID-004 | TODO | GHID-003 | Guild | Implement GhidraService (headless analysis wrapper) |
| 5 | GHID-005 | TODO | GHID-001 | Guild | Set up ghidriff Python environment |
| 6 | GHID-006 | TODO | GHID-005 | Guild | Implement GhidriffBridge (Python interop) |
| 7 | GHID-007 | TODO | GHID-006 | Guild | Implement GhidriffReportGenerator |
| 8 | GHID-008 | TODO | GHID-004,006 | Guild | Implement VersionTrackingService |
| 9 | GHID-009 | TODO | GHID-004 | Guild | Implement BSim signature generation |
| 10 | GHID-010 | TODO | GHID-009 | Guild | Implement BSim query service |
| 11 | GHID-011 | TODO | GHID-010 | Guild | Set up BSim PostgreSQL database |
| 12 | GHID-012 | TODO | GHID-008,010 | Guild | Implement GhidraDisassemblyPlugin (IDisassemblyPlugin) |
| 13 | GHID-013 | TODO | GHID-012 | Guild | Integrate Ghidra into DisassemblyService as fallback |
| 14 | GHID-014 | TODO | GHID-013 | Guild | Implement fallback selection logic (B2R2 -> Ghidra) |
| 15 | GHID-015 | TODO | GHID-008 | Guild | Unit tests: Version Tracking correlators |
| 16 | GHID-016 | TODO | GHID-010 | Guild | Unit tests: BSim signature generation |
| 17 | GHID-017 | TODO | GHID-014 | Guild | Integration tests: Fallback scenarios |
| 18 | GHID-018 | TODO | GHID-017 | Guild | Benchmark: Ghidra vs B2R2 accuracy comparison |
| 19 | GHID-019 | TODO | GHID-018 | Guild | Documentation: Ghidra deployment guide |
| 20 | GHID-020 | TODO | GHID-019 | Guild | Docker image: Ghidra Headless service |
---
## Task Details
### GHID-003: Implement Ghidra Headless Launcher
Manage Ghidra Headless process lifecycle:
```csharp
internal sealed class GhidraHeadlessManager : IAsyncDisposable
{
private readonly GhidraOptions _options;
private readonly ILogger<GhidraHeadlessManager> _logger;
private Process? _ghidraProcess;
private readonly SemaphoreSlim _lock = new(1, 1);
public GhidraHeadlessManager(
IOptions<GhidraOptions> options,
ILogger<GhidraHeadlessManager> logger)
{
_options = options.Value;
_logger = logger;
}
public async Task<string> AnalyzeAsync(
string binaryPath,
string scriptName,
string[] scriptArgs,
CancellationToken ct)
{
await _lock.WaitAsync(ct);
try
{
var projectDir = Path.Combine(_options.WorkDir, Guid.NewGuid().ToString("N"));
Directory.CreateDirectory(projectDir);
var args = BuildAnalyzeArgs(projectDir, binaryPath, scriptName, scriptArgs);
var result = await RunGhidraAsync(args, ct);
return result;
}
finally
{
_lock.Release();
}
}
private string[] BuildAnalyzeArgs(
string projectDir,
string binaryPath,
string scriptName,
string[] scriptArgs)
{
var args = new List<string>
{
projectDir, // Project location
"TempProject", // Project name
"-import", binaryPath,
"-postScript", scriptName
};
if (scriptArgs.Length > 0)
{
args.AddRange(scriptArgs);
}
// Add standard options
args.AddRange([
"-noanalysis", // We'll run analysis explicitly
"-scriptPath", _options.ScriptsDir,
"-max-cpu", _options.MaxCpu.ToString(CultureInfo.InvariantCulture)
]);
return [.. args];
}
private async Task<string> RunGhidraAsync(string[] args, CancellationToken ct)
{
var startInfo = new ProcessStartInfo
{
FileName = Path.Combine(_options.GhidraHome, "support", "analyzeHeadless"),
Arguments = string.Join(" ", args.Select(QuoteArg)),
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true
};
// Set Java options
startInfo.EnvironmentVariables["JAVA_HOME"] = _options.JavaHome;
startInfo.EnvironmentVariables["MAXMEM"] = _options.MaxMemory;
using var process = Process.Start(startInfo)
?? throw new InvalidOperationException("Failed to start Ghidra");
var output = await process.StandardOutput.ReadToEndAsync(ct);
var error = await process.StandardError.ReadToEndAsync(ct);
await process.WaitForExitAsync(ct);
if (process.ExitCode != 0)
{
throw new GhidraException($"Ghidra failed: {error}");
}
return output;
}
}
```
### GHID-006: Implement ghidriff Bridge
Python interop for ghidriff:
```csharp
internal sealed class GhidriffBridge : IGhidriffBridge
{
private readonly GhidriffOptions _options;
private readonly ILogger<GhidriffBridge> _logger;
public async Task<GhidriffResult> DiffAsync(
string oldBinaryPath,
string newBinaryPath,
GhidriffOptions? options = null,
CancellationToken ct = default)
{
options ??= _options;
var outputDir = Path.Combine(Path.GetTempPath(), $"ghidriff_{Guid.NewGuid():N}");
Directory.CreateDirectory(outputDir);
try
{
var args = BuildGhidriffArgs(oldBinaryPath, newBinaryPath, outputDir, options);
var result = await RunPythonAsync("ghidriff", args, ct);
// Parse JSON output
var jsonPath = Path.Combine(outputDir, "diff.json");
if (!File.Exists(jsonPath))
{
throw new GhidriffException($"ghidriff did not produce output: {result}");
}
var json = await File.ReadAllTextAsync(jsonPath, ct);
return ParseGhidriffOutput(json);
}
finally
{
if (Directory.Exists(outputDir))
{
Directory.Delete(outputDir, recursive: true);
}
}
}
private static string[] BuildGhidriffArgs(
string oldPath,
string newPath,
string outputDir,
GhidriffOptions options)
{
var args = new List<string>
{
oldPath,
newPath,
"--output-dir", outputDir,
"--output-format", "json"
};
if (!string.IsNullOrEmpty(options.GhidraPath))
{
args.AddRange(["--ghidra-path", options.GhidraPath]);
}
if (options.IncludeDecompilation)
{
args.Add("--include-decompilation");
}
if (options.ExcludeFunctions.Length > 0)
{
args.AddRange(["--exclude", string.Join(",", options.ExcludeFunctions)]);
}
return [.. args];
}
private async Task<string> RunPythonAsync(
string module,
string[] args,
CancellationToken ct)
{
var startInfo = new ProcessStartInfo
{
FileName = _options.PythonPath ?? "python3",
Arguments = $"-m {module} {string.Join(" ", args.Select(QuoteArg))}",
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
CreateNoWindow = true
};
using var process = Process.Start(startInfo)
?? throw new InvalidOperationException("Failed to start Python");
var output = await process.StandardOutput.ReadToEndAsync(ct);
await process.WaitForExitAsync(ct);
return output;
}
}
```
### GHID-014: Implement Fallback Selection Logic
Smart routing between B2R2 and Ghidra:
```csharp
internal sealed class HybridDisassemblyService : IDisassemblyService
{
private readonly B2R2DisassemblyPlugin _b2r2;
private readonly GhidraDisassemblyPlugin _ghidra;
private readonly ILogger<HybridDisassemblyService> _logger;
public async Task<DisassemblyResult> DisassembleAsync(
Stream binaryStream,
DisassemblyOptions? options = null,
CancellationToken ct = default)
{
options ??= new DisassemblyOptions();
// Try B2R2 first (faster, native)
var b2r2Result = await TryB2R2Async(binaryStream, options, ct);
if (b2r2Result is not null && MeetsQualityThreshold(b2r2Result, options))
{
_logger.LogDebug("Using B2R2 result (confidence: {Confidence})",
b2r2Result.Confidence);
return b2r2Result;
}
// Fallback to Ghidra for:
// 1. Low B2R2 confidence
// 2. Unsupported architecture
// 3. Explicit Ghidra preference
if (!await _ghidra.IsAvailableAsync(ct))
{
_logger.LogWarning("Ghidra unavailable, returning B2R2 result");
return b2r2Result ?? throw new DisassemblyException("No backend available");
}
_logger.LogInformation("Falling back to Ghidra (B2R2 confidence: {Confidence})",
b2r2Result?.Confidence ?? 0);
binaryStream.Position = 0;
return await _ghidra.DisassembleAsync(binaryStream, options, ct);
}
private static bool MeetsQualityThreshold(
DisassemblyResult result,
DisassemblyOptions options)
{
// Confidence threshold
if (result.Confidence < options.MinConfidence)
return false;
// Function discovery threshold
if (result.Functions.Length < options.MinFunctions)
return false;
// Instruction decoding success rate
var decodeRate = (double)result.DecodedInstructions / result.TotalInstructions;
if (decodeRate < options.MinDecodeRate)
return false;
return true;
}
}
```
---
## Deployment Architecture
### Container Setup
```yaml
# docker-compose.ghidra.yml
services:
ghidra-headless:
image: stellaops/ghidra-headless:11.2
build:
context: ./devops/docker/ghidra
dockerfile: Dockerfile.headless
volumes:
- ghidra-projects:/projects
- ghidra-scripts:/scripts
environment:
JAVA_HOME: /opt/java/openjdk
MAXMEM: 4G
deploy:
resources:
limits:
cpus: '4'
memory: 8G
bsim-postgres:
image: postgres:16
volumes:
- bsim-data:/var/lib/postgresql/data
environment:
POSTGRES_DB: bsim
POSTGRES_USER: bsim
POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD}
volumes:
ghidra-projects:
ghidra-scripts:
bsim-data:
```
### Dockerfile
```dockerfile
# devops/docker/ghidra/Dockerfile.headless
FROM eclipse-temurin:17-jdk-jammy
ARG GHIDRA_VERSION=11.2
ARG GHIDRA_SHA256=abc123...
# Download and extract Ghidra
RUN curl -fsSL https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_${GHIDRA_VERSION}_build/ghidra_${GHIDRA_VERSION}_PUBLIC_*.zip \
-o /tmp/ghidra.zip \
&& echo "${GHIDRA_SHA256} /tmp/ghidra.zip" | sha256sum -c - \
&& unzip /tmp/ghidra.zip -d /opt \
&& rm /tmp/ghidra.zip \
&& ln -s /opt/ghidra_* /opt/ghidra
# Install Python for ghidriff
RUN apt-get update && apt-get install -y python3 python3-pip \
&& pip3 install ghidriff \
&& apt-get clean
ENV GHIDRA_HOME=/opt/ghidra
ENV PATH="${GHIDRA_HOME}/support:${PATH}"
WORKDIR /projects
ENTRYPOINT ["analyzeHeadless"]
```
---
## Success Metrics
| Metric | Current | Target |
|--------|---------|--------|
| Architecture coverage | 12 (B2R2) | 20+ (with Ghidra) |
| Complex binary accuracy | ~70% | 90%+ |
| Version tracking precision | N/A | 85%+ |
| BSim identification rate | N/A | 80%+ on known libs |
| Fallback latency overhead | N/A | <30s per binary |
---
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
---
## Decisions & Risks
| Decision/Risk | Type | Mitigation |
|---------------|------|------------|
| Ghidra adds Java dependency | Trade-off | Containerize Ghidra, keep optional |
| ghidriff Python interop adds complexity | Trade-off | Use subprocess, avoid embedding |
| Ghidra startup time is slow (~10-30s) | Risk | Keep B2R2 primary, Ghidra fallback only |
| BSim database grows large | Risk | Prune old versions, tier storage |
| License considerations (Apache 2.0) | Compliance | Ghidra is Apache 2.0, compatible with AGPL |
---
## Next Checkpoints
- 2026-02-01: GHID-001 through GHID-007 (project setup, bridges) complete
- 2026-02-15: GHID-008 through GHID-014 (services, integration) complete
- 2026-02-28: GHID-015 through GHID-020 (testing, deployment) complete

View File

@@ -1,906 +0,0 @@
# Sprint 20260105_001_004_BINDEX - Semantic Diffing Phase 4: Decompiler Integration & ML Similarity
## Topic & Scope
Implement advanced semantic analysis capabilities including decompiled pseudo-code comparison and machine learning-based function embeddings. This phase addresses the highest-impact but most complex enhancements for detecting semantic equivalence in heavily optimized and obfuscated binaries.
**Advisory Reference:** Product advisory on semantic diffing - SEI Carnegie Mellon semantic equivalence checking of decompiled binaries, ML-based similarity models.
**Key Insight:** Comparing decompiled C-like code provides the highest semantic fidelity, as it abstracts away instruction-level details. ML embeddings capture functional behavior patterns that resist obfuscation.
**Working directory:** `src/BinaryIndex/`
**Evidence:** New `StellaOps.BinaryIndex.Decompiler` and `StellaOps.BinaryIndex.ML` libraries, model training pipeline.
---
## Dependencies & Concurrency
| Dependency | Type | Status |
|------------|------|--------|
| SPRINT_20260105_001_001 (IR Semantics) | Sprint | Required |
| SPRINT_20260105_001_002 (Corpus) | Sprint | Required for training data |
| SPRINT_20260105_001_003 (Ghidra) | Sprint | Required for decompiler |
| Ghidra Decompiler | External | Via Phase 3 |
| ONNX Runtime | Package | Available |
| ML.NET | Package | Available |
**Parallel Execution:** Decompiler integration (DCML-001-010) and ML pipeline (DCML-011-020) can proceed in parallel.
---
## Documentation Prerequisites
- Phase 1-3 sprint documents
- `docs/modules/binary-index/architecture.md`
- SEI paper: https://www.sei.cmu.edu/annual-reviews/2022-research-review/semantic-equivalence-checking-of-decompiled-binaries/
- Code similarity research: https://arxiv.org/abs/2308.01463
---
## Problem Analysis
### Current State
After Phases 1-3:
- B2R2 IR-level semantic fingerprints (Phase 1)
- Function behavior corpus (Phase 2)
- Ghidra fallback with Version Tracking (Phase 3)
**Remaining Gaps:**
1. No decompiled code comparison (highest semantic fidelity)
2. No ML-based similarity (robustness to obfuscation)
3. Cannot detect functionally equivalent code with radically different structure
### Target Capabilities
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Advanced Semantic Analysis Stack │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Decompilation Layer │ │
│ │ │ │
│ │ Binary -> Ghidra P-Code -> Decompiled C -> AST -> Semantic Hash │ │
│ │ │ │
│ │ Comparison methods: │ │
│ │ - AST structural similarity │ │
│ │ - Control flow equivalence │ │
│ │ - Data flow equivalence │ │
│ │ - Normalized code text similarity │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ ML Embedding Layer │ │
│ │ │ │
│ │ Function Code -> Tokenization -> Transformer -> Embedding Vector │ │
│ │ │ │
│ │ Models: │ │
│ │ - CodeBERT variant for binary code │ │
│ │ - Graph Neural Network for CFG │ │
│ │ - Contrastive learning for similarity │ │
│ │ │ │
│ │ Vector similarity: cosine, euclidean, learned metric │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Ensemble Decision Layer │ │
│ │ │ │
│ │ Combine signals: │ │
│ │ - Instruction fingerprint (Phase 1) : 15% weight │ │
│ │ - Semantic graph (Phase 1) : 25% weight │ │
│ │ - Decompiled AST similarity : 35% weight │ │
│ │ - ML embedding similarity : 25% weight │ │
│ │ │ │
│ │ Output: Confidence-weighted similarity score │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Architecture Design
### Decompiler Integration
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/IDecompilerService.cs
namespace StellaOps.BinaryIndex.Decompiler;
public interface IDecompilerService
{
/// <summary>
/// Decompile a function to C-like pseudo-code.
/// </summary>
Task<DecompiledFunction> DecompileAsync(
GhidraFunction function,
DecompileOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Parse decompiled code into AST.
/// </summary>
Task<DecompiledAst> ParseToAstAsync(
string decompiledCode,
CancellationToken ct = default);
/// <summary>
/// Compare two decompiled functions for semantic equivalence.
/// </summary>
Task<DecompiledComparisonResult> CompareAsync(
DecompiledFunction a,
DecompiledFunction b,
ComparisonOptions? options = null,
CancellationToken ct = default);
}
public sealed record DecompiledFunction(
string FunctionName,
string Signature,
string Code, // Decompiled C code
DecompiledAst? Ast,
ImmutableArray<LocalVariable> Locals,
ImmutableArray<string> CalledFunctions);
public sealed record DecompiledAst(
AstNode Root,
int NodeCount,
int Depth,
ImmutableArray<AstPattern> Patterns); // Recognized code patterns
public abstract record AstNode(AstNodeType Type, ImmutableArray<AstNode> Children);
public enum AstNodeType
{
Function, Block, If, While, For, DoWhile, Switch,
Return, Break, Continue, Goto,
Assignment, BinaryOp, UnaryOp, Call, Cast,
Variable, Constant, ArrayAccess, FieldAccess, Deref
}
```
### AST Comparison Engine
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/AstComparisonEngine.cs
namespace StellaOps.BinaryIndex.Decompiler;
public interface IAstComparisonEngine
{
/// <summary>
/// Compute structural similarity between ASTs.
/// </summary>
decimal ComputeStructuralSimilarity(DecompiledAst a, DecompiledAst b);
/// <summary>
/// Compute edit distance between ASTs.
/// </summary>
AstEditDistance ComputeEditDistance(DecompiledAst a, DecompiledAst b);
/// <summary>
/// Find semantic equivalent patterns.
/// </summary>
ImmutableArray<SemanticEquivalence> FindEquivalences(
DecompiledAst a,
DecompiledAst b);
}
public sealed record AstEditDistance(
int Insertions,
int Deletions,
int Modifications,
int TotalOperations,
decimal NormalizedDistance); // 0.0 = identical, 1.0 = completely different
public sealed record SemanticEquivalence(
AstNode NodeA,
AstNode NodeB,
EquivalenceType Type,
decimal Confidence);
public enum EquivalenceType
{
Identical, // Exact match
Renamed, // Same structure, different names
Reordered, // Same operations, different order
Optimized, // Compiler optimization variant
Semantically, // Different structure, same behavior
}
```
### Decompiled Code Normalizer
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Decompiler/CodeNormalizer.cs
namespace StellaOps.BinaryIndex.Decompiler;
public interface ICodeNormalizer
{
/// <summary>
/// Normalize decompiled code for comparison.
/// </summary>
string Normalize(string code, NormalizationOptions? options = null);
/// <summary>
/// Generate canonical form hash.
/// </summary>
byte[] ComputeCanonicalHash(string code);
}
internal sealed class CodeNormalizer : ICodeNormalizer
{
public string Normalize(string code, NormalizationOptions? options = null)
{
options ??= NormalizationOptions.Default;
var normalized = code;
// 1. Normalize variable names (var1, var2, ...)
if (options.NormalizeVariables)
{
normalized = NormalizeVariableNames(normalized);
}
// 2. Normalize function calls (func1, func2, ... or keep known names)
if (options.NormalizeFunctionCalls)
{
normalized = NormalizeFunctionCalls(normalized, options.KnownFunctions);
}
// 3. Normalize constants (replace magic numbers with placeholders)
if (options.NormalizeConstants)
{
normalized = NormalizeConstants(normalized);
}
// 4. Normalize whitespace
if (options.NormalizeWhitespace)
{
normalized = NormalizeWhitespace(normalized);
}
// 5. Sort independent statements (where order doesn't matter)
if (options.SortIndependentStatements)
{
normalized = SortIndependentStatements(normalized);
}
return normalized;
}
private static string NormalizeVariableNames(string code)
{
// Replace all local variable names with canonical names
// var_0, var_1, ... in order of first appearance
var varIndex = 0;
var varMap = new Dictionary<string, string>();
// Regex to find variable declarations and uses
return Regex.Replace(code, @"\b([a-zA-Z_][a-zA-Z0-9_]*)\b", match =>
{
var name = match.Value;
// Skip keywords and known types
if (IsKeywordOrType(name))
return name;
if (!varMap.TryGetValue(name, out var canonical))
{
canonical = $"var_{varIndex++}";
varMap[name] = canonical;
}
return canonical;
});
}
}
```
### ML Embedding Service
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/IEmbeddingService.cs
namespace StellaOps.BinaryIndex.ML;
public interface IEmbeddingService
{
/// <summary>
/// Generate embedding vector for a function.
/// </summary>
Task<FunctionEmbedding> GenerateEmbeddingAsync(
EmbeddingInput input,
EmbeddingOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Compute similarity between embeddings.
/// </summary>
decimal ComputeSimilarity(
FunctionEmbedding a,
FunctionEmbedding b,
SimilarityMetric metric = SimilarityMetric.Cosine);
/// <summary>
/// Find similar functions in embedding index.
/// </summary>
Task<ImmutableArray<EmbeddingMatch>> FindSimilarAsync(
FunctionEmbedding query,
int topK = 10,
decimal minSimilarity = 0.7m,
CancellationToken ct = default);
}
public sealed record EmbeddingInput(
string? DecompiledCode, // Preferred
KeySemanticsGraph? SemanticGraph, // Fallback
byte[]? InstructionBytes, // Last resort
EmbeddingInputType PreferredInput);
public enum EmbeddingInputType { DecompiledCode, SemanticGraph, Instructions }
public sealed record FunctionEmbedding(
string FunctionName,
float[] Vector, // 768-dimensional
EmbeddingModel Model,
EmbeddingInputType InputType);
public enum EmbeddingModel
{
CodeBertBinary, // Fine-tuned CodeBERT for binary code
GraphSageFunction, // GNN for CFG/call graph
ContrastiveFunction // Contrastive learning model
}
public enum SimilarityMetric { Cosine, Euclidean, Manhattan, LearnedMetric }
```
### Model Training Pipeline
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/IModelTrainingService.cs
namespace StellaOps.BinaryIndex.ML;
public interface IModelTrainingService
{
/// <summary>
/// Train embedding model on function pairs.
/// </summary>
Task<TrainingResult> TrainAsync(
IAsyncEnumerable<TrainingPair> trainingData,
TrainingOptions options,
IProgress<TrainingProgress>? progress = null,
CancellationToken ct = default);
/// <summary>
/// Evaluate model on test set.
/// </summary>
Task<EvaluationResult> EvaluateAsync(
IAsyncEnumerable<TrainingPair> testData,
CancellationToken ct = default);
/// <summary>
/// Export trained model for inference.
/// </summary>
Task ExportModelAsync(
string outputPath,
ModelExportFormat format = ModelExportFormat.Onnx,
CancellationToken ct = default);
}
public sealed record TrainingPair(
EmbeddingInput FunctionA,
EmbeddingInput FunctionB,
bool IsSimilar, // Ground truth: same function?
decimal? SimilarityScore); // Optional: how similar (0-1)
public sealed record TrainingOptions
{
public EmbeddingModel Model { get; init; } = EmbeddingModel.CodeBertBinary;
public int EmbeddingDimension { get; init; } = 768;
public int BatchSize { get; init; } = 32;
public int Epochs { get; init; } = 10;
public double LearningRate { get; init; } = 1e-5;
public double MarginLoss { get; init; } = 0.5; // Contrastive margin
public string? PretrainedModelPath { get; init; }
}
public sealed record TrainingResult(
string ModelPath,
int TotalPairs,
int Epochs,
double FinalLoss,
double ValidationAccuracy,
TimeSpan TrainingTime);
public sealed record EvaluationResult(
double Accuracy,
double Precision,
double Recall,
double F1Score,
double AucRoc,
ImmutableArray<ConfusionEntry> ConfusionMatrix);
```
### ONNX Inference Engine
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.ML/OnnxInferenceEngine.cs
namespace StellaOps.BinaryIndex.ML;
internal sealed class OnnxInferenceEngine : IEmbeddingService, IAsyncDisposable
{
private readonly InferenceSession _session;
private readonly ITokenizer _tokenizer;
private readonly ILogger<OnnxInferenceEngine> _logger;
public OnnxInferenceEngine(
string modelPath,
ITokenizer tokenizer,
ILogger<OnnxInferenceEngine> logger)
{
var options = new SessionOptions
{
GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL,
ExecutionMode = ExecutionMode.ORT_PARALLEL
};
_session = new InferenceSession(modelPath, options);
_tokenizer = tokenizer;
_logger = logger;
}
public async Task<FunctionEmbedding> GenerateEmbeddingAsync(
EmbeddingInput input,
EmbeddingOptions? options = null,
CancellationToken ct = default)
{
var text = input.PreferredInput switch
{
EmbeddingInputType.DecompiledCode => input.DecompiledCode
?? throw new ArgumentException("DecompiledCode required"),
EmbeddingInputType.SemanticGraph => SerializeGraph(input.SemanticGraph
?? throw new ArgumentException("SemanticGraph required")),
EmbeddingInputType.Instructions => SerializeInstructions(input.InstructionBytes
?? throw new ArgumentException("InstructionBytes required")),
_ => throw new ArgumentOutOfRangeException()
};
// Tokenize
var tokens = _tokenizer.Tokenize(text, maxLength: 512);
// Run inference
var inputTensor = new DenseTensor<long>(tokens, [1, tokens.Length]);
var inputs = new List<NamedOnnxValue>
{
NamedOnnxValue.CreateFromTensor("input_ids", inputTensor)
};
using var results = await Task.Run(() => _session.Run(inputs), ct);
var outputTensor = results.First().AsTensor<float>();
var embedding = outputTensor.ToArray();
return new FunctionEmbedding(
input.DecompiledCode?.GetHashCode().ToString() ?? "unknown",
embedding,
EmbeddingModel.CodeBertBinary,
input.PreferredInput);
}
public decimal ComputeSimilarity(
FunctionEmbedding a,
FunctionEmbedding b,
SimilarityMetric metric = SimilarityMetric.Cosine)
{
return metric switch
{
SimilarityMetric.Cosine => CosineSimilarity(a.Vector, b.Vector),
SimilarityMetric.Euclidean => EuclideanSimilarity(a.Vector, b.Vector),
SimilarityMetric.Manhattan => ManhattanSimilarity(a.Vector, b.Vector),
_ => throw new ArgumentOutOfRangeException(nameof(metric))
};
}
private static decimal CosineSimilarity(float[] a, float[] b)
{
var dotProduct = 0.0;
var normA = 0.0;
var normB = 0.0;
for (var i = 0; i < a.Length; i++)
{
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
if (normA == 0 || normB == 0)
return 0;
return (decimal)(dotProduct / (Math.Sqrt(normA) * Math.Sqrt(normB)));
}
}
```
### Ensemble Decision Engine
```csharp
// src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ensemble/IEnsembleDecisionEngine.cs
namespace StellaOps.BinaryIndex.Ensemble;
public interface IEnsembleDecisionEngine
{
/// <summary>
/// Compute final similarity using all available signals.
/// </summary>
Task<EnsembleResult> ComputeSimilarityAsync(
FunctionAnalysis a,
FunctionAnalysis b,
EnsembleOptions? options = null,
CancellationToken ct = default);
}
public sealed record FunctionAnalysis(
string FunctionName,
byte[]? InstructionFingerprint, // Phase 1
SemanticFingerprint? SemanticGraph, // Phase 1
DecompiledFunction? Decompiled, // Phase 4
FunctionEmbedding? Embedding); // Phase 4
public sealed record EnsembleOptions
{
// Weight configuration (must sum to 1.0)
public decimal InstructionWeight { get; init; } = 0.15m;
public decimal SemanticGraphWeight { get; init; } = 0.25m;
public decimal DecompiledWeight { get; init; } = 0.35m;
public decimal EmbeddingWeight { get; init; } = 0.25m;
// Confidence thresholds
public decimal MinConfidence { get; init; } = 0.6m;
public bool RequireAllSignals { get; init; } = false;
}
public sealed record EnsembleResult(
decimal OverallSimilarity,
MatchConfidence Confidence,
ImmutableArray<SignalContribution> Contributions,
string? Explanation);
public sealed record SignalContribution(
string SignalName,
decimal RawSimilarity,
decimal Weight,
decimal WeightedContribution,
bool WasAvailable);
```
---
## Delivery Tracker
| # | Task ID | Status | Dependency | Owners | Task Definition |
|---|---------|--------|------------|--------|-----------------|
| **Decompiler Integration** |
| 1 | DCML-001 | TODO | Phase 3 | Guild | Create `StellaOps.BinaryIndex.Decompiler` project |
| 2 | DCML-002 | TODO | DCML-001 | Guild | Define decompiled code model types |
| 3 | DCML-003 | TODO | DCML-002 | Guild | Implement Ghidra decompiler adapter |
| 4 | DCML-004 | TODO | DCML-003 | Guild | Implement C code parser (AST generation) |
| 5 | DCML-005 | TODO | DCML-004 | Guild | Implement AST comparison engine |
| 6 | DCML-006 | TODO | DCML-005 | Guild | Implement code normalizer |
| 7 | DCML-007 | TODO | DCML-006 | Guild | Implement semantic equivalence detector |
| 8 | DCML-008 | TODO | DCML-007 | Guild | Unit tests: Decompiler adapter |
| 9 | DCML-009 | TODO | DCML-007 | Guild | Unit tests: AST comparison |
| 10 | DCML-010 | TODO | DCML-009 | Guild | Integration tests: End-to-end decompiled comparison |
| **ML Embedding Pipeline** |
| 11 | DCML-011 | TODO | Phase 2 | Guild | Create `StellaOps.BinaryIndex.ML` project |
| 12 | DCML-012 | TODO | DCML-011 | Guild | Define embedding model types |
| 13 | DCML-013 | TODO | DCML-012 | Guild | Implement code tokenizer (binary-aware BPE) |
| 14 | DCML-014 | TODO | DCML-013 | Guild | Set up ONNX Runtime inference engine |
| 15 | DCML-015 | TODO | DCML-014 | Guild | Implement embedding service |
| 16 | DCML-016 | TODO | DCML-015 | Guild | Create training data from corpus (positive/negative pairs) |
| 17 | DCML-017 | TODO | DCML-016 | Guild | Train CodeBERT-Binary model |
| 18 | DCML-018 | TODO | DCML-017 | Guild | Export model to ONNX format |
| 19 | DCML-019 | TODO | DCML-015 | Guild | Unit tests: Embedding generation |
| 20 | DCML-020 | TODO | DCML-018 | Guild | Evaluation: Model accuracy metrics |
| **Ensemble Integration** |
| 21 | DCML-021 | TODO | DCML-010,020 | Guild | Create `StellaOps.BinaryIndex.Ensemble` project |
| 22 | DCML-022 | TODO | DCML-021 | Guild | Implement ensemble decision engine |
| 23 | DCML-023 | TODO | DCML-022 | Guild | Implement weight tuning (grid search) |
| 24 | DCML-024 | TODO | DCML-023 | Guild | Integrate ensemble into PatchDiffEngine |
| 25 | DCML-025 | TODO | DCML-024 | Guild | Integrate ensemble into DeltaSignatureMatcher |
| 26 | DCML-026 | TODO | DCML-025 | Guild | Unit tests: Ensemble decision logic |
| 27 | DCML-027 | TODO | DCML-026 | Guild | Integration tests: Full semantic diffing pipeline |
| 28 | DCML-028 | TODO | DCML-027 | Guild | Benchmark: Accuracy vs. baseline (Phase 1 only) |
| 29 | DCML-029 | TODO | DCML-028 | Guild | Benchmark: Latency impact |
| 30 | DCML-030 | TODO | DCML-029 | Guild | Documentation: ML model training guide |
---
## Task Details
### DCML-004: Implement C Code Parser
Parse Ghidra's decompiled C output into AST:
```csharp
internal sealed class DecompiledCodeParser
{
public DecompiledAst Parse(string code)
{
// Use Tree-sitter or Roslyn-based C parser
// Ghidra output is C-like but not standard C
var tokens = Tokenize(code);
var ast = BuildAst(tokens);
return new DecompiledAst(
ast,
CountNodes(ast),
ComputeDepth(ast),
ExtractPatterns(ast));
}
private AstNode BuildAst(IList<Token> tokens)
{
var parser = new RecursiveDescentParser(tokens);
return parser.ParseFunction();
}
private ImmutableArray<AstPattern> ExtractPatterns(AstNode root)
{
var patterns = new List<AstPattern>();
// Detect common patterns
patterns.AddRange(DetectLoopPatterns(root));
patterns.AddRange(DetectBranchPatterns(root));
patterns.AddRange(DetectAllocationPatterns(root));
patterns.AddRange(DetectErrorHandlingPatterns(root));
return [.. patterns];
}
private static IEnumerable<AstPattern> DetectLoopPatterns(AstNode root)
{
// Find: for loops, while loops, do-while
// Classify: counted loop, sentinel loop, infinite loop
foreach (var node in TraverseNodes(root))
{
if (node.Type == AstNodeType.For)
{
yield return new AstPattern(
PatternType.CountedLoop,
node,
AnalyzeForLoop(node));
}
else if (node.Type == AstNodeType.While)
{
yield return new AstPattern(
PatternType.ConditionalLoop,
node,
AnalyzeWhileLoop(node));
}
}
}
}
```
### DCML-017: Train CodeBERT-Binary Model
Training pipeline for function similarity:
```python
# tools/ml/train_codebert_binary.py
import torch
from transformers import RobertaTokenizer, RobertaModel
from torch.utils.data import DataLoader
import onnx
class CodeBertBinaryModel(torch.nn.Module):
def __init__(self, pretrained_model="microsoft/codebert-base"):
super().__init__()
self.encoder = RobertaModel.from_pretrained(pretrained_model)
self.projection = torch.nn.Linear(768, 768)
def forward(self, input_ids, attention_mask):
outputs = self.encoder(input_ids, attention_mask=attention_mask)
pooled = outputs.last_hidden_state[:, 0, :] # [CLS] token
projected = self.projection(pooled)
return torch.nn.functional.normalize(projected, p=2, dim=1)
class ContrastiveLoss(torch.nn.Module):
def __init__(self, margin=0.5):
super().__init__()
self.margin = margin
def forward(self, embedding_a, embedding_b, label):
distance = torch.nn.functional.pairwise_distance(embedding_a, embedding_b)
# label=1: similar, label=0: dissimilar
loss = label * distance.pow(2) + \
(1 - label) * torch.clamp(self.margin - distance, min=0).pow(2)
return loss.mean()
def train_model(train_dataloader, val_dataloader, epochs=10):
model = CodeBertBinaryModel()
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-5)
criterion = ContrastiveLoss(margin=0.5)
for epoch in range(epochs):
model.train()
total_loss = 0
for batch in train_dataloader:
optimizer.zero_grad()
emb_a = model(batch['input_ids_a'], batch['attention_mask_a'])
emb_b = model(batch['input_ids_b'], batch['attention_mask_b'])
loss = criterion(emb_a, emb_b, batch['label'])
loss.backward()
optimizer.step()
total_loss += loss.item()
# Validation
model.eval()
val_accuracy = evaluate(model, val_dataloader)
print(f"Epoch {epoch+1}: Loss={total_loss:.4f}, Val Acc={val_accuracy:.4f}")
return model
def export_to_onnx(model, output_path):
model.eval()
dummy_input = torch.randint(0, 50000, (1, 512))
dummy_mask = torch.ones(1, 512)
torch.onnx.export(
model,
(dummy_input, dummy_mask),
output_path,
input_names=['input_ids', 'attention_mask'],
output_names=['embedding'],
dynamic_axes={
'input_ids': {0: 'batch', 1: 'seq'},
'attention_mask': {0: 'batch', 1: 'seq'},
'embedding': {0: 'batch'}
}
)
```
### DCML-023: Implement Weight Tuning
Grid search for optimal ensemble weights:
```csharp
internal sealed class EnsembleWeightTuner
{
public async Task<EnsembleOptions> TuneWeightsAsync(
IAsyncEnumerable<LabeledPair> validationData,
CancellationToken ct)
{
var bestOptions = EnsembleOptions.Default;
var bestF1 = 0.0;
// Grid search over weight combinations
var weightCombinations = GenerateWeightCombinations(step: 0.05m);
foreach (var weights in weightCombinations)
{
ct.ThrowIfCancellationRequested();
var options = new EnsembleOptions
{
InstructionWeight = weights[0],
SemanticGraphWeight = weights[1],
DecompiledWeight = weights[2],
EmbeddingWeight = weights[3]
};
var metrics = await EvaluateAsync(validationData, options, ct);
if (metrics.F1Score > bestF1)
{
bestF1 = metrics.F1Score;
bestOptions = options;
}
}
return bestOptions;
}
private static IEnumerable<decimal[]> GenerateWeightCombinations(decimal step)
{
for (var w1 = 0m; w1 <= 1m; w1 += step)
for (var w2 = 0m; w2 <= 1m - w1; w2 += step)
for (var w3 = 0m; w3 <= 1m - w1 - w2; w3 += step)
{
var w4 = 1m - w1 - w2 - w3;
if (w4 >= 0)
{
yield return [w1, w2, w3, w4];
}
}
}
}
```
---
## Training Data Requirements
### Positive Pairs (Similar Functions)
| Source | Count | Description |
|--------|-------|-------------|
| Same function, different optimization | ~50,000 | O0 vs O2 vs O3 |
| Same function, different compiler | ~30,000 | GCC vs Clang |
| Same function, different version | ~100,000 | From corpus (Phase 2) |
| Same function, with patches | ~20,000 | Vulnerable vs fixed |
### Negative Pairs (Dissimilar Functions)
| Source | Count | Description |
|--------|-------|-------------|
| Random function pairs | ~100,000 | Random sampling |
| Similar-named different functions | ~50,000 | Hard negatives |
| Same library, different functions | ~50,000 | Medium negatives |
**Total training data:** ~400,000 labeled pairs
---
## Success Metrics
| Metric | Phase 1 Only | With Phase 4 | Target |
|--------|--------------|--------------|--------|
| Accuracy (optimized binaries) | 70% | 92% | 90%+ |
| Accuracy (obfuscated binaries) | 40% | 75% | 70%+ |
| False positive rate | 5% | 1.5% | <2% |
| False negative rate | 25% | 8% | <10% |
| Latency (per comparison) | 10ms | 150ms | <200ms |
---
## Resource Requirements
| Resource | Training | Inference |
|----------|----------|-----------|
| GPU | 1x V100 (32GB) or 4x T4 | Optional (CPU viable) |
| Memory | 64GB | 16GB |
| Storage | 100GB (training data) | 5GB (model) |
| Time | ~24 hours | <200ms per function |
---
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory analysis | Planning |
---
## Decisions & Risks
| Decision/Risk | Type | Mitigation |
|---------------|------|------------|
| ML model requires significant training data | Risk | Leverage corpus from Phase 2 |
| ONNX inference adds latency | Trade-off | Make ML optional, use for high-value comparisons |
| Decompiler output varies by Ghidra version | Risk | Pin Ghidra version, normalize output |
| Model may overfit to training library set | Risk | Diverse training data, regularization |
| GPU dependency for training | Constraint | Use cloud GPU, document CPU-only option |
---
## Next Checkpoints
- 2026-03-01: DCML-001 through DCML-010 (decompiler integration) complete
- 2026-03-15: DCML-011 through DCML-020 (ML pipeline) complete
- 2026-03-31: DCML-021 through DCML-030 (ensemble, benchmarks) complete

View File

@@ -142,17 +142,17 @@ CREATE INDEX idx_hlc_state_updated ON scheduler.hlc_state(updated_at DESC);
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | HLC-001 | TODO | - | Guild | Create `StellaOps.HybridLogicalClock` project with Directory.Build.props integration |
| 2 | HLC-002 | TODO | HLC-001 | Guild | Implement `HlcTimestamp` record with comparison, parsing, serialization |
| 3 | HLC-003 | TODO | HLC-002 | Guild | Implement `HybridLogicalClock` class with Tick/Receive/Current |
| 4 | HLC-004 | TODO | HLC-003 | Guild | Implement `IHlcStateStore` interface and `InMemoryHlcStateStore` |
| 5 | HLC-005 | TODO | HLC-004 | Guild | Implement `PostgresHlcStateStore` with atomic update semantics |
| 6 | HLC-006 | TODO | HLC-003 | Guild | Add `HlcTimestampJsonConverter` for System.Text.Json serialization |
| 7 | HLC-007 | TODO | HLC-003 | Guild | Add `HlcTimestampTypeHandler` for Npgsql/Dapper |
| 8 | HLC-008 | TODO | HLC-005 | Guild | Write unit tests: tick monotonicity, receive merge, clock skew handling |
| 9 | HLC-009 | TODO | HLC-008 | Guild | Write integration tests: concurrent ticks, node restart recovery |
| 1 | HLC-001 | DONE | - | Guild | Create `StellaOps.HybridLogicalClock` project with Directory.Build.props integration |
| 2 | HLC-002 | DONE | HLC-001 | Guild | Implement `HlcTimestamp` record with comparison, parsing, serialization |
| 3 | HLC-003 | DONE | HLC-002 | Guild | Implement `HybridLogicalClock` class with Tick/Receive/Current |
| 4 | HLC-004 | DONE | HLC-003 | Guild | Implement `IHlcStateStore` interface and `InMemoryHlcStateStore` |
| 5 | HLC-005 | DONE | HLC-004 | Guild | Implement `PostgresHlcStateStore` with atomic update semantics |
| 6 | HLC-006 | DONE | HLC-003 | Guild | Add `HlcTimestampJsonConverter` for System.Text.Json serialization |
| 7 | HLC-007 | DONE | HLC-003 | Guild | Add `HlcTimestampTypeHandler` for Npgsql/Dapper |
| 8 | HLC-008 | DONE | HLC-005 | Guild | Write unit tests: tick monotonicity, receive merge, clock skew handling |
| 9 | HLC-009 | DONE | HLC-008 | Guild | Write integration tests: concurrent ticks, node restart recovery |
| 10 | HLC-010 | TODO | HLC-009 | Guild | Write benchmarks: tick throughput, memory allocation |
| 11 | HLC-011 | TODO | HLC-010 | Guild | Create `HlcServiceCollectionExtensions` for DI registration |
| 11 | HLC-011 | DONE | HLC-010 | Guild | Create `HlcServiceCollectionExtensions` for DI registration |
| 12 | HLC-012 | TODO | HLC-011 | Guild | Documentation: README.md, API docs, usage examples |
## Implementation Details
@@ -335,6 +335,7 @@ hlc_physical_time_offset_seconds{node_id} // Drift from wall clock
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
| 2026-01-05 | HLC-001 to HLC-011 implemented: core library, state stores, JSON/Dapper serializers, DI extensions, 56 unit tests all passing | Agent |
## Next Checkpoints

View File

@@ -466,16 +466,16 @@ internal static class ProveCommandGroup
| 4 | RPL-004 | TODO | RPL-003 | Replay Guild | Update `CommandHandlers.VerifyBundle.ReplayVerdictAsync()` to use service |
| 5 | RPL-005 | TODO | RPL-004 | Replay Guild | Unit tests: VerdictBuilder replay with fixtures |
| **DSSE Verification** |
| 6 | RPL-006 | TODO | - | Attestor Guild | Define `IDsseVerifier` interface in `StellaOps.Attestation` |
| 7 | RPL-007 | TODO | RPL-006 | Attestor Guild | Implement `DsseVerifier` using existing `DsseHelper` |
| 8 | RPL-008 | TODO | RPL-007 | CLI Guild | Wire `DsseVerifier` into CLI DI container |
| 9 | RPL-009 | TODO | RPL-008 | CLI Guild | Update `CommandHandlers.VerifyBundle.VerifyDsseSignatureAsync()` |
| 10 | RPL-010 | TODO | RPL-009 | Attestor Guild | Unit tests: DSSE verification with valid/invalid signatures |
| 6 | RPL-006 | DONE | - | Attestor Guild | Define `IDsseVerifier` interface in `StellaOps.Attestation` |
| 7 | RPL-007 | DONE | RPL-006 | Attestor Guild | Implement `DsseVerifier` using existing `DsseHelper` |
| 8 | RPL-008 | DONE | RPL-007 | CLI Guild | Wire `DsseVerifier` into CLI DI container |
| 9 | RPL-009 | DONE | RPL-008 | CLI Guild | Update `CommandHandlers.VerifyBundle.VerifyDsseSignatureAsync()` |
| 10 | RPL-010 | DONE | RPL-009 | Attestor Guild | Unit tests: DSSE verification with valid/invalid signatures |
| **ReplayProof Schema** |
| 11 | RPL-011 | TODO | - | Replay Guild | Create `ReplayProof` model in `StellaOps.Replay.Core` |
| 12 | RPL-012 | TODO | RPL-011 | Replay Guild | Implement `ToCompactString()` with canonical JSON + SHA-256 |
| 13 | RPL-013 | TODO | RPL-012 | Replay Guild | Update `stella verify --bundle` to output replay proof |
| 14 | RPL-014 | TODO | RPL-013 | Replay Guild | Unit tests: Replay proof generation and parsing |
| 11 | RPL-011 | DONE | - | Replay Guild | Create `ReplayProof` model in `StellaOps.Replay.Core` |
| 12 | RPL-012 | DONE | RPL-011 | Replay Guild | Implement `ToCompactString()` with canonical JSON + SHA-256 |
| 13 | RPL-013 | DONE | RPL-012 | Replay Guild | Update `stella verify --bundle` to output replay proof |
| 14 | RPL-014 | DONE | RPL-013 | Replay Guild | Unit tests: Replay proof generation and parsing |
| **stella prove Command** |
| 15 | RPL-015 | TODO | RPL-011 | CLI Guild | Create `ProveCommandGroup.cs` with command structure |
| 16 | RPL-016 | TODO | RPL-015 | CLI Guild | Implement `ITimelineQueryService` adapter for snapshot lookup |
@@ -506,6 +506,8 @@ internal static class ProveCommandGroup
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
| 2026-01-xx | Completed RPL-006 through RPL-010: IDsseVerifier interface, DsseVerifier implementation with ECDSA/RSA support, CLI integration, 12 unit tests all passing | Implementer |
| 2026-01-xx | Completed RPL-011 through RPL-014: ReplayProof model, ToCompactString with SHA-256, ToCanonicalJson, FromExecutionResult factory, 14 unit tests all passing | Implementer |
---

View File

@@ -289,28 +289,28 @@ public sealed class BatchSnapshotService
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | SQC-001 | TODO | HLC lib | Guild | Add StellaOps.HybridLogicalClock reference to Scheduler projects |
| 2 | SQC-002 | TODO | SQC-001 | Guild | Create migration: `scheduler.scheduler_log` table |
| 3 | SQC-003 | TODO | SQC-002 | Guild | Create migration: `scheduler.batch_snapshot` table |
| 4 | SQC-004 | TODO | SQC-002 | Guild | Create migration: `scheduler.chain_heads` table |
| 5 | SQC-005 | TODO | SQC-004 | Guild | Implement `ISchedulerLogRepository` interface |
| 6 | SQC-006 | TODO | SQC-005 | Guild | Implement `PostgresSchedulerLogRepository` |
| 7 | SQC-007 | TODO | SQC-004 | Guild | Implement `IChainHeadRepository` and Postgres implementation |
| 8 | SQC-008 | TODO | SQC-006 | Guild | Implement `SchedulerChainLinking` static class |
| 9 | SQC-009 | TODO | SQC-008 | Guild | Implement `HlcSchedulerEnqueueService` |
| 10 | SQC-010 | TODO | SQC-009 | Guild | Implement `HlcSchedulerDequeueService` |
| 11 | SQC-011 | TODO | SQC-010 | Guild | Update Redis queue adapter to include HLC in message |
| 12 | SQC-012 | TODO | SQC-010 | Guild | Update NATS queue adapter to include HLC in message |
| 13 | SQC-013 | TODO | SQC-006 | Guild | Implement `BatchSnapshotService` |
| 14 | SQC-014 | TODO | SQC-013 | Guild | Add DSSE signing integration for batch snapshots |
| 15 | SQC-015 | TODO | SQC-008 | Guild | Implement chain verification: `VerifyChainIntegrity()` |
| 16 | SQC-016 | TODO | SQC-015 | Guild | Write unit tests: chain linking, HLC ordering |
| 17 | SQC-017 | TODO | SQC-016 | Guild | Write integration tests: enqueue/dequeue with chain |
| 18 | SQC-018 | TODO | SQC-017 | Guild | Write determinism tests: same input -> same chain |
| 19 | SQC-019 | TODO | SQC-018 | Guild | Update existing JobRepository to use HLC ordering optionally |
| 20 | SQC-020 | TODO | SQC-019 | Guild | Feature flag: `SchedulerOptions.EnableHlcOrdering` |
| 21 | SQC-021 | TODO | SQC-020 | Guild | Migration guide: enabling HLC on existing deployments |
| 22 | SQC-022 | TODO | SQC-021 | Guild | Metrics: `scheduler_hlc_enqueues_total`, `scheduler_chain_verifications_total` |
| 1 | SQC-001 | DONE | HLC lib | Guild | Add StellaOps.HybridLogicalClock reference to Scheduler projects |
| 2 | SQC-002 | DONE | SQC-001 | Guild | Create migration: `scheduler.scheduler_log` table |
| 3 | SQC-003 | DONE | SQC-002 | Guild | Create migration: `scheduler.batch_snapshot` table |
| 4 | SQC-004 | DONE | SQC-002 | Guild | Create migration: `scheduler.chain_heads` table |
| 5 | SQC-005 | DONE | SQC-004 | Guild | Implement `ISchedulerLogRepository` interface |
| 6 | SQC-006 | DONE | SQC-005 | Guild | Implement `PostgresSchedulerLogRepository` |
| 7 | SQC-007 | DONE | SQC-004 | Guild | Implement `IChainHeadRepository` and Postgres implementation |
| 8 | SQC-008 | DONE | SQC-006 | Guild | Implement `SchedulerChainLinking` static class |
| 9 | SQC-009 | DONE | SQC-008 | Guild | Implement `HlcSchedulerEnqueueService` |
| 10 | SQC-010 | DONE | SQC-009 | Guild | Implement `HlcSchedulerDequeueService` |
| 11 | SQC-011 | DONE | SQC-010 | Guild | Update Redis queue adapter to include HLC in message |
| 12 | SQC-012 | DONE | SQC-010 | Guild | Update NATS queue adapter to include HLC in message |
| 13 | SQC-013 | DONE | SQC-006 | Guild | Implement `BatchSnapshotService` |
| 14 | SQC-014 | DONE | SQC-013 | Guild | Add DSSE signing integration for batch snapshots |
| 15 | SQC-015 | DONE | SQC-008 | Guild | Implement chain verification: `VerifyChainIntegrity()` |
| 16 | SQC-016 | DONE | SQC-015 | Guild | Write unit tests: chain linking, HLC ordering |
| 17 | SQC-017 | DONE | SQC-016 | Guild | Write integration tests: enqueue/dequeue with chain |
| 18 | SQC-018 | DONE | SQC-017 | Guild | Write determinism tests: same input -> same chain |
| 19 | SQC-019 | DONE | SQC-018 | Guild | Update existing JobRepository to use HLC ordering optionally |
| 20 | SQC-020 | DONE | SQC-019 | Guild | Feature flag: `SchedulerOptions.EnableHlcOrdering` |
| 21 | SQC-021 | DONE | SQC-020 | Guild | Migration guide: enabling HLC on existing deployments |
| 22 | SQC-022 | DONE | SQC-021 | Guild | Metrics: `scheduler_hlc_enqueues_total`, `scheduler_chain_verifications_total` |
## Chain Verification
@@ -419,6 +419,20 @@ public sealed class SchedulerOptions
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
| 2026-01-06 | SQC-001: Added HLC and CanonicalJson references to Scheduler.Persistence and Scheduler.Queue projects | Agent |
| 2026-01-06 | SQC-002-004: Created migration 002_hlc_queue_chain.sql with scheduler_log, batch_snapshot, chain_heads tables | Agent |
| 2026-01-06 | SQC-005-008: Implemented SchedulerChainLinking, ISchedulerLogRepository, PostgresSchedulerLogRepository, IChainHeadRepository, PostgresChainHeadRepository | Agent |
| 2026-01-06 | SQC-009: Implemented HlcSchedulerEnqueueService with chain linking and idempotency | Agent |
| 2026-01-06 | SQC-010: Implemented HlcSchedulerDequeueService with HLC-ordered retrieval and cursor pagination | Agent |
| 2026-01-06 | SQC-013: Implemented BatchSnapshotService with audit anchoring and optional DSSE signing | Agent |
| 2026-01-06 | SQC-015: Implemented SchedulerChainVerifier for chain integrity verification | Agent |
| 2026-01-06 | SQC-020: Added SchedulerHlcOptions with EnableHlcOrdering, DualWriteMode, VerifyOnDequeue flags | Agent |
| 2026-01-06 | SQC-022: Implemented HlcSchedulerMetrics with enqueue, dequeue, verification, and snapshot metrics | Agent |
| 2026-01-06 | Added HlcSchedulerServiceCollectionExtensions for DI registration | Agent |
| 2026-01-06 | SQC-011-012: Verified Redis and NATS adapters already have HLC support (IHybridLogicalClock injection, Tick(), header storage) | Agent |
| 2026-01-06 | SQC-021: Created HLC migration guide at docs/modules/scheduler/hlc-migration-guide.md | Agent |
| 2026-01-06 | SQC-014: Implemented BatchSnapshotDsseSigner with HMAC-SHA256 signing, PAE encoding, and verification | Agent |
| 2026-01-06 | SQC-019: Updated JobRepository with optional HLC ordering via JobRepositoryOptions; GetScheduledJobsAsync and GetByStatusAsync now join with scheduler_log when enabled | Agent |
## Next Checkpoints

View File

@@ -632,17 +632,17 @@ public sealed class FacetDriftVexEmitter
| # | Task ID | Status | Dependency | Owners | Task Definition |
|---|---------|--------|------------|--------|-----------------|
| **Drift Engine** |
| 1 | QTA-001 | TODO | FCT models | Facet Guild | Define `IFacetDriftEngine` interface |
| 2 | QTA-002 | TODO | QTA-001 | Facet Guild | Define `FacetDriftReport` model |
| 3 | QTA-003 | TODO | QTA-002 | Facet Guild | Implement file diff computation (added/removed/modified) |
| 4 | QTA-004 | TODO | QTA-003 | Facet Guild | Implement allowlist glob filtering |
| 5 | QTA-005 | TODO | QTA-004 | Facet Guild | Implement drift score calculation |
| 6 | QTA-006 | TODO | QTA-005 | Facet Guild | Implement quota evaluation logic |
| 7 | QTA-007 | TODO | QTA-006 | Facet Guild | Unit tests: Drift computation with fixtures |
| 8 | QTA-008 | TODO | QTA-007 | Facet Guild | Unit tests: Quota evaluation edge cases |
| 1 | QTA-001 | DONE | FCT models | Facet Guild | Define `IFacetDriftEngine` interface |
| 2 | QTA-002 | DONE | QTA-001 | Facet Guild | Define `FacetDriftReport` model |
| 3 | QTA-003 | DONE | QTA-002 | Facet Guild | Implement file diff computation (added/removed/modified) |
| 4 | QTA-004 | DONE | QTA-003 | Facet Guild | Implement allowlist glob filtering |
| 5 | QTA-005 | DONE | QTA-004 | Facet Guild | Implement drift score calculation |
| 6 | QTA-006 | DONE | QTA-005 | Facet Guild | Implement quota evaluation logic |
| 7 | QTA-007 | DONE | QTA-006 | Facet Guild | Unit tests: Drift computation with fixtures |
| 8 | QTA-008 | DONE | QTA-007 | Facet Guild | Unit tests: Quota evaluation edge cases |
| **Quota Enforcement** |
| 9 | QTA-009 | TODO | QTA-006 | Policy Guild | Create `FacetQuotaGate` class |
| 10 | QTA-010 | TODO | QTA-009 | Policy Guild | Integrate with `IGateEvaluator` pipeline |
| 9 | QTA-009 | DONE | QTA-006 | Policy Guild | Create `FacetQuotaGate` class |
| 10 | QTA-010 | DONE | QTA-009 | Policy Guild | Integrate with `IGateEvaluator` pipeline |
| 11 | QTA-011 | TODO | QTA-010 | Policy Guild | Add `FacetQuotaEnabled` to policy options |
| 12 | QTA-012 | TODO | QTA-011 | Policy Guild | Create `IFacetSealStore` for baseline lookups |
| 13 | QTA-013 | TODO | QTA-012 | Policy Guild | Implement Postgres storage for facet seals |
@@ -678,6 +678,10 @@ public sealed class FacetDriftVexEmitter
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | QTA-001 to QTA-006 already implemented in FacetDriftDetector.cs | Agent |
| 2026-01-06 | QTA-007/008: Created StellaOps.Facet.Tests with 18 passing tests | Agent |
| 2026-01-06 | QTA-009: Created FacetQuotaGate in StellaOps.Policy.Gates | Agent |
| 2026-01-06 | QTA-010: Created FacetQuotaGateServiceCollectionExtensions for DI/registry integration | Agent |
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
---

View File

@@ -337,27 +337,27 @@ public sealed class ConflictResolver
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | OMP-001 | TODO | SQC lib | Guild | Create `StellaOps.AirGap.Sync` library project |
| 2 | OMP-002 | TODO | OMP-001 | Guild | Implement `OfflineHlcManager` for local offline enqueue |
| 3 | OMP-003 | TODO | OMP-002 | Guild | Implement `IOfflineJobLogStore` and file-based store |
| 4 | OMP-004 | TODO | OMP-003 | Guild | Implement `HlcMergeService` with total order merge |
| 5 | OMP-005 | TODO | OMP-004 | Guild | Implement `ConflictResolver` for edge cases |
| 6 | OMP-006 | TODO | OMP-005 | Guild | Implement `AirGapSyncService` for bundle import |
| 7 | OMP-007 | TODO | OMP-006 | Guild | Define `AirGapBundle` format (JSON schema) |
| 8 | OMP-008 | TODO | OMP-007 | Guild | Implement bundle export: `AirGapBundleExporter` |
| 9 | OMP-009 | TODO | OMP-008 | Guild | Implement bundle import: `AirGapBundleImporter` |
| 10 | OMP-010 | TODO | OMP-009 | Guild | Add DSSE signing for bundle integrity |
| 11 | OMP-011 | TODO | OMP-006 | Guild | Integrate with Router transport layer |
| 12 | OMP-012 | TODO | OMP-011 | Guild | Update `stella airgap export` CLI command |
| 13 | OMP-013 | TODO | OMP-012 | Guild | Update `stella airgap import` CLI command |
| 1 | OMP-001 | DONE | SQC lib | Guild | Create `StellaOps.AirGap.Sync` library project |
| 2 | OMP-002 | DONE | OMP-001 | Guild | Implement `OfflineHlcManager` for local offline enqueue |
| 3 | OMP-003 | DONE | OMP-002 | Guild | Implement `IOfflineJobLogStore` and file-based store |
| 4 | OMP-004 | DONE | OMP-003 | Guild | Implement `HlcMergeService` with total order merge |
| 5 | OMP-005 | DONE | OMP-004 | Guild | Implement `ConflictResolver` for edge cases |
| 6 | OMP-006 | DONE | OMP-005 | Guild | Implement `AirGapSyncService` for bundle import |
| 7 | OMP-007 | DONE | OMP-006 | Guild | Define `AirGapBundle` format (JSON schema) |
| 8 | OMP-008 | DONE | OMP-007 | Guild | Implement bundle export: `AirGapBundleExporter` |
| 9 | OMP-009 | DONE | OMP-008 | Guild | Implement bundle import: `AirGapBundleImporter` |
| 10 | OMP-010 | DONE | OMP-009 | Guild | Add DSSE signing for bundle integrity |
| 11 | OMP-011 | DONE | OMP-006 | Guild | Integrate with Router transport layer |
| 12 | OMP-012 | DONE | OMP-011 | Guild | Update `stella airgap export` CLI command |
| 13 | OMP-013 | DONE | OMP-012 | Guild | Update `stella airgap import` CLI command |
| 14 | OMP-014 | TODO | OMP-004 | Guild | Write unit tests: merge algorithm correctness |
| 15 | OMP-015 | TODO | OMP-014 | Guild | Write unit tests: duplicate detection |
| 16 | OMP-016 | TODO | OMP-015 | Guild | Write unit tests: conflict resolution |
| 17 | OMP-017 | TODO | OMP-016 | Guild | Write integration tests: offline -> online sync |
| 18 | OMP-018 | TODO | OMP-017 | Guild | Write integration tests: multi-node merge |
| 19 | OMP-019 | TODO | OMP-018 | Guild | Write determinism tests: same bundles -> same result |
| 20 | OMP-020 | TODO | OMP-019 | Guild | Metrics: `airgap_sync_total`, `airgap_merge_conflicts_total` |
| 21 | OMP-021 | TODO | OMP-020 | Guild | Documentation: offline operations guide |
| 20 | OMP-020 | DONE | OMP-019 | Guild | Metrics: `airgap_sync_total`, `airgap_merge_conflicts_total` |
| 21 | OMP-021 | DONE | OMP-020 | Guild | Documentation: offline operations guide |
## Test Scenarios
@@ -436,6 +436,16 @@ airgap_last_sync_timestamp{node_id}
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-05 | Sprint created from product advisory gap analysis | Planning |
| 2026-01-06 | OMP-001: Created StellaOps.AirGap.Sync library project with HLC, Canonical.Json, Scheduler.Models dependencies | Agent |
| 2026-01-06 | OMP-002-003: Implemented OfflineHlcManager and FileBasedOfflineJobLogStore for offline enqueue | Agent |
| 2026-01-06 | OMP-004-005: Implemented HlcMergeService with total order merge and ConflictResolver | Agent |
| 2026-01-06 | OMP-006: Implemented AirGapSyncService for bundle import with idempotency and chain recomputation | Agent |
| 2026-01-06 | OMP-007-009: Defined AirGapBundle models and implemented AirGapBundleExporter/Importer with validation | Agent |
| 2026-01-06 | OMP-010: Added manifest digest computation for bundle integrity (DSSE signing prepared via delegate) | Agent |
| 2026-01-06 | OMP-020: Implemented AirGapSyncMetrics with counters for exports, imports, syncs, duplicates, conflicts | Agent |
| 2026-01-06 | OMP-011: Created IJobSyncTransport, FileBasedJobSyncTransport, RouterJobSyncTransport for transport abstraction | Agent |
| 2026-01-06 | OMP-012-013: Added `stella airgap jobs export/import/list` CLI commands with handlers | Agent |
| 2026-01-06 | OMP-021: Created docs/airgap/job-sync-offline.md with CLI usage, bundle format, and runbook | Agent |
## Next Checkpoints

View File

@@ -0,0 +1,775 @@
# Sprint 20260106_001_001_LB - Determinization: Core Models and Types
## Topic & Scope
Create the foundational models and types for the Determinization subsystem. This implements the core data structures from the advisory: `pending_determinization` state, `SignalState<T>` wrapper, `UncertaintyScore`, and `ObservationDecay`.
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Determinization/`
- **Evidence:** New library project, model classes, unit tests
## Problem Statement
Current state tracking for CVEs:
- VEX has 4 states (`Affected`, `NotAffected`, `Fixed`, `UnderInvestigation`)
- Unknowns tracked separately via `Unknown` entity in Policy.Unknowns
- No unified "observation state" for CVE lifecycle
- Signal absence (EPSS null) indistinguishable from "not queried"
Advisory requires:
- `pending_determinization` as first-class observation state
- `SignalState<T>` distinguishing `NotQueried` vs `Queried(null)` vs `Queried(value)`
- `UncertaintyScore` measuring knowledge completeness (not code entropy)
- `ObservationDecay` tracking evidence staleness with configurable half-life
## Dependencies & Concurrency
- **Depends on:** None (foundational library)
- **Blocks:** SPRINT_20260106_001_002_LB (scoring), SPRINT_20260106_001_003_POLICY (gates)
- **Parallel safe:** New library; no cross-module conflicts
## Documentation Prerequisites
- docs/modules/policy/determinization-architecture.md
- src/Policy/AGENTS.md
- Product Advisory: "Unknown CVEs: graceful placeholders, not blockers"
## Technical Design
### Project Structure
```
src/Policy/__Libraries/StellaOps.Policy.Determinization/
├── StellaOps.Policy.Determinization.csproj
├── Models/
│ ├── ObservationState.cs
│ ├── SignalState.cs
│ ├── SignalQueryStatus.cs
│ ├── SignalSnapshot.cs
│ ├── UncertaintyScore.cs
│ ├── UncertaintyTier.cs
│ ├── SignalGap.cs
│ ├── ObservationDecay.cs
│ ├── GuardRails.cs
│ ├── DeterminizationContext.cs
│ └── DeterminizationResult.cs
├── Evidence/
│ ├── EpssEvidence.cs # Re-export or reference Scanner.Core
│ ├── VexClaimSummary.cs
│ ├── ReachabilityEvidence.cs
│ ├── RuntimeEvidence.cs
│ ├── BackportEvidence.cs
│ ├── SbomLineageEvidence.cs
│ └── CvssEvidence.cs
└── GlobalUsings.cs
```
### ObservationState Enum
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Observation state for CVE tracking, independent of VEX status.
/// Allows a CVE to be "Affected" (VEX) but "PendingDeterminization" (observation).
/// </summary>
public enum ObservationState
{
/// <summary>
/// Initial state: CVE discovered but evidence incomplete.
/// Triggers guardrail-based policy evaluation.
/// </summary>
PendingDeterminization = 0,
/// <summary>
/// Evidence sufficient for confident determination.
/// Normal policy evaluation applies.
/// </summary>
Determined = 1,
/// <summary>
/// Multiple signals conflict (K4 Conflict state).
/// Requires human review regardless of confidence.
/// </summary>
Disputed = 2,
/// <summary>
/// Evidence decayed below threshold; needs refresh.
/// Auto-triggered when decay > threshold.
/// </summary>
StaleRequiresRefresh = 3,
/// <summary>
/// Manually flagged for review.
/// Bypasses automatic determinization.
/// </summary>
ManualReviewRequired = 4,
/// <summary>
/// CVE suppressed/ignored by policy exception.
/// Evidence tracking continues but decisions skip.
/// </summary>
Suppressed = 5
}
```
### SignalState<T> Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Wraps a signal value with query status metadata.
/// Distinguishes between: not queried, queried with value, queried but absent, query failed.
/// </summary>
/// <typeparam name="T">The signal evidence type.</typeparam>
public sealed record SignalState<T>
{
/// <summary>Status of the signal query.</summary>
public required SignalQueryStatus Status { get; init; }
/// <summary>Signal value if Status is Queried and value exists.</summary>
public T? Value { get; init; }
/// <summary>When the signal was last queried (UTC).</summary>
public DateTimeOffset? QueriedAt { get; init; }
/// <summary>Reason for failure if Status is Failed.</summary>
public string? FailureReason { get; init; }
/// <summary>Source that provided the value (feed ID, issuer, etc.).</summary>
public string? Source { get; init; }
/// <summary>Whether this signal contributes to uncertainty (true if not queried or failed).</summary>
public bool ContributesToUncertainty =>
Status is SignalQueryStatus.NotQueried or SignalQueryStatus.Failed;
/// <summary>Whether this signal has a usable value.</summary>
public bool HasValue => Status == SignalQueryStatus.Queried && Value is not null;
/// <summary>Creates a NotQueried signal state.</summary>
public static SignalState<T> NotQueried() => new()
{
Status = SignalQueryStatus.NotQueried
};
/// <summary>Creates a Queried signal state with a value.</summary>
public static SignalState<T> WithValue(T value, DateTimeOffset queriedAt, string? source = null) => new()
{
Status = SignalQueryStatus.Queried,
Value = value,
QueriedAt = queriedAt,
Source = source
};
/// <summary>Creates a Queried signal state with null (queried but absent).</summary>
public static SignalState<T> Absent(DateTimeOffset queriedAt, string? source = null) => new()
{
Status = SignalQueryStatus.Queried,
Value = default,
QueriedAt = queriedAt,
Source = source
};
/// <summary>Creates a Failed signal state.</summary>
public static SignalState<T> Failed(string reason) => new()
{
Status = SignalQueryStatus.Failed,
FailureReason = reason
};
}
/// <summary>
/// Query status for a signal source.
/// </summary>
public enum SignalQueryStatus
{
/// <summary>Signal source not yet queried.</summary>
NotQueried = 0,
/// <summary>Signal source queried; value may be present or absent.</summary>
Queried = 1,
/// <summary>Signal query failed (timeout, network, parse error).</summary>
Failed = 2
}
```
### SignalSnapshot Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Immutable snapshot of all signals for a CVE observation at a point in time.
/// </summary>
public sealed record SignalSnapshot
{
/// <summary>CVE identifier (e.g., CVE-2026-12345).</summary>
public required string CveId { get; init; }
/// <summary>Subject component (PURL).</summary>
public required string SubjectPurl { get; init; }
/// <summary>Snapshot capture time (UTC).</summary>
public required DateTimeOffset CapturedAt { get; init; }
/// <summary>EPSS score signal.</summary>
public required SignalState<EpssEvidence> Epss { get; init; }
/// <summary>VEX claim signal.</summary>
public required SignalState<VexClaimSummary> Vex { get; init; }
/// <summary>Reachability determination signal.</summary>
public required SignalState<ReachabilityEvidence> Reachability { get; init; }
/// <summary>Runtime observation signal (eBPF, dyld, ETW).</summary>
public required SignalState<RuntimeEvidence> Runtime { get; init; }
/// <summary>Fix backport detection signal.</summary>
public required SignalState<BackportEvidence> Backport { get; init; }
/// <summary>SBOM lineage signal.</summary>
public required SignalState<SbomLineageEvidence> SbomLineage { get; init; }
/// <summary>Known Exploited Vulnerability flag.</summary>
public required SignalState<bool> Kev { get; init; }
/// <summary>CVSS score signal.</summary>
public required SignalState<CvssEvidence> Cvss { get; init; }
/// <summary>
/// Creates an empty snapshot with all signals in NotQueried state.
/// </summary>
public static SignalSnapshot Empty(string cveId, string subjectPurl, DateTimeOffset capturedAt) => new()
{
CveId = cveId,
SubjectPurl = subjectPurl,
CapturedAt = capturedAt,
Epss = SignalState<EpssEvidence>.NotQueried(),
Vex = SignalState<VexClaimSummary>.NotQueried(),
Reachability = SignalState<ReachabilityEvidence>.NotQueried(),
Runtime = SignalState<RuntimeEvidence>.NotQueried(),
Backport = SignalState<BackportEvidence>.NotQueried(),
SbomLineage = SignalState<SbomLineageEvidence>.NotQueried(),
Kev = SignalState<bool>.NotQueried(),
Cvss = SignalState<CvssEvidence>.NotQueried()
};
}
```
### UncertaintyScore Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Measures knowledge completeness for a CVE observation.
/// High entropy (close to 1.0) means many signals are missing.
/// Low entropy (close to 0.0) means comprehensive evidence.
/// </summary>
public sealed record UncertaintyScore
{
/// <summary>Entropy value [0.0-1.0]. Higher = more uncertain.</summary>
public required double Entropy { get; init; }
/// <summary>Completeness value [0.0-1.0]. Higher = more complete. (1 - Entropy)</summary>
public double Completeness => 1.0 - Entropy;
/// <summary>Signals that are missing or failed.</summary>
public required ImmutableArray<SignalGap> MissingSignals { get; init; }
/// <summary>Weighted sum of present signals.</summary>
public required double WeightedEvidenceSum { get; init; }
/// <summary>Maximum possible weighted sum (all signals present).</summary>
public required double MaxPossibleWeight { get; init; }
/// <summary>Tier classification based on entropy.</summary>
public UncertaintyTier Tier => Entropy switch
{
<= 0.2 => UncertaintyTier.VeryLow,
<= 0.4 => UncertaintyTier.Low,
<= 0.6 => UncertaintyTier.Medium,
<= 0.8 => UncertaintyTier.High,
_ => UncertaintyTier.VeryHigh
};
/// <summary>
/// Creates a fully certain score (all evidence present).
/// </summary>
public static UncertaintyScore FullyCertain(double maxWeight) => new()
{
Entropy = 0.0,
MissingSignals = ImmutableArray<SignalGap>.Empty,
WeightedEvidenceSum = maxWeight,
MaxPossibleWeight = maxWeight
};
/// <summary>
/// Creates a fully uncertain score (no evidence).
/// </summary>
public static UncertaintyScore FullyUncertain(double maxWeight, ImmutableArray<SignalGap> gaps) => new()
{
Entropy = 1.0,
MissingSignals = gaps,
WeightedEvidenceSum = 0.0,
MaxPossibleWeight = maxWeight
};
}
/// <summary>
/// Tier classification for uncertainty levels.
/// </summary>
public enum UncertaintyTier
{
/// <summary>Entropy &lt;= 0.2: Comprehensive evidence.</summary>
VeryLow = 0,
/// <summary>Entropy &lt;= 0.4: Good evidence coverage.</summary>
Low = 1,
/// <summary>Entropy &lt;= 0.6: Moderate gaps.</summary>
Medium = 2,
/// <summary>Entropy &lt;= 0.8: Significant gaps.</summary>
High = 3,
/// <summary>Entropy &gt; 0.8: Minimal evidence.</summary>
VeryHigh = 4
}
/// <summary>
/// Represents a missing or failed signal in uncertainty calculation.
/// </summary>
public sealed record SignalGap(
string SignalName,
double Weight,
SignalQueryStatus Status,
string? Reason);
```
### ObservationDecay Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Tracks evidence freshness decay for a CVE observation.
/// </summary>
public sealed record ObservationDecay
{
/// <summary>Half-life for confidence decay. Default: 14 days per advisory.</summary>
public required TimeSpan HalfLife { get; init; }
/// <summary>Minimum confidence floor (never decays below). Default: 0.35.</summary>
public required double Floor { get; init; }
/// <summary>Last time any signal was updated (UTC).</summary>
public required DateTimeOffset LastSignalUpdate { get; init; }
/// <summary>Current decayed confidence multiplier [Floor-1.0].</summary>
public required double DecayedMultiplier { get; init; }
/// <summary>When next auto-review is scheduled (UTC).</summary>
public DateTimeOffset? NextReviewAt { get; init; }
/// <summary>Whether decay has triggered stale state.</summary>
public bool IsStale { get; init; }
/// <summary>Age of the evidence in days.</summary>
public double AgeDays { get; init; }
/// <summary>
/// Creates a fresh observation (no decay applied).
/// </summary>
public static ObservationDecay Fresh(DateTimeOffset lastUpdate, TimeSpan halfLife, double floor = 0.35) => new()
{
HalfLife = halfLife,
Floor = floor,
LastSignalUpdate = lastUpdate,
DecayedMultiplier = 1.0,
NextReviewAt = lastUpdate.Add(halfLife),
IsStale = false,
AgeDays = 0
};
/// <summary>Default half-life: 14 days per advisory recommendation.</summary>
public static readonly TimeSpan DefaultHalfLife = TimeSpan.FromDays(14);
/// <summary>Default floor: 0.35 per existing FreshnessCalculator.</summary>
public const double DefaultFloor = 0.35;
}
```
### GuardRails Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Guardrails applied when allowing uncertain observations.
/// </summary>
public sealed record GuardRails
{
/// <summary>Enable runtime monitoring for this observation.</summary>
public required bool EnableRuntimeMonitoring { get; init; }
/// <summary>Interval for automatic re-review.</summary>
public required TimeSpan ReviewInterval { get; init; }
/// <summary>EPSS threshold that triggers automatic escalation.</summary>
public required double EpssEscalationThreshold { get; init; }
/// <summary>Reachability status that triggers escalation.</summary>
public required ImmutableArray<string> EscalatingReachabilityStates { get; init; }
/// <summary>Maximum time in guarded state before forced review.</summary>
public required TimeSpan MaxGuardedDuration { get; init; }
/// <summary>Alert channels for this observation.</summary>
public ImmutableArray<string> AlertChannels { get; init; } = ImmutableArray<string>.Empty;
/// <summary>Additional context for audit trail.</summary>
public string? PolicyRationale { get; init; }
/// <summary>
/// Creates default guardrails per advisory recommendation.
/// </summary>
public static GuardRails Default() => new()
{
EnableRuntimeMonitoring = true,
ReviewInterval = TimeSpan.FromDays(7),
EpssEscalationThreshold = 0.4,
EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
MaxGuardedDuration = TimeSpan.FromDays(30)
};
}
```
### DeterminizationContext Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Context for determinization policy evaluation.
/// </summary>
public sealed record DeterminizationContext
{
/// <summary>Point-in-time signal snapshot.</summary>
public required SignalSnapshot SignalSnapshot { get; init; }
/// <summary>Calculated uncertainty score.</summary>
public required UncertaintyScore UncertaintyScore { get; init; }
/// <summary>Evidence decay information.</summary>
public required ObservationDecay Decay { get; init; }
/// <summary>Aggregated trust score [0.0-1.0].</summary>
public required double TrustScore { get; init; }
/// <summary>Deployment environment (Production, Staging, Development).</summary>
public required DeploymentEnvironment Environment { get; init; }
/// <summary>Asset criticality tier (optional).</summary>
public AssetCriticality? AssetCriticality { get; init; }
/// <summary>Existing observation state (for transition decisions).</summary>
public ObservationState? CurrentState { get; init; }
/// <summary>Policy evaluation options.</summary>
public DeterminizationOptions? Options { get; init; }
}
/// <summary>
/// Deployment environment classification.
/// </summary>
public enum DeploymentEnvironment
{
Development = 0,
Staging = 1,
Production = 2
}
/// <summary>
/// Asset criticality classification.
/// </summary>
public enum AssetCriticality
{
Low = 0,
Medium = 1,
High = 2,
Critical = 3
}
```
### DeterminizationResult Record
```csharp
namespace StellaOps.Policy.Determinization.Models;
/// <summary>
/// Result of determinization policy evaluation.
/// </summary>
public sealed record DeterminizationResult
{
/// <summary>Policy verdict status.</summary>
public required PolicyVerdictStatus Status { get; init; }
/// <summary>Human-readable reason for the decision.</summary>
public required string Reason { get; init; }
/// <summary>Guardrails to apply if Status is GuardedPass.</summary>
public GuardRails? GuardRails { get; init; }
/// <summary>Suggested new observation state.</summary>
public ObservationState? SuggestedState { get; init; }
/// <summary>Rule that matched (for audit).</summary>
public string? MatchedRule { get; init; }
/// <summary>Additional metadata for audit trail.</summary>
public ImmutableDictionary<string, object>? Metadata { get; init; }
public static DeterminizationResult Allowed(string reason, PolicyVerdictStatus status = PolicyVerdictStatus.Pass) =>
new() { Status = status, Reason = reason, SuggestedState = ObservationState.Determined };
public static DeterminizationResult GuardedAllow(string reason, PolicyVerdictStatus status, GuardRails guardrails) =>
new() { Status = status, Reason = reason, GuardRails = guardrails, SuggestedState = ObservationState.PendingDeterminization };
public static DeterminizationResult Quarantined(string reason, PolicyVerdictStatus status) =>
new() { Status = status, Reason = reason, SuggestedState = ObservationState.ManualReviewRequired };
public static DeterminizationResult Escalated(string reason, PolicyVerdictStatus status) =>
new() { Status = status, Reason = reason, SuggestedState = ObservationState.ManualReviewRequired };
public static DeterminizationResult Deferred(string reason, PolicyVerdictStatus status) =>
new() { Status = status, Reason = reason, SuggestedState = ObservationState.StaleRequiresRefresh };
}
```
### Evidence Models
```csharp
namespace StellaOps.Policy.Determinization.Evidence;
/// <summary>
/// EPSS evidence for a CVE.
/// </summary>
public sealed record EpssEvidence
{
/// <summary>EPSS score [0.0-1.0].</summary>
public required double Score { get; init; }
/// <summary>EPSS percentile [0.0-1.0].</summary>
public required double Percentile { get; init; }
/// <summary>EPSS model date.</summary>
public required DateOnly ModelDate { get; init; }
}
/// <summary>
/// VEX claim summary for a CVE.
/// </summary>
public sealed record VexClaimSummary
{
/// <summary>VEX status.</summary>
public required string Status { get; init; }
/// <summary>Justification if not_affected.</summary>
public string? Justification { get; init; }
/// <summary>Issuer of the VEX statement.</summary>
public required string Issuer { get; init; }
/// <summary>Issuer trust level.</summary>
public required double IssuerTrust { get; init; }
}
/// <summary>
/// Reachability evidence for a CVE.
/// </summary>
public sealed record ReachabilityEvidence
{
/// <summary>Reachability status.</summary>
public required ReachabilityStatus Status { get; init; }
/// <summary>Confidence in the determination [0.0-1.0].</summary>
public required double Confidence { get; init; }
/// <summary>Call path depth if reachable.</summary>
public int? PathDepth { get; init; }
}
public enum ReachabilityStatus
{
Unknown = 0,
Reachable = 1,
Unreachable = 2,
Gated = 3,
ObservedReachable = 4
}
/// <summary>
/// Runtime observation evidence.
/// </summary>
public sealed record RuntimeEvidence
{
/// <summary>Whether vulnerable code was observed loaded.</summary>
public required bool ObservedLoaded { get; init; }
/// <summary>Observation source (eBPF, dyld, ETW).</summary>
public required string Source { get; init; }
/// <summary>Observation window.</summary>
public required TimeSpan ObservationWindow { get; init; }
/// <summary>Sample count.</summary>
public required int SampleCount { get; init; }
}
/// <summary>
/// Fix backport detection evidence.
/// </summary>
public sealed record BackportEvidence
{
/// <summary>Whether a backport was detected.</summary>
public required bool BackportDetected { get; init; }
/// <summary>Confidence in detection [0.0-1.0].</summary>
public required double Confidence { get; init; }
/// <summary>Detection method.</summary>
public string? Method { get; init; }
}
/// <summary>
/// SBOM lineage evidence.
/// </summary>
public sealed record SbomLineageEvidence
{
/// <summary>Whether lineage is verified.</summary>
public required bool LineageVerified { get; init; }
/// <summary>SBOM quality score [0.0-1.0].</summary>
public required double QualityScore { get; init; }
/// <summary>Provenance attestation present.</summary>
public required bool HasProvenanceAttestation { get; init; }
}
/// <summary>
/// CVSS evidence for a CVE.
/// </summary>
public sealed record CvssEvidence
{
/// <summary>CVSS base score [0.0-10.0].</summary>
public required double BaseScore { get; init; }
/// <summary>CVSS version (2.0, 3.0, 3.1, 4.0).</summary>
public required string Version { get; init; }
/// <summary>CVSS vector string.</summary>
public string? Vector { get; init; }
/// <summary>Severity label.</summary>
public required string Severity { get; init; }
}
```
### Project File
```xml
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
<RootNamespace>StellaOps.Policy.Determinization</RootNamespace>
<AssemblyName>StellaOps.Policy.Determinization</AssemblyName>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="System.Collections.Immutable" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\StellaOps.Policy\StellaOps.Policy.csproj" />
</ItemGroup>
</Project>
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | DCM-001 | TODO | - | Guild | Create `StellaOps.Policy.Determinization.csproj` project |
| 2 | DCM-002 | TODO | DCM-001 | Guild | Implement `ObservationState` enum |
| 3 | DCM-003 | TODO | DCM-001 | Guild | Implement `SignalQueryStatus` enum |
| 4 | DCM-004 | TODO | DCM-003 | Guild | Implement `SignalState<T>` record with factory methods |
| 5 | DCM-005 | TODO | DCM-004 | Guild | Implement `SignalGap` record |
| 6 | DCM-006 | TODO | DCM-005 | Guild | Implement `UncertaintyTier` enum |
| 7 | DCM-007 | TODO | DCM-006 | Guild | Implement `UncertaintyScore` record with factory methods |
| 8 | DCM-008 | TODO | DCM-001 | Guild | Implement `ObservationDecay` record with factory methods |
| 9 | DCM-009 | TODO | DCM-001 | Guild | Implement `GuardRails` record with defaults |
| 10 | DCM-010 | TODO | DCM-001 | Guild | Implement `DeploymentEnvironment` enum |
| 11 | DCM-011 | TODO | DCM-001 | Guild | Implement `AssetCriticality` enum |
| 12 | DCM-012 | TODO | DCM-011 | Guild | Implement `DeterminizationContext` record |
| 13 | DCM-013 | TODO | DCM-012 | Guild | Implement `DeterminizationResult` record with factory methods |
| 14 | DCM-014 | TODO | DCM-001 | Guild | Implement `EpssEvidence` record |
| 15 | DCM-015 | TODO | DCM-001 | Guild | Implement `VexClaimSummary` record |
| 16 | DCM-016 | TODO | DCM-001 | Guild | Implement `ReachabilityEvidence` record with status enum |
| 17 | DCM-017 | TODO | DCM-001 | Guild | Implement `RuntimeEvidence` record |
| 18 | DCM-018 | TODO | DCM-001 | Guild | Implement `BackportEvidence` record |
| 19 | DCM-019 | TODO | DCM-001 | Guild | Implement `SbomLineageEvidence` record |
| 20 | DCM-020 | TODO | DCM-001 | Guild | Implement `CvssEvidence` record |
| 21 | DCM-021 | TODO | DCM-020 | Guild | Implement `SignalSnapshot` record with Empty factory |
| 22 | DCM-022 | TODO | DCM-021 | Guild | Add `GlobalUsings.cs` with common imports |
| 23 | DCM-023 | TODO | DCM-022 | Guild | Create test project `StellaOps.Policy.Determinization.Tests` |
| 24 | DCM-024 | TODO | DCM-023 | Guild | Write unit tests: `SignalState<T>` factory methods |
| 25 | DCM-025 | TODO | DCM-024 | Guild | Write unit tests: `UncertaintyScore` tier calculation |
| 26 | DCM-026 | TODO | DCM-025 | Guild | Write unit tests: `ObservationDecay` fresh/stale detection |
| 27 | DCM-027 | TODO | DCM-026 | Guild | Write unit tests: `SignalSnapshot.Empty()` initialization |
| 28 | DCM-028 | TODO | DCM-027 | Guild | Write unit tests: `DeterminizationResult` factory methods |
| 29 | DCM-029 | TODO | DCM-028 | Guild | Add project to `StellaOps.Policy.sln` |
| 30 | DCM-030 | TODO | DCM-029 | Guild | Verify build with `dotnet build` |
## Acceptance Criteria
1. All model types compile without warnings
2. Unit tests pass for all factory methods
3. `SignalState<T>` correctly distinguishes NotQueried/Queried/Failed
4. `UncertaintyScore.Tier` correctly maps entropy ranges
5. `ObservationDecay` correctly calculates staleness
6. All records are immutable and use `required` where appropriate
7. XML documentation complete for all public types
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Separate `ObservationState` from VEX status | Orthogonal concerns: VEX = vulnerability impact, Observation = evidence lifecycle |
| `SignalState<T>` as generic wrapper | Type safety for different evidence types; unified null-awareness |
| Entropy tiers at 0.2 increments | Aligns with existing confidence tiers; provides 5 distinct levels |
| 14-day default half-life | Per advisory recommendation; shorter than existing 90-day FreshnessCalculator |
| Risk | Mitigation |
|------|------------|
| Evidence type proliferation | Keep evidence records minimal; reference existing types where possible |
| Name collision with EntropySignal | Use "Uncertainty" terminology consistently; document difference |
| Breaking changes to PolicyVerdictStatus | GuardedPass addition is additive; existing code unaffected |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
## Next Checkpoints
- 2026-01-07: DCM-001 to DCM-013 complete (core models)
- 2026-01-08: DCM-014 to DCM-022 complete (evidence models)
- 2026-01-09: DCM-023 to DCM-030 complete (tests, integration)

View File

@@ -0,0 +1,737 @@
# Sprint 20260106_001_001_LB - Unified Verdict Rationale Renderer
## Topic & Scope
Implement a unified verdict rationale renderer that composes existing evidence (PathWitness, RiskVerdictAttestation, ScoreExplanation, VEX consensus) into a standardized 4-line template for consistent explainability across UI, CLI, and API.
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Explainability/`
- **Evidence:** New library with renderer, tests, schema validation
## Problem Statement
The product advisory requires **uniform, explainable verdicts** with a 4-line template:
1. **Evidence:** "CVE-2024-XXXX in `libxyz` 1.2.3; symbol `foo_read` reachable from `/usr/bin/tool`."
2. **Policy clause:** "Policy S2.1: reachable+EPSS>=0.2 => triage=P1."
3. **Attestations/Proofs:** "Build-ID match to vendor advisory; call-path: `main->parse->foo_read`."
4. **Decision:** "Affected (score 0.72). Mitigation recommended: upgrade or backport KB-123."
Current state:
- `RiskVerdictAttestation` has `Explanation` field but no structured format
- `PathWitness` documents call paths but not rendered into rationale
- `ScoreExplanation` has factor breakdowns but not composed with verdicts
- `VerdictReasonCode` has descriptions but not formatted for users
- `AdvisoryAI.ExplanationResult` provides LLM explanations but no template enforcement
**Gap:** No unified renderer that composes these pieces into the 4-line format for any output channel.
## Dependencies & Concurrency
- **Depends on:** None (uses existing models)
- **Blocks:** None
- **Parallel safe:** New library; no cross-module conflicts
## Documentation Prerequisites
- docs/modules/policy/architecture.md
- src/Policy/AGENTS.md (if exists)
- Product Advisory: "Smart-Diff & Unknowns" explainability section
## Technical Design
### Data Contracts
```csharp
namespace StellaOps.Policy.Explainability;
/// <summary>
/// Structured verdict rationale following the 4-line template.
/// </summary>
public sealed record VerdictRationale
{
/// <summary>Schema version for forward compatibility.</summary>
[JsonPropertyName("schema_version")]
public string SchemaVersion { get; init; } = "1.0";
/// <summary>Unique rationale ID (content-addressed).</summary>
[JsonPropertyName("rationale_id")]
public required string RationaleId { get; init; }
/// <summary>Reference to the verdict being explained.</summary>
[JsonPropertyName("verdict_ref")]
public required VerdictReference VerdictRef { get; init; }
/// <summary>Line 1: Evidence summary.</summary>
[JsonPropertyName("evidence")]
public required RationaleEvidence Evidence { get; init; }
/// <summary>Line 2: Policy clause that triggered the decision.</summary>
[JsonPropertyName("policy_clause")]
public required RationalePolicyClause PolicyClause { get; init; }
/// <summary>Line 3: Attestations and proofs supporting the verdict.</summary>
[JsonPropertyName("attestations")]
public required RationaleAttestations Attestations { get; init; }
/// <summary>Line 4: Final decision with score and recommendation.</summary>
[JsonPropertyName("decision")]
public required RationaleDecision Decision { get; init; }
/// <summary>Generation timestamp (UTC).</summary>
[JsonPropertyName("generated_at")]
public required DateTimeOffset GeneratedAt { get; init; }
/// <summary>Input digests for reproducibility.</summary>
[JsonPropertyName("input_digests")]
public required RationaleInputDigests InputDigests { get; init; }
}
/// <summary>Reference to the verdict being explained.</summary>
public sealed record VerdictReference
{
[JsonPropertyName("attestation_id")]
public required string AttestationId { get; init; }
[JsonPropertyName("artifact_digest")]
public required string ArtifactDigest { get; init; }
[JsonPropertyName("policy_id")]
public required string PolicyId { get; init; }
[JsonPropertyName("policy_version")]
public required string PolicyVersion { get; init; }
}
/// <summary>Line 1: Evidence summary.</summary>
public sealed record RationaleEvidence
{
/// <summary>Primary vulnerability ID (CVE, GHSA, etc.).</summary>
[JsonPropertyName("vulnerability_id")]
public required string VulnerabilityId { get; init; }
/// <summary>Affected component PURL.</summary>
[JsonPropertyName("component_purl")]
public required string ComponentPurl { get; init; }
/// <summary>Affected version.</summary>
[JsonPropertyName("component_version")]
public required string ComponentVersion { get; init; }
/// <summary>Vulnerable symbol (if reachability analyzed).</summary>
[JsonPropertyName("vulnerable_symbol")]
public string? VulnerableSymbol { get; init; }
/// <summary>Entry point from which vulnerable code is reachable.</summary>
[JsonPropertyName("entrypoint")]
public string? Entrypoint { get; init; }
/// <summary>Rendered text for display.</summary>
[JsonPropertyName("text")]
public required string Text { get; init; }
}
/// <summary>Line 2: Policy clause.</summary>
public sealed record RationalePolicyClause
{
/// <summary>Policy section reference (e.g., "S2.1").</summary>
[JsonPropertyName("section")]
public required string Section { get; init; }
/// <summary>Rule expression that matched.</summary>
[JsonPropertyName("rule_expression")]
public required string RuleExpression { get; init; }
/// <summary>Resulting triage priority.</summary>
[JsonPropertyName("triage_priority")]
public required string TriagePriority { get; init; }
/// <summary>Rendered text for display.</summary>
[JsonPropertyName("text")]
public required string Text { get; init; }
}
/// <summary>Line 3: Attestations and proofs.</summary>
public sealed record RationaleAttestations
{
/// <summary>Build-ID match status.</summary>
[JsonPropertyName("build_id_match")]
public BuildIdMatchInfo? BuildIdMatch { get; init; }
/// <summary>Call path summary (if available).</summary>
[JsonPropertyName("call_path")]
public CallPathSummary? CallPath { get; init; }
/// <summary>VEX statement source.</summary>
[JsonPropertyName("vex_source")]
public string? VexSource { get; init; }
/// <summary>Suppression proof (if not affected).</summary>
[JsonPropertyName("suppression_proof")]
public SuppressionProofSummary? SuppressionProof { get; init; }
/// <summary>Rendered text for display.</summary>
[JsonPropertyName("text")]
public required string Text { get; init; }
}
public sealed record BuildIdMatchInfo
{
[JsonPropertyName("build_id")]
public required string BuildId { get; init; }
[JsonPropertyName("match_source")]
public required string MatchSource { get; init; }
[JsonPropertyName("confidence")]
public required double Confidence { get; init; }
}
public sealed record CallPathSummary
{
[JsonPropertyName("hop_count")]
public required int HopCount { get; init; }
[JsonPropertyName("path_abbreviated")]
public required string PathAbbreviated { get; init; }
[JsonPropertyName("witness_id")]
public string? WitnessId { get; init; }
}
public sealed record SuppressionProofSummary
{
[JsonPropertyName("type")]
public required string Type { get; init; }
[JsonPropertyName("reason")]
public required string Reason { get; init; }
[JsonPropertyName("proof_id")]
public string? ProofId { get; init; }
}
/// <summary>Line 4: Decision with recommendation.</summary>
public sealed record RationaleDecision
{
/// <summary>Final decision status.</summary>
[JsonPropertyName("status")]
public required string Status { get; init; }
/// <summary>Numeric risk score (0.0-1.0).</summary>
[JsonPropertyName("score")]
public required double Score { get; init; }
/// <summary>Score band (P1, P2, P3, P4).</summary>
[JsonPropertyName("band")]
public required string Band { get; init; }
/// <summary>Recommended mitigation action.</summary>
[JsonPropertyName("recommendation")]
public required string Recommendation { get; init; }
/// <summary>Knowledge base reference (if applicable).</summary>
[JsonPropertyName("kb_ref")]
public string? KbRef { get; init; }
/// <summary>Rendered text for display.</summary>
[JsonPropertyName("text")]
public required string Text { get; init; }
}
/// <summary>Input digests for reproducibility verification.</summary>
public sealed record RationaleInputDigests
{
[JsonPropertyName("verdict_digest")]
public required string VerdictDigest { get; init; }
[JsonPropertyName("witness_digest")]
public string? WitnessDigest { get; init; }
[JsonPropertyName("score_explanation_digest")]
public string? ScoreExplanationDigest { get; init; }
[JsonPropertyName("vex_consensus_digest")]
public string? VexConsensusDigest { get; init; }
}
```
### Renderer Interface
```csharp
namespace StellaOps.Policy.Explainability;
/// <summary>
/// Renders structured rationales from verdict components.
/// </summary>
public interface IVerdictRationaleRenderer
{
/// <summary>
/// Render a complete rationale from verdict components.
/// </summary>
VerdictRationale Render(VerdictRationaleInput input);
/// <summary>
/// Render rationale as plain text (4 lines).
/// </summary>
string RenderPlainText(VerdictRationale rationale);
/// <summary>
/// Render rationale as Markdown.
/// </summary>
string RenderMarkdown(VerdictRationale rationale);
/// <summary>
/// Render rationale as structured JSON (RFC 8785 canonical).
/// </summary>
string RenderJson(VerdictRationale rationale);
}
/// <summary>
/// Input components for rationale rendering.
/// </summary>
public sealed record VerdictRationaleInput
{
/// <summary>The verdict attestation being explained.</summary>
public required RiskVerdictAttestation Verdict { get; init; }
/// <summary>Path witness (if reachability analyzed).</summary>
public PathWitness? PathWitness { get; init; }
/// <summary>Score explanation with factor breakdown.</summary>
public ScoreExplanation? ScoreExplanation { get; init; }
/// <summary>VEX consensus result.</summary>
public ConsensusResult? VexConsensus { get; init; }
/// <summary>Policy rule that triggered the decision.</summary>
public PolicyRuleMatch? TriggeringRule { get; init; }
/// <summary>Suppression proof (if not affected).</summary>
public SuppressionWitness? SuppressionWitness { get; init; }
/// <summary>Recommended mitigation (from advisory or policy).</summary>
public MitigationRecommendation? Recommendation { get; init; }
}
/// <summary>
/// Policy rule that matched during evaluation.
/// </summary>
public sealed record PolicyRuleMatch
{
public required string Section { get; init; }
public required string RuleName { get; init; }
public required string Expression { get; init; }
public required string TriagePriority { get; init; }
}
/// <summary>
/// Mitigation recommendation.
/// </summary>
public sealed record MitigationRecommendation
{
public required string Action { get; init; }
public string? KbRef { get; init; }
public string? TargetVersion { get; init; }
}
```
### Renderer Implementation
```csharp
namespace StellaOps.Policy.Explainability;
public sealed class VerdictRationaleRenderer : IVerdictRationaleRenderer
{
private readonly TimeProvider _timeProvider;
private readonly ILogger<VerdictRationaleRenderer> _logger;
public VerdictRationaleRenderer(
TimeProvider timeProvider,
ILogger<VerdictRationaleRenderer> logger)
{
_timeProvider = timeProvider;
_logger = logger;
}
public VerdictRationale Render(VerdictRationaleInput input)
{
ArgumentNullException.ThrowIfNull(input);
ArgumentNullException.ThrowIfNull(input.Verdict);
var evidence = RenderEvidence(input);
var policyClause = RenderPolicyClause(input);
var attestations = RenderAttestations(input);
var decision = RenderDecision(input);
var rationale = new VerdictRationale
{
RationaleId = ComputeRationaleId(input),
VerdictRef = new VerdictReference
{
AttestationId = input.Verdict.AttestationId,
ArtifactDigest = input.Verdict.Subject.Digest,
PolicyId = input.Verdict.Policy.PolicyId,
PolicyVersion = input.Verdict.Policy.Version
},
Evidence = evidence,
PolicyClause = policyClause,
Attestations = attestations,
Decision = decision,
GeneratedAt = _timeProvider.GetUtcNow(),
InputDigests = ComputeInputDigests(input)
};
_logger.LogDebug("Rendered rationale {RationaleId} for verdict {VerdictId}",
rationale.RationaleId, input.Verdict.AttestationId);
return rationale;
}
private RationaleEvidence RenderEvidence(VerdictRationaleInput input)
{
var verdict = input.Verdict;
var witness = input.PathWitness;
// Extract primary CVE from reason codes or evidence
var vulnId = ExtractPrimaryVulnerabilityId(verdict);
var componentPurl = verdict.Subject.Name ?? verdict.Subject.Digest;
var componentVersion = ExtractVersion(componentPurl);
var text = witness is not null
? $"{vulnId} in `{componentPurl}` {componentVersion}; " +
$"symbol `{witness.Sink.Symbol}` reachable from `{witness.Entrypoint.Name}`."
: $"{vulnId} in `{componentPurl}` {componentVersion}.";
return new RationaleEvidence
{
VulnerabilityId = vulnId,
ComponentPurl = componentPurl,
ComponentVersion = componentVersion,
VulnerableSymbol = witness?.Sink.Symbol,
Entrypoint = witness?.Entrypoint.Name,
Text = text
};
}
private RationalePolicyClause RenderPolicyClause(VerdictRationaleInput input)
{
var rule = input.TriggeringRule;
if (rule is null)
{
// Infer from reason codes
var primaryReason = input.Verdict.ReasonCodes.FirstOrDefault();
return new RationalePolicyClause
{
Section = "default",
RuleExpression = primaryReason?.GetDescription() ?? "policy evaluation",
TriagePriority = MapVerdictToPriority(input.Verdict.Verdict),
Text = $"Policy: {primaryReason?.GetDescription() ?? "default evaluation"} => " +
$"triage={MapVerdictToPriority(input.Verdict.Verdict)}."
};
}
return new RationalePolicyClause
{
Section = rule.Section,
RuleExpression = rule.Expression,
TriagePriority = rule.TriagePriority,
Text = $"Policy {rule.Section}: {rule.Expression} => triage={rule.TriagePriority}."
};
}
private RationaleAttestations RenderAttestations(VerdictRationaleInput input)
{
var parts = new List<string>();
BuildIdMatchInfo? buildIdMatch = null;
CallPathSummary? callPath = null;
SuppressionProofSummary? suppressionProof = null;
// Build-ID match
if (input.PathWitness?.Evidence.BuildId is not null)
{
buildIdMatch = new BuildIdMatchInfo
{
BuildId = input.PathWitness.Evidence.BuildId,
MatchSource = "vendor advisory",
Confidence = 1.0
};
parts.Add($"Build-ID match to vendor advisory");
}
// Call path
if (input.PathWitness?.Path.Count > 0)
{
var abbreviated = AbbreviatePath(input.PathWitness.Path);
callPath = new CallPathSummary
{
HopCount = input.PathWitness.Path.Count,
PathAbbreviated = abbreviated,
WitnessId = input.PathWitness.WitnessId
};
parts.Add($"call-path: `{abbreviated}`");
}
// VEX source
string? vexSource = null;
if (input.VexConsensus is not null)
{
vexSource = $"VEX consensus ({input.VexConsensus.ContributingStatements} statements)";
parts.Add(vexSource);
}
// Suppression proof
if (input.SuppressionWitness is not null)
{
suppressionProof = new SuppressionProofSummary
{
Type = input.SuppressionWitness.Type.ToString(),
Reason = input.SuppressionWitness.Reason,
ProofId = input.SuppressionWitness.WitnessId
};
parts.Add($"suppression: {input.SuppressionWitness.Reason}");
}
var text = parts.Count > 0
? string.Join("; ", parts) + "."
: "No attestations available.";
return new RationaleAttestations
{
BuildIdMatch = buildIdMatch,
CallPath = callPath,
VexSource = vexSource,
SuppressionProof = suppressionProof,
Text = text
};
}
private RationaleDecision RenderDecision(VerdictRationaleInput input)
{
var verdict = input.Verdict;
var score = input.ScoreExplanation?.Factors
.Sum(f => f.Value * GetFactorWeight(f.Factor)) ?? 0.0;
var status = verdict.Verdict switch
{
RiskVerdictStatus.Pass => "Not Affected",
RiskVerdictStatus.Fail => "Affected",
RiskVerdictStatus.PassWithExceptions => "Affected (excepted)",
RiskVerdictStatus.Indeterminate => "Under Investigation",
_ => "Unknown"
};
var band = score switch
{
>= 0.75 => "P1",
>= 0.50 => "P2",
>= 0.25 => "P3",
_ => "P4"
};
var recommendation = input.Recommendation?.Action ?? "Review finding and take appropriate action.";
var kbRef = input.Recommendation?.KbRef;
var text = kbRef is not null
? $"{status} (score {score:F2}). Mitigation recommended: {recommendation} {kbRef}."
: $"{status} (score {score:F2}). Mitigation recommended: {recommendation}";
return new RationaleDecision
{
Status = status,
Score = Math.Round(score, 2),
Band = band,
Recommendation = recommendation,
KbRef = kbRef,
Text = text
};
}
public string RenderPlainText(VerdictRationale rationale)
{
return $"""
{rationale.Evidence.Text}
{rationale.PolicyClause.Text}
{rationale.Attestations.Text}
{rationale.Decision.Text}
""";
}
public string RenderMarkdown(VerdictRationale rationale)
{
return $"""
**Evidence:** {rationale.Evidence.Text}
**Policy:** {rationale.PolicyClause.Text}
**Attestations:** {rationale.Attestations.Text}
**Decision:** {rationale.Decision.Text}
""";
}
public string RenderJson(VerdictRationale rationale)
{
return CanonicalJsonSerializer.Serialize(rationale);
}
private static string AbbreviatePath(IReadOnlyList<PathStep> path)
{
if (path.Count <= 3)
{
return string.Join("->", path.Select(p => p.Symbol));
}
return $"{path[0].Symbol}->...({path.Count - 2} hops)->->{path[^1].Symbol}";
}
private static string ComputeRationaleId(VerdictRationaleInput input)
{
var canonical = CanonicalJsonSerializer.Serialize(new
{
verdict_id = input.Verdict.AttestationId,
witness_id = input.PathWitness?.WitnessId,
score_factors = input.ScoreExplanation?.Factors.Count ?? 0
});
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(canonical));
return $"rationale:sha256:{Convert.ToHexString(hash).ToLowerInvariant()}";
}
private static RationaleInputDigests ComputeInputDigests(VerdictRationaleInput input)
{
return new RationaleInputDigests
{
VerdictDigest = input.Verdict.AttestationId,
WitnessDigest = input.PathWitness?.Evidence.CallgraphDigest,
ScoreExplanationDigest = input.ScoreExplanation is not null
? ComputeDigest(input.ScoreExplanation)
: null,
VexConsensusDigest = input.VexConsensus is not null
? ComputeDigest(input.VexConsensus)
: null
};
}
private static string ComputeDigest(object obj)
{
var json = CanonicalJsonSerializer.Serialize(obj);
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(json));
return $"sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..16]}";
}
private static string ExtractPrimaryVulnerabilityId(RiskVerdictAttestation verdict)
{
// Try to extract from evidence refs
var cveRef = verdict.Evidence.FirstOrDefault(e =>
e.Type == "cve" || e.Description?.StartsWith("CVE-") == true);
return cveRef?.Description ?? "CVE-UNKNOWN";
}
private static string ExtractVersion(string purl)
{
var atIndex = purl.LastIndexOf('@');
return atIndex > 0 ? purl[(atIndex + 1)..] : "unknown";
}
private static string MapVerdictToPriority(RiskVerdictStatus status)
{
return status switch
{
RiskVerdictStatus.Fail => "P1",
RiskVerdictStatus.PassWithExceptions => "P2",
RiskVerdictStatus.Indeterminate => "P3",
RiskVerdictStatus.Pass => "P4",
_ => "P4"
};
}
private static double GetFactorWeight(string factor)
{
return factor.ToLowerInvariant() switch
{
"reachability" => 0.30,
"evidence" => 0.25,
"provenance" => 0.20,
"severity" => 0.25,
_ => 0.10
};
}
}
```
### Service Registration
```csharp
namespace StellaOps.Policy.Explainability;
public static class ExplainabilityServiceCollectionExtensions
{
public static IServiceCollection AddVerdictExplainability(this IServiceCollection services)
{
services.AddSingleton<IVerdictRationaleRenderer, VerdictRationaleRenderer>();
return services;
}
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | VRR-001 | TODO | - | - | Create `StellaOps.Policy.Explainability` project |
| 2 | VRR-002 | TODO | VRR-001 | - | Define `VerdictRationale` and component records |
| 3 | VRR-003 | TODO | VRR-002 | - | Define `IVerdictRationaleRenderer` interface |
| 4 | VRR-004 | TODO | VRR-003 | - | Implement `VerdictRationaleRenderer.RenderEvidence()` |
| 5 | VRR-005 | TODO | VRR-004 | - | Implement `VerdictRationaleRenderer.RenderPolicyClause()` |
| 6 | VRR-006 | TODO | VRR-005 | - | Implement `VerdictRationaleRenderer.RenderAttestations()` |
| 7 | VRR-007 | TODO | VRR-006 | - | Implement `VerdictRationaleRenderer.RenderDecision()` |
| 8 | VRR-008 | TODO | VRR-007 | - | Implement `Render()` composition method |
| 9 | VRR-009 | TODO | VRR-008 | - | Implement `RenderPlainText()` output |
| 10 | VRR-010 | TODO | VRR-008 | - | Implement `RenderMarkdown()` output |
| 11 | VRR-011 | TODO | VRR-008 | - | Implement `RenderJson()` with RFC 8785 canonicalization |
| 12 | VRR-012 | TODO | VRR-011 | - | Add input digest computation for reproducibility |
| 13 | VRR-013 | TODO | VRR-012 | - | Create service registration extension |
| 14 | VRR-014 | TODO | VRR-013 | - | Write unit tests: evidence rendering |
| 15 | VRR-015 | TODO | VRR-014 | - | Write unit tests: policy clause rendering |
| 16 | VRR-016 | TODO | VRR-015 | - | Write unit tests: attestations rendering |
| 17 | VRR-017 | TODO | VRR-016 | - | Write unit tests: decision rendering |
| 18 | VRR-018 | TODO | VRR-017 | - | Write golden fixture tests for output formats |
| 19 | VRR-019 | TODO | VRR-018 | - | Write determinism tests: same input -> same rationale ID |
| 20 | VRR-020 | TODO | VRR-019 | - | Integrate into Scanner.WebService verdict endpoints |
| 21 | VRR-021 | TODO | VRR-020 | - | Integrate into CLI triage commands |
| 22 | VRR-022 | TODO | VRR-021 | - | Add OpenAPI schema for `VerdictRationale` |
| 23 | VRR-023 | TODO | VRR-022 | - | Document rationale template in docs/modules/policy/ |
## Acceptance Criteria
1. **4-Line Template:** All rationales follow Evidence -> Policy -> Attestations -> Decision format
2. **Determinism:** Same inputs produce identical rationale IDs (content-addressed)
3. **Output Formats:** Plain text, Markdown, and JSON outputs available
4. **Reproducibility:** Input digests enable verification of rationale computation
5. **Integration:** Renderer integrated into Scanner.WebService and CLI
6. **Test Coverage:** Unit tests for each line, golden fixtures for formats
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| New library vs extension | Clean separation; renderer has no side effects |
| Content-addressed IDs | Enables caching and deduplication |
| RFC 8785 JSON | Consistent with existing canonical JSON usage |
| Optional components | Graceful degradation when PathWitness/VEX unavailable |
| Risk | Mitigation |
|------|------------|
| Template too rigid | Make format configurable via options |
| Missing context | Fallback text when components unavailable |
| Performance | Cache rendered rationales by input digest |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |

View File

@@ -0,0 +1,833 @@
# Sprint 20260106_001_002_LB - Determinization: Scoring and Decay Calculations
## Topic & Scope
Implement the scoring and decay calculation services for the Determinization subsystem. This includes `UncertaintyScoreCalculator` (entropy from signal completeness), `DecayedConfidenceCalculator` (half-life decay), configurable signal weights, and prior distributions for missing signals.
- **Working directory:** `src/Policy/__Libraries/StellaOps.Policy.Determinization/`
- **Evidence:** Calculator implementations, configuration options, unit tests
## Problem Statement
Current confidence calculation:
- Uses `ConfidenceScore` with weighted factors
- No explicit "knowledge completeness" entropy calculation
- `FreshnessCalculator` exists but uses 90-day half-life, not configurable per-observation
- No prior distributions for missing signals
Advisory requires:
- Entropy formula: `entropy = 1 - (weighted_present_signals / max_possible_weight)`
- Decay formula: `decayed = max(floor, exp(-ln(2) * age_days / half_life_days))`
- Configurable signal weights (default: VEX=0.25, EPSS=0.15, Reach=0.25, Runtime=0.15, Backport=0.10, SBOM=0.10)
- 14-day half-life default (configurable)
## Dependencies & Concurrency
- **Depends on:** SPRINT_20260106_001_001_LB (core models)
- **Blocks:** SPRINT_20260106_001_003_POLICY (gates)
- **Parallel safe:** Library additions; no cross-module conflicts
## Documentation Prerequisites
- docs/modules/policy/determinization-architecture.md
- SPRINT_20260106_001_001_LB (core models)
- Existing: `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/FreshnessCalculator.cs`
## Technical Design
### Directory Structure Addition
```
src/Policy/__Libraries/StellaOps.Policy.Determinization/
├── Scoring/
│ ├── IUncertaintyScoreCalculator.cs
│ ├── UncertaintyScoreCalculator.cs
│ ├── IDecayedConfidenceCalculator.cs
│ ├── DecayedConfidenceCalculator.cs
│ ├── SignalWeights.cs
│ ├── PriorDistribution.cs
│ └── TrustScoreAggregator.cs
├── DeterminizationOptions.cs
└── ServiceCollectionExtensions.cs
```
### IUncertaintyScoreCalculator Interface
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Calculates knowledge completeness entropy from signal snapshots.
/// </summary>
public interface IUncertaintyScoreCalculator
{
/// <summary>
/// Calculate uncertainty score from a signal snapshot.
/// </summary>
/// <param name="snapshot">Point-in-time signal collection.</param>
/// <returns>Uncertainty score with entropy and missing signal details.</returns>
UncertaintyScore Calculate(SignalSnapshot snapshot);
/// <summary>
/// Calculate uncertainty score with custom weights.
/// </summary>
/// <param name="snapshot">Point-in-time signal collection.</param>
/// <param name="weights">Custom signal weights.</param>
/// <returns>Uncertainty score with entropy and missing signal details.</returns>
UncertaintyScore Calculate(SignalSnapshot snapshot, SignalWeights weights);
}
```
### UncertaintyScoreCalculator Implementation
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Calculates knowledge completeness entropy from signal snapshot.
/// Formula: entropy = 1 - (sum of weighted present signals / max possible weight)
/// </summary>
public sealed class UncertaintyScoreCalculator : IUncertaintyScoreCalculator
{
private readonly SignalWeights _defaultWeights;
private readonly ILogger<UncertaintyScoreCalculator> _logger;
public UncertaintyScoreCalculator(
IOptions<DeterminizationOptions> options,
ILogger<UncertaintyScoreCalculator> logger)
{
_defaultWeights = options.Value.SignalWeights.Normalize();
_logger = logger;
}
public UncertaintyScore Calculate(SignalSnapshot snapshot) =>
Calculate(snapshot, _defaultWeights);
public UncertaintyScore Calculate(SignalSnapshot snapshot, SignalWeights weights)
{
ArgumentNullException.ThrowIfNull(snapshot);
ArgumentNullException.ThrowIfNull(weights);
var normalizedWeights = weights.Normalize();
var gaps = new List<SignalGap>();
var weightedSum = 0.0;
// EPSS signal
weightedSum += EvaluateSignal(
snapshot.Epss,
"EPSS",
normalizedWeights.Epss,
gaps);
// VEX signal
weightedSum += EvaluateSignal(
snapshot.Vex,
"VEX",
normalizedWeights.Vex,
gaps);
// Reachability signal
weightedSum += EvaluateSignal(
snapshot.Reachability,
"Reachability",
normalizedWeights.Reachability,
gaps);
// Runtime signal
weightedSum += EvaluateSignal(
snapshot.Runtime,
"Runtime",
normalizedWeights.Runtime,
gaps);
// Backport signal
weightedSum += EvaluateSignal(
snapshot.Backport,
"Backport",
normalizedWeights.Backport,
gaps);
// SBOM Lineage signal
weightedSum += EvaluateSignal(
snapshot.SbomLineage,
"SBOMLineage",
normalizedWeights.SbomLineage,
gaps);
var maxWeight = normalizedWeights.TotalWeight;
var entropy = 1.0 - (weightedSum / maxWeight);
var result = new UncertaintyScore
{
Entropy = Math.Clamp(entropy, 0.0, 1.0),
MissingSignals = gaps.ToImmutableArray(),
WeightedEvidenceSum = weightedSum,
MaxPossibleWeight = maxWeight
};
_logger.LogDebug(
"Calculated uncertainty for CVE {CveId}: entropy={Entropy:F3}, tier={Tier}, missing={MissingCount}",
snapshot.CveId,
result.Entropy,
result.Tier,
gaps.Count);
return result;
}
private static double EvaluateSignal<T>(
SignalState<T> signal,
string signalName,
double weight,
List<SignalGap> gaps)
{
if (signal.HasValue)
{
return weight;
}
gaps.Add(new SignalGap(
signalName,
weight,
signal.Status,
signal.FailureReason));
return 0.0;
}
}
```
### IDecayedConfidenceCalculator Interface
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Calculates time-based confidence decay for evidence staleness.
/// </summary>
public interface IDecayedConfidenceCalculator
{
/// <summary>
/// Calculate decay for evidence age.
/// </summary>
/// <param name="lastSignalUpdate">When the last signal was updated.</param>
/// <returns>Observation decay with multiplier and staleness flag.</returns>
ObservationDecay Calculate(DateTimeOffset lastSignalUpdate);
/// <summary>
/// Calculate decay with custom half-life and floor.
/// </summary>
/// <param name="lastSignalUpdate">When the last signal was updated.</param>
/// <param name="halfLife">Custom half-life duration.</param>
/// <param name="floor">Minimum confidence floor.</param>
/// <returns>Observation decay with multiplier and staleness flag.</returns>
ObservationDecay Calculate(DateTimeOffset lastSignalUpdate, TimeSpan halfLife, double floor);
/// <summary>
/// Apply decay multiplier to a confidence score.
/// </summary>
/// <param name="baseConfidence">Base confidence score [0.0-1.0].</param>
/// <param name="decay">Decay calculation result.</param>
/// <returns>Decayed confidence score.</returns>
double ApplyDecay(double baseConfidence, ObservationDecay decay);
}
```
### DecayedConfidenceCalculator Implementation
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Applies exponential decay to confidence based on evidence staleness.
/// Formula: decayed = max(floor, exp(-ln(2) * age_days / half_life_days))
/// </summary>
public sealed class DecayedConfidenceCalculator : IDecayedConfidenceCalculator
{
private readonly TimeProvider _timeProvider;
private readonly DeterminizationOptions _options;
private readonly ILogger<DecayedConfidenceCalculator> _logger;
public DecayedConfidenceCalculator(
TimeProvider timeProvider,
IOptions<DeterminizationOptions> options,
ILogger<DecayedConfidenceCalculator> logger)
{
_timeProvider = timeProvider;
_options = options.Value;
_logger = logger;
}
public ObservationDecay Calculate(DateTimeOffset lastSignalUpdate) =>
Calculate(
lastSignalUpdate,
TimeSpan.FromDays(_options.DecayHalfLifeDays),
_options.DecayFloor);
public ObservationDecay Calculate(
DateTimeOffset lastSignalUpdate,
TimeSpan halfLife,
double floor)
{
if (halfLife <= TimeSpan.Zero)
throw new ArgumentOutOfRangeException(nameof(halfLife), "Half-life must be positive");
if (floor is < 0.0 or > 1.0)
throw new ArgumentOutOfRangeException(nameof(floor), "Floor must be between 0.0 and 1.0");
var now = _timeProvider.GetUtcNow();
var ageDays = (now - lastSignalUpdate).TotalDays;
double decayedMultiplier;
if (ageDays <= 0)
{
// Evidence is fresh or from the future (clock skew)
decayedMultiplier = 1.0;
}
else
{
// Exponential decay: e^(-ln(2) * t / t_half)
var rawDecay = Math.Exp(-Math.Log(2) * ageDays / halfLife.TotalDays);
decayedMultiplier = Math.Max(rawDecay, floor);
}
// Calculate next review time (when decay crosses 50% threshold)
var daysTo50Percent = halfLife.TotalDays;
var nextReviewAt = lastSignalUpdate.AddDays(daysTo50Percent);
// Stale threshold: below 50% of original
var isStale = decayedMultiplier <= 0.5;
var result = new ObservationDecay
{
HalfLife = halfLife,
Floor = floor,
LastSignalUpdate = lastSignalUpdate,
DecayedMultiplier = decayedMultiplier,
NextReviewAt = nextReviewAt,
IsStale = isStale,
AgeDays = Math.Max(0, ageDays)
};
_logger.LogDebug(
"Calculated decay: age={AgeDays:F1}d, halfLife={HalfLife}d, multiplier={Multiplier:F3}, stale={IsStale}",
ageDays,
halfLife.TotalDays,
decayedMultiplier,
isStale);
return result;
}
public double ApplyDecay(double baseConfidence, ObservationDecay decay)
{
if (baseConfidence is < 0.0 or > 1.0)
throw new ArgumentOutOfRangeException(nameof(baseConfidence), "Confidence must be between 0.0 and 1.0");
return baseConfidence * decay.DecayedMultiplier;
}
}
```
### SignalWeights Configuration
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Configurable weights for signal contribution to completeness.
/// Weights should sum to 1.0 for normalized entropy.
/// </summary>
public sealed record SignalWeights
{
/// <summary>VEX statement weight. Default: 0.25</summary>
public double Vex { get; init; } = 0.25;
/// <summary>EPSS score weight. Default: 0.15</summary>
public double Epss { get; init; } = 0.15;
/// <summary>Reachability analysis weight. Default: 0.25</summary>
public double Reachability { get; init; } = 0.25;
/// <summary>Runtime observation weight. Default: 0.15</summary>
public double Runtime { get; init; } = 0.15;
/// <summary>Fix backport detection weight. Default: 0.10</summary>
public double Backport { get; init; } = 0.10;
/// <summary>SBOM lineage weight. Default: 0.10</summary>
public double SbomLineage { get; init; } = 0.10;
/// <summary>Total weight (sum of all signals).</summary>
public double TotalWeight =>
Vex + Epss + Reachability + Runtime + Backport + SbomLineage;
/// <summary>
/// Returns normalized weights that sum to 1.0.
/// </summary>
public SignalWeights Normalize()
{
var total = TotalWeight;
if (total <= 0)
throw new InvalidOperationException("Total weight must be positive");
if (Math.Abs(total - 1.0) < 0.0001)
return this; // Already normalized
return new SignalWeights
{
Vex = Vex / total,
Epss = Epss / total,
Reachability = Reachability / total,
Runtime = Runtime / total,
Backport = Backport / total,
SbomLineage = SbomLineage / total
};
}
/// <summary>
/// Validates that all weights are non-negative and total is positive.
/// </summary>
public bool IsValid =>
Vex >= 0 && Epss >= 0 && Reachability >= 0 &&
Runtime >= 0 && Backport >= 0 && SbomLineage >= 0 &&
TotalWeight > 0;
/// <summary>
/// Default weights per advisory recommendation.
/// </summary>
public static SignalWeights Default => new();
/// <summary>
/// Weights emphasizing VEX and reachability (for production).
/// </summary>
public static SignalWeights ProductionEmphasis => new()
{
Vex = 0.30,
Epss = 0.15,
Reachability = 0.30,
Runtime = 0.10,
Backport = 0.08,
SbomLineage = 0.07
};
/// <summary>
/// Weights emphasizing runtime signals (for observed environments).
/// </summary>
public static SignalWeights RuntimeEmphasis => new()
{
Vex = 0.20,
Epss = 0.10,
Reachability = 0.20,
Runtime = 0.30,
Backport = 0.10,
SbomLineage = 0.10
};
}
```
### PriorDistribution for Missing Signals
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Prior distributions for missing signals.
/// Used when a signal is not available but we need a default assumption.
/// </summary>
public sealed record PriorDistribution
{
/// <summary>
/// Default prior for EPSS when not available.
/// Median EPSS is ~0.04, so we use a conservative prior.
/// </summary>
public double EpssPrior { get; init; } = 0.10;
/// <summary>
/// Default prior for reachability when not analyzed.
/// Conservative: assume reachable until proven otherwise.
/// </summary>
public ReachabilityStatus ReachabilityPrior { get; init; } = ReachabilityStatus.Unknown;
/// <summary>
/// Default prior for KEV when not checked.
/// Conservative: assume not in KEV (most CVEs are not).
/// </summary>
public bool KevPrior { get; init; } = false;
/// <summary>
/// Confidence in the prior values [0.0-1.0].
/// Lower values indicate priors should be weighted less.
/// </summary>
public double PriorConfidence { get; init; } = 0.3;
/// <summary>
/// Default conservative priors.
/// </summary>
public static PriorDistribution Default => new();
/// <summary>
/// Pessimistic priors (assume worst case).
/// </summary>
public static PriorDistribution Pessimistic => new()
{
EpssPrior = 0.30,
ReachabilityPrior = ReachabilityStatus.Reachable,
KevPrior = false,
PriorConfidence = 0.2
};
/// <summary>
/// Optimistic priors (assume best case).
/// </summary>
public static PriorDistribution Optimistic => new()
{
EpssPrior = 0.02,
ReachabilityPrior = ReachabilityStatus.Unreachable,
KevPrior = false,
PriorConfidence = 0.2
};
}
```
### TrustScoreAggregator
```csharp
namespace StellaOps.Policy.Determinization.Scoring;
/// <summary>
/// Aggregates trust score from signal snapshot.
/// Combines signal values with weights to produce overall trust score.
/// </summary>
public interface ITrustScoreAggregator
{
/// <summary>
/// Calculate aggregate trust score from signals.
/// </summary>
/// <param name="snapshot">Signal snapshot.</param>
/// <param name="priors">Priors for missing signals.</param>
/// <returns>Trust score [0.0-1.0].</returns>
double Calculate(SignalSnapshot snapshot, PriorDistribution? priors = null);
}
public sealed class TrustScoreAggregator : ITrustScoreAggregator
{
private readonly SignalWeights _weights;
private readonly PriorDistribution _defaultPriors;
private readonly ILogger<TrustScoreAggregator> _logger;
public TrustScoreAggregator(
IOptions<DeterminizationOptions> options,
ILogger<TrustScoreAggregator> logger)
{
_weights = options.Value.SignalWeights.Normalize();
_defaultPriors = options.Value.Priors ?? PriorDistribution.Default;
_logger = logger;
}
public double Calculate(SignalSnapshot snapshot, PriorDistribution? priors = null)
{
priors ??= _defaultPriors;
var normalized = _weights.Normalize();
var score = 0.0;
// VEX contribution: high trust if not_affected with good issuer trust
score += CalculateVexContribution(snapshot.Vex, priors) * normalized.Vex;
// EPSS contribution: inverse (lower EPSS = higher trust)
score += CalculateEpssContribution(snapshot.Epss, priors) * normalized.Epss;
// Reachability contribution: high trust if unreachable
score += CalculateReachabilityContribution(snapshot.Reachability, priors) * normalized.Reachability;
// Runtime contribution: high trust if not observed loaded
score += CalculateRuntimeContribution(snapshot.Runtime, priors) * normalized.Runtime;
// Backport contribution: high trust if backport detected
score += CalculateBackportContribution(snapshot.Backport, priors) * normalized.Backport;
// SBOM lineage contribution: high trust if verified
score += CalculateSbomContribution(snapshot.SbomLineage, priors) * normalized.SbomLineage;
var result = Math.Clamp(score, 0.0, 1.0);
_logger.LogDebug(
"Calculated trust score for CVE {CveId}: {Score:F3}",
snapshot.CveId,
result);
return result;
}
private static double CalculateVexContribution(SignalState<VexClaimSummary> signal, PriorDistribution priors)
{
if (!signal.HasValue)
return priors.PriorConfidence * 0.5; // Uncertain
var vex = signal.Value!;
return vex.Status switch
{
"not_affected" => vex.IssuerTrust,
"fixed" => vex.IssuerTrust * 0.9,
"under_investigation" => 0.4,
"affected" => 0.1,
_ => 0.3
};
}
private static double CalculateEpssContribution(SignalState<EpssEvidence> signal, PriorDistribution priors)
{
if (!signal.HasValue)
return 1.0 - priors.EpssPrior; // Use prior
// Inverse: low EPSS = high trust
return 1.0 - signal.Value!.Score;
}
private static double CalculateReachabilityContribution(SignalState<ReachabilityEvidence> signal, PriorDistribution priors)
{
if (!signal.HasValue)
{
return priors.ReachabilityPrior switch
{
ReachabilityStatus.Unreachable => 0.9 * priors.PriorConfidence,
ReachabilityStatus.Reachable => 0.1 * priors.PriorConfidence,
_ => 0.5 * priors.PriorConfidence
};
}
var reach = signal.Value!;
return reach.Status switch
{
ReachabilityStatus.Unreachable => reach.Confidence,
ReachabilityStatus.Gated => reach.Confidence * 0.6,
ReachabilityStatus.Unknown => 0.4,
ReachabilityStatus.Reachable => 0.1,
ReachabilityStatus.ObservedReachable => 0.0,
_ => 0.3
};
}
private static double CalculateRuntimeContribution(SignalState<RuntimeEvidence> signal, PriorDistribution priors)
{
if (!signal.HasValue)
return 0.5 * priors.PriorConfidence; // No runtime data
return signal.Value!.ObservedLoaded ? 0.0 : 0.9;
}
private static double CalculateBackportContribution(SignalState<BackportEvidence> signal, PriorDistribution priors)
{
if (!signal.HasValue)
return 0.5 * priors.PriorConfidence;
return signal.Value!.BackportDetected ? signal.Value.Confidence : 0.3;
}
private static double CalculateSbomContribution(SignalState<SbomLineageEvidence> signal, PriorDistribution priors)
{
if (!signal.HasValue)
return 0.5 * priors.PriorConfidence;
var sbom = signal.Value!;
var score = sbom.QualityScore;
if (sbom.LineageVerified) score *= 1.1;
if (sbom.HasProvenanceAttestation) score *= 1.1;
return Math.Min(score, 1.0);
}
}
```
### DeterminizationOptions
```csharp
namespace StellaOps.Policy.Determinization;
/// <summary>
/// Configuration options for the Determinization subsystem.
/// </summary>
public sealed class DeterminizationOptions
{
/// <summary>Configuration section name.</summary>
public const string SectionName = "Determinization";
/// <summary>EPSS score that triggers quarantine (block). Default: 0.4</summary>
public double EpssQuarantineThreshold { get; set; } = 0.4;
/// <summary>Trust score threshold for guarded allow. Default: 0.5</summary>
public double GuardedAllowScoreThreshold { get; set; } = 0.5;
/// <summary>Entropy threshold for guarded allow. Default: 0.4</summary>
public double GuardedAllowEntropyThreshold { get; set; } = 0.4;
/// <summary>Entropy threshold for production block. Default: 0.3</summary>
public double ProductionBlockEntropyThreshold { get; set; } = 0.3;
/// <summary>Half-life for evidence decay in days. Default: 14</summary>
public int DecayHalfLifeDays { get; set; } = 14;
/// <summary>Minimum confidence floor after decay. Default: 0.35</summary>
public double DecayFloor { get; set; } = 0.35;
/// <summary>Review interval for guarded observations in days. Default: 7</summary>
public int GuardedReviewIntervalDays { get; set; } = 7;
/// <summary>Maximum time in guarded state in days. Default: 30</summary>
public int MaxGuardedDurationDays { get; set; } = 30;
/// <summary>Signal weights for uncertainty calculation.</summary>
public SignalWeights SignalWeights { get; set; } = new();
/// <summary>Prior distributions for missing signals.</summary>
public PriorDistribution? Priors { get; set; }
/// <summary>Per-environment threshold overrides.</summary>
public Dictionary<string, EnvironmentThresholds> EnvironmentThresholds { get; set; } = new();
/// <summary>Enable detailed logging for debugging.</summary>
public bool EnableDetailedLogging { get; set; } = false;
}
/// <summary>
/// Per-environment threshold configuration.
/// </summary>
public sealed record EnvironmentThresholds
{
public DeploymentEnvironment Environment { get; init; }
public double MinConfidenceForNotAffected { get; init; }
public double MaxEntropyForAllow { get; init; }
public double EpssBlockThreshold { get; init; }
public bool RequireReachabilityForAllow { get; init; }
}
```
### ServiceCollectionExtensions
```csharp
namespace StellaOps.Policy.Determinization;
/// <summary>
/// DI registration for Determinization services.
/// </summary>
public static class ServiceCollectionExtensions
{
/// <summary>
/// Adds Determinization services to the DI container.
/// </summary>
public static IServiceCollection AddDeterminization(
this IServiceCollection services,
IConfiguration configuration)
{
// Bind options
services.AddOptions<DeterminizationOptions>()
.Bind(configuration.GetSection(DeterminizationOptions.SectionName))
.ValidateDataAnnotations()
.ValidateOnStart();
// Register services
services.AddSingleton<IUncertaintyScoreCalculator, UncertaintyScoreCalculator>();
services.AddSingleton<IDecayedConfidenceCalculator, DecayedConfidenceCalculator>();
services.AddSingleton<ITrustScoreAggregator, TrustScoreAggregator>();
return services;
}
/// <summary>
/// Adds Determinization services with custom options.
/// </summary>
public static IServiceCollection AddDeterminization(
this IServiceCollection services,
Action<DeterminizationOptions> configure)
{
services.Configure(configure);
services.PostConfigure<DeterminizationOptions>(options =>
{
// Validate and normalize weights
if (!options.SignalWeights.IsValid)
throw new OptionsValidationException(
nameof(DeterminizationOptions.SignalWeights),
typeof(SignalWeights),
new[] { "Signal weights must be non-negative and have positive total" });
});
services.AddSingleton<IUncertaintyScoreCalculator, UncertaintyScoreCalculator>();
services.AddSingleton<IDecayedConfidenceCalculator, DecayedConfidenceCalculator>();
services.AddSingleton<ITrustScoreAggregator, TrustScoreAggregator>();
return services;
}
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | DCS-001 | TODO | DCM-030 | Guild | Create `Scoring/` directory structure |
| 2 | DCS-002 | TODO | DCS-001 | Guild | Implement `SignalWeights` record with presets |
| 3 | DCS-003 | TODO | DCS-002 | Guild | Implement `PriorDistribution` record with presets |
| 4 | DCS-004 | TODO | DCS-003 | Guild | Implement `IUncertaintyScoreCalculator` interface |
| 5 | DCS-005 | TODO | DCS-004 | Guild | Implement `UncertaintyScoreCalculator` with logging |
| 6 | DCS-006 | TODO | DCS-005 | Guild | Implement `IDecayedConfidenceCalculator` interface |
| 7 | DCS-007 | TODO | DCS-006 | Guild | Implement `DecayedConfidenceCalculator` with TimeProvider |
| 8 | DCS-008 | TODO | DCS-007 | Guild | Implement `ITrustScoreAggregator` interface |
| 9 | DCS-009 | TODO | DCS-008 | Guild | Implement `TrustScoreAggregator` with all signal types |
| 10 | DCS-010 | TODO | DCS-009 | Guild | Implement `EnvironmentThresholds` record |
| 11 | DCS-011 | TODO | DCS-010 | Guild | Implement `DeterminizationOptions` with validation |
| 12 | DCS-012 | TODO | DCS-011 | Guild | Implement `ServiceCollectionExtensions` for DI |
| 13 | DCS-013 | TODO | DCS-012 | Guild | Write unit tests: `SignalWeights.Normalize()` |
| 14 | DCS-014 | TODO | DCS-013 | Guild | Write unit tests: `UncertaintyScoreCalculator` entropy bounds |
| 15 | DCS-015 | TODO | DCS-014 | Guild | Write unit tests: `UncertaintyScoreCalculator` missing signals |
| 16 | DCS-016 | TODO | DCS-015 | Guild | Write unit tests: `DecayedConfidenceCalculator` half-life |
| 17 | DCS-017 | TODO | DCS-016 | Guild | Write unit tests: `DecayedConfidenceCalculator` floor |
| 18 | DCS-018 | TODO | DCS-017 | Guild | Write unit tests: `DecayedConfidenceCalculator` staleness |
| 19 | DCS-019 | TODO | DCS-018 | Guild | Write unit tests: `TrustScoreAggregator` signal combinations |
| 20 | DCS-020 | TODO | DCS-019 | Guild | Write unit tests: `TrustScoreAggregator` with priors |
| 21 | DCS-021 | TODO | DCS-020 | Guild | Write property tests: entropy always [0.0, 1.0] |
| 22 | DCS-022 | TODO | DCS-021 | Guild | Write property tests: decay monotonically decreasing |
| 23 | DCS-023 | TODO | DCS-022 | Guild | Write determinism tests: same snapshot same entropy |
| 24 | DCS-024 | TODO | DCS-023 | Guild | Integration test: DI registration with configuration |
| 25 | DCS-025 | TODO | DCS-024 | Guild | Add metrics: `stellaops_determinization_uncertainty_entropy` |
| 26 | DCS-026 | TODO | DCS-025 | Guild | Add metrics: `stellaops_determinization_decay_multiplier` |
| 27 | DCS-027 | TODO | DCS-026 | Guild | Document configuration options in architecture.md |
| 28 | DCS-028 | TODO | DCS-027 | Guild | Verify build with `dotnet build` |
## Acceptance Criteria
1. `UncertaintyScoreCalculator` produces entropy [0.0, 1.0] for any input
2. `DecayedConfidenceCalculator` correctly applies half-life formula
3. Decay never drops below configured floor
4. Missing signals correctly contribute to higher entropy
5. Signal weights are normalized before calculation
6. Priors are applied when signals are missing
7. All services registered in DI correctly
8. Configuration options validated at startup
9. Metrics emitted for observability
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| 14-day default half-life | Per advisory; shorter than existing 90-day gives more urgency |
| 0.35 floor | Consistent with existing FreshnessCalculator; prevents zero confidence |
| Normalized weights | Ensures entropy calculation is consistent regardless of weight scale |
| Conservative priors | Missing data assumes moderate risk, not best/worst case |
| Risk | Mitigation |
|------|------------|
| Calculation overhead | Cache results per snapshot; calculators are stateless |
| Weight misconfiguration | Validation at startup; presets for common scenarios |
| Clock skew affecting decay | Use TimeProvider abstraction; handle future timestamps gracefully |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
## Next Checkpoints
- 2026-01-08: DCS-001 to DCS-012 complete (implementations)
- 2026-01-09: DCS-013 to DCS-023 complete (tests)
- 2026-01-10: DCS-024 to DCS-028 complete (metrics, docs)

View File

@@ -0,0 +1,842 @@
# Sprint 20260106_001_002_SCANNER - Suppression Proof Model
## Topic & Scope
Implement `SuppressionWitness` - a DSSE-signable proof documenting why a vulnerability is **not affected**, complementing the existing `PathWitness` which documents reachable paths.
- **Working directory:** `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/`
- **Evidence:** SuppressionWitness model, builder, signer, tests
## Problem Statement
The product advisory requires **proof objects for both outcomes**:
- If "affected": attach *minimal counterexample path* (entrypoint -> vulnerable symbol) - **EXISTS: PathWitness**
- If "not affected": attach *suppression proof* (e.g., dead code after linker GC; feature flag off; patched symbol diff) - **GAP**
Current state:
- `PathWitness` documents reachability (why code IS reachable)
- VEX status can be "not_affected" but lacks structured proof
- Gate detection (`DetectedGate`) shows mitigating controls but doesn't form a complete suppression proof
- No model for "why this vulnerability doesn't apply"
**Gap:** No `SuppressionWitness` model to document and attest why a vulnerability is not exploitable.
## Dependencies & Concurrency
- **Depends on:** None (extends existing Witnesses module)
- **Blocks:** SPRINT_20260106_001_001_LB (rationale renderer uses SuppressionWitness)
- **Parallel safe:** Extends existing module; no conflicts
## Documentation Prerequisites
- docs/modules/scanner/architecture.md
- src/Scanner/AGENTS.md
- Existing PathWitness implementation at `src/Scanner/__Libraries/StellaOps.Scanner.Reachability/Witnesses/`
## Technical Design
### Suppression Types
```csharp
namespace StellaOps.Scanner.Reachability.Witnesses;
/// <summary>
/// Classification of suppression reasons.
/// </summary>
public enum SuppressionType
{
/// <summary>Vulnerable code is unreachable from any entry point.</summary>
Unreachable,
/// <summary>Vulnerable symbol was removed by linker garbage collection.</summary>
LinkerGarbageCollected,
/// <summary>Feature flag disables the vulnerable code path.</summary>
FeatureFlagDisabled,
/// <summary>Vulnerable symbol was patched (backport).</summary>
PatchedSymbol,
/// <summary>Runtime gate (authentication, validation) blocks exploitation.</summary>
GateBlocked,
/// <summary>Compile-time configuration excludes vulnerable code.</summary>
CompileTimeExcluded,
/// <summary>VEX statement from authoritative source declares not_affected.</summary>
VexNotAffected,
/// <summary>Binary does not contain the vulnerable function.</summary>
FunctionAbsent,
/// <summary>Version is outside the affected range.</summary>
VersionNotAffected,
/// <summary>Platform/architecture not vulnerable.</summary>
PlatformNotAffected
}
```
### SuppressionWitness Model
```csharp
namespace StellaOps.Scanner.Reachability.Witnesses;
/// <summary>
/// A DSSE-signable suppression witness documenting why a vulnerability is not exploitable.
/// Conforms to stellaops.suppression.v1 schema.
/// </summary>
public sealed record SuppressionWitness
{
/// <summary>Schema version identifier.</summary>
[JsonPropertyName("witness_schema")]
public string WitnessSchema { get; init; } = SuppressionWitnessSchema.Version;
/// <summary>Content-addressed witness ID (e.g., "sup:sha256:...").</summary>
[JsonPropertyName("witness_id")]
public required string WitnessId { get; init; }
/// <summary>The artifact (SBOM, component) this witness relates to.</summary>
[JsonPropertyName("artifact")]
public required WitnessArtifact Artifact { get; init; }
/// <summary>The vulnerability this witness concerns.</summary>
[JsonPropertyName("vuln")]
public required WitnessVuln Vuln { get; init; }
/// <summary>Type of suppression.</summary>
[JsonPropertyName("type")]
public required SuppressionType Type { get; init; }
/// <summary>Human-readable reason for suppression.</summary>
[JsonPropertyName("reason")]
public required string Reason { get; init; }
/// <summary>Detailed evidence supporting the suppression.</summary>
[JsonPropertyName("evidence")]
public required SuppressionEvidence Evidence { get; init; }
/// <summary>Confidence level (0.0 - 1.0).</summary>
[JsonPropertyName("confidence")]
public required double Confidence { get; init; }
/// <summary>When this witness was generated (UTC ISO-8601).</summary>
[JsonPropertyName("observed_at")]
public required DateTimeOffset ObservedAt { get; init; }
/// <summary>Optional expiration for time-bounded suppressions.</summary>
[JsonPropertyName("expires_at")]
public DateTimeOffset? ExpiresAt { get; init; }
/// <summary>Additional metadata.</summary>
[JsonPropertyName("metadata")]
public IReadOnlyDictionary<string, string>? Metadata { get; init; }
}
/// <summary>
/// Evidence supporting a suppression claim.
/// </summary>
public sealed record SuppressionEvidence
{
/// <summary>BLAKE3 digest of the call graph analyzed.</summary>
[JsonPropertyName("callgraph_digest")]
public string? CallgraphDigest { get; init; }
/// <summary>Build identifier for the analyzed artifact.</summary>
[JsonPropertyName("build_id")]
public string? BuildId { get; init; }
/// <summary>Linker map digest (for GC-based suppression).</summary>
[JsonPropertyName("linker_map_digest")]
public string? LinkerMapDigest { get; init; }
/// <summary>Symbol that was expected but absent.</summary>
[JsonPropertyName("absent_symbol")]
public AbsentSymbolInfo? AbsentSymbol { get; init; }
/// <summary>Patched symbol comparison.</summary>
[JsonPropertyName("patched_symbol")]
public PatchedSymbolInfo? PatchedSymbol { get; init; }
/// <summary>Feature flag that disables the code path.</summary>
[JsonPropertyName("feature_flag")]
public FeatureFlagInfo? FeatureFlag { get; init; }
/// <summary>Gates that block exploitation.</summary>
[JsonPropertyName("blocking_gates")]
public IReadOnlyList<DetectedGate>? BlockingGates { get; init; }
/// <summary>VEX statement reference.</summary>
[JsonPropertyName("vex_statement")]
public VexStatementRef? VexStatement { get; init; }
/// <summary>Version comparison evidence.</summary>
[JsonPropertyName("version_comparison")]
public VersionComparisonInfo? VersionComparison { get; init; }
/// <summary>SHA-256 digest of the analysis configuration.</summary>
[JsonPropertyName("analysis_config_digest")]
public string? AnalysisConfigDigest { get; init; }
}
/// <summary>Information about an absent symbol.</summary>
public sealed record AbsentSymbolInfo
{
[JsonPropertyName("symbol_id")]
public required string SymbolId { get; init; }
[JsonPropertyName("expected_in_version")]
public required string ExpectedInVersion { get; init; }
[JsonPropertyName("search_scope")]
public required string SearchScope { get; init; }
[JsonPropertyName("searched_binaries")]
public IReadOnlyList<string>? SearchedBinaries { get; init; }
}
/// <summary>Information about a patched symbol.</summary>
public sealed record PatchedSymbolInfo
{
[JsonPropertyName("symbol_id")]
public required string SymbolId { get; init; }
[JsonPropertyName("vulnerable_fingerprint")]
public required string VulnerableFingerprint { get; init; }
[JsonPropertyName("actual_fingerprint")]
public required string ActualFingerprint { get; init; }
[JsonPropertyName("similarity_score")]
public required double SimilarityScore { get; init; }
[JsonPropertyName("patch_source")]
public string? PatchSource { get; init; }
[JsonPropertyName("diff_summary")]
public string? DiffSummary { get; init; }
}
/// <summary>Information about a disabling feature flag.</summary>
public sealed record FeatureFlagInfo
{
[JsonPropertyName("flag_name")]
public required string FlagName { get; init; }
[JsonPropertyName("flag_value")]
public required string FlagValue { get; init; }
[JsonPropertyName("source")]
public required string Source { get; init; }
[JsonPropertyName("controls_symbol")]
public string? ControlsSymbol { get; init; }
}
/// <summary>Reference to a VEX statement.</summary>
public sealed record VexStatementRef
{
[JsonPropertyName("document_id")]
public required string DocumentId { get; init; }
[JsonPropertyName("statement_id")]
public required string StatementId { get; init; }
[JsonPropertyName("issuer")]
public required string Issuer { get; init; }
[JsonPropertyName("status")]
public required string Status { get; init; }
[JsonPropertyName("justification")]
public string? Justification { get; init; }
}
/// <summary>Version comparison evidence.</summary>
public sealed record VersionComparisonInfo
{
[JsonPropertyName("actual_version")]
public required string ActualVersion { get; init; }
[JsonPropertyName("affected_range")]
public required string AffectedRange { get; init; }
[JsonPropertyName("comparison_result")]
public required string ComparisonResult { get; init; }
}
```
### SuppressionWitness Builder
```csharp
namespace StellaOps.Scanner.Reachability.Witnesses;
/// <summary>
/// Builds suppression witnesses from analysis results.
/// </summary>
public interface ISuppressionWitnessBuilder
{
/// <summary>
/// Build a suppression witness for unreachable code.
/// </summary>
SuppressionWitness BuildUnreachable(
WitnessArtifact artifact,
WitnessVuln vuln,
string callgraphDigest,
string reason);
/// <summary>
/// Build a suppression witness for patched symbol.
/// </summary>
SuppressionWitness BuildPatchedSymbol(
WitnessArtifact artifact,
WitnessVuln vuln,
PatchedSymbolInfo patchInfo);
/// <summary>
/// Build a suppression witness for absent function.
/// </summary>
SuppressionWitness BuildFunctionAbsent(
WitnessArtifact artifact,
WitnessVuln vuln,
AbsentSymbolInfo absentInfo);
/// <summary>
/// Build a suppression witness for gate-blocked path.
/// </summary>
SuppressionWitness BuildGateBlocked(
WitnessArtifact artifact,
WitnessVuln vuln,
IReadOnlyList<DetectedGate> blockingGates);
/// <summary>
/// Build a suppression witness for feature flag disabled.
/// </summary>
SuppressionWitness BuildFeatureFlagDisabled(
WitnessArtifact artifact,
WitnessVuln vuln,
FeatureFlagInfo flagInfo);
/// <summary>
/// Build a suppression witness from VEX not_affected statement.
/// </summary>
SuppressionWitness BuildFromVexStatement(
WitnessArtifact artifact,
WitnessVuln vuln,
VexStatementRef vexStatement);
/// <summary>
/// Build a suppression witness for version not in affected range.
/// </summary>
SuppressionWitness BuildVersionNotAffected(
WitnessArtifact artifact,
WitnessVuln vuln,
VersionComparisonInfo versionInfo);
}
public sealed class SuppressionWitnessBuilder : ISuppressionWitnessBuilder
{
private readonly TimeProvider _timeProvider;
private readonly ILogger<SuppressionWitnessBuilder> _logger;
public SuppressionWitnessBuilder(
TimeProvider timeProvider,
ILogger<SuppressionWitnessBuilder> logger)
{
_timeProvider = timeProvider;
_logger = logger;
}
public SuppressionWitness BuildUnreachable(
WitnessArtifact artifact,
WitnessVuln vuln,
string callgraphDigest,
string reason)
{
var evidence = new SuppressionEvidence
{
CallgraphDigest = callgraphDigest
};
return Build(
artifact,
vuln,
SuppressionType.Unreachable,
reason,
evidence,
confidence: 0.95);
}
public SuppressionWitness BuildPatchedSymbol(
WitnessArtifact artifact,
WitnessVuln vuln,
PatchedSymbolInfo patchInfo)
{
var evidence = new SuppressionEvidence
{
PatchedSymbol = patchInfo
};
var reason = $"Symbol `{patchInfo.SymbolId}` differs from vulnerable version " +
$"(similarity: {patchInfo.SimilarityScore:P1})";
// Confidence based on similarity: lower similarity = higher confidence it's patched
var confidence = 1.0 - patchInfo.SimilarityScore;
return Build(
artifact,
vuln,
SuppressionType.PatchedSymbol,
reason,
evidence,
confidence);
}
public SuppressionWitness BuildFunctionAbsent(
WitnessArtifact artifact,
WitnessVuln vuln,
AbsentSymbolInfo absentInfo)
{
var evidence = new SuppressionEvidence
{
AbsentSymbol = absentInfo
};
var reason = $"Vulnerable symbol `{absentInfo.SymbolId}` not found in binary";
return Build(
artifact,
vuln,
SuppressionType.FunctionAbsent,
reason,
evidence,
confidence: 0.90);
}
public SuppressionWitness BuildGateBlocked(
WitnessArtifact artifact,
WitnessVuln vuln,
IReadOnlyList<DetectedGate> blockingGates)
{
var evidence = new SuppressionEvidence
{
BlockingGates = blockingGates
};
var gateTypes = string.Join(", ", blockingGates.Select(g => g.Type).Distinct());
var reason = $"Exploitation blocked by gates: {gateTypes}";
// Confidence based on minimum gate confidence
var confidence = blockingGates.Min(g => g.Confidence);
return Build(
artifact,
vuln,
SuppressionType.GateBlocked,
reason,
evidence,
confidence);
}
public SuppressionWitness BuildFeatureFlagDisabled(
WitnessArtifact artifact,
WitnessVuln vuln,
FeatureFlagInfo flagInfo)
{
var evidence = new SuppressionEvidence
{
FeatureFlag = flagInfo
};
var reason = $"Feature flag `{flagInfo.FlagName}` = `{flagInfo.FlagValue}` disables vulnerable code path";
return Build(
artifact,
vuln,
SuppressionType.FeatureFlagDisabled,
reason,
evidence,
confidence: 0.85);
}
public SuppressionWitness BuildFromVexStatement(
WitnessArtifact artifact,
WitnessVuln vuln,
VexStatementRef vexStatement)
{
var evidence = new SuppressionEvidence
{
VexStatement = vexStatement
};
var reason = vexStatement.Justification
?? $"VEX statement from {vexStatement.Issuer} declares not_affected";
return Build(
artifact,
vuln,
SuppressionType.VexNotAffected,
reason,
evidence,
confidence: 0.95);
}
public SuppressionWitness BuildVersionNotAffected(
WitnessArtifact artifact,
WitnessVuln vuln,
VersionComparisonInfo versionInfo)
{
var evidence = new SuppressionEvidence
{
VersionComparison = versionInfo
};
var reason = $"Version {versionInfo.ActualVersion} is outside affected range {versionInfo.AffectedRange}";
return Build(
artifact,
vuln,
SuppressionType.VersionNotAffected,
reason,
evidence,
confidence: 0.99);
}
private SuppressionWitness Build(
WitnessArtifact artifact,
WitnessVuln vuln,
SuppressionType type,
string reason,
SuppressionEvidence evidence,
double confidence)
{
var observedAt = _timeProvider.GetUtcNow();
var witness = new SuppressionWitness
{
WitnessId = "", // Computed below
Artifact = artifact,
Vuln = vuln,
Type = type,
Reason = reason,
Evidence = evidence,
Confidence = Math.Round(confidence, 4),
ObservedAt = observedAt
};
// Compute content-addressed ID
var witnessId = ComputeWitnessId(witness);
witness = witness with { WitnessId = witnessId };
_logger.LogDebug(
"Built suppression witness {WitnessId} for {VulnId} on {Component}: {Type}",
witnessId, vuln.Id, artifact.ComponentPurl, type);
return witness;
}
private static string ComputeWitnessId(SuppressionWitness witness)
{
var canonical = CanonicalJsonSerializer.Serialize(new
{
artifact = witness.Artifact,
vuln = witness.Vuln,
type = witness.Type.ToString(),
reason = witness.Reason,
evidence_callgraph = witness.Evidence.CallgraphDigest,
evidence_build_id = witness.Evidence.BuildId,
evidence_patched = witness.Evidence.PatchedSymbol?.ActualFingerprint,
evidence_vex = witness.Evidence.VexStatement?.StatementId
});
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(canonical));
return $"sup:sha256:{Convert.ToHexString(hash).ToLowerInvariant()}";
}
}
```
### DSSE Signing
```csharp
namespace StellaOps.Scanner.Reachability.Witnesses;
/// <summary>
/// Signs suppression witnesses with DSSE.
/// </summary>
public interface ISuppressionDsseSigner
{
/// <summary>
/// Sign a suppression witness.
/// </summary>
Task<DsseEnvelope> SignAsync(
SuppressionWitness witness,
string keyId,
CancellationToken ct = default);
/// <summary>
/// Verify a signed suppression witness.
/// </summary>
Task<bool> VerifyAsync(
DsseEnvelope envelope,
CancellationToken ct = default);
}
public sealed class SuppressionDsseSigner : ISuppressionDsseSigner
{
public const string PredicateType = "stellaops.dev/predicates/suppression-witness@v1";
private readonly ISigningService _signingService;
private readonly ILogger<SuppressionDsseSigner> _logger;
public SuppressionDsseSigner(
ISigningService signingService,
ILogger<SuppressionDsseSigner> logger)
{
_signingService = signingService;
_logger = logger;
}
public async Task<DsseEnvelope> SignAsync(
SuppressionWitness witness,
string keyId,
CancellationToken ct = default)
{
var payload = CanonicalJsonSerializer.Serialize(witness);
var payloadBytes = Encoding.UTF8.GetBytes(payload);
var pae = DsseHelper.ComputePreAuthenticationEncoding(
PredicateType,
payloadBytes);
var signature = await _signingService.SignAsync(
pae,
keyId,
ct);
var envelope = new DsseEnvelope
{
PayloadType = PredicateType,
Payload = Convert.ToBase64String(payloadBytes),
Signatures =
[
new DsseSignature
{
KeyId = keyId,
Sig = Convert.ToBase64String(signature)
}
]
};
_logger.LogInformation(
"Signed suppression witness {WitnessId} with key {KeyId}",
witness.WitnessId, keyId);
return envelope;
}
public async Task<bool> VerifyAsync(
DsseEnvelope envelope,
CancellationToken ct = default)
{
if (envelope.PayloadType != PredicateType)
{
_logger.LogWarning(
"Invalid payload type: expected {Expected}, got {Actual}",
PredicateType, envelope.PayloadType);
return false;
}
var payloadBytes = Convert.FromBase64String(envelope.Payload);
var pae = DsseHelper.ComputePreAuthenticationEncoding(
PredicateType,
payloadBytes);
foreach (var sig in envelope.Signatures)
{
var signatureBytes = Convert.FromBase64String(sig.Sig);
var valid = await _signingService.VerifyAsync(
pae,
signatureBytes,
sig.KeyId,
ct);
if (!valid)
{
_logger.LogWarning(
"Signature verification failed for key {KeyId}",
sig.KeyId);
return false;
}
}
return true;
}
}
```
### Integration with Reachability Evaluator
```csharp
namespace StellaOps.Scanner.Reachability.Stack;
public sealed class ReachabilityStackEvaluator
{
private readonly ISuppressionWitnessBuilder _suppressionBuilder;
// ... existing dependencies
/// <summary>
/// Evaluate reachability and produce either PathWitness (affected) or SuppressionWitness (not affected).
/// </summary>
public async Task<ReachabilityResult> EvaluateAsync(
RichGraph graph,
WitnessArtifact artifact,
WitnessVuln vuln,
string targetSymbol,
CancellationToken ct = default)
{
// L1: Static analysis
var staticResult = await EvaluateStaticReachabilityAsync(graph, targetSymbol, ct);
if (staticResult.Verdict == ReachabilityVerdict.Unreachable)
{
var suppression = _suppressionBuilder.BuildUnreachable(
artifact,
vuln,
staticResult.CallgraphDigest,
"No path from any entry point to vulnerable symbol");
return ReachabilityResult.NotAffected(suppression);
}
// L2: Binary resolution
var binaryResult = await EvaluateBinaryResolutionAsync(artifact, targetSymbol, ct);
if (binaryResult.FunctionAbsent)
{
var suppression = _suppressionBuilder.BuildFunctionAbsent(
artifact,
vuln,
binaryResult.AbsentSymbolInfo!);
return ReachabilityResult.NotAffected(suppression);
}
if (binaryResult.IsPatched)
{
var suppression = _suppressionBuilder.BuildPatchedSymbol(
artifact,
vuln,
binaryResult.PatchedSymbolInfo!);
return ReachabilityResult.NotAffected(suppression);
}
// L3: Runtime gating
var gateResult = await EvaluateGatesAsync(graph, staticResult.Path!, ct);
if (gateResult.AllPathsBlocked)
{
var suppression = _suppressionBuilder.BuildGateBlocked(
artifact,
vuln,
gateResult.BlockingGates);
return ReachabilityResult.NotAffected(suppression);
}
// Reachable - build PathWitness
var pathWitness = await _pathWitnessBuilder.BuildAsync(
artifact,
vuln,
staticResult.Path!,
gateResult.DetectedGates,
ct);
return ReachabilityResult.Affected(pathWitness);
}
}
public sealed record ReachabilityResult
{
public required ReachabilityVerdict Verdict { get; init; }
public PathWitness? PathWitness { get; init; }
public SuppressionWitness? SuppressionWitness { get; init; }
public static ReachabilityResult Affected(PathWitness witness) =>
new() { Verdict = ReachabilityVerdict.Affected, PathWitness = witness };
public static ReachabilityResult NotAffected(SuppressionWitness witness) =>
new() { Verdict = ReachabilityVerdict.NotAffected, SuppressionWitness = witness };
}
public enum ReachabilityVerdict
{
Affected,
NotAffected,
Unknown
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | SUP-001 | TODO | - | - | Define `SuppressionType` enum |
| 2 | SUP-002 | TODO | SUP-001 | - | Define `SuppressionWitness` record |
| 3 | SUP-003 | TODO | SUP-002 | - | Define `SuppressionEvidence` and sub-records |
| 4 | SUP-004 | TODO | SUP-003 | - | Define `SuppressionWitnessSchema` version |
| 5 | SUP-005 | TODO | SUP-004 | - | Define `ISuppressionWitnessBuilder` interface |
| 6 | SUP-006 | TODO | SUP-005 | - | Implement `SuppressionWitnessBuilder.BuildUnreachable()` |
| 7 | SUP-007 | TODO | SUP-006 | - | Implement `SuppressionWitnessBuilder.BuildPatchedSymbol()` |
| 8 | SUP-008 | TODO | SUP-007 | - | Implement `SuppressionWitnessBuilder.BuildFunctionAbsent()` |
| 9 | SUP-009 | TODO | SUP-008 | - | Implement `SuppressionWitnessBuilder.BuildGateBlocked()` |
| 10 | SUP-010 | TODO | SUP-009 | - | Implement `SuppressionWitnessBuilder.BuildFeatureFlagDisabled()` |
| 11 | SUP-011 | TODO | SUP-010 | - | Implement `SuppressionWitnessBuilder.BuildFromVexStatement()` |
| 12 | SUP-012 | TODO | SUP-011 | - | Implement `SuppressionWitnessBuilder.BuildVersionNotAffected()` |
| 13 | SUP-013 | TODO | SUP-012 | - | Implement content-addressed witness ID computation |
| 14 | SUP-014 | TODO | SUP-013 | - | Define `ISuppressionDsseSigner` interface |
| 15 | SUP-015 | TODO | SUP-014 | - | Implement `SuppressionDsseSigner.SignAsync()` |
| 16 | SUP-016 | TODO | SUP-015 | - | Implement `SuppressionDsseSigner.VerifyAsync()` |
| 17 | SUP-017 | TODO | SUP-016 | - | Create `ReachabilityResult` unified result type |
| 18 | SUP-018 | TODO | SUP-017 | - | Integrate SuppressionWitnessBuilder into ReachabilityStackEvaluator |
| 19 | SUP-019 | TODO | SUP-018 | - | Add service registration extensions |
| 20 | SUP-020 | TODO | SUP-019 | - | Write unit tests: SuppressionWitnessBuilder (all types) |
| 21 | SUP-021 | TODO | SUP-020 | - | Write unit tests: SuppressionDsseSigner |
| 22 | SUP-022 | TODO | SUP-021 | - | Write unit tests: ReachabilityStackEvaluator with suppression |
| 23 | SUP-023 | TODO | SUP-022 | - | Write golden fixture tests for witness serialization |
| 24 | SUP-024 | TODO | SUP-023 | - | Write property tests: witness ID determinism |
| 25 | SUP-025 | TODO | SUP-024 | - | Add JSON schema for SuppressionWitness (stellaops.suppression.v1) |
| 26 | SUP-026 | TODO | SUP-025 | - | Document suppression types in docs/modules/scanner/ |
| 27 | SUP-027 | TODO | SUP-026 | - | Expose suppression witnesses via Scanner.WebService API |
## Acceptance Criteria
1. **Completeness:** All 10 suppression types have dedicated builders
2. **DSSE Signing:** All suppression witnesses are signable with DSSE
3. **Determinism:** Same inputs produce identical witness IDs (content-addressed)
4. **Schema:** JSON schema registered at `stellaops.suppression.v1`
5. **Integration:** ReachabilityStackEvaluator returns SuppressionWitness for not-affected findings
6. **Test Coverage:** Unit tests for all builder methods, property tests for determinism
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| 10 suppression types | Covers all common not-affected scenarios per advisory |
| Content-addressed IDs | Enables caching and deduplication |
| Confidence scores | Different evidence has different reliability |
| Optional expiration | Some suppressions are time-bounded (e.g., pending patches) |
| Risk | Mitigation |
|------|------------|
| False suppression | Confidence thresholds; manual review for low confidence |
| Missing suppression type | Extensible enum; can add new types |
| Complex evidence | Structured sub-records for each type |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |

View File

@@ -0,0 +1,962 @@
# Sprint 20260106_001_003_BINDEX - Symbol Table Diff
## Topic & Scope
Extend `PatchDiffEngine` with symbol table comparison capabilities to track exported/imported symbol changes, version maps, and GOT/PLT table modifications between binary versions.
- **Working directory:** `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
- **Evidence:** SymbolTableDiff model, analyzer, tests, integration with MaterialChange
## Problem Statement
The product advisory requires **per-layer diffs** including:
> **Symbols:** exported symbols and version maps; highlight ABI-relevant changes.
Current state:
- `PatchDiffEngine` compares **function bodies** (fingerprints, CFG, basic blocks)
- `DeltaSignatureGenerator` creates CVE signatures at function level
- No comparison of:
- Exported symbol table (.dynsym, .symtab)
- Imported symbols and version requirements (.gnu.version_r)
- Symbol versioning maps (.gnu.version, .gnu.version_d)
- GOT/PLT entries (dynamic linking)
- Relocation entries
**Gap:** Symbol-level changes between binaries are not detected or reported.
## Dependencies & Concurrency
- **Depends on:** StellaOps.BinaryIndex.Disassembly (for ELF/PE parsing)
- **Blocks:** SPRINT_20260106_001_004_LB (orchestrator uses symbol diffs)
- **Parallel safe:** Extends existing module; no conflicts
## Documentation Prerequisites
- docs/modules/binary-index/architecture.md
- src/BinaryIndex/AGENTS.md
- Existing PatchDiffEngine at `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Builders/`
## Technical Design
### Data Contracts
```csharp
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
/// <summary>
/// Complete symbol table diff between two binaries.
/// </summary>
public sealed record SymbolTableDiff
{
/// <summary>Content-addressed diff ID.</summary>
[JsonPropertyName("diff_id")]
public required string DiffId { get; init; }
/// <summary>Base binary identity.</summary>
[JsonPropertyName("base")]
public required BinaryRef Base { get; init; }
/// <summary>Target binary identity.</summary>
[JsonPropertyName("target")]
public required BinaryRef Target { get; init; }
/// <summary>Exported symbol changes.</summary>
[JsonPropertyName("exports")]
public required SymbolChangeSummary Exports { get; init; }
/// <summary>Imported symbol changes.</summary>
[JsonPropertyName("imports")]
public required SymbolChangeSummary Imports { get; init; }
/// <summary>Version map changes.</summary>
[JsonPropertyName("versions")]
public required VersionMapDiff Versions { get; init; }
/// <summary>GOT/PLT changes (dynamic linking).</summary>
[JsonPropertyName("dynamic")]
public DynamicLinkingDiff? Dynamic { get; init; }
/// <summary>Overall ABI compatibility assessment.</summary>
[JsonPropertyName("abi_compatibility")]
public required AbiCompatibility AbiCompatibility { get; init; }
/// <summary>When this diff was computed (UTC).</summary>
[JsonPropertyName("computed_at")]
public required DateTimeOffset ComputedAt { get; init; }
}
/// <summary>Reference to a binary.</summary>
public sealed record BinaryRef
{
[JsonPropertyName("path")]
public required string Path { get; init; }
[JsonPropertyName("sha256")]
public required string Sha256 { get; init; }
[JsonPropertyName("build_id")]
public string? BuildId { get; init; }
[JsonPropertyName("architecture")]
public required string Architecture { get; init; }
}
/// <summary>Summary of symbol changes.</summary>
public sealed record SymbolChangeSummary
{
[JsonPropertyName("added")]
public required IReadOnlyList<SymbolChange> Added { get; init; }
[JsonPropertyName("removed")]
public required IReadOnlyList<SymbolChange> Removed { get; init; }
[JsonPropertyName("modified")]
public required IReadOnlyList<SymbolModification> Modified { get; init; }
[JsonPropertyName("renamed")]
public required IReadOnlyList<SymbolRename> Renamed { get; init; }
/// <summary>Count summaries.</summary>
[JsonPropertyName("counts")]
public required SymbolChangeCounts Counts { get; init; }
}
public sealed record SymbolChangeCounts
{
[JsonPropertyName("added")]
public int Added { get; init; }
[JsonPropertyName("removed")]
public int Removed { get; init; }
[JsonPropertyName("modified")]
public int Modified { get; init; }
[JsonPropertyName("renamed")]
public int Renamed { get; init; }
[JsonPropertyName("unchanged")]
public int Unchanged { get; init; }
[JsonPropertyName("total_base")]
public int TotalBase { get; init; }
[JsonPropertyName("total_target")]
public int TotalTarget { get; init; }
}
/// <summary>A single symbol change.</summary>
public sealed record SymbolChange
{
[JsonPropertyName("name")]
public required string Name { get; init; }
[JsonPropertyName("demangled")]
public string? Demangled { get; init; }
[JsonPropertyName("type")]
public required SymbolType Type { get; init; }
[JsonPropertyName("binding")]
public required SymbolBinding Binding { get; init; }
[JsonPropertyName("visibility")]
public required SymbolVisibility Visibility { get; init; }
[JsonPropertyName("version")]
public string? Version { get; init; }
[JsonPropertyName("address")]
public ulong? Address { get; init; }
[JsonPropertyName("size")]
public ulong? Size { get; init; }
[JsonPropertyName("section")]
public string? Section { get; init; }
}
/// <summary>A symbol that was modified.</summary>
public sealed record SymbolModification
{
[JsonPropertyName("name")]
public required string Name { get; init; }
[JsonPropertyName("demangled")]
public string? Demangled { get; init; }
[JsonPropertyName("changes")]
public required IReadOnlyList<SymbolFieldChange> Changes { get; init; }
[JsonPropertyName("abi_breaking")]
public bool AbiBreaking { get; init; }
}
public sealed record SymbolFieldChange
{
[JsonPropertyName("field")]
public required string Field { get; init; }
[JsonPropertyName("old_value")]
public required string OldValue { get; init; }
[JsonPropertyName("new_value")]
public required string NewValue { get; init; }
}
/// <summary>A symbol that was renamed.</summary>
public sealed record SymbolRename
{
[JsonPropertyName("old_name")]
public required string OldName { get; init; }
[JsonPropertyName("new_name")]
public required string NewName { get; init; }
[JsonPropertyName("confidence")]
public required double Confidence { get; init; }
[JsonPropertyName("reason")]
public required string Reason { get; init; }
}
public enum SymbolType
{
Function,
Object,
TlsObject,
Section,
File,
Common,
Indirect,
Unknown
}
public enum SymbolBinding
{
Local,
Global,
Weak,
Unknown
}
public enum SymbolVisibility
{
Default,
Internal,
Hidden,
Protected
}
/// <summary>Version map changes.</summary>
public sealed record VersionMapDiff
{
/// <summary>Version definitions added.</summary>
[JsonPropertyName("definitions_added")]
public required IReadOnlyList<VersionDefinition> DefinitionsAdded { get; init; }
/// <summary>Version definitions removed.</summary>
[JsonPropertyName("definitions_removed")]
public required IReadOnlyList<VersionDefinition> DefinitionsRemoved { get; init; }
/// <summary>Version requirements added.</summary>
[JsonPropertyName("requirements_added")]
public required IReadOnlyList<VersionRequirement> RequirementsAdded { get; init; }
/// <summary>Version requirements removed.</summary>
[JsonPropertyName("requirements_removed")]
public required IReadOnlyList<VersionRequirement> RequirementsRemoved { get; init; }
/// <summary>Symbols with version changes.</summary>
[JsonPropertyName("symbol_version_changes")]
public required IReadOnlyList<SymbolVersionChange> SymbolVersionChanges { get; init; }
}
public sealed record VersionDefinition
{
[JsonPropertyName("name")]
public required string Name { get; init; }
[JsonPropertyName("index")]
public int Index { get; init; }
[JsonPropertyName("predecessors")]
public IReadOnlyList<string>? Predecessors { get; init; }
}
public sealed record VersionRequirement
{
[JsonPropertyName("library")]
public required string Library { get; init; }
[JsonPropertyName("version")]
public required string Version { get; init; }
[JsonPropertyName("symbols")]
public IReadOnlyList<string>? Symbols { get; init; }
}
public sealed record SymbolVersionChange
{
[JsonPropertyName("symbol")]
public required string Symbol { get; init; }
[JsonPropertyName("old_version")]
public required string OldVersion { get; init; }
[JsonPropertyName("new_version")]
public required string NewVersion { get; init; }
}
/// <summary>Dynamic linking changes (GOT/PLT).</summary>
public sealed record DynamicLinkingDiff
{
/// <summary>GOT entries added.</summary>
[JsonPropertyName("got_added")]
public required IReadOnlyList<GotEntry> GotAdded { get; init; }
/// <summary>GOT entries removed.</summary>
[JsonPropertyName("got_removed")]
public required IReadOnlyList<GotEntry> GotRemoved { get; init; }
/// <summary>PLT entries added.</summary>
[JsonPropertyName("plt_added")]
public required IReadOnlyList<PltEntry> PltAdded { get; init; }
/// <summary>PLT entries removed.</summary>
[JsonPropertyName("plt_removed")]
public required IReadOnlyList<PltEntry> PltRemoved { get; init; }
/// <summary>Relocation changes.</summary>
[JsonPropertyName("relocation_changes")]
public IReadOnlyList<RelocationChange>? RelocationChanges { get; init; }
}
public sealed record GotEntry
{
[JsonPropertyName("symbol")]
public required string Symbol { get; init; }
[JsonPropertyName("offset")]
public ulong Offset { get; init; }
}
public sealed record PltEntry
{
[JsonPropertyName("symbol")]
public required string Symbol { get; init; }
[JsonPropertyName("address")]
public ulong Address { get; init; }
}
public sealed record RelocationChange
{
[JsonPropertyName("type")]
public required string Type { get; init; }
[JsonPropertyName("symbol")]
public required string Symbol { get; init; }
[JsonPropertyName("change_kind")]
public required string ChangeKind { get; init; }
}
/// <summary>ABI compatibility assessment.</summary>
public sealed record AbiCompatibility
{
[JsonPropertyName("level")]
public required AbiCompatibilityLevel Level { get; init; }
[JsonPropertyName("breaking_changes")]
public required IReadOnlyList<AbiBreakingChange> BreakingChanges { get; init; }
[JsonPropertyName("score")]
public required double Score { get; init; }
}
public enum AbiCompatibilityLevel
{
/// <summary>Fully backward compatible.</summary>
Compatible,
/// <summary>Minor changes, likely compatible.</summary>
MinorChanges,
/// <summary>Breaking changes detected.</summary>
Breaking,
/// <summary>Cannot determine compatibility.</summary>
Unknown
}
public sealed record AbiBreakingChange
{
[JsonPropertyName("category")]
public required string Category { get; init; }
[JsonPropertyName("symbol")]
public required string Symbol { get; init; }
[JsonPropertyName("description")]
public required string Description { get; init; }
[JsonPropertyName("severity")]
public required string Severity { get; init; }
}
```
### Symbol Table Analyzer Interface
```csharp
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
/// <summary>
/// Analyzes symbol table differences between binaries.
/// </summary>
public interface ISymbolTableDiffAnalyzer
{
/// <summary>
/// Compute symbol table diff between two binaries.
/// </summary>
Task<SymbolTableDiff> ComputeDiffAsync(
string basePath,
string targetPath,
SymbolDiffOptions? options = null,
CancellationToken ct = default);
/// <summary>
/// Extract symbol table from a binary.
/// </summary>
Task<SymbolTable> ExtractSymbolTableAsync(
string binaryPath,
CancellationToken ct = default);
}
/// <summary>
/// Options for symbol diff analysis.
/// </summary>
public sealed record SymbolDiffOptions
{
/// <summary>Include local symbols (default: false).</summary>
public bool IncludeLocalSymbols { get; init; } = false;
/// <summary>Include debug symbols (default: false).</summary>
public bool IncludeDebugSymbols { get; init; } = false;
/// <summary>Demangle C++ symbols (default: true).</summary>
public bool Demangle { get; init; } = true;
/// <summary>Detect renames via fingerprint matching (default: true).</summary>
public bool DetectRenames { get; init; } = true;
/// <summary>Minimum confidence for rename detection (default: 0.7).</summary>
public double RenameConfidenceThreshold { get; init; } = 0.7;
/// <summary>Include GOT/PLT analysis (default: true).</summary>
public bool IncludeDynamicLinking { get; init; } = true;
/// <summary>Include version map analysis (default: true).</summary>
public bool IncludeVersionMaps { get; init; } = true;
}
/// <summary>
/// Extracted symbol table from a binary.
/// </summary>
public sealed record SymbolTable
{
public required string BinaryPath { get; init; }
public required string Sha256 { get; init; }
public string? BuildId { get; init; }
public required string Architecture { get; init; }
public required IReadOnlyList<Symbol> Exports { get; init; }
public required IReadOnlyList<Symbol> Imports { get; init; }
public required IReadOnlyList<VersionDefinition> VersionDefinitions { get; init; }
public required IReadOnlyList<VersionRequirement> VersionRequirements { get; init; }
public IReadOnlyList<GotEntry>? GotEntries { get; init; }
public IReadOnlyList<PltEntry>? PltEntries { get; init; }
}
public sealed record Symbol
{
public required string Name { get; init; }
public string? Demangled { get; init; }
public required SymbolType Type { get; init; }
public required SymbolBinding Binding { get; init; }
public required SymbolVisibility Visibility { get; init; }
public string? Version { get; init; }
public ulong Address { get; init; }
public ulong Size { get; init; }
public string? Section { get; init; }
public string? Fingerprint { get; init; }
}
```
### Symbol Table Diff Analyzer Implementation
```csharp
namespace StellaOps.BinaryIndex.Builders.SymbolDiff;
public sealed class SymbolTableDiffAnalyzer : ISymbolTableDiffAnalyzer
{
private readonly IDisassemblyService _disassembly;
private readonly IFunctionFingerprintExtractor _fingerprinter;
private readonly TimeProvider _timeProvider;
private readonly ILogger<SymbolTableDiffAnalyzer> _logger;
public SymbolTableDiffAnalyzer(
IDisassemblyService disassembly,
IFunctionFingerprintExtractor fingerprinter,
TimeProvider timeProvider,
ILogger<SymbolTableDiffAnalyzer> logger)
{
_disassembly = disassembly;
_fingerprinter = fingerprinter;
_timeProvider = timeProvider;
_logger = logger;
}
public async Task<SymbolTableDiff> ComputeDiffAsync(
string basePath,
string targetPath,
SymbolDiffOptions? options = null,
CancellationToken ct = default)
{
options ??= new SymbolDiffOptions();
var baseTable = await ExtractSymbolTableAsync(basePath, ct);
var targetTable = await ExtractSymbolTableAsync(targetPath, ct);
var exports = ComputeSymbolChanges(
baseTable.Exports, targetTable.Exports, options);
var imports = ComputeSymbolChanges(
baseTable.Imports, targetTable.Imports, options);
var versions = ComputeVersionDiff(baseTable, targetTable);
DynamicLinkingDiff? dynamic = null;
if (options.IncludeDynamicLinking)
{
dynamic = ComputeDynamicLinkingDiff(baseTable, targetTable);
}
var abiCompatibility = AssessAbiCompatibility(exports, imports, versions);
var diff = new SymbolTableDiff
{
DiffId = ComputeDiffId(baseTable, targetTable),
Base = new BinaryRef
{
Path = basePath,
Sha256 = baseTable.Sha256,
BuildId = baseTable.BuildId,
Architecture = baseTable.Architecture
},
Target = new BinaryRef
{
Path = targetPath,
Sha256 = targetTable.Sha256,
BuildId = targetTable.BuildId,
Architecture = targetTable.Architecture
},
Exports = exports,
Imports = imports,
Versions = versions,
Dynamic = dynamic,
AbiCompatibility = abiCompatibility,
ComputedAt = _timeProvider.GetUtcNow()
};
_logger.LogInformation(
"Computed symbol diff {DiffId}: exports (+{Added}/-{Removed}), " +
"imports (+{ImpAdded}/-{ImpRemoved}), ABI={AbiLevel}",
diff.DiffId,
exports.Counts.Added, exports.Counts.Removed,
imports.Counts.Added, imports.Counts.Removed,
abiCompatibility.Level);
return diff;
}
public async Task<SymbolTable> ExtractSymbolTableAsync(
string binaryPath,
CancellationToken ct = default)
{
var binary = await _disassembly.LoadBinaryAsync(binaryPath, ct);
var exports = new List<Symbol>();
var imports = new List<Symbol>();
foreach (var sym in binary.Symbols)
{
var symbol = new Symbol
{
Name = sym.Name,
Demangled = Demangle(sym.Name),
Type = MapSymbolType(sym.Type),
Binding = MapSymbolBinding(sym.Binding),
Visibility = MapSymbolVisibility(sym.Visibility),
Version = sym.Version,
Address = sym.Address,
Size = sym.Size,
Section = sym.Section,
Fingerprint = sym.Type == ElfSymbolType.Function
? await ComputeFingerprintAsync(binary, sym, ct)
: null
};
if (sym.IsExport)
{
exports.Add(symbol);
}
else if (sym.IsImport)
{
imports.Add(symbol);
}
}
return new SymbolTable
{
BinaryPath = binaryPath,
Sha256 = binary.Sha256,
BuildId = binary.BuildId,
Architecture = binary.Architecture,
Exports = exports,
Imports = imports,
VersionDefinitions = ExtractVersionDefinitions(binary),
VersionRequirements = ExtractVersionRequirements(binary),
GotEntries = ExtractGotEntries(binary),
PltEntries = ExtractPltEntries(binary)
};
}
private SymbolChangeSummary ComputeSymbolChanges(
IReadOnlyList<Symbol> baseSymbols,
IReadOnlyList<Symbol> targetSymbols,
SymbolDiffOptions options)
{
var baseByName = baseSymbols.ToDictionary(s => s.Name);
var targetByName = targetSymbols.ToDictionary(s => s.Name);
var added = new List<SymbolChange>();
var removed = new List<SymbolChange>();
var modified = new List<SymbolModification>();
var renamed = new List<SymbolRename>();
var unchanged = 0;
// Find added symbols
foreach (var (name, sym) in targetByName)
{
if (!baseByName.ContainsKey(name))
{
added.Add(MapToChange(sym));
}
}
// Find removed and modified symbols
foreach (var (name, baseSym) in baseByName)
{
if (!targetByName.TryGetValue(name, out var targetSym))
{
removed.Add(MapToChange(baseSym));
}
else
{
var changes = CompareSymbols(baseSym, targetSym);
if (changes.Count > 0)
{
modified.Add(new SymbolModification
{
Name = name,
Demangled = baseSym.Demangled,
Changes = changes,
AbiBreaking = IsAbiBreaking(changes)
});
}
else
{
unchanged++;
}
}
}
// Detect renames (removed symbol with matching fingerprint in added)
if (options.DetectRenames)
{
renamed = DetectRenames(
removed, added,
options.RenameConfidenceThreshold);
// Remove detected renames from added/removed lists
var renamedOld = renamed.Select(r => r.OldName).ToHashSet();
var renamedNew = renamed.Select(r => r.NewName).ToHashSet();
removed = removed.Where(s => !renamedOld.Contains(s.Name)).ToList();
added = added.Where(s => !renamedNew.Contains(s.Name)).ToList();
}
return new SymbolChangeSummary
{
Added = added,
Removed = removed,
Modified = modified,
Renamed = renamed,
Counts = new SymbolChangeCounts
{
Added = added.Count,
Removed = removed.Count,
Modified = modified.Count,
Renamed = renamed.Count,
Unchanged = unchanged,
TotalBase = baseSymbols.Count,
TotalTarget = targetSymbols.Count
}
};
}
private List<SymbolRename> DetectRenames(
List<SymbolChange> removed,
List<SymbolChange> added,
double threshold)
{
var renames = new List<SymbolRename>();
// Match by fingerprint (for functions with computed fingerprints)
var removedFunctions = removed
.Where(s => s.Type == SymbolType.Function)
.ToList();
var addedFunctions = added
.Where(s => s.Type == SymbolType.Function)
.ToList();
// Use fingerprint matching from PatchDiffEngine
foreach (var oldSym in removedFunctions)
{
foreach (var newSym in addedFunctions)
{
// Size similarity as quick filter
if (oldSym.Size.HasValue && newSym.Size.HasValue)
{
var sizeRatio = Math.Min(oldSym.Size.Value, newSym.Size.Value) /
Math.Max(oldSym.Size.Value, newSym.Size.Value);
if (sizeRatio < 0.5) continue;
}
// TODO: Use fingerprint comparison when available
// For now, use name similarity heuristic
var nameSimilarity = ComputeNameSimilarity(oldSym.Name, newSym.Name);
if (nameSimilarity >= threshold)
{
renames.Add(new SymbolRename
{
OldName = oldSym.Name,
NewName = newSym.Name,
Confidence = nameSimilarity,
Reason = "Name similarity match"
});
break;
}
}
}
return renames;
}
private AbiCompatibility AssessAbiCompatibility(
SymbolChangeSummary exports,
SymbolChangeSummary imports,
VersionMapDiff versions)
{
var breakingChanges = new List<AbiBreakingChange>();
// Removed exports are ABI breaking
foreach (var sym in exports.Removed)
{
if (sym.Binding == SymbolBinding.Global)
{
breakingChanges.Add(new AbiBreakingChange
{
Category = "RemovedExport",
Symbol = sym.Name,
Description = $"Global symbol `{sym.Name}` was removed",
Severity = "High"
});
}
}
// Modified exports with type/size changes
foreach (var mod in exports.Modified.Where(m => m.AbiBreaking))
{
breakingChanges.Add(new AbiBreakingChange
{
Category = "ModifiedExport",
Symbol = mod.Name,
Description = $"Symbol `{mod.Name}` has ABI-breaking changes: " +
string.Join(", ", mod.Changes.Select(c => c.Field)),
Severity = "Medium"
});
}
// New required versions are potentially breaking
foreach (var req in versions.RequirementsAdded)
{
breakingChanges.Add(new AbiBreakingChange
{
Category = "NewVersionRequirement",
Symbol = req.Library,
Description = $"New version requirement: {req.Library}@{req.Version}",
Severity = "Low"
});
}
var level = breakingChanges.Count switch
{
0 => AbiCompatibilityLevel.Compatible,
_ when breakingChanges.All(b => b.Severity == "Low") => AbiCompatibilityLevel.MinorChanges,
_ => AbiCompatibilityLevel.Breaking
};
var score = 1.0 - (breakingChanges.Count * 0.1);
score = Math.Max(0.0, Math.Min(1.0, score));
return new AbiCompatibility
{
Level = level,
BreakingChanges = breakingChanges,
Score = Math.Round(score, 4)
};
}
private static string ComputeDiffId(SymbolTable baseTable, SymbolTable targetTable)
{
var input = $"{baseTable.Sha256}:{targetTable.Sha256}";
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(input));
return $"symdiff:sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..32]}";
}
// Helper methods omitted for brevity...
}
```
### Integration with MaterialChange
```csharp
namespace StellaOps.Scanner.SmartDiff;
/// <summary>
/// Extended MaterialChange with symbol-level scope.
/// </summary>
public sealed record MaterialChange
{
// Existing fields...
/// <summary>Scope of the change: file, symbol, or package.</summary>
[JsonPropertyName("scope")]
public MaterialChangeScope Scope { get; init; } = MaterialChangeScope.Package;
/// <summary>Symbol-level details (when scope = Symbol).</summary>
[JsonPropertyName("symbolDetails")]
public SymbolChangeDetails? SymbolDetails { get; init; }
}
public enum MaterialChangeScope
{
Package,
File,
Symbol
}
public sealed record SymbolChangeDetails
{
[JsonPropertyName("symbol_name")]
public required string SymbolName { get; init; }
[JsonPropertyName("demangled")]
public string? Demangled { get; init; }
[JsonPropertyName("change_type")]
public required SymbolMaterialChangeType ChangeType { get; init; }
[JsonPropertyName("abi_impact")]
public required string AbiImpact { get; init; }
[JsonPropertyName("diff_ref")]
public string? DiffRef { get; init; }
}
public enum SymbolMaterialChangeType
{
Added,
Removed,
Modified,
Renamed,
VersionChanged
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | SYM-001 | TODO | - | - | Define `SymbolTableDiff` and related records |
| 2 | SYM-002 | TODO | SYM-001 | - | Define `SymbolChangeSummary` and change records |
| 3 | SYM-003 | TODO | SYM-002 | - | Define `VersionMapDiff` records |
| 4 | SYM-004 | TODO | SYM-003 | - | Define `DynamicLinkingDiff` records (GOT/PLT) |
| 5 | SYM-005 | TODO | SYM-004 | - | Define `AbiCompatibility` assessment model |
| 6 | SYM-006 | TODO | SYM-005 | - | Define `ISymbolTableDiffAnalyzer` interface |
| 7 | SYM-007 | TODO | SYM-006 | - | Implement `ExtractSymbolTableAsync()` for ELF |
| 8 | SYM-008 | TODO | SYM-007 | - | Implement `ExtractSymbolTableAsync()` for PE |
| 9 | SYM-009 | TODO | SYM-008 | - | Implement `ComputeSymbolChanges()` for exports |
| 10 | SYM-010 | TODO | SYM-009 | - | Implement `ComputeSymbolChanges()` for imports |
| 11 | SYM-011 | TODO | SYM-010 | - | Implement `ComputeVersionDiff()` |
| 12 | SYM-012 | TODO | SYM-011 | - | Implement `ComputeDynamicLinkingDiff()` |
| 13 | SYM-013 | TODO | SYM-012 | - | Implement `DetectRenames()` via fingerprint matching |
| 14 | SYM-014 | TODO | SYM-013 | - | Implement `AssessAbiCompatibility()` |
| 15 | SYM-015 | TODO | SYM-014 | - | Implement content-addressed diff ID computation |
| 16 | SYM-016 | TODO | SYM-015 | - | Add C++ name demangling support |
| 17 | SYM-017 | TODO | SYM-016 | - | Add Rust name demangling support |
| 18 | SYM-018 | TODO | SYM-017 | - | Extend `MaterialChange` with symbol scope |
| 19 | SYM-019 | TODO | SYM-018 | - | Add service registration extensions |
| 20 | SYM-020 | TODO | SYM-019 | - | Write unit tests: ELF symbol extraction |
| 21 | SYM-021 | TODO | SYM-020 | - | Write unit tests: PE symbol extraction |
| 22 | SYM-022 | TODO | SYM-021 | - | Write unit tests: symbol change detection |
| 23 | SYM-023 | TODO | SYM-022 | - | Write unit tests: rename detection |
| 24 | SYM-024 | TODO | SYM-023 | - | Write unit tests: ABI compatibility assessment |
| 25 | SYM-025 | TODO | SYM-024 | - | Write golden fixture tests with known binaries |
| 26 | SYM-026 | TODO | SYM-025 | - | Add JSON schema for SymbolTableDiff |
| 27 | SYM-027 | TODO | SYM-026 | - | Document in docs/modules/binary-index/ |
## Acceptance Criteria
1. **Completeness:** Extract exports, imports, versions, GOT/PLT from ELF and PE
2. **Change Detection:** Identify added, removed, modified, renamed symbols
3. **ABI Assessment:** Classify compatibility level with breaking change details
4. **Rename Detection:** Match renames via fingerprint similarity (threshold 0.7)
5. **MaterialChange Integration:** Symbol changes appear as `scope: symbol` in diffs
6. **Test Coverage:** Unit tests for all extractors, golden fixtures for known binaries
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Content-addressed diff IDs | Enables caching and deduplication |
| ABI compatibility scoring | Provides quick triage of binary changes |
| Fingerprint-based rename detection | Handles version-to-version symbol renames |
| Separate ELF/PE extractors | Different binary formats require different parsing |
| Risk | Mitigation |
|------|------------|
| Large symbol tables | Paginate results; index by name |
| False rename detection | Confidence threshold; manual review for low confidence |
| Stripped binaries | Graceful degradation; note limited analysis |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |

View File

@@ -0,0 +1,986 @@
# Sprint 20260106_001_003_POLICY - Determinization: Policy Engine Integration
## Topic & Scope
Integrate the Determinization subsystem into the Policy Engine. This includes the `DeterminizationGate`, policy rules for allow/quarantine/escalate, `GuardedPass` verdict status extension, and event-driven re-evaluation subscriptions.
- **Working directory:** `src/Policy/StellaOps.Policy.Engine/` and `src/Policy/__Libraries/StellaOps.Policy/`
- **Evidence:** Gate implementation, verdict extension, policy rules, integration tests
## Problem Statement
Current Policy Engine:
- Uses `PolicyVerdictStatus` with Pass, Blocked, Ignored, Warned, Deferred, Escalated, RequiresVex
- No "allow with guardrails" outcome for uncertain observations
- No gate specifically for determinization/uncertainty thresholds
- No automatic re-evaluation when new signals arrive
Advisory requires:
- `GuardedPass` status for allowing uncertain observations with monitoring
- `DeterminizationGate` that checks entropy/score thresholds
- Policy rules: allow (score<0.5, entropy>0.4, non-prod), quarantine (EPSS>=0.4 or reachable), escalate (runtime proof)
- Signal update subscriptions for automatic re-evaluation
## Dependencies & Concurrency
- **Depends on:** SPRINT_20260106_001_001_LB, SPRINT_20260106_001_002_LB (determinization library)
- **Blocks:** SPRINT_20260106_001_004_BE (backend integration)
- **Parallel safe:** Policy module changes; coordinate with existing gate implementations
## Documentation Prerequisites
- docs/modules/policy/determinization-architecture.md
- docs/modules/policy/architecture.md
- src/Policy/AGENTS.md
- Existing: `src/Policy/__Libraries/StellaOps.Policy/PolicyVerdict.cs`
- Existing: `src/Policy/StellaOps.Policy.Engine/Gates/`
## Technical Design
### Directory Structure Changes
```
src/Policy/__Libraries/StellaOps.Policy/
├── PolicyVerdict.cs # MODIFY: Add GuardedPass status
├── PolicyVerdictStatus.cs # MODIFY: Add GuardedPass enum value
└── Determinization/ # NEW: Reference to library
src/Policy/StellaOps.Policy.Engine/
├── Gates/
│ ├── IDeterminizationGate.cs # NEW
│ ├── DeterminizationGate.cs # NEW
│ └── DeterminizationGateOptions.cs # NEW
├── Policies/
│ ├── IDeterminizationPolicy.cs # NEW
│ ├── DeterminizationPolicy.cs # NEW
│ └── DeterminizationRuleSet.cs # NEW
└── Subscriptions/
├── ISignalUpdateSubscription.cs # NEW
├── SignalUpdateHandler.cs # NEW
└── DeterminizationEventTypes.cs # NEW
```
### PolicyVerdictStatus Extension
```csharp
// In src/Policy/__Libraries/StellaOps.Policy/PolicyVerdictStatus.cs
namespace StellaOps.Policy;
/// <summary>
/// Status outcomes for policy verdicts.
/// </summary>
public enum PolicyVerdictStatus
{
/// <summary>Finding meets policy requirements.</summary>
Pass = 0,
/// <summary>
/// NEW: Finding allowed with runtime monitoring enabled.
/// Used for uncertain observations that don't exceed risk thresholds.
/// </summary>
GuardedPass = 1,
/// <summary>Finding fails policy checks; must be remediated.</summary>
Blocked = 2,
/// <summary>Finding deliberately ignored via exception.</summary>
Ignored = 3,
/// <summary>Finding passes but with warnings.</summary>
Warned = 4,
/// <summary>Decision deferred; needs additional evidence.</summary>
Deferred = 5,
/// <summary>Decision escalated for human review.</summary>
Escalated = 6,
/// <summary>VEX statement required to make decision.</summary>
RequiresVex = 7
}
```
### PolicyVerdict Extension
```csharp
// Additions to src/Policy/__Libraries/StellaOps.Policy/PolicyVerdict.cs
namespace StellaOps.Policy;
public sealed record PolicyVerdict
{
// ... existing properties ...
/// <summary>
/// Guardrails applied when Status is GuardedPass.
/// Null for other statuses.
/// </summary>
public GuardRails? GuardRails { get; init; }
/// <summary>
/// Observation state suggested by the verdict.
/// Used for determinization tracking.
/// </summary>
public ObservationState? SuggestedObservationState { get; init; }
/// <summary>
/// Uncertainty score at time of verdict.
/// </summary>
public UncertaintyScore? UncertaintyScore { get; init; }
/// <summary>
/// Whether this verdict allows the finding to proceed (Pass or GuardedPass).
/// </summary>
public bool IsAllowing => Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass;
/// <summary>
/// Whether this verdict requires monitoring (GuardedPass only).
/// </summary>
public bool RequiresMonitoring => Status == PolicyVerdictStatus.GuardedPass;
}
```
### IDeterminizationGate Interface
```csharp
namespace StellaOps.Policy.Engine.Gates;
/// <summary>
/// Gate that evaluates determinization state and uncertainty for findings.
/// </summary>
public interface IDeterminizationGate : IPolicyGate
{
/// <summary>
/// Evaluate a finding against determinization thresholds.
/// </summary>
/// <param name="context">Policy evaluation context.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>Gate evaluation result.</returns>
Task<DeterminizationGateResult> EvaluateDeterminizationAsync(
PolicyEvaluationContext context,
CancellationToken ct = default);
}
/// <summary>
/// Result of determinization gate evaluation.
/// </summary>
public sealed record DeterminizationGateResult
{
/// <summary>Whether the gate passed.</summary>
public required bool Passed { get; init; }
/// <summary>Policy verdict status.</summary>
public required PolicyVerdictStatus Status { get; init; }
/// <summary>Reason for the decision.</summary>
public required string Reason { get; init; }
/// <summary>Guardrails if GuardedPass.</summary>
public GuardRails? GuardRails { get; init; }
/// <summary>Uncertainty score.</summary>
public required UncertaintyScore UncertaintyScore { get; init; }
/// <summary>Decay information.</summary>
public required ObservationDecay Decay { get; init; }
/// <summary>Trust score.</summary>
public required double TrustScore { get; init; }
/// <summary>Rule that matched.</summary>
public string? MatchedRule { get; init; }
/// <summary>Additional metadata for audit.</summary>
public ImmutableDictionary<string, object>? Metadata { get; init; }
}
```
### DeterminizationGate Implementation
```csharp
namespace StellaOps.Policy.Engine.Gates;
/// <summary>
/// Gate that evaluates CVE observations against determinization thresholds.
/// </summary>
public sealed class DeterminizationGate : IDeterminizationGate
{
private readonly IDeterminizationPolicy _policy;
private readonly IUncertaintyScoreCalculator _uncertaintyCalculator;
private readonly IDecayedConfidenceCalculator _decayCalculator;
private readonly ITrustScoreAggregator _trustAggregator;
private readonly ISignalSnapshotBuilder _snapshotBuilder;
private readonly ILogger<DeterminizationGate> _logger;
public DeterminizationGate(
IDeterminizationPolicy policy,
IUncertaintyScoreCalculator uncertaintyCalculator,
IDecayedConfidenceCalculator decayCalculator,
ITrustScoreAggregator trustAggregator,
ISignalSnapshotBuilder snapshotBuilder,
ILogger<DeterminizationGate> logger)
{
_policy = policy;
_uncertaintyCalculator = uncertaintyCalculator;
_decayCalculator = decayCalculator;
_trustAggregator = trustAggregator;
_snapshotBuilder = snapshotBuilder;
_logger = logger;
}
public string GateName => "DeterminizationGate";
public int Priority => 50; // After VEX gates, before compliance gates
public async Task<GateResult> EvaluateAsync(
PolicyEvaluationContext context,
CancellationToken ct = default)
{
var result = await EvaluateDeterminizationAsync(context, ct);
return new GateResult
{
GateName = GateName,
Passed = result.Passed,
Status = result.Status,
Reason = result.Reason,
Metadata = BuildMetadata(result)
};
}
public async Task<DeterminizationGateResult> EvaluateDeterminizationAsync(
PolicyEvaluationContext context,
CancellationToken ct = default)
{
// 1. Build signal snapshot for the CVE/component
var snapshot = await _snapshotBuilder.BuildAsync(
context.CveId,
context.ComponentPurl,
ct);
// 2. Calculate uncertainty
var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
// 3. Calculate decay
var lastUpdate = DetermineLastSignalUpdate(snapshot);
var decay = _decayCalculator.Calculate(lastUpdate);
// 4. Calculate trust score
var trustScore = _trustAggregator.Calculate(snapshot);
// 5. Build determinization context
var determCtx = new DeterminizationContext
{
SignalSnapshot = snapshot,
UncertaintyScore = uncertainty,
Decay = decay,
TrustScore = trustScore,
Environment = context.Environment,
AssetCriticality = context.AssetCriticality,
CurrentState = context.CurrentObservationState,
Options = context.DeterminizationOptions
};
// 6. Evaluate policy
var policyResult = _policy.Evaluate(determCtx);
_logger.LogInformation(
"DeterminizationGate evaluated CVE {CveId} on {Purl}: status={Status}, entropy={Entropy:F3}, trust={Trust:F3}, rule={Rule}",
context.CveId,
context.ComponentPurl,
policyResult.Status,
uncertainty.Entropy,
trustScore,
policyResult.MatchedRule);
return new DeterminizationGateResult
{
Passed = policyResult.Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass,
Status = policyResult.Status,
Reason = policyResult.Reason,
GuardRails = policyResult.GuardRails,
UncertaintyScore = uncertainty,
Decay = decay,
TrustScore = trustScore,
MatchedRule = policyResult.MatchedRule,
Metadata = policyResult.Metadata
};
}
private static DateTimeOffset DetermineLastSignalUpdate(SignalSnapshot snapshot)
{
var timestamps = new List<DateTimeOffset?>();
if (snapshot.Epss.QueriedAt.HasValue) timestamps.Add(snapshot.Epss.QueriedAt);
if (snapshot.Vex.QueriedAt.HasValue) timestamps.Add(snapshot.Vex.QueriedAt);
if (snapshot.Reachability.QueriedAt.HasValue) timestamps.Add(snapshot.Reachability.QueriedAt);
if (snapshot.Runtime.QueriedAt.HasValue) timestamps.Add(snapshot.Runtime.QueriedAt);
if (snapshot.Backport.QueriedAt.HasValue) timestamps.Add(snapshot.Backport.QueriedAt);
if (snapshot.SbomLineage.QueriedAt.HasValue) timestamps.Add(snapshot.SbomLineage.QueriedAt);
return timestamps.Where(t => t.HasValue).Max() ?? snapshot.CapturedAt;
}
private static ImmutableDictionary<string, object> BuildMetadata(DeterminizationGateResult result)
{
var builder = ImmutableDictionary.CreateBuilder<string, object>();
builder["uncertainty_entropy"] = result.UncertaintyScore.Entropy;
builder["uncertainty_tier"] = result.UncertaintyScore.Tier.ToString();
builder["uncertainty_completeness"] = result.UncertaintyScore.Completeness;
builder["decay_multiplier"] = result.Decay.DecayedMultiplier;
builder["decay_is_stale"] = result.Decay.IsStale;
builder["decay_age_days"] = result.Decay.AgeDays;
builder["trust_score"] = result.TrustScore;
builder["missing_signals"] = result.UncertaintyScore.MissingSignals.Select(g => g.SignalName).ToArray();
if (result.MatchedRule is not null)
builder["matched_rule"] = result.MatchedRule;
if (result.GuardRails is not null)
{
builder["guardrails_monitoring"] = result.GuardRails.EnableRuntimeMonitoring;
builder["guardrails_review_interval"] = result.GuardRails.ReviewInterval.ToString();
}
return builder.ToImmutable();
}
}
```
### IDeterminizationPolicy Interface
```csharp
namespace StellaOps.Policy.Engine.Policies;
/// <summary>
/// Policy for evaluating determinization decisions (allow/quarantine/escalate).
/// </summary>
public interface IDeterminizationPolicy
{
/// <summary>
/// Evaluate a CVE observation against determinization rules.
/// </summary>
/// <param name="context">Determinization context.</param>
/// <returns>Policy decision result.</returns>
DeterminizationResult Evaluate(DeterminizationContext context);
}
```
### DeterminizationPolicy Implementation
```csharp
namespace StellaOps.Policy.Engine.Policies;
/// <summary>
/// Implements allow/quarantine/escalate logic per advisory specification.
/// </summary>
public sealed class DeterminizationPolicy : IDeterminizationPolicy
{
private readonly DeterminizationOptions _options;
private readonly DeterminizationRuleSet _ruleSet;
private readonly ILogger<DeterminizationPolicy> _logger;
public DeterminizationPolicy(
IOptions<DeterminizationOptions> options,
ILogger<DeterminizationPolicy> logger)
{
_options = options.Value;
_ruleSet = DeterminizationRuleSet.Default(_options);
_logger = logger;
}
public DeterminizationResult Evaluate(DeterminizationContext ctx)
{
ArgumentNullException.ThrowIfNull(ctx);
// Get environment-specific thresholds
var thresholds = GetEnvironmentThresholds(ctx.Environment);
// Evaluate rules in priority order
foreach (var rule in _ruleSet.Rules.OrderBy(r => r.Priority))
{
if (rule.Condition(ctx, thresholds))
{
var result = rule.Action(ctx, thresholds);
result = result with { MatchedRule = rule.Name };
_logger.LogDebug(
"Rule {RuleName} matched for CVE {CveId}: {Status}",
rule.Name,
ctx.SignalSnapshot.CveId,
result.Status);
return result;
}
}
// Default: Deferred (no rule matched, needs more evidence)
return DeterminizationResult.Deferred(
"No determinization rule matched; additional evidence required",
PolicyVerdictStatus.Deferred);
}
private EnvironmentThresholds GetEnvironmentThresholds(DeploymentEnvironment env)
{
var key = env.ToString();
if (_options.EnvironmentThresholds.TryGetValue(key, out var custom))
return custom;
return env switch
{
DeploymentEnvironment.Production => DefaultEnvironmentThresholds.Production,
DeploymentEnvironment.Staging => DefaultEnvironmentThresholds.Staging,
_ => DefaultEnvironmentThresholds.Development
};
}
}
/// <summary>
/// Default environment thresholds per advisory.
/// </summary>
public static class DefaultEnvironmentThresholds
{
public static EnvironmentThresholds Production => new()
{
Environment = DeploymentEnvironment.Production,
MinConfidenceForNotAffected = 0.75,
MaxEntropyForAllow = 0.3,
EpssBlockThreshold = 0.3,
RequireReachabilityForAllow = true
};
public static EnvironmentThresholds Staging => new()
{
Environment = DeploymentEnvironment.Staging,
MinConfidenceForNotAffected = 0.60,
MaxEntropyForAllow = 0.5,
EpssBlockThreshold = 0.4,
RequireReachabilityForAllow = true
};
public static EnvironmentThresholds Development => new()
{
Environment = DeploymentEnvironment.Development,
MinConfidenceForNotAffected = 0.40,
MaxEntropyForAllow = 0.7,
EpssBlockThreshold = 0.6,
RequireReachabilityForAllow = false
};
}
```
### DeterminizationRuleSet
```csharp
namespace StellaOps.Policy.Engine.Policies;
/// <summary>
/// Rule set for determinization policy evaluation.
/// Rules are evaluated in priority order (lower = higher priority).
/// </summary>
public sealed class DeterminizationRuleSet
{
public IReadOnlyList<DeterminizationRule> Rules { get; }
private DeterminizationRuleSet(IReadOnlyList<DeterminizationRule> rules)
{
Rules = rules;
}
/// <summary>
/// Creates the default rule set per advisory specification.
/// </summary>
public static DeterminizationRuleSet Default(DeterminizationOptions options) =>
new(new List<DeterminizationRule>
{
// Rule 1: Escalate if runtime evidence shows vulnerable code loaded
new DeterminizationRule
{
Name = "RuntimeEscalation",
Priority = 10,
Condition = (ctx, _) =>
ctx.SignalSnapshot.Runtime.HasValue &&
ctx.SignalSnapshot.Runtime.Value!.ObservedLoaded,
Action = (ctx, _) =>
DeterminizationResult.Escalated(
"Runtime evidence shows vulnerable code loaded in memory",
PolicyVerdictStatus.Escalated)
},
// Rule 2: Quarantine if EPSS exceeds threshold
new DeterminizationRule
{
Name = "EpssQuarantine",
Priority = 20,
Condition = (ctx, thresholds) =>
ctx.SignalSnapshot.Epss.HasValue &&
ctx.SignalSnapshot.Epss.Value!.Score >= thresholds.EpssBlockThreshold,
Action = (ctx, thresholds) =>
DeterminizationResult.Quarantined(
$"EPSS score {ctx.SignalSnapshot.Epss.Value!.Score:P1} exceeds threshold {thresholds.EpssBlockThreshold:P1}",
PolicyVerdictStatus.Blocked)
},
// Rule 3: Quarantine if proven reachable
new DeterminizationRule
{
Name = "ReachabilityQuarantine",
Priority = 25,
Condition = (ctx, _) =>
ctx.SignalSnapshot.Reachability.HasValue &&
ctx.SignalSnapshot.Reachability.Value!.Status is
ReachabilityStatus.Reachable or
ReachabilityStatus.ObservedReachable,
Action = (ctx, _) =>
DeterminizationResult.Quarantined(
$"Vulnerable code is {ctx.SignalSnapshot.Reachability.Value!.Status} via call graph analysis",
PolicyVerdictStatus.Blocked)
},
// Rule 4: Block high entropy in production
new DeterminizationRule
{
Name = "ProductionEntropyBlock",
Priority = 30,
Condition = (ctx, thresholds) =>
ctx.Environment == DeploymentEnvironment.Production &&
ctx.UncertaintyScore.Entropy > thresholds.MaxEntropyForAllow,
Action = (ctx, thresholds) =>
DeterminizationResult.Quarantined(
$"High uncertainty (entropy={ctx.UncertaintyScore.Entropy:F2}) exceeds production threshold ({thresholds.MaxEntropyForAllow:F2})",
PolicyVerdictStatus.Blocked)
},
// Rule 5: Defer if evidence is stale
new DeterminizationRule
{
Name = "StaleEvidenceDefer",
Priority = 40,
Condition = (ctx, _) => ctx.Decay.IsStale,
Action = (ctx, _) =>
DeterminizationResult.Deferred(
$"Evidence is stale (last update: {ctx.Decay.LastSignalUpdate:u}, age: {ctx.Decay.AgeDays:F1} days)",
PolicyVerdictStatus.Deferred)
},
// Rule 6: Guarded allow for uncertain observations in non-prod
new DeterminizationRule
{
Name = "GuardedAllowNonProd",
Priority = 50,
Condition = (ctx, _) =>
ctx.TrustScore < options.GuardedAllowScoreThreshold &&
ctx.UncertaintyScore.Entropy > options.GuardedAllowEntropyThreshold &&
ctx.Environment != DeploymentEnvironment.Production,
Action = (ctx, _) =>
DeterminizationResult.GuardedAllow(
$"Uncertain observation (entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}) allowed with guardrails in {ctx.Environment}",
PolicyVerdictStatus.GuardedPass,
BuildGuardrails(ctx, options))
},
// Rule 7: Allow if unreachable with high confidence
new DeterminizationRule
{
Name = "UnreachableAllow",
Priority = 60,
Condition = (ctx, thresholds) =>
ctx.SignalSnapshot.Reachability.HasValue &&
ctx.SignalSnapshot.Reachability.Value!.Status == ReachabilityStatus.Unreachable &&
ctx.SignalSnapshot.Reachability.Value.Confidence >= thresholds.MinConfidenceForNotAffected,
Action = (ctx, _) =>
DeterminizationResult.Allowed(
$"Vulnerable code is unreachable (confidence={ctx.SignalSnapshot.Reachability.Value!.Confidence:P0})",
PolicyVerdictStatus.Pass)
},
// Rule 8: Allow if VEX not_affected with trusted issuer
new DeterminizationRule
{
Name = "VexNotAffectedAllow",
Priority = 65,
Condition = (ctx, thresholds) =>
ctx.SignalSnapshot.Vex.HasValue &&
ctx.SignalSnapshot.Vex.Value!.Status == "not_affected" &&
ctx.SignalSnapshot.Vex.Value.IssuerTrust >= thresholds.MinConfidenceForNotAffected,
Action = (ctx, _) =>
DeterminizationResult.Allowed(
$"VEX statement from {ctx.SignalSnapshot.Vex.Value!.Issuer} indicates not_affected (trust={ctx.SignalSnapshot.Vex.Value.IssuerTrust:P0})",
PolicyVerdictStatus.Pass)
},
// Rule 9: Allow if sufficient evidence and low entropy
new DeterminizationRule
{
Name = "SufficientEvidenceAllow",
Priority = 70,
Condition = (ctx, thresholds) =>
ctx.UncertaintyScore.Entropy <= thresholds.MaxEntropyForAllow &&
ctx.TrustScore >= thresholds.MinConfidenceForNotAffected,
Action = (ctx, _) =>
DeterminizationResult.Allowed(
$"Sufficient evidence (entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}) for confident determination",
PolicyVerdictStatus.Pass)
},
// Rule 10: Guarded allow for moderate uncertainty
new DeterminizationRule
{
Name = "GuardedAllowModerateUncertainty",
Priority = 80,
Condition = (ctx, _) =>
ctx.UncertaintyScore.Tier <= UncertaintyTier.Medium &&
ctx.TrustScore >= 0.4,
Action = (ctx, _) =>
DeterminizationResult.GuardedAllow(
$"Moderate uncertainty (tier={ctx.UncertaintyScore.Tier}, trust={ctx.TrustScore:F2}) allowed with monitoring",
PolicyVerdictStatus.GuardedPass,
BuildGuardrails(ctx, options))
},
// Rule 11: Default - require more evidence
new DeterminizationRule
{
Name = "DefaultDefer",
Priority = 100,
Condition = (_, _) => true,
Action = (ctx, _) =>
DeterminizationResult.Deferred(
$"Insufficient evidence for determination (entropy={ctx.UncertaintyScore.Entropy:F2}, tier={ctx.UncertaintyScore.Tier})",
PolicyVerdictStatus.Deferred)
}
});
private static GuardRails BuildGuardrails(DeterminizationContext ctx, DeterminizationOptions options) =>
new GuardRails
{
EnableRuntimeMonitoring = true,
ReviewInterval = TimeSpan.FromDays(options.GuardedReviewIntervalDays),
EpssEscalationThreshold = options.EpssQuarantineThreshold,
EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
MaxGuardedDuration = TimeSpan.FromDays(options.MaxGuardedDurationDays),
PolicyRationale = $"Auto-allowed: entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}, env={ctx.Environment}"
};
}
/// <summary>
/// A single determinization rule.
/// </summary>
public sealed record DeterminizationRule
{
/// <summary>Rule name for audit/logging.</summary>
public required string Name { get; init; }
/// <summary>Priority (lower = evaluated first).</summary>
public required int Priority { get; init; }
/// <summary>Condition function.</summary>
public required Func<DeterminizationContext, EnvironmentThresholds, bool> Condition { get; init; }
/// <summary>Action function.</summary>
public required Func<DeterminizationContext, EnvironmentThresholds, DeterminizationResult> Action { get; init; }
}
```
### Signal Update Subscription
```csharp
namespace StellaOps.Policy.Engine.Subscriptions;
/// <summary>
/// Events for signal updates that trigger re-evaluation.
/// </summary>
public static class DeterminizationEventTypes
{
public const string EpssUpdated = "epss.updated";
public const string VexUpdated = "vex.updated";
public const string ReachabilityUpdated = "reachability.updated";
public const string RuntimeUpdated = "runtime.updated";
public const string BackportUpdated = "backport.updated";
public const string ObservationStateChanged = "observation.state_changed";
}
/// <summary>
/// Event published when a signal is updated.
/// </summary>
public sealed record SignalUpdatedEvent
{
public required string EventType { get; init; }
public required string CveId { get; init; }
public required string Purl { get; init; }
public required DateTimeOffset UpdatedAt { get; init; }
public required string Source { get; init; }
public object? NewValue { get; init; }
public object? PreviousValue { get; init; }
}
/// <summary>
/// Event published when observation state changes.
/// </summary>
public sealed record ObservationStateChangedEvent
{
public required Guid ObservationId { get; init; }
public required string CveId { get; init; }
public required string Purl { get; init; }
public required ObservationState PreviousState { get; init; }
public required ObservationState NewState { get; init; }
public required string Reason { get; init; }
public required DateTimeOffset ChangedAt { get; init; }
}
/// <summary>
/// Handler for signal update events.
/// </summary>
public interface ISignalUpdateSubscription
{
/// <summary>
/// Handle a signal update and re-evaluate affected observations.
/// </summary>
Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct = default);
}
/// <summary>
/// Implementation of signal update handling.
/// </summary>
public sealed class SignalUpdateHandler : ISignalUpdateSubscription
{
private readonly IObservationRepository _observations;
private readonly IDeterminizationGate _gate;
private readonly IEventPublisher _eventPublisher;
private readonly ILogger<SignalUpdateHandler> _logger;
public SignalUpdateHandler(
IObservationRepository observations,
IDeterminizationGate gate,
IEventPublisher eventPublisher,
ILogger<SignalUpdateHandler> logger)
{
_observations = observations;
_gate = gate;
_eventPublisher = eventPublisher;
_logger = logger;
}
public async Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct = default)
{
_logger.LogInformation(
"Processing signal update: {EventType} for CVE {CveId} on {Purl}",
evt.EventType,
evt.CveId,
evt.Purl);
// Find observations affected by this signal
var affected = await _observations.FindByCveAndPurlAsync(evt.CveId, evt.Purl, ct);
foreach (var obs in affected)
{
try
{
await ReEvaluateObservationAsync(obs, evt, ct);
}
catch (Exception ex)
{
_logger.LogError(ex,
"Failed to re-evaluate observation {ObservationId} after signal update",
obs.Id);
}
}
}
private async Task ReEvaluateObservationAsync(
CveObservation obs,
SignalUpdatedEvent trigger,
CancellationToken ct)
{
var context = new PolicyEvaluationContext
{
CveId = obs.CveId,
ComponentPurl = obs.SubjectPurl,
Environment = obs.Environment,
CurrentObservationState = obs.ObservationState
};
var result = await _gate.EvaluateDeterminizationAsync(context, ct);
// Determine if state should change
var newState = DetermineNewState(obs.ObservationState, result);
if (newState != obs.ObservationState)
{
_logger.LogInformation(
"Observation {ObservationId} state transition: {OldState} -> {NewState} (trigger: {Trigger})",
obs.Id,
obs.ObservationState,
newState,
trigger.EventType);
await _observations.UpdateStateAsync(obs.Id, newState, result, ct);
await _eventPublisher.PublishAsync(new ObservationStateChangedEvent
{
ObservationId = obs.Id,
CveId = obs.CveId,
Purl = obs.SubjectPurl,
PreviousState = obs.ObservationState,
NewState = newState,
Reason = result.Reason,
ChangedAt = DateTimeOffset.UtcNow
}, ct);
}
}
private static ObservationState DetermineNewState(
ObservationState current,
DeterminizationGateResult result)
{
// Escalation always triggers ManualReviewRequired
if (result.Status == PolicyVerdictStatus.Escalated)
return ObservationState.ManualReviewRequired;
// Very low uncertainty means we have enough evidence
if (result.UncertaintyScore.Tier == UncertaintyTier.VeryLow)
return ObservationState.Determined;
// Transition from Pending to Determined when evidence sufficient
if (current == ObservationState.PendingDeterminization &&
result.UncertaintyScore.Tier <= UncertaintyTier.Low &&
result.Status == PolicyVerdictStatus.Pass)
return ObservationState.Determined;
// Stale evidence
if (result.Decay.IsStale && current != ObservationState.StaleRequiresRefresh)
return ObservationState.StaleRequiresRefresh;
// Otherwise maintain current state
return current;
}
}
```
### DI Registration Updates
```csharp
// Additions to Policy.Engine DI registration
public static class DeterminizationEngineExtensions
{
public static IServiceCollection AddDeterminizationEngine(
this IServiceCollection services,
IConfiguration configuration)
{
// Register determinization library services
services.AddDeterminization(configuration);
// Register policy engine services
services.AddScoped<IDeterminizationPolicy, DeterminizationPolicy>();
services.AddScoped<IDeterminizationGate, DeterminizationGate>();
services.AddScoped<ISignalSnapshotBuilder, SignalSnapshotBuilder>();
services.AddScoped<ISignalUpdateSubscription, SignalUpdateHandler>();
return services;
}
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | DPE-001 | TODO | DCS-028 | Guild | Add `GuardedPass` to `PolicyVerdictStatus` enum |
| 2 | DPE-002 | TODO | DPE-001 | Guild | Extend `PolicyVerdict` with GuardRails and UncertaintyScore |
| 3 | DPE-003 | TODO | DPE-002 | Guild | Create `IDeterminizationGate` interface |
| 4 | DPE-004 | TODO | DPE-003 | Guild | Implement `DeterminizationGate` with priority 50 |
| 5 | DPE-005 | TODO | DPE-004 | Guild | Create `DeterminizationGateResult` record |
| 6 | DPE-006 | TODO | DPE-005 | Guild | Create `ISignalSnapshotBuilder` interface |
| 7 | DPE-007 | TODO | DPE-006 | Guild | Implement `SignalSnapshotBuilder` |
| 8 | DPE-008 | TODO | DPE-007 | Guild | Create `IDeterminizationPolicy` interface |
| 9 | DPE-009 | TODO | DPE-008 | Guild | Implement `DeterminizationPolicy` |
| 10 | DPE-010 | TODO | DPE-009 | Guild | Implement `DeterminizationRuleSet` with 11 rules |
| 11 | DPE-011 | TODO | DPE-010 | Guild | Implement `DefaultEnvironmentThresholds` |
| 12 | DPE-012 | TODO | DPE-011 | Guild | Create `DeterminizationEventTypes` constants |
| 13 | DPE-013 | TODO | DPE-012 | Guild | Create `SignalUpdatedEvent` record |
| 14 | DPE-014 | TODO | DPE-013 | Guild | Create `ObservationStateChangedEvent` record |
| 15 | DPE-015 | TODO | DPE-014 | Guild | Create `ISignalUpdateSubscription` interface |
| 16 | DPE-016 | TODO | DPE-015 | Guild | Implement `SignalUpdateHandler` |
| 17 | DPE-017 | TODO | DPE-016 | Guild | Create `IObservationRepository` interface |
| 18 | DPE-018 | TODO | DPE-017 | Guild | Implement `DeterminizationEngineExtensions` for DI |
| 19 | DPE-019 | TODO | DPE-018 | Guild | Write unit tests: `DeterminizationPolicy` rule evaluation |
| 20 | DPE-020 | TODO | DPE-019 | Guild | Write unit tests: `DeterminizationGate` metadata building |
| 21 | DPE-021 | TODO | DPE-020 | Guild | Write unit tests: `SignalUpdateHandler` state transitions |
| 22 | DPE-022 | TODO | DPE-021 | Guild | Write unit tests: Rule priority ordering |
| 23 | DPE-023 | TODO | DPE-022 | Guild | Write integration tests: Gate in policy pipeline |
| 24 | DPE-024 | TODO | DPE-023 | Guild | Write integration tests: Signal update re-evaluation |
| 25 | DPE-025 | TODO | DPE-024 | Guild | Add metrics: `stellaops_policy_determinization_evaluations_total` |
| 26 | DPE-026 | TODO | DPE-025 | Guild | Add metrics: `stellaops_policy_determinization_rule_matches_total` |
| 27 | DPE-027 | TODO | DPE-026 | Guild | Add metrics: `stellaops_policy_observation_state_transitions_total` |
| 28 | DPE-028 | TODO | DPE-027 | Guild | Update existing PolicyEngine to register DeterminizationGate |
| 29 | DPE-029 | TODO | DPE-028 | Guild | Document new PolicyVerdictStatus.GuardedPass in API docs |
| 30 | DPE-030 | TODO | DPE-029 | Guild | Verify build with `dotnet build` |
## Acceptance Criteria
1. `PolicyVerdictStatus.GuardedPass` compiles and serializes correctly
2. `DeterminizationGate` integrates with existing gate pipeline
3. All 11 rules evaluate in correct priority order
4. `SignalUpdateHandler` correctly triggers re-evaluation
5. State transitions follow expected logic
6. Metrics emitted for all evaluations and transitions
7. Integration tests pass with mock signal sources
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Gate priority 50 | After VEX gates (30-40), before compliance gates (60+) |
| 11 rules in default set | Covers all advisory scenarios; extensible |
| Event-driven re-evaluation | Reactive system; no polling required |
| Separate IObservationRepository | Decouples from specific persistence; testable |
| Risk | Mitigation |
|------|------------|
| Rule evaluation performance | Rules short-circuit on first match; cached signal snapshots |
| Event storm on bulk updates | Batch processing; debounce repeated events |
| Breaking existing PolicyVerdictStatus consumers | GuardedPass=1 shifts existing values; requires migration |
## Migration Notes
### PolicyVerdictStatus Value Change
Adding `GuardedPass = 1` shifts existing enum values:
- `Blocked` was 1, now 2
- `Ignored` was 2, now 3
- etc.
**Migration strategy:**
1. Add `GuardedPass` at the end first (`= 8`) for backward compatibility
2. Update all consumers
3. Reorder enum values in next major version
Alternatively, insert `GuardedPass` with explicit value assignment to avoid breaking changes:
```csharp
public enum PolicyVerdictStatus
{
Pass = 0,
Blocked = 1, // Keep existing
Ignored = 2, // Keep existing
Warned = 3, // Keep existing
Deferred = 4, // Keep existing
Escalated = 5, // Keep existing
RequiresVex = 6, // Keep existing
GuardedPass = 7 // NEW - at end
}
```
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
## Next Checkpoints
- 2026-01-10: DPE-001 to DPE-011 complete (core implementation)
- 2026-01-11: DPE-012 to DPE-018 complete (events, subscriptions)
- 2026-01-12: DPE-019 to DPE-030 complete (tests, metrics, docs)

View File

@@ -0,0 +1,906 @@
# Sprint 20260106_001_004_BE - Determinization: Backend Integration
## Topic & Scope
Integrate the Determinization subsystem with backend modules: Feedser (signal attachment), VexLens (VEX signal emission), Graph (CVE node enhancement), and Findings (observation persistence). This connects the policy infrastructure to data sources.
- **Working directories:**
- `src/Feedser/`
- `src/VexLens/`
- `src/Graph/`
- `src/Findings/`
- **Evidence:** Signal attachers, repository implementations, graph node enhancements, integration tests
## Problem Statement
Current backend state:
- Feedser collects EPSS/VEX/advisories but doesn't emit `SignalState<T>`
- VexLens normalizes VEX but doesn't notify on updates
- Graph has CVE nodes but no `ObservationState` or `UncertaintyScore`
- Findings tracks verdicts but not determinization state
Advisory requires:
- Feedser attaches `SignalState<EpssEvidence>` with query status
- VexLens emits `SignalUpdatedEvent` on VEX changes
- Graph nodes carry `ObservationState`, `UncertaintyScore`, `GuardRails`
- Findings persists observation lifecycle with state transitions
## Dependencies & Concurrency
- **Depends on:** SPRINT_20260106_001_003_POLICY (gates and policies)
- **Blocks:** SPRINT_20260106_001_005_FE (frontend)
- **Parallel safe with:** Graph module internal changes; coordinate with Feedser/VexLens teams
## Documentation Prerequisites
- docs/modules/policy/determinization-architecture.md
- SPRINT_20260106_001_003_POLICY (events and subscriptions)
- src/Feedser/AGENTS.md
- src/VexLens/AGENTS.md (if exists)
- src/Graph/AGENTS.md
- src/Findings/AGENTS.md
## Technical Design
### Feedser: Signal Attachment
#### Directory Structure Changes
```
src/Feedser/StellaOps.Feedser/
├── Signals/
│ ├── ISignalAttacher.cs # NEW
│ ├── EpssSignalAttacher.cs # NEW
│ ├── KevSignalAttacher.cs # NEW
│ └── SignalAttachmentResult.cs # NEW
├── Events/
│ └── SignalAttachmentEventEmitter.cs # NEW
└── Extensions/
└── SignalAttacherServiceExtensions.cs # NEW
```
#### ISignalAttacher Interface
```csharp
namespace StellaOps.Feedser.Signals;
/// <summary>
/// Attaches signal evidence to CVE observations.
/// </summary>
/// <typeparam name="T">The evidence type.</typeparam>
public interface ISignalAttacher<T>
{
/// <summary>
/// Attach signal evidence for a CVE.
/// </summary>
/// <param name="cveId">CVE identifier.</param>
/// <param name="purl">Component PURL.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>Signal state with query status.</returns>
Task<SignalState<T>> AttachAsync(string cveId, string purl, CancellationToken ct = default);
/// <summary>
/// Batch attach signal evidence for multiple CVEs.
/// </summary>
/// <param name="requests">CVE/PURL pairs.</param>
/// <param name="ct">Cancellation token.</param>
/// <returns>Signal states keyed by CVE ID.</returns>
Task<IReadOnlyDictionary<string, SignalState<T>>> AttachBatchAsync(
IEnumerable<(string CveId, string Purl)> requests,
CancellationToken ct = default);
}
```
#### EpssSignalAttacher Implementation
```csharp
namespace StellaOps.Feedser.Signals;
/// <summary>
/// Attaches EPSS evidence to CVE observations.
/// </summary>
public sealed class EpssSignalAttacher : ISignalAttacher<EpssEvidence>
{
private readonly IEpssClient _epssClient;
private readonly IEventPublisher _eventPublisher;
private readonly TimeProvider _timeProvider;
private readonly ILogger<EpssSignalAttacher> _logger;
public EpssSignalAttacher(
IEpssClient epssClient,
IEventPublisher eventPublisher,
TimeProvider timeProvider,
ILogger<EpssSignalAttacher> logger)
{
_epssClient = epssClient;
_eventPublisher = eventPublisher;
_timeProvider = timeProvider;
_logger = logger;
}
public async Task<SignalState<EpssEvidence>> AttachAsync(
string cveId,
string purl,
CancellationToken ct = default)
{
var now = _timeProvider.GetUtcNow();
try
{
var epssData = await _epssClient.GetScoreAsync(cveId, ct);
if (epssData is null)
{
_logger.LogDebug("EPSS data not found for CVE {CveId}", cveId);
return SignalState<EpssEvidence>.Absent(now, "first.org");
}
var evidence = new EpssEvidence
{
Score = epssData.Score,
Percentile = epssData.Percentile,
ModelDate = epssData.ModelDate
};
// Emit event for signal update
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
{
EventType = DeterminizationEventTypes.EpssUpdated,
CveId = cveId,
Purl = purl,
UpdatedAt = now,
Source = "first.org",
NewValue = evidence
}, ct);
_logger.LogDebug(
"Attached EPSS for CVE {CveId}: score={Score:P1}, percentile={Percentile:P1}",
cveId,
evidence.Score,
evidence.Percentile);
return SignalState<EpssEvidence>.WithValue(evidence, now, "first.org");
}
catch (EpssNotFoundException)
{
return SignalState<EpssEvidence>.Absent(now, "first.org");
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to fetch EPSS for CVE {CveId}", cveId);
return SignalState<EpssEvidence>.Failed(ex.Message);
}
}
public async Task<IReadOnlyDictionary<string, SignalState<EpssEvidence>>> AttachBatchAsync(
IEnumerable<(string CveId, string Purl)> requests,
CancellationToken ct = default)
{
var results = new Dictionary<string, SignalState<EpssEvidence>>();
var requestList = requests.ToList();
// Batch query EPSS
var cveIds = requestList.Select(r => r.CveId).Distinct().ToList();
var batchResult = await _epssClient.GetScoresBatchAsync(cveIds, ct);
var now = _timeProvider.GetUtcNow();
foreach (var (cveId, purl) in requestList)
{
if (batchResult.Found.TryGetValue(cveId, out var epssData))
{
var evidence = new EpssEvidence
{
Score = epssData.Score,
Percentile = epssData.Percentile,
ModelDate = epssData.ModelDate
};
results[cveId] = SignalState<EpssEvidence>.WithValue(evidence, now, "first.org");
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
{
EventType = DeterminizationEventTypes.EpssUpdated,
CveId = cveId,
Purl = purl,
UpdatedAt = now,
Source = "first.org",
NewValue = evidence
}, ct);
}
else if (batchResult.NotFound.Contains(cveId))
{
results[cveId] = SignalState<EpssEvidence>.Absent(now, "first.org");
}
else
{
results[cveId] = SignalState<EpssEvidence>.Failed("Batch query did not return result");
}
}
return results;
}
}
```
#### KevSignalAttacher Implementation
```csharp
namespace StellaOps.Feedser.Signals;
/// <summary>
/// Attaches KEV (Known Exploited Vulnerabilities) flag to CVE observations.
/// </summary>
public sealed class KevSignalAttacher : ISignalAttacher<bool>
{
private readonly IKevCatalog _kevCatalog;
private readonly IEventPublisher _eventPublisher;
private readonly TimeProvider _timeProvider;
private readonly ILogger<KevSignalAttacher> _logger;
public async Task<SignalState<bool>> AttachAsync(
string cveId,
string purl,
CancellationToken ct = default)
{
var now = _timeProvider.GetUtcNow();
try
{
var isInKev = await _kevCatalog.ContainsAsync(cveId, ct);
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
{
EventType = "kev.updated",
CveId = cveId,
Purl = purl,
UpdatedAt = now,
Source = "cisa-kev",
NewValue = isInKev
}, ct);
return SignalState<bool>.WithValue(isInKev, now, "cisa-kev");
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Failed to check KEV for CVE {CveId}", cveId);
return SignalState<bool>.Failed(ex.Message);
}
}
public async Task<IReadOnlyDictionary<string, SignalState<bool>>> AttachBatchAsync(
IEnumerable<(string CveId, string Purl)> requests,
CancellationToken ct = default)
{
var results = new Dictionary<string, SignalState<bool>>();
var now = _timeProvider.GetUtcNow();
foreach (var (cveId, purl) in requests)
{
results[cveId] = await AttachAsync(cveId, purl, ct);
}
return results;
}
}
```
### VexLens: Signal Emission
#### VexSignalEmitter
```csharp
namespace StellaOps.VexLens.Signals;
/// <summary>
/// Emits VEX signal updates when VEX documents are processed.
/// </summary>
public sealed class VexSignalEmitter
{
private readonly IEventPublisher _eventPublisher;
private readonly TimeProvider _timeProvider;
private readonly ILogger<VexSignalEmitter> _logger;
public async Task EmitVexUpdateAsync(
string cveId,
string purl,
VexClaimSummary newClaim,
VexClaimSummary? previousClaim,
CancellationToken ct = default)
{
var now = _timeProvider.GetUtcNow();
await _eventPublisher.PublishAsync(new SignalUpdatedEvent
{
EventType = DeterminizationEventTypes.VexUpdated,
CveId = cveId,
Purl = purl,
UpdatedAt = now,
Source = newClaim.Issuer,
NewValue = newClaim,
PreviousValue = previousClaim
}, ct);
_logger.LogInformation(
"Emitted VEX update for CVE {CveId}: {Status} from {Issuer} (previous: {PreviousStatus})",
cveId,
newClaim.Status,
newClaim.Issuer,
previousClaim?.Status ?? "none");
}
}
/// <summary>
/// Converts normalized VEX documents to signal-compatible summaries.
/// </summary>
public sealed class VexClaimSummaryMapper
{
public VexClaimSummary Map(NormalizedVexStatement statement, double issuerTrust)
{
return new VexClaimSummary
{
Status = statement.Status.ToString().ToLowerInvariant(),
Justification = statement.Justification?.ToString(),
Issuer = statement.IssuerId,
IssuerTrust = issuerTrust
};
}
}
```
### Graph: CVE Node Enhancement
#### Enhanced CveObservationNode
```csharp
namespace StellaOps.Graph.Indexer.Nodes;
/// <summary>
/// Enhanced CVE observation node with determinization state.
/// </summary>
public sealed record CveObservationNode
{
/// <summary>Node identifier (CVE ID + PURL hash).</summary>
public required string NodeId { get; init; }
/// <summary>CVE identifier.</summary>
public required string CveId { get; init; }
/// <summary>Subject component PURL.</summary>
public required string SubjectPurl { get; init; }
/// <summary>VEX status (orthogonal to observation state).</summary>
public VexClaimStatus? VexStatus { get; init; }
/// <summary>Observation lifecycle state.</summary>
public required ObservationState ObservationState { get; init; }
/// <summary>Knowledge completeness score.</summary>
public required UncertaintyScore Uncertainty { get; init; }
/// <summary>Evidence freshness decay.</summary>
public required ObservationDecay Decay { get; init; }
/// <summary>Aggregated trust score [0.0-1.0].</summary>
public required double TrustScore { get; init; }
/// <summary>Policy verdict status.</summary>
public required PolicyVerdictStatus PolicyHint { get; init; }
/// <summary>Guardrails if PolicyHint is GuardedPass.</summary>
public GuardRails? GuardRails { get; init; }
/// <summary>Signal snapshot timestamp.</summary>
public required DateTimeOffset LastEvaluatedAt { get; init; }
/// <summary>Next scheduled review (if guarded or stale).</summary>
public DateTimeOffset? NextReviewAt { get; init; }
/// <summary>Environment where observation applies.</summary>
public DeploymentEnvironment? Environment { get; init; }
/// <summary>Generates node ID from CVE and PURL.</summary>
public static string GenerateNodeId(string cveId, string purl)
{
using var sha = SHA256.Create();
var input = $"{cveId}|{purl}";
var hash = sha.ComputeHash(Encoding.UTF8.GetBytes(input));
return $"obs:{Convert.ToHexString(hash)[..16].ToLowerInvariant()}";
}
}
```
#### CveObservationNodeRepository
```csharp
namespace StellaOps.Graph.Indexer.Repositories;
/// <summary>
/// Repository for CVE observation nodes in the graph.
/// </summary>
public interface ICveObservationNodeRepository
{
/// <summary>Get observation node by CVE and PURL.</summary>
Task<CveObservationNode?> GetAsync(string cveId, string purl, CancellationToken ct = default);
/// <summary>Get all observations for a CVE.</summary>
Task<IReadOnlyList<CveObservationNode>> GetByCveAsync(string cveId, CancellationToken ct = default);
/// <summary>Get all observations for a component.</summary>
Task<IReadOnlyList<CveObservationNode>> GetByPurlAsync(string purl, CancellationToken ct = default);
/// <summary>Get observations in a specific state.</summary>
Task<IReadOnlyList<CveObservationNode>> GetByStateAsync(
ObservationState state,
int limit = 100,
CancellationToken ct = default);
/// <summary>Get observations needing review (past NextReviewAt).</summary>
Task<IReadOnlyList<CveObservationNode>> GetPendingReviewAsync(
DateTimeOffset asOf,
int limit = 100,
CancellationToken ct = default);
/// <summary>Upsert observation node.</summary>
Task UpsertAsync(CveObservationNode node, CancellationToken ct = default);
/// <summary>Update observation state.</summary>
Task UpdateStateAsync(
string nodeId,
ObservationState newState,
DeterminizationGateResult? result,
CancellationToken ct = default);
}
/// <summary>
/// PostgreSQL implementation of observation node repository.
/// </summary>
public sealed class PostgresCveObservationNodeRepository : ICveObservationNodeRepository
{
private readonly IDbConnectionFactory _connectionFactory;
private readonly ILogger<PostgresCveObservationNodeRepository> _logger;
private const string TableName = "graph.cve_observation_nodes";
public async Task<CveObservationNode?> GetAsync(
string cveId,
string purl,
CancellationToken ct = default)
{
var nodeId = CveObservationNode.GenerateNodeId(cveId, purl);
await using var connection = await _connectionFactory.CreateAsync(ct);
var sql = $"""
SELECT
node_id,
cve_id,
subject_purl,
vex_status,
observation_state,
uncertainty_entropy,
uncertainty_completeness,
uncertainty_tier,
uncertainty_missing_signals,
decay_half_life_days,
decay_floor,
decay_last_update,
decay_multiplier,
decay_is_stale,
trust_score,
policy_hint,
guard_rails,
last_evaluated_at,
next_review_at,
environment
FROM {TableName}
WHERE node_id = @NodeId
""";
return await connection.QuerySingleOrDefaultAsync<CveObservationNode>(
sql,
new { NodeId = nodeId },
ct);
}
public async Task UpsertAsync(CveObservationNode node, CancellationToken ct = default)
{
await using var connection = await _connectionFactory.CreateAsync(ct);
var sql = $"""
INSERT INTO {TableName} (
node_id,
cve_id,
subject_purl,
vex_status,
observation_state,
uncertainty_entropy,
uncertainty_completeness,
uncertainty_tier,
uncertainty_missing_signals,
decay_half_life_days,
decay_floor,
decay_last_update,
decay_multiplier,
decay_is_stale,
trust_score,
policy_hint,
guard_rails,
last_evaluated_at,
next_review_at,
environment,
created_at,
updated_at
) VALUES (
@NodeId,
@CveId,
@SubjectPurl,
@VexStatus,
@ObservationState,
@UncertaintyEntropy,
@UncertaintyCompleteness,
@UncertaintyTier,
@UncertaintyMissingSignals,
@DecayHalfLifeDays,
@DecayFloor,
@DecayLastUpdate,
@DecayMultiplier,
@DecayIsStale,
@TrustScore,
@PolicyHint,
@GuardRails,
@LastEvaluatedAt,
@NextReviewAt,
@Environment,
NOW(),
NOW()
)
ON CONFLICT (node_id) DO UPDATE SET
vex_status = EXCLUDED.vex_status,
observation_state = EXCLUDED.observation_state,
uncertainty_entropy = EXCLUDED.uncertainty_entropy,
uncertainty_completeness = EXCLUDED.uncertainty_completeness,
uncertainty_tier = EXCLUDED.uncertainty_tier,
uncertainty_missing_signals = EXCLUDED.uncertainty_missing_signals,
decay_half_life_days = EXCLUDED.decay_half_life_days,
decay_floor = EXCLUDED.decay_floor,
decay_last_update = EXCLUDED.decay_last_update,
decay_multiplier = EXCLUDED.decay_multiplier,
decay_is_stale = EXCLUDED.decay_is_stale,
trust_score = EXCLUDED.trust_score,
policy_hint = EXCLUDED.policy_hint,
guard_rails = EXCLUDED.guard_rails,
last_evaluated_at = EXCLUDED.last_evaluated_at,
next_review_at = EXCLUDED.next_review_at,
environment = EXCLUDED.environment,
updated_at = NOW()
""";
var parameters = new
{
node.NodeId,
node.CveId,
node.SubjectPurl,
VexStatus = node.VexStatus?.ToString(),
ObservationState = node.ObservationState.ToString(),
UncertaintyEntropy = node.Uncertainty.Entropy,
UncertaintyCompleteness = node.Uncertainty.Completeness,
UncertaintyTier = node.Uncertainty.Tier.ToString(),
UncertaintyMissingSignals = JsonSerializer.Serialize(node.Uncertainty.MissingSignals),
DecayHalfLifeDays = node.Decay.HalfLife.TotalDays,
DecayFloor = node.Decay.Floor,
DecayLastUpdate = node.Decay.LastSignalUpdate,
DecayMultiplier = node.Decay.DecayedMultiplier,
DecayIsStale = node.Decay.IsStale,
node.TrustScore,
PolicyHint = node.PolicyHint.ToString(),
GuardRails = node.GuardRails is not null ? JsonSerializer.Serialize(node.GuardRails) : null,
node.LastEvaluatedAt,
node.NextReviewAt,
Environment = node.Environment?.ToString()
};
await connection.ExecuteAsync(sql, parameters, ct);
}
public async Task<IReadOnlyList<CveObservationNode>> GetPendingReviewAsync(
DateTimeOffset asOf,
int limit = 100,
CancellationToken ct = default)
{
await using var connection = await _connectionFactory.CreateAsync(ct);
var sql = $"""
SELECT *
FROM {TableName}
WHERE next_review_at <= @AsOf
AND observation_state IN ('PendingDeterminization', 'StaleRequiresRefresh')
ORDER BY next_review_at ASC
LIMIT @Limit
""";
var results = await connection.QueryAsync<CveObservationNode>(
sql,
new { AsOf = asOf, Limit = limit },
ct);
return results.ToList();
}
}
```
#### Database Migration
```sql
-- Migration: Add CVE observation nodes table
-- File: src/Graph/StellaOps.Graph.Indexer/Migrations/003_cve_observation_nodes.sql
CREATE TABLE IF NOT EXISTS graph.cve_observation_nodes (
node_id TEXT PRIMARY KEY,
cve_id TEXT NOT NULL,
subject_purl TEXT NOT NULL,
vex_status TEXT,
observation_state TEXT NOT NULL DEFAULT 'PendingDeterminization',
-- Uncertainty score
uncertainty_entropy DOUBLE PRECISION NOT NULL,
uncertainty_completeness DOUBLE PRECISION NOT NULL,
uncertainty_tier TEXT NOT NULL,
uncertainty_missing_signals JSONB NOT NULL DEFAULT '[]',
-- Decay tracking
decay_half_life_days DOUBLE PRECISION NOT NULL DEFAULT 14,
decay_floor DOUBLE PRECISION NOT NULL DEFAULT 0.35,
decay_last_update TIMESTAMPTZ NOT NULL,
decay_multiplier DOUBLE PRECISION NOT NULL,
decay_is_stale BOOLEAN NOT NULL DEFAULT FALSE,
-- Trust and policy
trust_score DOUBLE PRECISION NOT NULL,
policy_hint TEXT NOT NULL,
guard_rails JSONB,
-- Timestamps
last_evaluated_at TIMESTAMPTZ NOT NULL,
next_review_at TIMESTAMPTZ,
environment TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT uq_cve_observation_cve_purl UNIQUE (cve_id, subject_purl)
);
-- Indexes for common queries
CREATE INDEX idx_cve_obs_cve_id ON graph.cve_observation_nodes(cve_id);
CREATE INDEX idx_cve_obs_purl ON graph.cve_observation_nodes(subject_purl);
CREATE INDEX idx_cve_obs_state ON graph.cve_observation_nodes(observation_state);
CREATE INDEX idx_cve_obs_review ON graph.cve_observation_nodes(next_review_at)
WHERE observation_state IN ('PendingDeterminization', 'StaleRequiresRefresh');
CREATE INDEX idx_cve_obs_policy ON graph.cve_observation_nodes(policy_hint);
-- Trigger for updated_at
CREATE OR REPLACE FUNCTION graph.update_cve_obs_timestamp()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_cve_obs_updated
BEFORE UPDATE ON graph.cve_observation_nodes
FOR EACH ROW EXECUTE FUNCTION graph.update_cve_obs_timestamp();
```
### Findings: Observation Persistence
#### IObservationRepository (Full Implementation)
```csharp
namespace StellaOps.Findings.Ledger.Repositories;
/// <summary>
/// Repository for CVE observations in the findings ledger.
/// </summary>
public interface IObservationRepository
{
/// <summary>Find observations by CVE and PURL.</summary>
Task<IReadOnlyList<CveObservation>> FindByCveAndPurlAsync(
string cveId,
string purl,
CancellationToken ct = default);
/// <summary>Get observation by ID.</summary>
Task<CveObservation?> GetByIdAsync(Guid id, CancellationToken ct = default);
/// <summary>Create new observation.</summary>
Task<CveObservation> CreateAsync(CveObservation observation, CancellationToken ct = default);
/// <summary>Update observation state with audit trail.</summary>
Task UpdateStateAsync(
Guid id,
ObservationState newState,
DeterminizationGateResult? result,
CancellationToken ct = default);
/// <summary>Get observations needing review.</summary>
Task<IReadOnlyList<CveObservation>> GetPendingReviewAsync(
DateTimeOffset asOf,
int limit = 100,
CancellationToken ct = default);
/// <summary>Record state transition in audit log.</summary>
Task RecordTransitionAsync(
Guid observationId,
ObservationState fromState,
ObservationState toState,
string reason,
CancellationToken ct = default);
}
/// <summary>
/// CVE observation entity for findings ledger.
/// </summary>
public sealed record CveObservation
{
public required Guid Id { get; init; }
public required string CveId { get; init; }
public required string SubjectPurl { get; init; }
public required ObservationState ObservationState { get; init; }
public required DeploymentEnvironment Environment { get; init; }
public UncertaintyScore? LastUncertaintyScore { get; init; }
public double? LastTrustScore { get; init; }
public PolicyVerdictStatus? LastPolicyHint { get; init; }
public GuardRails? GuardRails { get; init; }
public required DateTimeOffset CreatedAt { get; init; }
public required DateTimeOffset UpdatedAt { get; init; }
public DateTimeOffset? NextReviewAt { get; init; }
}
```
### SignalSnapshotBuilder (Full Implementation)
```csharp
namespace StellaOps.Policy.Engine.Signals;
/// <summary>
/// Builds signal snapshots by aggregating from multiple sources.
/// </summary>
public interface ISignalSnapshotBuilder
{
/// <summary>Build snapshot for a CVE/PURL pair.</summary>
Task<SignalSnapshot> BuildAsync(string cveId, string purl, CancellationToken ct = default);
}
public sealed class SignalSnapshotBuilder : ISignalSnapshotBuilder
{
private readonly ISignalAttacher<EpssEvidence> _epssAttacher;
private readonly ISignalAttacher<bool> _kevAttacher;
private readonly IVexSignalProvider _vexProvider;
private readonly IReachabilitySignalProvider _reachabilityProvider;
private readonly IRuntimeSignalProvider _runtimeProvider;
private readonly IBackportSignalProvider _backportProvider;
private readonly ISbomLineageSignalProvider _sbomProvider;
private readonly ICvssSignalProvider _cvssProvider;
private readonly TimeProvider _timeProvider;
private readonly ILogger<SignalSnapshotBuilder> _logger;
public async Task<SignalSnapshot> BuildAsync(
string cveId,
string purl,
CancellationToken ct = default)
{
var now = _timeProvider.GetUtcNow();
_logger.LogDebug("Building signal snapshot for CVE {CveId} on {Purl}", cveId, purl);
// Fetch all signals in parallel
var epssTask = _epssAttacher.AttachAsync(cveId, purl, ct);
var kevTask = _kevAttacher.AttachAsync(cveId, purl, ct);
var vexTask = _vexProvider.GetSignalAsync(cveId, purl, ct);
var reachTask = _reachabilityProvider.GetSignalAsync(cveId, purl, ct);
var runtimeTask = _runtimeProvider.GetSignalAsync(cveId, purl, ct);
var backportTask = _backportProvider.GetSignalAsync(cveId, purl, ct);
var sbomTask = _sbomProvider.GetSignalAsync(purl, ct);
var cvssTask = _cvssProvider.GetSignalAsync(cveId, ct);
await Task.WhenAll(
epssTask, kevTask, vexTask, reachTask,
runtimeTask, backportTask, sbomTask, cvssTask);
var snapshot = new SignalSnapshot
{
CveId = cveId,
SubjectPurl = purl,
CapturedAt = now,
Epss = await epssTask,
Kev = await kevTask,
Vex = await vexTask,
Reachability = await reachTask,
Runtime = await runtimeTask,
Backport = await backportTask,
SbomLineage = await sbomTask,
Cvss = await cvssTask
};
_logger.LogDebug(
"Built signal snapshot for CVE {CveId}: EPSS={EpssStatus}, VEX={VexStatus}, Reach={ReachStatus}",
cveId,
snapshot.Epss.Status,
snapshot.Vex.Status,
snapshot.Reachability.Status);
return snapshot;
}
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | DBI-001 | TODO | DPE-030 | Guild | Create `ISignalAttacher<T>` interface in Feedser |
| 2 | DBI-002 | TODO | DBI-001 | Guild | Implement `EpssSignalAttacher` with event emission |
| 3 | DBI-003 | TODO | DBI-002 | Guild | Implement `KevSignalAttacher` |
| 4 | DBI-004 | TODO | DBI-003 | Guild | Create `SignalAttacherServiceExtensions` for DI |
| 5 | DBI-005 | TODO | DBI-004 | Guild | Create `VexSignalEmitter` in VexLens |
| 6 | DBI-006 | TODO | DBI-005 | Guild | Create `VexClaimSummaryMapper` |
| 7 | DBI-007 | TODO | DBI-006 | Guild | Integrate VexSignalEmitter into VEX processing pipeline |
| 8 | DBI-008 | TODO | DBI-007 | Guild | Create `CveObservationNode` record in Graph |
| 9 | DBI-009 | TODO | DBI-008 | Guild | Create `ICveObservationNodeRepository` interface |
| 10 | DBI-010 | TODO | DBI-009 | Guild | Implement `PostgresCveObservationNodeRepository` |
| 11 | DBI-011 | TODO | DBI-010 | Guild | Create migration `003_cve_observation_nodes.sql` |
| 12 | DBI-012 | TODO | DBI-011 | Guild | Create `IObservationRepository` in Findings |
| 13 | DBI-013 | TODO | DBI-012 | Guild | Implement `PostgresObservationRepository` |
| 14 | DBI-014 | TODO | DBI-013 | Guild | Create `ISignalSnapshotBuilder` interface |
| 15 | DBI-015 | TODO | DBI-014 | Guild | Implement `SignalSnapshotBuilder` with parallel fetch |
| 16 | DBI-016 | TODO | DBI-015 | Guild | Create signal provider interfaces (VEX, Reachability, etc.) |
| 17 | DBI-017 | TODO | DBI-016 | Guild | Implement signal provider adapters |
| 18 | DBI-018 | TODO | DBI-017 | Guild | Write unit tests: `EpssSignalAttacher` scenarios |
| 19 | DBI-019 | TODO | DBI-018 | Guild | Write unit tests: `SignalSnapshotBuilder` parallel fetch |
| 20 | DBI-020 | TODO | DBI-019 | Guild | Write integration tests: Graph node persistence |
| 21 | DBI-021 | TODO | DBI-020 | Guild | Write integration tests: Findings observation lifecycle |
| 22 | DBI-022 | TODO | DBI-021 | Guild | Write integration tests: End-to-end signal flow |
| 23 | DBI-023 | TODO | DBI-022 | Guild | Add metrics: `stellaops_feedser_signal_attachments_total` |
| 24 | DBI-024 | TODO | DBI-023 | Guild | Add metrics: `stellaops_graph_observation_nodes_total` |
| 25 | DBI-025 | TODO | DBI-024 | Guild | Update module AGENTS.md files |
| 26 | DBI-026 | TODO | DBI-025 | Guild | Verify build across all affected modules |
## Acceptance Criteria
1. `EpssSignalAttacher` correctly wraps EPSS results in `SignalState<T>`
2. VEX updates emit `SignalUpdatedEvent` for downstream processing
3. Graph nodes persist `ObservationState` and `UncertaintyScore`
4. Findings ledger tracks state transitions with audit trail
5. `SignalSnapshotBuilder` fetches all signals in parallel
6. Migration creates proper indexes for common queries
7. All integration tests pass with Testcontainers
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Parallel signal fetch | Reduces latency; signals are independent |
| Graph node hash ID | Deterministic; avoids UUID collision across systems |
| JSONB for missing_signals | Flexible schema; supports varying signal sets |
| Separate Graph and Findings storage | Graph for query patterns; Findings for audit trail |
| Risk | Mitigation |
|------|------------|
| Signal provider availability | Graceful degradation to `SignalState.Failed` |
| Event storm on bulk VEX import | Batch event emission; debounce handler |
| Schema drift across modules | Shared Evidence models in Determinization library |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
## Next Checkpoints
- 2026-01-12: DBI-001 to DBI-011 complete (Feedser, VexLens, Graph)
- 2026-01-13: DBI-012 to DBI-017 complete (Findings, SignalSnapshotBuilder)
- 2026-01-14: DBI-018 to DBI-026 complete (tests, metrics)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,914 @@
# Sprint 20260106_001_005_FE - Determinization: Frontend UI Components
## Topic & Scope
Create Angular UI components for displaying and managing CVE observation state, uncertainty scores, guardrails status, and review workflows. This includes the "Unknown (auto-tracking)" chip with next review ETA and a determinization dashboard.
- **Working directory:** `src/Web/StellaOps.Web/`
- **Evidence:** Angular components, services, tests, Storybook stories
## Problem Statement
Current UI state:
- Vulnerability findings show VEX status but not observation state
- No visibility into uncertainty/entropy levels
- No guardrails status indicator
- No review workflow for uncertain observations
Advisory requires:
- UI chip: "Unknown (auto-tracking)" with next review ETA
- Uncertainty tier visualization
- Guardrails status and monitoring indicators
- Review queue for pending observations
- State transition history
## Dependencies & Concurrency
- **Depends on:** SPRINT_20260106_001_004_BE (API endpoints)
- **Blocks:** None (end of chain)
- **Parallel safe:** Frontend-only changes
## Documentation Prerequisites
- docs/modules/policy/determinization-architecture.md
- SPRINT_20260106_001_004_BE (API contracts)
- src/Web/StellaOps.Web/AGENTS.md (if exists)
- Existing: Vulnerability findings components
## Technical Design
### Directory Structure
```
src/Web/StellaOps.Web/src/app/
├── shared/
│ └── components/
│ └── determinization/
│ ├── observation-state-chip/
│ │ ├── observation-state-chip.component.ts
│ │ ├── observation-state-chip.component.html
│ │ ├── observation-state-chip.component.scss
│ │ └── observation-state-chip.component.spec.ts
│ ├── uncertainty-indicator/
│ │ ├── uncertainty-indicator.component.ts
│ │ ├── uncertainty-indicator.component.html
│ │ ├── uncertainty-indicator.component.scss
│ │ └── uncertainty-indicator.component.spec.ts
│ ├── guardrails-badge/
│ │ ├── guardrails-badge.component.ts
│ │ ├── guardrails-badge.component.html
│ │ ├── guardrails-badge.component.scss
│ │ └── guardrails-badge.component.spec.ts
│ ├── decay-progress/
│ │ ├── decay-progress.component.ts
│ │ ├── decay-progress.component.html
│ │ ├── decay-progress.component.scss
│ │ └── decay-progress.component.spec.ts
│ └── determinization.module.ts
├── features/
│ └── vulnerabilities/
│ └── components/
│ ├── observation-details-panel/
│ │ ├── observation-details-panel.component.ts
│ │ ├── observation-details-panel.component.html
│ │ └── observation-details-panel.component.scss
│ └── observation-review-queue/
│ ├── observation-review-queue.component.ts
│ ├── observation-review-queue.component.html
│ └── observation-review-queue.component.scss
├── core/
│ └── services/
│ └── determinization/
│ ├── determinization.service.ts
│ ├── determinization.models.ts
│ └── determinization.service.spec.ts
└── core/
└── models/
└── determinization.models.ts
```
### TypeScript Models
```typescript
// src/app/core/models/determinization.models.ts
export enum ObservationState {
PendingDeterminization = 'PendingDeterminization',
Determined = 'Determined',
Disputed = 'Disputed',
StaleRequiresRefresh = 'StaleRequiresRefresh',
ManualReviewRequired = 'ManualReviewRequired',
Suppressed = 'Suppressed'
}
export enum UncertaintyTier {
VeryLow = 'VeryLow',
Low = 'Low',
Medium = 'Medium',
High = 'High',
VeryHigh = 'VeryHigh'
}
export enum PolicyVerdictStatus {
Pass = 'Pass',
GuardedPass = 'GuardedPass',
Blocked = 'Blocked',
Ignored = 'Ignored',
Warned = 'Warned',
Deferred = 'Deferred',
Escalated = 'Escalated',
RequiresVex = 'RequiresVex'
}
export interface UncertaintyScore {
entropy: number;
completeness: number;
tier: UncertaintyTier;
missingSignals: SignalGap[];
weightedEvidenceSum: number;
maxPossibleWeight: number;
}
export interface SignalGap {
signalName: string;
weight: number;
status: 'NotQueried' | 'Queried' | 'Failed';
reason?: string;
}
export interface ObservationDecay {
halfLifeDays: number;
floor: number;
lastSignalUpdate: string;
decayedMultiplier: number;
nextReviewAt?: string;
isStale: boolean;
ageDays: number;
}
export interface GuardRails {
enableRuntimeMonitoring: boolean;
reviewIntervalDays: number;
epssEscalationThreshold: number;
escalatingReachabilityStates: string[];
maxGuardedDurationDays: number;
alertChannels: string[];
policyRationale?: string;
}
export interface CveObservation {
id: string;
cveId: string;
subjectPurl: string;
observationState: ObservationState;
uncertaintyScore: UncertaintyScore;
decay: ObservationDecay;
trustScore: number;
policyHint: PolicyVerdictStatus;
guardRails?: GuardRails;
lastEvaluatedAt: string;
nextReviewAt?: string;
environment?: string;
vexStatus?: string;
}
export interface ObservationStateTransition {
id: string;
observationId: string;
fromState: ObservationState;
toState: ObservationState;
reason: string;
triggeredBy: string;
timestamp: string;
}
```
### ObservationStateChip Component
```typescript
// observation-state-chip.component.ts
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
import { CommonModule } from '@angular/common';
import { MatChipsModule } from '@angular/material/chips';
import { MatIconModule } from '@angular/material/icon';
import { MatTooltipModule } from '@angular/material/tooltip';
import { ObservationState, CveObservation } from '@core/models/determinization.models';
import { formatDistanceToNow, parseISO } from 'date-fns';
@Component({
selector: 'stellaops-observation-state-chip',
standalone: true,
imports: [CommonModule, MatChipsModule, MatIconModule, MatTooltipModule],
templateUrl: './observation-state-chip.component.html',
styleUrls: ['./observation-state-chip.component.scss'],
changeDetection: ChangeDetectionStrategy.OnPush
})
export class ObservationStateChipComponent {
@Input({ required: true }) observation!: CveObservation;
@Input() showReviewEta = true;
get stateConfig(): StateConfig {
return STATE_CONFIGS[this.observation.observationState];
}
get reviewEtaText(): string | null {
if (!this.observation.nextReviewAt) return null;
const nextReview = parseISO(this.observation.nextReviewAt);
return formatDistanceToNow(nextReview, { addSuffix: true });
}
get tooltipText(): string {
const config = this.stateConfig;
let tooltip = config.description;
if (this.observation.observationState === ObservationState.PendingDeterminization) {
const missing = this.observation.uncertaintyScore.missingSignals
.map(g => g.signalName)
.join(', ');
if (missing) {
tooltip += ` Missing: ${missing}`;
}
}
if (this.reviewEtaText) {
tooltip += ` Next review: ${this.reviewEtaText}`;
}
return tooltip;
}
}
interface StateConfig {
label: string;
icon: string;
color: 'primary' | 'accent' | 'warn' | 'default';
description: string;
}
const STATE_CONFIGS: Record<ObservationState, StateConfig> = {
[ObservationState.PendingDeterminization]: {
label: 'Unknown (auto-tracking)',
icon: 'hourglass_empty',
color: 'accent',
description: 'Evidence incomplete; tracking for updates.'
},
[ObservationState.Determined]: {
label: 'Determined',
icon: 'check_circle',
color: 'primary',
description: 'Sufficient evidence for confident determination.'
},
[ObservationState.Disputed]: {
label: 'Disputed',
icon: 'warning',
color: 'warn',
description: 'Conflicting evidence detected; requires review.'
},
[ObservationState.StaleRequiresRefresh]: {
label: 'Stale',
icon: 'update',
color: 'warn',
description: 'Evidence has decayed; needs refresh.'
},
[ObservationState.ManualReviewRequired]: {
label: 'Review Required',
icon: 'rate_review',
color: 'warn',
description: 'Manual review required before proceeding.'
},
[ObservationState.Suppressed]: {
label: 'Suppressed',
icon: 'visibility_off',
color: 'default',
description: 'Observation suppressed by policy exception.'
}
};
```
```html
<!-- observation-state-chip.component.html -->
<mat-chip
[class]="'observation-chip observation-chip--' + observation.observationState.toLowerCase()"
[matTooltip]="tooltipText"
matTooltipPosition="above">
<mat-icon class="chip-icon">{{ stateConfig.icon }}</mat-icon>
<span class="chip-label">{{ stateConfig.label }}</span>
<span *ngIf="showReviewEta && reviewEtaText" class="chip-eta">
({{ reviewEtaText }})
</span>
</mat-chip>
```
```scss
// observation-state-chip.component.scss
.observation-chip {
display: inline-flex;
align-items: center;
gap: 4px;
font-size: 12px;
height: 24px;
.chip-icon {
font-size: 16px;
width: 16px;
height: 16px;
}
.chip-eta {
font-size: 10px;
opacity: 0.8;
}
&--pendingdeterminization {
background-color: #fff3e0;
color: #e65100;
}
&--determined {
background-color: #e8f5e9;
color: #2e7d32;
}
&--disputed {
background-color: #fff8e1;
color: #f57f17;
}
&--stalerequiresrefresh {
background-color: #fce4ec;
color: #c2185b;
}
&--manualreviewrequired {
background-color: #ffebee;
color: #c62828;
}
&--suppressed {
background-color: #f5f5f5;
color: #757575;
}
}
```
### UncertaintyIndicator Component
```typescript
// uncertainty-indicator.component.ts
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
import { CommonModule } from '@angular/common';
import { MatProgressBarModule } from '@angular/material/progress-bar';
import { MatTooltipModule } from '@angular/material/tooltip';
import { UncertaintyScore, UncertaintyTier } from '@core/models/determinization.models';
@Component({
selector: 'stellaops-uncertainty-indicator',
standalone: true,
imports: [CommonModule, MatProgressBarModule, MatTooltipModule],
templateUrl: './uncertainty-indicator.component.html',
styleUrls: ['./uncertainty-indicator.component.scss'],
changeDetection: ChangeDetectionStrategy.OnPush
})
export class UncertaintyIndicatorComponent {
@Input({ required: true }) score!: UncertaintyScore;
@Input() showLabel = true;
@Input() compact = false;
get completenessPercent(): number {
return Math.round(this.score.completeness * 100);
}
get tierConfig(): TierConfig {
return TIER_CONFIGS[this.score.tier];
}
get tooltipText(): string {
const missing = this.score.missingSignals.map(g => g.signalName).join(', ');
return `Evidence completeness: ${this.completenessPercent}%` +
(missing ? ` | Missing: ${missing}` : '');
}
}
interface TierConfig {
label: string;
color: string;
barColor: 'primary' | 'accent' | 'warn';
}
const TIER_CONFIGS: Record<UncertaintyTier, TierConfig> = {
[UncertaintyTier.VeryLow]: {
label: 'Very Low Uncertainty',
color: '#4caf50',
barColor: 'primary'
},
[UncertaintyTier.Low]: {
label: 'Low Uncertainty',
color: '#8bc34a',
barColor: 'primary'
},
[UncertaintyTier.Medium]: {
label: 'Moderate Uncertainty',
color: '#ffc107',
barColor: 'accent'
},
[UncertaintyTier.High]: {
label: 'High Uncertainty',
color: '#ff9800',
barColor: 'warn'
},
[UncertaintyTier.VeryHigh]: {
label: 'Very High Uncertainty',
color: '#f44336',
barColor: 'warn'
}
};
```
```html
<!-- uncertainty-indicator.component.html -->
<div class="uncertainty-indicator"
[class.compact]="compact"
[matTooltip]="tooltipText">
<div class="indicator-header" *ngIf="showLabel">
<span class="tier-label" [style.color]="tierConfig.color">
{{ tierConfig.label }}
</span>
<span class="completeness-value">{{ completenessPercent }}%</span>
</div>
<mat-progress-bar
[value]="completenessPercent"
[color]="tierConfig.barColor"
mode="determinate">
</mat-progress-bar>
<div class="missing-signals" *ngIf="!compact && score.missingSignals.length > 0">
<span class="missing-label">Missing:</span>
<span class="missing-list">
{{ score.missingSignals | slice:0:3 | map:'signalName' | join:', ' }}
<span *ngIf="score.missingSignals.length > 3">
+{{ score.missingSignals.length - 3 }} more
</span>
</span>
</div>
</div>
```
### GuardrailsBadge Component
```typescript
// guardrails-badge.component.ts
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
import { CommonModule } from '@angular/common';
import { MatBadgeModule } from '@angular/material/badge';
import { MatIconModule } from '@angular/material/icon';
import { MatTooltipModule } from '@angular/material/tooltip';
import { GuardRails } from '@core/models/determinization.models';
@Component({
selector: 'stellaops-guardrails-badge',
standalone: true,
imports: [CommonModule, MatBadgeModule, MatIconModule, MatTooltipModule],
templateUrl: './guardrails-badge.component.html',
styleUrls: ['./guardrails-badge.component.scss'],
changeDetection: ChangeDetectionStrategy.OnPush
})
export class GuardrailsBadgeComponent {
@Input({ required: true }) guardRails!: GuardRails;
get activeGuardrailsCount(): number {
let count = 0;
if (this.guardRails.enableRuntimeMonitoring) count++;
if (this.guardRails.alertChannels.length > 0) count++;
if (this.guardRails.epssEscalationThreshold < 1.0) count++;
return count;
}
get tooltipText(): string {
const parts: string[] = [];
if (this.guardRails.enableRuntimeMonitoring) {
parts.push('Runtime monitoring enabled');
}
parts.push(`Review every ${this.guardRails.reviewIntervalDays} days`);
parts.push(`EPSS escalation at ${(this.guardRails.epssEscalationThreshold * 100).toFixed(0)}%`);
if (this.guardRails.alertChannels.length > 0) {
parts.push(`Alerts: ${this.guardRails.alertChannels.join(', ')}`);
}
if (this.guardRails.policyRationale) {
parts.push(`Rationale: ${this.guardRails.policyRationale}`);
}
return parts.join(' | ');
}
}
```
```html
<!-- guardrails-badge.component.html -->
<div class="guardrails-badge" [matTooltip]="tooltipText">
<mat-icon
[matBadge]="activeGuardrailsCount"
matBadgeColor="accent"
matBadgeSize="small">
security
</mat-icon>
<span class="badge-label">Guarded</span>
<div class="guardrails-icons">
<mat-icon *ngIf="guardRails.enableRuntimeMonitoring"
class="guardrail-icon"
matTooltip="Runtime monitoring active">
monitor_heart
</mat-icon>
<mat-icon *ngIf="guardRails.alertChannels.length > 0"
class="guardrail-icon"
matTooltip="Alerts configured">
notifications_active
</mat-icon>
</div>
</div>
```
### DecayProgress Component
```typescript
// decay-progress.component.ts
import { Component, Input, ChangeDetectionStrategy } from '@angular/core';
import { CommonModule } from '@angular/common';
import { MatProgressBarModule } from '@angular/material/progress-bar';
import { MatTooltipModule } from '@angular/material/tooltip';
import { ObservationDecay } from '@core/models/determinization.models';
import { formatDistanceToNow, parseISO } from 'date-fns';
@Component({
selector: 'stellaops-decay-progress',
standalone: true,
imports: [CommonModule, MatProgressBarModule, MatTooltipModule],
templateUrl: './decay-progress.component.html',
styleUrls: ['./decay-progress.component.scss'],
changeDetection: ChangeDetectionStrategy.OnPush
})
export class DecayProgressComponent {
@Input({ required: true }) decay!: ObservationDecay;
get freshness(): number {
return Math.round(this.decay.decayedMultiplier * 100);
}
get ageText(): string {
return `${this.decay.ageDays.toFixed(1)} days old`;
}
get nextReviewText(): string | null {
if (!this.decay.nextReviewAt) return null;
return formatDistanceToNow(parseISO(this.decay.nextReviewAt), { addSuffix: true });
}
get barColor(): 'primary' | 'accent' | 'warn' {
if (this.decay.isStale) return 'warn';
if (this.decay.decayedMultiplier < 0.7) return 'accent';
return 'primary';
}
get tooltipText(): string {
return `Freshness: ${this.freshness}% | Age: ${this.ageText} | ` +
`Half-life: ${this.decay.halfLifeDays} days` +
(this.decay.isStale ? ' | STALE - needs refresh' : '');
}
}
```
### Determinization Service
```typescript
// determinization.service.ts
import { Injectable, inject } from '@angular/core';
import { HttpClient, HttpParams } from '@angular/common/http';
import { Observable } from 'rxjs';
import {
CveObservation,
ObservationState,
ObservationStateTransition
} from '@core/models/determinization.models';
import { ApiConfig } from '@core/config/api.config';
@Injectable({ providedIn: 'root' })
export class DeterminizationService {
private readonly http = inject(HttpClient);
private readonly apiConfig = inject(ApiConfig);
private get baseUrl(): string {
return `${this.apiConfig.baseUrl}/api/v1/observations`;
}
getObservation(cveId: string, purl: string): Observable<CveObservation> {
const params = new HttpParams()
.set('cveId', cveId)
.set('purl', purl);
return this.http.get<CveObservation>(this.baseUrl, { params });
}
getObservationById(id: string): Observable<CveObservation> {
return this.http.get<CveObservation>(`${this.baseUrl}/${id}`);
}
getPendingReview(limit = 50): Observable<CveObservation[]> {
const params = new HttpParams()
.set('state', ObservationState.PendingDeterminization)
.set('limit', limit.toString());
return this.http.get<CveObservation[]>(`${this.baseUrl}/pending-review`, { params });
}
getByState(state: ObservationState, limit = 100): Observable<CveObservation[]> {
const params = new HttpParams()
.set('state', state)
.set('limit', limit.toString());
return this.http.get<CveObservation[]>(this.baseUrl, { params });
}
getTransitionHistory(observationId: string): Observable<ObservationStateTransition[]> {
return this.http.get<ObservationStateTransition[]>(
`${this.baseUrl}/${observationId}/transitions`
);
}
requestReview(observationId: string, reason: string): Observable<void> {
return this.http.post<void>(
`${this.baseUrl}/${observationId}/request-review`,
{ reason }
);
}
suppress(observationId: string, reason: string): Observable<void> {
return this.http.post<void>(
`${this.baseUrl}/${observationId}/suppress`,
{ reason }
);
}
refreshSignals(observationId: string): Observable<CveObservation> {
return this.http.post<CveObservation>(
`${this.baseUrl}/${observationId}/refresh`,
{}
);
}
}
```
### Observation Review Queue Component
```typescript
// observation-review-queue.component.ts
import { Component, OnInit, inject, ChangeDetectionStrategy } from '@angular/core';
import { CommonModule } from '@angular/common';
import { MatTableModule } from '@angular/material/table';
import { MatPaginatorModule, PageEvent } from '@angular/material/paginator';
import { MatButtonModule } from '@angular/material/button';
import { MatIconModule } from '@angular/material/icon';
import { MatMenuModule } from '@angular/material/menu';
import { BehaviorSubject, switchMap } from 'rxjs';
import { DeterminizationService } from '@core/services/determinization/determinization.service';
import { CveObservation } from '@core/models/determinization.models';
import { ObservationStateChipComponent } from '@shared/components/determinization/observation-state-chip/observation-state-chip.component';
import { UncertaintyIndicatorComponent } from '@shared/components/determinization/uncertainty-indicator/uncertainty-indicator.component';
import { GuardrailsBadgeComponent } from '@shared/components/determinization/guardrails-badge/guardrails-badge.component';
import { DecayProgressComponent } from '@shared/components/determinization/decay-progress/decay-progress.component';
@Component({
selector: 'stellaops-observation-review-queue',
standalone: true,
imports: [
CommonModule,
MatTableModule,
MatPaginatorModule,
MatButtonModule,
MatIconModule,
MatMenuModule,
ObservationStateChipComponent,
UncertaintyIndicatorComponent,
GuardrailsBadgeComponent,
DecayProgressComponent
],
templateUrl: './observation-review-queue.component.html',
styleUrls: ['./observation-review-queue.component.scss'],
changeDetection: ChangeDetectionStrategy.OnPush
})
export class ObservationReviewQueueComponent implements OnInit {
private readonly determinizationService = inject(DeterminizationService);
displayedColumns = ['cveId', 'purl', 'state', 'uncertainty', 'freshness', 'actions'];
observations$ = new BehaviorSubject<CveObservation[]>([]);
loading$ = new BehaviorSubject<boolean>(false);
pageSize = 25;
pageIndex = 0;
ngOnInit(): void {
this.loadObservations();
}
loadObservations(): void {
this.loading$.next(true);
this.determinizationService.getPendingReview(this.pageSize)
.subscribe({
next: (observations) => {
this.observations$.next(observations);
this.loading$.next(false);
},
error: () => this.loading$.next(false)
});
}
onPageChange(event: PageEvent): void {
this.pageSize = event.pageSize;
this.pageIndex = event.pageIndex;
this.loadObservations();
}
onRefresh(observation: CveObservation): void {
this.determinizationService.refreshSignals(observation.id)
.subscribe(() => this.loadObservations());
}
onRequestReview(observation: CveObservation): void {
// Open dialog for review request
}
onSuppress(observation: CveObservation): void {
// Open dialog for suppression
}
}
```
```html
<!-- observation-review-queue.component.html -->
<div class="review-queue">
<div class="queue-header">
<h2>Pending Determinization Review</h2>
<button mat-icon-button (click)="loadObservations()" matTooltip="Refresh">
<mat-icon>refresh</mat-icon>
</button>
</div>
<table mat-table [dataSource]="observations$ | async" class="queue-table">
<!-- CVE ID Column -->
<ng-container matColumnDef="cveId">
<th mat-header-cell *matHeaderCellDef>CVE</th>
<td mat-cell *matCellDef="let obs">
<a [routerLink]="['/vulnerabilities', obs.cveId]">{{ obs.cveId }}</a>
</td>
</ng-container>
<!-- PURL Column -->
<ng-container matColumnDef="purl">
<th mat-header-cell *matHeaderCellDef>Component</th>
<td mat-cell *matCellDef="let obs" class="purl-cell">
{{ obs.subjectPurl | truncate:50 }}
</td>
</ng-container>
<!-- State Column -->
<ng-container matColumnDef="state">
<th mat-header-cell *matHeaderCellDef>State</th>
<td mat-cell *matCellDef="let obs">
<stellaops-observation-state-chip [observation]="obs">
</stellaops-observation-state-chip>
</td>
</ng-container>
<!-- Uncertainty Column -->
<ng-container matColumnDef="uncertainty">
<th mat-header-cell *matHeaderCellDef>Evidence</th>
<td mat-cell *matCellDef="let obs">
<stellaops-uncertainty-indicator
[score]="obs.uncertaintyScore"
[compact]="true">
</stellaops-uncertainty-indicator>
</td>
</ng-container>
<!-- Freshness Column -->
<ng-container matColumnDef="freshness">
<th mat-header-cell *matHeaderCellDef>Freshness</th>
<td mat-cell *matCellDef="let obs">
<stellaops-decay-progress [decay]="obs.decay">
</stellaops-decay-progress>
</td>
</ng-container>
<!-- Actions Column -->
<ng-container matColumnDef="actions">
<th mat-header-cell *matHeaderCellDef></th>
<td mat-cell *matCellDef="let obs">
<button mat-icon-button [matMenuTriggerFor]="menu">
<mat-icon>more_vert</mat-icon>
</button>
<mat-menu #menu="matMenu">
<button mat-menu-item (click)="onRefresh(obs)">
<mat-icon>refresh</mat-icon>
<span>Refresh Signals</span>
</button>
<button mat-menu-item (click)="onRequestReview(obs)">
<mat-icon>rate_review</mat-icon>
<span>Request Review</span>
</button>
<button mat-menu-item (click)="onSuppress(obs)">
<mat-icon>visibility_off</mat-icon>
<span>Suppress</span>
</button>
</mat-menu>
</td>
</ng-container>
<tr mat-header-row *matHeaderRowDef="displayedColumns"></tr>
<tr mat-row *matRowDef="let row; columns: displayedColumns;"></tr>
</table>
<mat-paginator
[pageSize]="pageSize"
[pageIndex]="pageIndex"
[pageSizeOptions]="[10, 25, 50, 100]"
(page)="onPageChange($event)">
</mat-paginator>
</div>
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | DFE-001 | TODO | DBI-026 | Guild | Create `determinization.models.ts` TypeScript interfaces |
| 2 | DFE-002 | TODO | DFE-001 | Guild | Create `DeterminizationService` with API methods |
| 3 | DFE-003 | TODO | DFE-002 | Guild | Create `ObservationStateChipComponent` |
| 4 | DFE-004 | TODO | DFE-003 | Guild | Create `UncertaintyIndicatorComponent` |
| 5 | DFE-005 | TODO | DFE-004 | Guild | Create `GuardrailsBadgeComponent` |
| 6 | DFE-006 | TODO | DFE-005 | Guild | Create `DecayProgressComponent` |
| 7 | DFE-007 | TODO | DFE-006 | Guild | Create `DeterminizationModule` to export components |
| 8 | DFE-008 | TODO | DFE-007 | Guild | Create `ObservationDetailsPanelComponent` |
| 9 | DFE-009 | TODO | DFE-008 | Guild | Create `ObservationReviewQueueComponent` |
| 10 | DFE-010 | TODO | DFE-009 | Guild | Integrate state chip into existing vulnerability list |
| 11 | DFE-011 | TODO | DFE-010 | Guild | Add uncertainty indicator to vulnerability details |
| 12 | DFE-012 | TODO | DFE-011 | Guild | Add guardrails badge to guarded findings |
| 13 | DFE-013 | TODO | DFE-012 | Guild | Create state transition history timeline component |
| 14 | DFE-014 | TODO | DFE-013 | Guild | Add review queue to navigation |
| 15 | DFE-015 | TODO | DFE-014 | Guild | Write unit tests: ObservationStateChipComponent |
| 16 | DFE-016 | TODO | DFE-015 | Guild | Write unit tests: UncertaintyIndicatorComponent |
| 17 | DFE-017 | TODO | DFE-016 | Guild | Write unit tests: DeterminizationService |
| 18 | DFE-018 | TODO | DFE-017 | Guild | Write Storybook stories for all components |
| 19 | DFE-019 | TODO | DFE-018 | Guild | Add i18n translations for state labels |
| 20 | DFE-020 | TODO | DFE-019 | Guild | Implement dark mode styles |
| 21 | DFE-021 | TODO | DFE-020 | Guild | Add accessibility (ARIA) attributes |
| 22 | DFE-022 | TODO | DFE-021 | Guild | E2E tests: review queue workflow |
| 23 | DFE-023 | TODO | DFE-022 | Guild | Performance optimization: virtual scroll for large lists |
| 24 | DFE-024 | TODO | DFE-023 | Guild | Verify build with `ng build --configuration production` |
## Acceptance Criteria
1. "Unknown (auto-tracking)" chip displays correctly with review ETA
2. Uncertainty indicator shows tier and completeness percentage
3. Guardrails badge shows active guardrail count and details
4. Decay progress shows freshness and staleness warnings
5. Review queue lists pending observations with sorting
6. All components work in dark mode
7. ARIA attributes present for accessibility
8. Storybook stories document all component states
9. Unit tests achieve 80%+ coverage
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Standalone components | Tree-shakeable; modern Angular pattern |
| Material Design | Consistent with existing StellaOps UI |
| date-fns for formatting | Lighter than moment; tree-shakeable |
| Virtual scroll for queue | Performance with large observation counts |
| Risk | Mitigation |
|------|------------|
| API contract drift | TypeScript interfaces from OpenAPI spec |
| Performance with many observations | Pagination; virtual scroll; lazy loading |
| Localization complexity | i18n from day one; extract all strings |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from advisory gap analysis | Planning |
## Next Checkpoints
- 2026-01-15: DFE-001 to DFE-009 complete (core components)
- 2026-01-16: DFE-010 to DFE-014 complete (integration)
- 2026-01-17: DFE-015 to DFE-024 complete (tests, polish)

View File

@@ -0,0 +1,990 @@
# Sprint 20260106_001_005_UNKNOWNS - Provenance Hint Enhancement
## Topic & Scope
Extend the Unknowns module with structured provenance hints that help explain **why** something is unknown and provide hypotheses for resolution, following the advisory's requirement for "provenance hints like: Build-ID match, import table fingerprint, section layout deltas."
- **Working directory:** `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/`
- **Evidence:** ProvenanceHint model, builders, integration with Unknown, tests
## Problem Statement
The product advisory requires:
> **Unknown tagging with provenance hints:**
> - ELF Build-ID / debuglink match; import table fingerprint; section layout deltas.
> - Attach hypotheses like: "Binary matches distro build-ID, likely backport."
Current state:
- `Unknown` model has `Context` as flexible `JsonDocument`
- No structured provenance hint types
- No confidence scoring for hints
- No hypothesis generation for resolution
**Gap:** Unknown.Context lacks structured provenance-specific fields. No way to express "we don't know what this is, but here's evidence that might help identify it."
## Dependencies & Concurrency
- **Depends on:** None (extends existing Unknowns module)
- **Blocks:** SPRINT_20260106_001_004_LB (orchestrator uses provenance hints)
- **Parallel safe:** Extends existing module; no conflicts
## Documentation Prerequisites
- docs/modules/unknowns/architecture.md
- src/Unknowns/AGENTS.md
- Existing Unknown model at `src/Unknowns/__Libraries/StellaOps.Unknowns.Core/Models/`
## Technical Design
### Provenance Hint Types
```csharp
namespace StellaOps.Unknowns.Core.Models;
/// <summary>
/// Classification of provenance hint types.
/// </summary>
public enum ProvenanceHintType
{
/// <summary>ELF/PE Build-ID match against known catalog.</summary>
BuildIdMatch,
/// <summary>Debug link (.gnu_debuglink) reference.</summary>
DebugLink,
/// <summary>Import table fingerprint comparison.</summary>
ImportTableFingerprint,
/// <summary>Export table fingerprint comparison.</summary>
ExportTableFingerprint,
/// <summary>Section layout similarity.</summary>
SectionLayout,
/// <summary>String table signature match.</summary>
StringTableSignature,
/// <summary>Compiler/linker identification.</summary>
CompilerSignature,
/// <summary>Package manager metadata (RPATH, NEEDED, etc.).</summary>
PackageMetadata,
/// <summary>Distro/vendor pattern match.</summary>
DistroPattern,
/// <summary>Version string extraction.</summary>
VersionString,
/// <summary>Symbol name pattern match.</summary>
SymbolPattern,
/// <summary>File path pattern match.</summary>
PathPattern,
/// <summary>Hash match against known corpus.</summary>
CorpusMatch,
/// <summary>SBOM cross-reference.</summary>
SbomCrossReference,
/// <summary>Advisory cross-reference.</summary>
AdvisoryCrossReference
}
/// <summary>
/// Confidence level for a provenance hint.
/// </summary>
public enum HintConfidence
{
/// <summary>Very high confidence (>= 0.9).</summary>
VeryHigh,
/// <summary>High confidence (0.7 - 0.9).</summary>
High,
/// <summary>Medium confidence (0.5 - 0.7).</summary>
Medium,
/// <summary>Low confidence (0.3 - 0.5).</summary>
Low,
/// <summary>Very low confidence (< 0.3).</summary>
VeryLow
}
```
### Provenance Hint Model
```csharp
namespace StellaOps.Unknowns.Core.Models;
/// <summary>
/// A provenance hint providing evidence about an unknown's identity.
/// </summary>
public sealed record ProvenanceHint
{
/// <summary>Unique hint ID (content-addressed).</summary>
[JsonPropertyName("hint_id")]
public required string HintId { get; init; }
/// <summary>Type of provenance hint.</summary>
[JsonPropertyName("type")]
public required ProvenanceHintType Type { get; init; }
/// <summary>Confidence score (0.0 - 1.0).</summary>
[JsonPropertyName("confidence")]
public required double Confidence { get; init; }
/// <summary>Confidence level classification.</summary>
[JsonPropertyName("confidence_level")]
public required HintConfidence ConfidenceLevel { get; init; }
/// <summary>Human-readable summary of the hint.</summary>
[JsonPropertyName("summary")]
public required string Summary { get; init; }
/// <summary>Hypothesis about the unknown's identity.</summary>
[JsonPropertyName("hypothesis")]
public required string Hypothesis { get; init; }
/// <summary>Type-specific evidence details.</summary>
[JsonPropertyName("evidence")]
public required ProvenanceEvidence Evidence { get; init; }
/// <summary>Suggested resolution actions.</summary>
[JsonPropertyName("suggested_actions")]
public required IReadOnlyList<SuggestedAction> SuggestedActions { get; init; }
/// <summary>When this hint was generated (UTC).</summary>
[JsonPropertyName("generated_at")]
public required DateTimeOffset GeneratedAt { get; init; }
/// <summary>Source of the hint (analyzer, corpus, etc.).</summary>
[JsonPropertyName("source")]
public required string Source { get; init; }
}
/// <summary>
/// Type-specific evidence for a provenance hint.
/// </summary>
public sealed record ProvenanceEvidence
{
/// <summary>Build-ID match details.</summary>
[JsonPropertyName("build_id")]
public BuildIdEvidence? BuildId { get; init; }
/// <summary>Debug link details.</summary>
[JsonPropertyName("debug_link")]
public DebugLinkEvidence? DebugLink { get; init; }
/// <summary>Import table fingerprint details.</summary>
[JsonPropertyName("import_fingerprint")]
public ImportFingerprintEvidence? ImportFingerprint { get; init; }
/// <summary>Export table fingerprint details.</summary>
[JsonPropertyName("export_fingerprint")]
public ExportFingerprintEvidence? ExportFingerprint { get; init; }
/// <summary>Section layout details.</summary>
[JsonPropertyName("section_layout")]
public SectionLayoutEvidence? SectionLayout { get; init; }
/// <summary>Compiler signature details.</summary>
[JsonPropertyName("compiler")]
public CompilerEvidence? Compiler { get; init; }
/// <summary>Distro pattern match details.</summary>
[JsonPropertyName("distro_pattern")]
public DistroPatternEvidence? DistroPattern { get; init; }
/// <summary>Version string extraction details.</summary>
[JsonPropertyName("version_string")]
public VersionStringEvidence? VersionString { get; init; }
/// <summary>Corpus match details.</summary>
[JsonPropertyName("corpus_match")]
public CorpusMatchEvidence? CorpusMatch { get; init; }
/// <summary>Raw evidence as JSON (for extensibility).</summary>
[JsonPropertyName("raw")]
public JsonDocument? Raw { get; init; }
}
/// <summary>Build-ID match evidence.</summary>
public sealed record BuildIdEvidence
{
[JsonPropertyName("build_id")]
public required string BuildId { get; init; }
[JsonPropertyName("build_id_type")]
public required string BuildIdType { get; init; }
[JsonPropertyName("matched_package")]
public string? MatchedPackage { get; init; }
[JsonPropertyName("matched_version")]
public string? MatchedVersion { get; init; }
[JsonPropertyName("matched_distro")]
public string? MatchedDistro { get; init; }
[JsonPropertyName("catalog_source")]
public string? CatalogSource { get; init; }
}
/// <summary>Debug link evidence.</summary>
public sealed record DebugLinkEvidence
{
[JsonPropertyName("debug_link")]
public required string DebugLink { get; init; }
[JsonPropertyName("crc32")]
public uint? Crc32 { get; init; }
[JsonPropertyName("debug_info_found")]
public bool DebugInfoFound { get; init; }
[JsonPropertyName("debug_info_path")]
public string? DebugInfoPath { get; init; }
}
/// <summary>Import table fingerprint evidence.</summary>
public sealed record ImportFingerprintEvidence
{
[JsonPropertyName("fingerprint")]
public required string Fingerprint { get; init; }
[JsonPropertyName("imported_libraries")]
public required IReadOnlyList<string> ImportedLibraries { get; init; }
[JsonPropertyName("import_count")]
public int ImportCount { get; init; }
[JsonPropertyName("matched_fingerprints")]
public IReadOnlyList<FingerprintMatch>? MatchedFingerprints { get; init; }
}
/// <summary>Export table fingerprint evidence.</summary>
public sealed record ExportFingerprintEvidence
{
[JsonPropertyName("fingerprint")]
public required string Fingerprint { get; init; }
[JsonPropertyName("export_count")]
public int ExportCount { get; init; }
[JsonPropertyName("notable_exports")]
public IReadOnlyList<string>? NotableExports { get; init; }
[JsonPropertyName("matched_fingerprints")]
public IReadOnlyList<FingerprintMatch>? MatchedFingerprints { get; init; }
}
/// <summary>Fingerprint match from corpus.</summary>
public sealed record FingerprintMatch
{
[JsonPropertyName("package")]
public required string Package { get; init; }
[JsonPropertyName("version")]
public required string Version { get; init; }
[JsonPropertyName("similarity")]
public required double Similarity { get; init; }
[JsonPropertyName("source")]
public required string Source { get; init; }
}
/// <summary>Section layout evidence.</summary>
public sealed record SectionLayoutEvidence
{
[JsonPropertyName("sections")]
public required IReadOnlyList<SectionInfo> Sections { get; init; }
[JsonPropertyName("layout_hash")]
public required string LayoutHash { get; init; }
[JsonPropertyName("matched_layouts")]
public IReadOnlyList<LayoutMatch>? MatchedLayouts { get; init; }
}
public sealed record SectionInfo
{
[JsonPropertyName("name")]
public required string Name { get; init; }
[JsonPropertyName("type")]
public required string Type { get; init; }
[JsonPropertyName("size")]
public ulong Size { get; init; }
[JsonPropertyName("flags")]
public string? Flags { get; init; }
}
public sealed record LayoutMatch
{
[JsonPropertyName("package")]
public required string Package { get; init; }
[JsonPropertyName("similarity")]
public required double Similarity { get; init; }
}
/// <summary>Compiler signature evidence.</summary>
public sealed record CompilerEvidence
{
[JsonPropertyName("compiler")]
public required string Compiler { get; init; }
[JsonPropertyName("version")]
public string? Version { get; init; }
[JsonPropertyName("flags")]
public IReadOnlyList<string>? Flags { get; init; }
[JsonPropertyName("detection_method")]
public required string DetectionMethod { get; init; }
}
/// <summary>Distro pattern match evidence.</summary>
public sealed record DistroPatternEvidence
{
[JsonPropertyName("distro")]
public required string Distro { get; init; }
[JsonPropertyName("release")]
public string? Release { get; init; }
[JsonPropertyName("pattern_type")]
public required string PatternType { get; init; }
[JsonPropertyName("matched_pattern")]
public required string MatchedPattern { get; init; }
[JsonPropertyName("examples")]
public IReadOnlyList<string>? Examples { get; init; }
}
/// <summary>Version string extraction evidence.</summary>
public sealed record VersionStringEvidence
{
[JsonPropertyName("version_strings")]
public required IReadOnlyList<ExtractedVersionString> VersionStrings { get; init; }
[JsonPropertyName("best_guess")]
public string? BestGuess { get; init; }
}
public sealed record ExtractedVersionString
{
[JsonPropertyName("value")]
public required string Value { get; init; }
[JsonPropertyName("location")]
public required string Location { get; init; }
[JsonPropertyName("confidence")]
public double Confidence { get; init; }
}
/// <summary>Corpus match evidence.</summary>
public sealed record CorpusMatchEvidence
{
[JsonPropertyName("corpus_name")]
public required string CorpusName { get; init; }
[JsonPropertyName("matched_entry")]
public required string MatchedEntry { get; init; }
[JsonPropertyName("match_type")]
public required string MatchType { get; init; }
[JsonPropertyName("similarity")]
public required double Similarity { get; init; }
[JsonPropertyName("metadata")]
public IReadOnlyDictionary<string, string>? Metadata { get; init; }
}
/// <summary>Suggested action for resolving the unknown.</summary>
public sealed record SuggestedAction
{
[JsonPropertyName("action")]
public required string Action { get; init; }
[JsonPropertyName("priority")]
public required int Priority { get; init; }
[JsonPropertyName("effort")]
public required string Effort { get; init; }
[JsonPropertyName("description")]
public required string Description { get; init; }
[JsonPropertyName("link")]
public string? Link { get; init; }
}
```
### Extended Unknown Model
```csharp
namespace StellaOps.Unknowns.Core.Models;
/// <summary>
/// Extended Unknown model with structured provenance hints.
/// </summary>
public sealed record Unknown
{
// ... existing fields ...
/// <summary>Structured provenance hints about this unknown.</summary>
public IReadOnlyList<ProvenanceHint> ProvenanceHints { get; init; } = [];
/// <summary>Best hypothesis based on hints (highest confidence).</summary>
public string? BestHypothesis { get; init; }
/// <summary>Combined confidence from all hints.</summary>
public double? CombinedConfidence { get; init; }
/// <summary>Primary suggested action (highest priority).</summary>
public string? PrimarySuggestedAction { get; init; }
}
```
### Provenance Hint Builder
```csharp
namespace StellaOps.Unknowns.Core.Hints;
/// <summary>
/// Builds provenance hints from various evidence sources.
/// </summary>
public interface IProvenanceHintBuilder
{
/// <summary>Build hint from Build-ID match.</summary>
ProvenanceHint BuildFromBuildId(
string buildId,
string buildIdType,
BuildIdMatchResult? match);
/// <summary>Build hint from import table fingerprint.</summary>
ProvenanceHint BuildFromImportFingerprint(
string fingerprint,
IReadOnlyList<string> importedLibraries,
IReadOnlyList<FingerprintMatch>? matches);
/// <summary>Build hint from section layout.</summary>
ProvenanceHint BuildFromSectionLayout(
IReadOnlyList<SectionInfo> sections,
IReadOnlyList<LayoutMatch>? matches);
/// <summary>Build hint from distro pattern.</summary>
ProvenanceHint BuildFromDistroPattern(
string distro,
string? release,
string patternType,
string matchedPattern);
/// <summary>Build hint from version strings.</summary>
ProvenanceHint BuildFromVersionStrings(
IReadOnlyList<ExtractedVersionString> versionStrings);
/// <summary>Build hint from corpus match.</summary>
ProvenanceHint BuildFromCorpusMatch(
string corpusName,
string matchedEntry,
string matchType,
double similarity,
IReadOnlyDictionary<string, string>? metadata);
/// <summary>Combine multiple hints into a best hypothesis.</summary>
(string Hypothesis, double Confidence) CombineHints(
IReadOnlyList<ProvenanceHint> hints);
}
public sealed class ProvenanceHintBuilder : IProvenanceHintBuilder
{
private readonly TimeProvider _timeProvider;
private readonly ILogger<ProvenanceHintBuilder> _logger;
public ProvenanceHintBuilder(
TimeProvider timeProvider,
ILogger<ProvenanceHintBuilder> logger)
{
_timeProvider = timeProvider;
_logger = logger;
}
public ProvenanceHint BuildFromBuildId(
string buildId,
string buildIdType,
BuildIdMatchResult? match)
{
var confidence = match is not null ? 0.95 : 0.3;
var hypothesis = match is not null
? $"Binary matches {match.Package}@{match.Version} from {match.Distro}"
: $"Build-ID {buildId[..Math.Min(16, buildId.Length)]}... not found in catalog";
var suggestedActions = new List<SuggestedAction>();
if (match is not null)
{
suggestedActions.Add(new SuggestedAction
{
Action = "verify_package",
Priority = 1,
Effort = "low",
Description = $"Verify component is {match.Package}@{match.Version}",
Link = match.AdvisoryLink
});
}
else
{
suggestedActions.Add(new SuggestedAction
{
Action = "catalog_lookup",
Priority = 1,
Effort = "medium",
Description = "Search additional Build-ID catalogs",
Link = null
});
suggestedActions.Add(new SuggestedAction
{
Action = "manual_identification",
Priority = 2,
Effort = "high",
Description = "Manually identify binary using other methods",
Link = null
});
}
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.BuildIdMatch, buildId),
Type = ProvenanceHintType.BuildIdMatch,
Confidence = confidence,
ConfidenceLevel = MapConfidenceLevel(confidence),
Summary = $"Build-ID: {buildId[..Math.Min(16, buildId.Length)]}...",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
BuildId = new BuildIdEvidence
{
BuildId = buildId,
BuildIdType = buildIdType,
MatchedPackage = match?.Package,
MatchedVersion = match?.Version,
MatchedDistro = match?.Distro,
CatalogSource = match?.CatalogSource
}
},
SuggestedActions = suggestedActions,
GeneratedAt = _timeProvider.GetUtcNow(),
Source = "BuildIdAnalyzer"
};
}
public ProvenanceHint BuildFromImportFingerprint(
string fingerprint,
IReadOnlyList<string> importedLibraries,
IReadOnlyList<FingerprintMatch>? matches)
{
var bestMatch = matches?.OrderByDescending(m => m.Similarity).FirstOrDefault();
var confidence = bestMatch?.Similarity ?? 0.2;
var hypothesis = bestMatch is not null
? $"Import pattern matches {bestMatch.Package}@{bestMatch.Version} ({bestMatch.Similarity:P0} similar)"
: $"Import pattern not found in corpus (imports: {string.Join(", ", importedLibraries.Take(3))})";
var suggestedActions = new List<SuggestedAction>();
if (bestMatch is not null && bestMatch.Similarity >= 0.8)
{
suggestedActions.Add(new SuggestedAction
{
Action = "verify_import_match",
Priority = 1,
Effort = "low",
Description = $"Verify component is {bestMatch.Package}",
Link = null
});
}
else
{
suggestedActions.Add(new SuggestedAction
{
Action = "analyze_imports",
Priority = 1,
Effort = "medium",
Description = "Analyze imported libraries for identification",
Link = null
});
}
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.ImportTableFingerprint, fingerprint),
Type = ProvenanceHintType.ImportTableFingerprint,
Confidence = confidence,
ConfidenceLevel = MapConfidenceLevel(confidence),
Summary = $"Import fingerprint: {fingerprint[..Math.Min(16, fingerprint.Length)]}...",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
ImportFingerprint = new ImportFingerprintEvidence
{
Fingerprint = fingerprint,
ImportedLibraries = importedLibraries,
ImportCount = importedLibraries.Count,
MatchedFingerprints = matches
}
},
SuggestedActions = suggestedActions,
GeneratedAt = _timeProvider.GetUtcNow(),
Source = "ImportTableAnalyzer"
};
}
public ProvenanceHint BuildFromSectionLayout(
IReadOnlyList<SectionInfo> sections,
IReadOnlyList<LayoutMatch>? matches)
{
var layoutHash = ComputeLayoutHash(sections);
var bestMatch = matches?.OrderByDescending(m => m.Similarity).FirstOrDefault();
var confidence = bestMatch?.Similarity ?? 0.15;
var hypothesis = bestMatch is not null
? $"Section layout matches {bestMatch.Package} ({bestMatch.Similarity:P0} similar)"
: "Section layout not found in corpus";
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.SectionLayout, layoutHash),
Type = ProvenanceHintType.SectionLayout,
Confidence = confidence,
ConfidenceLevel = MapConfidenceLevel(confidence),
Summary = $"Section layout: {sections.Count} sections",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
SectionLayout = new SectionLayoutEvidence
{
Sections = sections,
LayoutHash = layoutHash,
MatchedLayouts = matches
}
},
SuggestedActions =
[
new SuggestedAction
{
Action = "section_analysis",
Priority = 2,
Effort = "high",
Description = "Detailed section analysis required",
Link = null
}
],
GeneratedAt = _timeProvider.GetUtcNow(),
Source = "SectionLayoutAnalyzer"
};
}
public ProvenanceHint BuildFromDistroPattern(
string distro,
string? release,
string patternType,
string matchedPattern)
{
var confidence = 0.7;
var hypothesis = release is not null
? $"Binary appears to be from {distro} {release}"
: $"Binary appears to be from {distro}";
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.DistroPattern, $"{distro}:{matchedPattern}"),
Type = ProvenanceHintType.DistroPattern,
Confidence = confidence,
ConfidenceLevel = MapConfidenceLevel(confidence),
Summary = $"Distro pattern: {distro}",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
DistroPattern = new DistroPatternEvidence
{
Distro = distro,
Release = release,
PatternType = patternType,
MatchedPattern = matchedPattern
}
},
SuggestedActions =
[
new SuggestedAction
{
Action = "distro_package_lookup",
Priority = 1,
Effort = "low",
Description = $"Search {distro} package repositories",
Link = GetDistroPackageSearchUrl(distro)
}
],
GeneratedAt = _timeProvider.GetUtcNow(),
Source = "DistroPatternAnalyzer"
};
}
public ProvenanceHint BuildFromVersionStrings(
IReadOnlyList<ExtractedVersionString> versionStrings)
{
var bestGuess = versionStrings
.OrderByDescending(v => v.Confidence)
.FirstOrDefault();
var confidence = bestGuess?.Confidence ?? 0.3;
var hypothesis = bestGuess is not null
? $"Version appears to be {bestGuess.Value}"
: "No clear version string found";
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.VersionString,
string.Join(",", versionStrings.Select(v => v.Value))),
Type = ProvenanceHintType.VersionString,
Confidence = confidence,
ConfidenceLevel = MapConfidenceLevel(confidence),
Summary = $"Found {versionStrings.Count} version string(s)",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
VersionString = new VersionStringEvidence
{
VersionStrings = versionStrings,
BestGuess = bestGuess?.Value
}
},
SuggestedActions =
[
new SuggestedAction
{
Action = "version_verification",
Priority = 1,
Effort = "low",
Description = "Verify extracted version against known releases",
Link = null
}
],
GeneratedAt = _timeProvider.GetUtcNow(),
Source = "VersionStringExtractor"
};
}
public ProvenanceHint BuildFromCorpusMatch(
string corpusName,
string matchedEntry,
string matchType,
double similarity,
IReadOnlyDictionary<string, string>? metadata)
{
var hypothesis = similarity >= 0.9
? $"High confidence match: {matchedEntry}"
: $"Possible match: {matchedEntry} ({similarity:P0} similar)";
return new ProvenanceHint
{
HintId = ComputeHintId(ProvenanceHintType.CorpusMatch, $"{corpusName}:{matchedEntry}"),
Type = ProvenanceHintType.CorpusMatch,
Confidence = similarity,
ConfidenceLevel = MapConfidenceLevel(similarity),
Summary = $"Corpus match: {matchedEntry}",
Hypothesis = hypothesis,
Evidence = new ProvenanceEvidence
{
CorpusMatch = new CorpusMatchEvidence
{
CorpusName = corpusName,
MatchedEntry = matchedEntry,
MatchType = matchType,
Similarity = similarity,
Metadata = metadata
}
},
SuggestedActions =
[
new SuggestedAction
{
Action = "verify_corpus_match",
Priority = 1,
Effort = "low",
Description = $"Verify match against {corpusName}",
Link = null
}
],
GeneratedAt = _timeProvider.GetUtcNow(),
Source = $"{corpusName}Matcher"
};
}
public (string Hypothesis, double Confidence) CombineHints(
IReadOnlyList<ProvenanceHint> hints)
{
if (hints.Count == 0)
{
return ("No provenance hints available", 0.0);
}
// Sort by confidence descending
var sorted = hints.OrderByDescending(h => h.Confidence).ToList();
// Best single hypothesis
var bestHint = sorted[0];
// If we have multiple high-confidence hints that agree, boost confidence
var agreeing = sorted
.Where(h => h.Confidence >= 0.5)
.GroupBy(h => ExtractPackageFromHypothesis(h.Hypothesis))
.OrderByDescending(g => g.Count())
.FirstOrDefault();
if (agreeing is not null && agreeing.Count() >= 2)
{
// Multiple hints agree - combine confidence
var combinedConfidence = Math.Min(0.99,
agreeing.Max(h => h.Confidence) + (agreeing.Count() - 1) * 0.1);
return (
$"{agreeing.Key} (confirmed by {agreeing.Count()} evidence sources)",
Math.Round(combinedConfidence, 4)
);
}
return (bestHint.Hypothesis, Math.Round(bestHint.Confidence, 4));
}
private static string ComputeHintId(ProvenanceHintType type, string evidence)
{
var input = $"{type}:{evidence}";
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(input));
return $"hint:sha256:{Convert.ToHexString(hash).ToLowerInvariant()[..24]}";
}
private static HintConfidence MapConfidenceLevel(double confidence)
{
return confidence switch
{
>= 0.9 => HintConfidence.VeryHigh,
>= 0.7 => HintConfidence.High,
>= 0.5 => HintConfidence.Medium,
>= 0.3 => HintConfidence.Low,
_ => HintConfidence.VeryLow
};
}
private static string ComputeLayoutHash(IReadOnlyList<SectionInfo> sections)
{
var normalized = string.Join("|",
sections.OrderBy(s => s.Name).Select(s => $"{s.Name}:{s.Type}:{s.Size}"));
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(normalized));
return Convert.ToHexString(hash).ToLowerInvariant()[..16];
}
private static string? GetDistroPackageSearchUrl(string distro)
{
return distro.ToLowerInvariant() switch
{
"debian" => "https://packages.debian.org/search",
"ubuntu" => "https://packages.ubuntu.com/",
"rhel" or "centos" => "https://access.redhat.com/downloads",
"alpine" => "https://pkgs.alpinelinux.org/packages",
_ => null
};
}
private static string ExtractPackageFromHypothesis(string hypothesis)
{
// Simple extraction - could be more sophisticated
var match = Regex.Match(hypothesis, @"matches?\s+(\S+)");
return match.Success ? match.Groups[1].Value : hypothesis;
}
}
public sealed record BuildIdMatchResult
{
public required string Package { get; init; }
public required string Version { get; init; }
public required string Distro { get; init; }
public string? CatalogSource { get; init; }
public string? AdvisoryLink { get; init; }
}
```
## Delivery Tracker
| # | Task ID | Status | Dependency | Owner | Task Definition |
|---|---------|--------|------------|-------|-----------------|
| 1 | PH-001 | TODO | - | - | Define `ProvenanceHintType` enum (15+ types) |
| 2 | PH-002 | TODO | PH-001 | - | Define `HintConfidence` enum |
| 3 | PH-003 | TODO | PH-002 | - | Define `ProvenanceHint` record |
| 4 | PH-004 | TODO | PH-003 | - | Define `ProvenanceEvidence` and sub-records |
| 5 | PH-005 | TODO | PH-004 | - | Define evidence records: BuildId, DebugLink |
| 6 | PH-006 | TODO | PH-005 | - | Define evidence records: ImportFingerprint, ExportFingerprint |
| 7 | PH-007 | TODO | PH-006 | - | Define evidence records: SectionLayout, Compiler |
| 8 | PH-008 | TODO | PH-007 | - | Define evidence records: DistroPattern, VersionString |
| 9 | PH-009 | TODO | PH-008 | - | Define evidence records: CorpusMatch |
| 10 | PH-010 | TODO | PH-009 | - | Define `SuggestedAction` record |
| 11 | PH-011 | TODO | PH-010 | - | Extend `Unknown` model with `ProvenanceHints` |
| 12 | PH-012 | TODO | PH-011 | - | Define `IProvenanceHintBuilder` interface |
| 13 | PH-013 | TODO | PH-012 | - | Implement `BuildFromBuildId()` |
| 14 | PH-014 | TODO | PH-013 | - | Implement `BuildFromImportFingerprint()` |
| 15 | PH-015 | TODO | PH-014 | - | Implement `BuildFromSectionLayout()` |
| 16 | PH-016 | TODO | PH-015 | - | Implement `BuildFromDistroPattern()` |
| 17 | PH-017 | TODO | PH-016 | - | Implement `BuildFromVersionStrings()` |
| 18 | PH-018 | TODO | PH-017 | - | Implement `BuildFromCorpusMatch()` |
| 19 | PH-019 | TODO | PH-018 | - | Implement `CombineHints()` for best hypothesis |
| 20 | PH-020 | TODO | PH-019 | - | Add service registration extensions |
| 21 | PH-021 | TODO | PH-020 | - | Update Unknown repository to persist hints |
| 22 | PH-022 | TODO | PH-021 | - | Add database migration for provenance_hints table |
| 23 | PH-023 | TODO | PH-022 | - | Write unit tests: hint builders (all types) |
| 24 | PH-024 | TODO | PH-023 | - | Write unit tests: hint combination |
| 25 | PH-025 | TODO | PH-024 | - | Write golden fixture tests for hint serialization |
| 26 | PH-026 | TODO | PH-025 | - | Add JSON schema for ProvenanceHint |
| 27 | PH-027 | TODO | PH-026 | - | Document in docs/modules/unknowns/ |
| 28 | PH-028 | TODO | PH-027 | - | Expose hints via Unknowns.WebService API |
## Acceptance Criteria
1. **Completeness:** All 15 hint types have dedicated evidence records
2. **Confidence Scoring:** All hints have confidence scores (0-1) and levels
3. **Hypothesis Generation:** Each hint produces a human-readable hypothesis
4. **Suggested Actions:** Each hint includes prioritized resolution actions
5. **Combination:** Multiple hints can be combined for best hypothesis
6. **Persistence:** Hints are stored with unknowns in database
7. **Test Coverage:** Unit tests for all builders, golden fixtures for serialization
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| 15+ hint types | Covers common provenance evidence per advisory |
| Content-addressed IDs | Enables deduplication of identical hints |
| Confidence levels | Both numeric and categorical for different use cases |
| Suggested actions | Actionable output for resolution workflow |
| Risk | Mitigation |
|------|------------|
| Low-quality hints | Confidence thresholds; manual review for low confidence |
| Hint explosion | Aggregate/dedupe hints by type |
| Corpus dependency | Graceful degradation without corpus matches |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2026-01-06 | Sprint created from product advisory gap analysis | Planning |

View File

@@ -0,0 +1,168 @@
# Sprint Series 20260106_003 - Verifiable Software Supply Chain Pipeline
## Executive Summary
This sprint series completes the "quiet, verifiable software supply chain pipeline" as outlined in the product advisory. While StellaOps already implements ~85% of the advisory requirements, this series addresses the remaining gaps to deliver a fully integrated, production-ready pipeline from SBOMs to signed evidence bundles.
## Problem Statement
The product advisory outlines a complete software supply chain pipeline with:
- Deterministic per-layer SBOMs with normalization
- VEX-first gating to reduce noise before triage
- DSSE/in-toto attestations for everything
- Traceable event flow with breadcrumbs
- Portable evidence bundles for audits
**Current State Analysis:**
| Capability | Status | Gap |
|------------|--------|-----|
| Deterministic SBOMs | 95% | Per-layer files not exposed, Composition Recipe API missing |
| VEX-first gating | 75% | No explicit "gate" service that blocks/warns before triage |
| DSSE attestations | 90% | Per-layer attestations missing, cross-attestation linking missing |
| Evidence bundles | 85% | No standardized export format with verify commands |
| Event flow | 90% | Router idempotency enforcement not formalized |
## Solution Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ Verifiable Supply Chain Pipeline │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Scanner │───▶│ VEX Gate │───▶│ Attestor │───▶│ Evidence │ │
│ │ (Per-layer │ │ (Verdict + │ │ (Chain │ │ Locker │ │
│ │ SBOMs) │ │ Rationale) │ │ Linking) │ │ (Bundle) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Router (Event Flow) │ │
│ │ - Idempotent keys (artifact digest + stage) │ │
│ │ - Trace records at each hop │ │
│ │ - Timeline queryable by artifact digest │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Evidence Bundle │ │
│ │ Export │ │
│ │ (zip + verify) │ │
│ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
```
## Sprint Breakdown
| Sprint | Module | Scope | Dependencies |
|--------|--------|-------|--------------|
| [003_001](SPRINT_20260106_003_001_SCANNER_perlayer_sbom_api.md) | Scanner | Per-layer SBOM export + Composition Recipe API | None |
| [003_002](SPRINT_20260106_003_002_SCANNER_vex_gate_service.md) | Scanner/Excititor | VEX-first gating service integration | 003_001 |
| [003_003](SPRINT_20260106_003_003_EVIDENCE_export_bundle.md) | EvidenceLocker | Standardized export with verify commands | 003_001 |
| [003_004](SPRINT_20260106_003_004_ATTESTOR_chain_linking.md) | Attestor | Cross-attestation linking + per-layer attestations | 003_001, 003_002 |
## Dependency Graph
```
┌──────────────────────────────┐
│ SPRINT_20260106_003_001 │
│ Per-layer SBOM + Recipe API │
└──────────────┬───────────────┘
┌──────────────────────┼──────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│ SPRINT_003_002 │ │ SPRINT_003_003 │ │ │
│ VEX Gate Service │ │ Evidence Export │ │ │
└────────┬──────────┘ └───────────────────┘ │ │
│ │ │
└─────────────────────────────────────┘ │
│ │
▼ │
┌───────────────────┐ │
│ SPRINT_003_004 │◀────────────────────────────┘
│ Cross-Attestation │
│ Linking │
└───────────────────┘
Production Rollout
```
## Key Deliverables
### Sprint 003_001: Per-layer SBOM & Composition Recipe API
- Per-layer CycloneDX/SPDX files stored separately in CAS
- `GET /scans/{id}/layers/{digest}/sbom` API endpoint
- `GET /scans/{id}/composition-recipe` API endpoint
- Deterministic layer ordering with Merkle root in recipe
- CLI: `stella scan sbom --layer <digest> --format cdx|spdx`
### Sprint 003_002: VEX Gate Service
- `IVexGateService` interface with gate decisions: `PASS`, `WARN`, `BLOCK`
- Pre-triage filtering that reduces noise
- Evidence tracking for each gate decision
- Integration with Excititor VEX observations
- Configurable gate policies (exploitable+reachable+no-control = BLOCK)
### Sprint 003_003: Evidence Bundle Export
- Standardized export format: `evidence-bundle-<id>.tar.gz`
- Contents: SBOMs, VEX statements, attestations, public keys, README
- `verify.sh` script embedded in bundle
- `stella evidence export --bundle <id> --output ./audit-bundle.tar.gz`
- Offline verification support
### Sprint 003_004: Cross-Attestation Linking
- SBOM attestation links to VEX attestation via subject reference
- Policy verdict attestation links to both
- Per-layer attestations with layer-specific subjects
- `GET /attestations?artifact=<digest>&chain=true` for full chain retrieval
## Acceptance Criteria (Series)
1. **Determinism**: Same inputs produce identical SBOMs, recipes, and attestation hashes
2. **Traceability**: Any artifact can be traced through the full pipeline via digest
3. **Verifiability**: Evidence bundles can be verified offline without network access
4. **Completeness**: All artifacts (SBOMs, VEX, verdicts, attestations) are included in bundles
5. **Integration**: VEX gate reduces triage noise by at least 50% (measured via test corpus)
## Risk Assessment
| Risk | Impact | Mitigation |
|------|--------|------------|
| Per-layer SBOMs increase storage | Medium | Content-addressable deduplication, TTL for stale layers |
| VEX gate false positives | High | Conservative defaults, policy override mechanism |
| Cross-attestation circular deps | Low | DAG validation at creation time |
| Export bundle size | Medium | Compression, selective export by date range |
## Testing Strategy
- **Unit tests**: Each service with determinism verification
- **Integration tests**: Full pipeline from scan to export
- **Replay tests**: Identical inputs produce identical outputs
- **Corpus tests**: Advisory test corpus for VEX gate accuracy
- **E2E tests**: Air-gapped verification of exported bundles
## Documentation Updates Required
- `docs/modules/scanner/architecture.md` - Per-layer SBOM section
- `docs/modules/evidence-locker/architecture.md` - Export bundle format
- `docs/modules/attestor/architecture.md` - Cross-attestation linking
- `docs/API_CLI_REFERENCE.md` - New endpoints and commands
- `docs/OFFLINE_KIT.md` - Evidence bundle verification
## Related Work
- SPRINT_20260105_002_* (HLC) - Required for timestamp ordering in attestation chains
- SPRINT_20251229_001_002_BE_vex_delta - VEX delta foundation
- Epic 10 (Export Center) - Bundle export workflows
- Epic 19 (Attestor Console) - Attestation verification UI
## Execution Notes
- All changes must maintain backward compatibility
- Feature flags for gradual rollout recommended
- Cross-module changes require coordinated deployment
- CLI commands should support both new and legacy formats during transition

View File

@@ -0,0 +1,230 @@
# SPRINT_20260106_003_001_SCANNER_perlayer_sbom_api
## Sprint Metadata
| Field | Value |
|-------|-------|
| Sprint ID | 20260106_003_001 |
| Module | SCANNER |
| Title | Per-layer SBOM Export & Composition Recipe API |
| Working Directory | `src/Scanner/` |
| Dependencies | None |
| Blocking | 003_002, 003_003, 003_004 |
## Objective
Expose per-layer SBOMs as first-class artifacts and add a Composition Recipe API that enables downstream verification of SBOM determinism. This completes Step 1 of the product advisory: "Deterministic SBOMs (per layer, per build)".
## Context
**Current State:**
- `LayerComponentFragment` model tracks components per layer internally
- SBOM composition aggregates fragments into single image-level SBOM
- Composition recipe stored in CAS but not exposed via API
- No mechanism to retrieve SBOM for a specific layer
**Target State:**
- Per-layer SBOMs stored as individual CAS artifacts
- API endpoints to retrieve layer-specific SBOMs
- Composition Recipe API for determinism verification
- CLI support for per-layer SBOM export
## Tasks
### Phase 1: Per-layer SBOM Generation (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T001 | Create `ILayerSbomWriter` interface | TODO | `src/Scanner/__Libraries/StellaOps.Scanner.Emit/` |
| T002 | Implement `CycloneDxLayerWriter` for per-layer CDX | TODO | Extends existing writer |
| T003 | Implement `SpdxLayerWriter` for per-layer SPDX | TODO | Extends existing writer |
| T004 | Update `SbomCompositionEngine` to emit layer SBOMs | TODO | Store in CAS with layer digest key |
| T005 | Add layer SBOM paths to `SbomCompositionResult` | TODO | `LayerSboms: ImmutableDictionary<string, SbomRef>` |
| T006 | Unit tests for per-layer SBOM generation | TODO | Determinism tests required |
### Phase 2: Composition Recipe API (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T007 | Define `CompositionRecipeResponse` contract | TODO | Include Merkle root, fragment order, digests |
| T008 | Add `GET /scans/{id}/composition-recipe` endpoint | TODO | Scanner.WebService |
| T009 | Implement `ICompositionRecipeService` | TODO | Retrieves and validates recipe from CAS |
| T010 | Add recipe verification logic | TODO | Verify Merkle root matches layer digests |
| T011 | Integration tests for composition recipe API | TODO | Round-trip determinism verification |
### Phase 3: Per-layer SBOM API (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T012 | Add `GET /scans/{id}/layers` endpoint | TODO | List layers with SBOM availability |
| T013 | Add `GET /scans/{id}/layers/{digest}/sbom` endpoint | TODO | Format param: `cdx`, `spdx` |
| T014 | Add content negotiation for SBOM format | TODO | Accept header support |
| T015 | Implement caching headers for layer SBOMs | TODO | ETag based on content hash |
| T016 | Integration tests for layer SBOM API | TODO | |
### Phase 4: CLI Commands (4 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T017 | Add `stella scan sbom --layer <digest>` command | TODO | `src/Cli/StellaOps.Cli/` |
| T018 | Add `stella scan recipe` command | TODO | Output composition recipe |
| T019 | Add `--verify` flag to recipe command | TODO | Verify recipe against stored SBOMs |
| T020 | CLI integration tests | TODO | |
## Contracts
### CompositionRecipeResponse
```json
{
"scanId": "scan-abc123",
"imageDigest": "sha256:abcdef...",
"createdAt": "2026-01-06T10:30:00.000000Z",
"recipe": {
"version": "1.0.0",
"generatorName": "StellaOps.Scanner",
"generatorVersion": "2026.04",
"layers": [
{
"digest": "sha256:layer1...",
"order": 0,
"fragmentDigest": "sha256:frag1...",
"sbomDigests": {
"cyclonedx": "sha256:cdx1...",
"spdx": "sha256:spdx1..."
},
"componentCount": 42
}
],
"merkleRoot": "sha256:merkle...",
"aggregatedSbomDigests": {
"cyclonedx": "sha256:finalcdx...",
"spdx": "sha256:finalspdx..."
}
}
}
```
### LayerSbomRef
```csharp
public sealed record LayerSbomRef
{
public required string LayerDigest { get; init; }
public required int Order { get; init; }
public required string FragmentDigest { get; init; }
public required string CycloneDxDigest { get; init; }
public required string CycloneDxCasUri { get; init; }
public required string SpdxDigest { get; init; }
public required string SpdxCasUri { get; init; }
public required int ComponentCount { get; init; }
}
```
## API Endpoints
### GET /api/v1/scans/{scanId}/layers
```
Response 200:
{
"scanId": "...",
"imageDigest": "sha256:...",
"layers": [
{
"digest": "sha256:layer1...",
"order": 0,
"hasSbom": true,
"componentCount": 42
}
]
}
```
### GET /api/v1/scans/{scanId}/layers/{layerDigest}/sbom
```
Query params:
- format: "cdx" | "spdx" (default: "cdx")
Response 200: SBOM content (application/json)
Headers:
- ETag: "<content-digest>"
- X-StellaOps-Layer-Digest: "sha256:..."
- X-StellaOps-Format: "cyclonedx-1.7"
```
### GET /api/v1/scans/{scanId}/composition-recipe
```
Response 200: CompositionRecipeResponse (application/json)
```
## CLI Commands
```bash
# List layers with SBOM info
stella scan layers <scan-id>
# Get per-layer SBOM
stella scan sbom <scan-id> --layer sha256:abc123 --format cdx --output layer.cdx.json
# Get composition recipe
stella scan recipe <scan-id> --output recipe.json
# Verify composition recipe against stored SBOMs
stella scan recipe <scan-id> --verify
```
## Storage Schema
Per-layer SBOMs stored in CAS with paths:
```
/evidence/sboms/<image-digest>/layers/<layer-digest>.cdx.json
/evidence/sboms/<image-digest>/layers/<layer-digest>.spdx.json
/evidence/sboms/<image-digest>/recipe.json
```
## Acceptance Criteria
1. **Determinism**: Same image scan produces identical per-layer SBOMs
2. **Completeness**: Every layer in the image has a corresponding SBOM
3. **Verifiability**: Composition recipe Merkle root matches layer SBOM digests
4. **Performance**: Per-layer SBOM retrieval < 100ms (cached)
5. **Backward Compatibility**: Existing SBOM APIs continue to work unchanged
## Test Cases
### Unit Tests
- `LayerSbomWriter` produces deterministic output for identical fragments
- Composition recipe Merkle root computation is RFC 6962 compliant
- Layer ordering is stable (sorted by layer order, not discovery order)
### Integration Tests
- Full scan produces per-layer SBOMs stored in CAS
- API returns correct layer SBOM by digest
- Recipe verification passes for valid scans
- Recipe verification fails for tampered SBOMs
### Determinism Tests
- Two scans of identical images produce identical per-layer SBOM digests
- Composition recipe is identical across runs
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Store per-layer SBOMs in CAS | Content-addressable deduplication handles shared layers |
| Use layer digest as key | Deterministic, unique per layer content |
| Include both CDX and SPDX per layer | Supports customer format preferences |
| Risk | Mitigation |
|------|------------|
| Storage growth with many layers | TTL-based cleanup for orphaned layer SBOMs |
| Cache invalidation complexity | Layer SBOMs are immutable once created |
## Execution Log
| Date | Author | Action |
|------|--------|--------|
| 2026-01-06 | Claude | Sprint created from product advisory |

View File

@@ -0,0 +1,310 @@
# SPRINT_20260106_003_002_SCANNER_vex_gate_service
## Sprint Metadata
| Field | Value |
|-------|-------|
| Sprint ID | 20260106_003_002 |
| Module | SCANNER/EXCITITOR |
| Title | VEX-first Gating Service |
| Working Directory | `src/Scanner/`, `src/Excititor/` |
| Dependencies | SPRINT_20260106_003_001 |
| Blocking | SPRINT_20260106_003_004 |
## Objective
Implement a VEX-first gating service that filters vulnerability findings before triage, reducing noise by applying VEX statements and configurable policies. This completes Step 2 of the product advisory: "VEX-first gating (reduce noise before triage)".
## Context
**Current State:**
- Excititor ingests VEX statements and stores as immutable observations
- VexLens computes consensus across weighted statements
- Scanner produces findings without pre-filtering
- No explicit "gate" decision before findings reach triage queue
**Target State:**
- `IVexGateService` applies VEX evidence before triage
- Gate decisions: `PASS` (proceed), `WARN` (proceed with flag), `BLOCK` (requires attention)
- Evidence tracking for each gate decision
- Configurable gate policies per tenant
## Tasks
### Phase 1: VEX Gate Core Service (8 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T001 | Define `VexGateDecision` enum: `Pass`, `Warn`, `Block` | TODO | `src/Scanner/__Libraries/StellaOps.Scanner.Gate/` |
| T002 | Define `VexGateResult` model with evidence | TODO | Include rationale, contributing statements |
| T003 | Define `IVexGateService` interface | TODO | `EvaluateAsync(Finding, CancellationToken)` |
| T004 | Implement `VexGateService` core logic | TODO | Integrates with VexLens consensus |
| T005 | Create `VexGatePolicy` configuration model | TODO | Rules for PASS/WARN/BLOCK decisions |
| T006 | Implement default policy rules | TODO | Per advisory: exploitable+reachable+no-control=BLOCK |
| T007 | Add `IVexGatePolicy` interface | TODO | Pluggable policy evaluation |
| T008 | Unit tests for VexGateService | TODO | |
### Phase 2: Excititor Integration (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T009 | Add `IVexObservationQuery` for gate lookups | TODO | `src/Excititor/__Libraries/` |
| T010 | Implement efficient CVE+PURL batch lookup | TODO | Optimize for gate throughput |
| T011 | Add VEX statement caching for gate operations | TODO | Short TTL, bounded cache |
| T012 | Create `VexGateExcititorAdapter` | TODO | Bridges Scanner → Excititor |
| T013 | Integration tests for Excititor lookups | TODO | |
| T014 | Performance benchmarks for batch evaluation | TODO | Target: 1000 findings/sec |
### Phase 3: Scanner Worker Integration (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T015 | Add VEX gate stage to scan pipeline | TODO | After findings, before triage emit |
| T016 | Update `ScanResult` with gate decisions | TODO | `GatedFindings: ImmutableArray<GatedFinding>` |
| T017 | Add gate metrics to `ScanMetricsCollector` | TODO | pass/warn/block counts |
| T018 | Implement gate bypass for emergency scans | TODO | Feature flag or scan option |
| T019 | Integration tests for gated scan pipeline | TODO | |
### Phase 4: Gate Evidence & API (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T020 | Define `GateEvidence` model | TODO | Statement refs, policy rule matched |
| T021 | Add `GET /scans/{id}/gate-results` endpoint | TODO | Scanner.WebService |
| T022 | Add gate evidence to SBOM findings metadata | TODO | Link to VEX statements |
| T023 | Implement gate decision audit logging | TODO | For compliance |
| T024 | Add gate summary to scan completion event | TODO | Router notification |
| T025 | API integration tests | TODO | |
### Phase 5: CLI & Configuration (4 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T026 | Add `stella scan gate-policy show` command | TODO | Display current policy |
| T027 | Add `stella scan gate-results <scan-id>` command | TODO | Show gate decisions |
| T028 | Add gate policy to tenant configuration | TODO | `etc/scanner.yaml` |
| T029 | CLI integration tests | TODO | |
## Contracts
### VexGateDecision
```csharp
public enum VexGateDecision
{
Pass, // Finding cleared by VEX evidence - no action needed
Warn, // Finding has partial evidence - proceed with caution
Block // Finding requires attention - exploitable and reachable
}
```
### VexGateResult
```csharp
public sealed record VexGateResult
{
public required VexGateDecision Decision { get; init; }
public required string Rationale { get; init; }
public required string PolicyRuleMatched { get; init; }
public required ImmutableArray<VexStatementRef> ContributingStatements { get; init; }
public required VexGateEvidence Evidence { get; init; }
public required DateTimeOffset EvaluatedAt { get; init; }
}
public sealed record VexGateEvidence
{
public required VexStatus? VendorStatus { get; init; }
public required VexJustificationType? Justification { get; init; }
public required bool IsReachable { get; init; }
public required bool HasCompensatingControl { get; init; }
public required double ConfidenceScore { get; init; }
public required ImmutableArray<string> BackportHints { get; init; }
}
public sealed record VexStatementRef
{
public required string StatementId { get; init; }
public required string IssuerId { get; init; }
public required VexStatus Status { get; init; }
public required DateTimeOffset Timestamp { get; init; }
}
```
### VexGatePolicy
```csharp
public sealed record VexGatePolicy
{
public required ImmutableArray<VexGatePolicyRule> Rules { get; init; }
public required VexGateDecision DefaultDecision { get; init; }
}
public sealed record VexGatePolicyRule
{
public required string RuleId { get; init; }
public required VexGatePolicyCondition Condition { get; init; }
public required VexGateDecision Decision { get; init; }
public required int Priority { get; init; }
}
public sealed record VexGatePolicyCondition
{
public VexStatus? VendorStatus { get; init; }
public bool? IsExploitable { get; init; }
public bool? IsReachable { get; init; }
public bool? HasCompensatingControl { get; init; }
public string[]? SeverityLevels { get; init; }
}
```
### GatedFinding
```csharp
public sealed record GatedFinding
{
public required FindingRef Finding { get; init; }
public required VexGateResult GateResult { get; init; }
}
```
## Default Gate Policy Rules
Per product advisory:
```yaml
# etc/scanner.yaml
vexGate:
enabled: true
rules:
- ruleId: "block-exploitable-reachable"
priority: 100
condition:
isExploitable: true
isReachable: true
hasCompensatingControl: false
decision: Block
- ruleId: "warn-high-not-reachable"
priority: 90
condition:
severityLevels: ["critical", "high"]
isReachable: false
decision: Warn
- ruleId: "pass-vendor-not-affected"
priority: 80
condition:
vendorStatus: NotAffected
decision: Pass
- ruleId: "pass-backport-confirmed"
priority: 70
condition:
vendorStatus: Fixed
# justification implies backport evidence
decision: Pass
defaultDecision: Warn
```
## API Endpoints
### GET /api/v1/scans/{scanId}/gate-results
```json
{
"scanId": "...",
"gateSummary": {
"totalFindings": 150,
"passed": 100,
"warned": 35,
"blocked": 15,
"evaluatedAt": "2026-01-06T10:30:00Z"
},
"gatedFindings": [
{
"findingId": "...",
"cve": "CVE-2025-12345",
"decision": "Block",
"rationale": "Exploitable + reachable, no compensating control",
"policyRuleMatched": "block-exploitable-reachable",
"evidence": {
"vendorStatus": null,
"isReachable": true,
"hasCompensatingControl": false,
"confidenceScore": 0.95
}
}
]
}
```
## CLI Commands
```bash
# Show current gate policy
stella scan gate-policy show
# Get gate results for a scan
stella scan gate-results <scan-id>
# Get gate results with blocked only
stella scan gate-results <scan-id> --decision Block
# Run scan with gate bypass (emergency)
stella scan start <image> --bypass-gate
```
## Performance Targets
| Metric | Target |
|--------|--------|
| Gate evaluation throughput | >= 1000 findings/sec |
| VEX lookup latency (cached) | < 5ms |
| VEX lookup latency (uncached) | < 50ms |
| Memory overhead per scan | < 10MB for gate state |
## Acceptance Criteria
1. **Noise Reduction**: Gate reduces triage queue by >= 50% on test corpus
2. **Accuracy**: False positive rate < 1% (findings incorrectly passed)
3. **Performance**: Gate evaluation < 1s for typical scan (100 findings)
4. **Traceability**: Every gate decision has auditable evidence
5. **Configurability**: Policy rules can be customized per tenant
## Test Cases
### Unit Tests
- Policy rule matching logic for all conditions
- Default policy produces expected decisions
- Evidence is correctly captured from VEX statements
### Integration Tests
- Gate service queries Excititor correctly
- Scan pipeline applies gate decisions
- Gate results appear in API response
### Corpus Tests (test data from `src/__Tests/__Datasets/`)
- Known "not affected" CVEs are passed
- Known exploitable+reachable CVEs are blocked
- Ambiguous cases are warned
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Gate after findings, before triage | Allows full finding context for decision |
| Default to Warn not Block | Conservative to avoid blocking legitimate alerts |
| Cache VEX lookups with short TTL | Balance freshness vs performance |
| Risk | Mitigation |
|------|------------|
| VEX data stale at gate time | TTL-based cache invalidation, async refresh |
| Policy misconfiguration | Policy validation at startup, audit logging |
| Gate becomes bottleneck | Parallel evaluation, batch VEX lookups |
## Execution Log
| Date | Author | Action |
|------|--------|--------|
| 2026-01-06 | Claude | Sprint created from product advisory |

View File

@@ -0,0 +1,350 @@
# SPRINT_20260106_003_003_EVIDENCE_export_bundle
## Sprint Metadata
| Field | Value |
|-------|-------|
| Sprint ID | 20260106_003_003 |
| Module | EVIDENCELOCKER |
| Title | Evidence Bundle Export with Verify Commands |
| Working Directory | `src/EvidenceLocker/` |
| Dependencies | SPRINT_20260106_003_001 |
| Blocking | None (can proceed in parallel with 003_004) |
## Objective
Implement a standardized evidence bundle export format that includes SBOMs, VEX statements, attestations, public keys, and embedded verification scripts. This enables offline audits and air-gapped verification as specified in the product advisory MVP: "Evidence Bundle export (zip/tar) for audits".
## Context
**Current State:**
- EvidenceLocker stores sealed bundles with Merkle integrity
- Bundles contain SBOM, scan results, policy verdicts, attestations
- No standardized export format for external auditors
- No embedded verification commands
**Target State:**
- Standardized `evidence-bundle-<id>.tar.gz` export format
- Embedded `verify.sh` and `verify.ps1` scripts
- README with verification instructions
- Public keys bundled for offline verification
- CLI command for export
## Tasks
### Phase 1: Export Format Definition (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T001 | Define bundle directory structure | TODO | See "Bundle Structure" below |
| T002 | Create `BundleManifest` model | TODO | Index of all artifacts in bundle |
| T003 | Define `BundleMetadata` model | TODO | Provenance, timestamps, subject |
| T004 | Create bundle format specification doc | TODO | `docs/modules/evidence-locker/export-format.md` |
| T005 | Unit tests for manifest serialization | TODO | Deterministic JSON output |
### Phase 2: Export Service Implementation (8 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T006 | Define `IEvidenceBundleExporter` interface | TODO | `src/EvidenceLocker/__Libraries/StellaOps.EvidenceLocker.Export/` |
| T007 | Implement `TarGzBundleExporter` | TODO | Creates tar.gz with correct structure |
| T008 | Implement artifact collector (SBOMs) | TODO | Fetches from CAS |
| T009 | Implement artifact collector (VEX) | TODO | Fetches VEX statements |
| T010 | Implement artifact collector (Attestations) | TODO | Fetches DSSE envelopes |
| T011 | Implement public key bundler | TODO | Includes signing keys for verification |
| T012 | Add compression options (gzip, brotli) | TODO | Configurable compression level |
| T013 | Unit tests for export service | TODO | |
### Phase 3: Verify Script Generation (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T014 | Create `verify.sh` template (bash) | TODO | POSIX-compliant |
| T015 | Create `verify.ps1` template (PowerShell) | TODO | Windows support |
| T016 | Implement DSSE verification in scripts | TODO | Uses bundled public keys |
| T017 | Implement Merkle root verification in scripts | TODO | Checks manifest integrity |
| T018 | Implement checksum verification in scripts | TODO | SHA256 of each artifact |
| T019 | Script generation tests | TODO | Generated scripts run correctly |
### Phase 4: API & Worker (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T020 | Add `POST /bundles/{id}/export` endpoint | TODO | Triggers async export |
| T021 | Add `GET /bundles/{id}/export/{exportId}` endpoint | TODO | Download exported bundle |
| T022 | Implement export worker for large bundles | TODO | Background processing |
| T023 | Add export status tracking | TODO | pending/processing/ready/failed |
| T024 | API integration tests | TODO | |
### Phase 5: CLI Commands (4 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T025 | Add `stella evidence export` command | TODO | `--bundle <id> --output <path>` |
| T026 | Add `stella evidence verify` command | TODO | Verifies exported bundle |
| T027 | Add progress indicator for large exports | TODO | |
| T028 | CLI integration tests | TODO | |
## Bundle Structure
```
evidence-bundle-<id>/
+-- manifest.json # Bundle manifest with all artifact refs
+-- metadata.json # Bundle metadata (provenance, timestamps)
+-- README.md # Human-readable verification instructions
+-- verify.sh # Bash verification script
+-- verify.ps1 # PowerShell verification script
+-- checksums.sha256 # SHA256 checksums for all artifacts
+-- keys/
| +-- signing-key-001.pem # Public key for DSSE verification
| +-- signing-key-002.pem # Additional keys if multi-sig
| +-- trust-bundle.pem # CA chain if applicable
+-- sboms/
| +-- image.cdx.json # Aggregated CycloneDX SBOM
| +-- image.spdx.json # Aggregated SPDX SBOM
| +-- layers/
| +-- <layer-digest>.cdx.json # Per-layer CycloneDX
| +-- <layer-digest>.spdx.json # Per-layer SPDX
+-- vex/
| +-- statements/
| | +-- <statement-id>.openvex.json
| +-- consensus/
| +-- image-consensus.json # VEX consensus result
+-- attestations/
| +-- sbom.dsse.json # SBOM attestation envelope
| +-- vex.dsse.json # VEX attestation envelope
| +-- policy.dsse.json # Policy verdict attestation
| +-- rekor-proofs/
| +-- <uuid>.proof.json # Rekor inclusion proofs
+-- findings/
| +-- scan-results.json # Vulnerability findings
| +-- gate-results.json # VEX gate decisions
+-- audit/
+-- timeline.ndjson # Audit event timeline
```
## Contracts
### BundleManifest
```json
{
"manifestVersion": "1.0.0",
"bundleId": "eb-2026-01-06-abc123",
"createdAt": "2026-01-06T10:30:00.000000Z",
"subject": {
"type": "container-image",
"digest": "sha256:abcdef...",
"name": "registry.example.com/app:v1.2.3"
},
"artifacts": [
{
"path": "sboms/image.cdx.json",
"type": "sbom",
"format": "cyclonedx-1.7",
"digest": "sha256:...",
"size": 45678
},
{
"path": "attestations/sbom.dsse.json",
"type": "attestation",
"format": "dsse-v1",
"predicateType": "StellaOps.SBOMAttestation@1",
"digest": "sha256:...",
"size": 12345,
"signedBy": ["sha256:keyabc..."]
}
],
"verification": {
"merkleRoot": "sha256:...",
"algorithm": "sha256",
"checksumFile": "checksums.sha256"
}
}
```
### BundleMetadata
```json
{
"bundleId": "eb-2026-01-06-abc123",
"exportedAt": "2026-01-06T10:35:00.000000Z",
"exportedBy": "stella evidence export",
"exportVersion": "2026.04",
"provenance": {
"tenantId": "tenant-xyz",
"scanId": "scan-abc123",
"pipelineId": "pipeline-def456",
"sourceRepository": "https://github.com/example/app",
"sourceCommit": "abc123def456..."
},
"chainInfo": {
"previousBundleId": "eb-2026-01-05-xyz789",
"sequenceNumber": 42
},
"transparency": {
"rekorLogUrl": "https://rekor.sigstore.dev",
"rekorEntryUuids": ["uuid1", "uuid2"]
}
}
```
## Verify Script Logic
### verify.sh (Bash)
```bash
#!/bin/bash
set -euo pipefail
BUNDLE_DIR="$(cd "$(dirname "$0")" && pwd)"
MANIFEST="$BUNDLE_DIR/manifest.json"
CHECKSUMS="$BUNDLE_DIR/checksums.sha256"
echo "=== StellaOps Evidence Bundle Verification ==="
echo "Bundle: $(basename "$BUNDLE_DIR")"
echo ""
# Step 1: Verify checksums
echo "[1/4] Verifying artifact checksums..."
cd "$BUNDLE_DIR"
sha256sum -c "$CHECKSUMS" --quiet
echo " OK: All checksums match"
# Step 2: Verify Merkle root
echo "[2/4] Verifying Merkle root..."
COMPUTED_ROOT=$(compute-merkle-root "$CHECKSUMS")
EXPECTED_ROOT=$(jq -r '.verification.merkleRoot' "$MANIFEST")
if [ "$COMPUTED_ROOT" = "$EXPECTED_ROOT" ]; then
echo " OK: Merkle root verified"
else
echo " FAIL: Merkle root mismatch"
exit 1
fi
# Step 3: Verify DSSE signatures
echo "[3/4] Verifying attestation signatures..."
for dsse in "$BUNDLE_DIR"/attestations/*.dsse.json; do
verify-dsse "$dsse" --keys "$BUNDLE_DIR/keys/"
echo " OK: $(basename "$dsse")"
done
# Step 4: Verify Rekor proofs (if online)
echo "[4/4] Verifying Rekor proofs..."
if [ "${OFFLINE:-false}" = "true" ]; then
echo " SKIP: Offline mode, Rekor verification skipped"
else
for proof in "$BUNDLE_DIR"/attestations/rekor-proofs/*.proof.json; do
verify-rekor-proof "$proof"
echo " OK: $(basename "$proof")"
done
fi
echo ""
echo "=== Verification Complete: PASSED ==="
```
## API Endpoints
### POST /api/v1/bundles/{bundleId}/export
```json
Request:
{
"format": "tar.gz",
"compression": "gzip",
"includeRekorProofs": true,
"includeLayerSboms": true
}
Response 202:
{
"exportId": "exp-123",
"status": "processing",
"estimatedSize": 1234567,
"statusUrl": "/api/v1/bundles/{bundleId}/export/exp-123"
}
```
### GET /api/v1/bundles/{bundleId}/export/{exportId}
```
Response 200 (when ready):
Headers:
Content-Type: application/gzip
Content-Disposition: attachment; filename="evidence-bundle-eb-123.tar.gz"
Body: <binary tar.gz content>
Response 202 (still processing):
{
"exportId": "exp-123",
"status": "processing",
"progress": 65,
"estimatedTimeRemaining": "30s"
}
```
## CLI Commands
```bash
# Export bundle to file
stella evidence export --bundle eb-2026-01-06-abc123 --output ./audit-bundle.tar.gz
# Export with options
stella evidence export --bundle eb-123 \
--output ./bundle.tar.gz \
--include-layers \
--include-rekor-proofs
# Verify an exported bundle
stella evidence verify ./audit-bundle.tar.gz
# Verify offline (skip Rekor)
stella evidence verify ./audit-bundle.tar.gz --offline
```
## Acceptance Criteria
1. **Completeness**: Bundle includes all specified artifacts (SBOMs, VEX, attestations, keys)
2. **Verifiability**: `verify.sh` and `verify.ps1` run successfully on valid bundles
3. **Offline Support**: Verification works without network access (except Rekor)
4. **Determinism**: Same bundle exported twice produces identical tar.gz
5. **Documentation**: README explains verification steps for non-technical auditors
## Test Cases
### Unit Tests
- Manifest serialization is deterministic
- Merkle root computation matches expected
- Checksum file format is correct
### Integration Tests
- Export service collects all artifacts from CAS
- Generated verify.sh runs correctly on Linux
- Generated verify.ps1 runs correctly on Windows
- Large bundles (>100MB) export without OOM
### E2E Tests
- Full flow: scan -> seal -> export -> verify
- Exported bundle verifies in air-gapped environment
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| tar.gz format | Universal, works on all platforms |
| Embedded verify scripts | No external dependencies for basic verification |
| Include public keys in bundle | Enables offline verification |
| NDJSON for audit timeline | Streaming-friendly, easy to parse |
| Risk | Mitigation |
|------|------------|
| Bundle size too large | Compression, optional layer SBOMs |
| Script compatibility issues | Test on multiple OS versions |
| Key rotation during export | Include all valid keys, document rotation |
## Execution Log
| Date | Author | Action |
|------|--------|--------|
| 2026-01-06 | Claude | Sprint created from product advisory |

View File

@@ -0,0 +1,351 @@
# SPRINT_20260106_003_004_ATTESTOR_chain_linking
## Sprint Metadata
| Field | Value |
|-------|-------|
| Sprint ID | 20260106_003_004 |
| Module | ATTESTOR |
| Title | Cross-Attestation Linking & Per-Layer Attestations |
| Working Directory | `src/Attestor/` |
| Dependencies | SPRINT_20260106_003_001, SPRINT_20260106_003_002 |
| Blocking | None |
## Objective
Implement cross-attestation linking (SBOM -> VEX -> Policy chain) and per-layer attestations to complete the attestation chain model specified in Step 3 of the product advisory: "Sign everything (portable, verifiable evidence)".
## Context
**Current State:**
- Attestor creates DSSE envelopes for SBOMs, VEX, scan results, policy verdicts
- Each attestation is independent with subject pointing to artifact digest
- No explicit chain linking between attestations
- Single attestation per image (no per-layer)
**Target State:**
- Cross-attestation linking via in-toto layout references
- Per-layer attestations with layer-specific subjects
- Query API for attestation chains
- Full provenance chain from source to final verdict
## Tasks
### Phase 1: Cross-Attestation Model (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T001 | Define `AttestationLink` model | TODO | References between attestations |
| T002 | Define `AttestationChain` model | TODO | Ordered chain with validation |
| T003 | Update `InTotoStatement` to include `materials` refs | TODO | Link to upstream attestations |
| T004 | Create `IAttestationLinkResolver` interface | TODO | Resolve chain from any point |
| T005 | Implement `AttestationChainValidator` | TODO | Validates DAG structure |
| T006 | Unit tests for chain models | TODO | |
### Phase 2: Chain Linking Implementation (7 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T007 | Update SBOM attestation to include source materials | TODO | Commit SHA, layer digests |
| T008 | Update VEX attestation to reference SBOM attestation | TODO | `materials: [{sbom-attestation-digest}]` |
| T009 | Update Policy attestation to reference VEX + SBOM | TODO | Complete chain |
| T010 | Implement `IAttestationChainBuilder` | TODO | Builds chain from components |
| T011 | Add chain validation at submission time | TODO | Reject circular refs |
| T012 | Store chain links in `attestor.entry_links` table | TODO | PostgreSQL |
| T013 | Integration tests for chain building | TODO | |
### Phase 3: Per-Layer Attestations (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T014 | Define `LayerAttestationRequest` model | TODO | Layer digest as subject |
| T015 | Update `IAttestationSigningService` for layers | TODO | Batch layer attestations |
| T016 | Implement `LayerAttestationService` | TODO | Creates per-layer DSSE |
| T017 | Add layer attestations to `SbomCompositionResult` | TODO | From Scanner |
| T018 | Batch signing for efficiency | TODO | Sign all layers in one operation |
| T019 | Unit tests for layer attestations | TODO | |
### Phase 4: Chain Query API (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T020 | Add `GET /attestations?artifact={digest}&chain=true` | TODO | Returns full chain |
| T021 | Add `GET /attestations/{id}/upstream` | TODO | Parent attestations |
| T022 | Add `GET /attestations/{id}/downstream` | TODO | Child attestations |
| T023 | Implement chain traversal with depth limit | TODO | Prevent infinite loops |
| T024 | Add chain visualization endpoint | TODO | Mermaid/DOT graph output |
| T025 | API integration tests | TODO | |
### Phase 5: CLI & Documentation (4 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T026 | Add `stella attest chain <artifact-digest>` command | TODO | Display attestation chain |
| T027 | Add `stella attest layers <scan-id>` command | TODO | List layer attestations |
| T028 | Update attestor architecture docs | TODO | Cross-attestation linking |
| T029 | CLI integration tests | TODO | |
## Contracts
### AttestationLink
```csharp
public sealed record AttestationLink
{
public required string SourceAttestationId { get; init; } // sha256:<hash>
public required string TargetAttestationId { get; init; } // sha256:<hash>
public required AttestationLinkType LinkType { get; init; }
public required DateTimeOffset CreatedAt { get; init; }
}
public enum AttestationLinkType
{
DependsOn, // Target is a material for source
Supersedes, // Source supersedes target (version update)
Aggregates // Source aggregates multiple targets (batch)
}
```
### AttestationChain
```csharp
public sealed record AttestationChain
{
public required string RootAttestationId { get; init; }
public required ImmutableArray<AttestationChainNode> Nodes { get; init; }
public required ImmutableArray<AttestationLink> Links { get; init; }
public required bool IsComplete { get; init; }
public required DateTimeOffset ResolvedAt { get; init; }
}
public sealed record AttestationChainNode
{
public required string AttestationId { get; init; }
public required string PredicateType { get; init; }
public required string SubjectDigest { get; init; }
public required int Depth { get; init; }
public required DateTimeOffset CreatedAt { get; init; }
}
```
### Enhanced InTotoStatement (with materials)
```json
{
"_type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "registry.example.com/app@sha256:imageabc...",
"digest": { "sha256": "imageabc..." }
}
],
"predicateType": "StellaOps.PolicyEvaluation@1",
"predicate": {
"verdict": "pass",
"evaluatedAt": "2026-01-06T10:30:00Z",
"policyVersion": "1.2.3"
},
"materials": [
{
"uri": "attestation:sha256:sbom-attest-digest",
"digest": { "sha256": "sbom-attest-digest" },
"annotations": { "predicateType": "StellaOps.SBOMAttestation@1" }
},
{
"uri": "attestation:sha256:vex-attest-digest",
"digest": { "sha256": "vex-attest-digest" },
"annotations": { "predicateType": "StellaOps.VEXAttestation@1" }
}
]
}
```
### LayerAttestationRequest
```csharp
public sealed record LayerAttestationRequest
{
public required string ImageDigest { get; init; }
public required string LayerDigest { get; init; }
public required int LayerOrder { get; init; }
public required string SbomDigest { get; init; }
public required string SbomFormat { get; init; } // "cyclonedx" | "spdx"
}
```
## Database Schema
### attestor.entry_links
```sql
CREATE TABLE attestor.entry_links (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
source_attestation_id TEXT NOT NULL, -- sha256:<hash>
target_attestation_id TEXT NOT NULL, -- sha256:<hash>
link_type TEXT NOT NULL, -- 'depends_on', 'supersedes', 'aggregates'
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT fk_source FOREIGN KEY (source_attestation_id)
REFERENCES attestor.entries(bundle_sha256) ON DELETE CASCADE,
CONSTRAINT fk_target FOREIGN KEY (target_attestation_id)
REFERENCES attestor.entries(bundle_sha256) ON DELETE CASCADE,
CONSTRAINT no_self_link CHECK (source_attestation_id != target_attestation_id)
);
CREATE INDEX idx_entry_links_source ON attestor.entry_links(source_attestation_id);
CREATE INDEX idx_entry_links_target ON attestor.entry_links(target_attestation_id);
CREATE INDEX idx_entry_links_type ON attestor.entry_links(link_type);
```
## API Endpoints
### GET /api/v1/attestations?artifact={digest}&chain=true
```json
Response 200:
{
"artifactDigest": "sha256:imageabc...",
"chain": {
"rootAttestationId": "sha256:policy-attest...",
"isComplete": true,
"resolvedAt": "2026-01-06T10:35:00Z",
"nodes": [
{
"attestationId": "sha256:policy-attest...",
"predicateType": "StellaOps.PolicyEvaluation@1",
"depth": 0
},
{
"attestationId": "sha256:vex-attest...",
"predicateType": "StellaOps.VEXAttestation@1",
"depth": 1
},
{
"attestationId": "sha256:sbom-attest...",
"predicateType": "StellaOps.SBOMAttestation@1",
"depth": 2
}
],
"links": [
{
"source": "sha256:policy-attest...",
"target": "sha256:vex-attest...",
"type": "DependsOn"
},
{
"source": "sha256:policy-attest...",
"target": "sha256:sbom-attest...",
"type": "DependsOn"
}
]
}
}
```
### GET /api/v1/attestations/{id}/chain/graph
```
Query params:
- format: "mermaid" | "dot" | "json"
Response 200 (format=mermaid):
```mermaid
graph TD
A[Policy Verdict] -->|depends_on| B[VEX Attestation]
A -->|depends_on| C[SBOM Attestation]
B -->|depends_on| C
C -->|depends_on| D[Layer 0 Attest]
C -->|depends_on| E[Layer 1 Attest]
```
## Chain Structure Example
```
┌─────────────────────────┐
│ Policy Verdict │
│ Attestation │
│ (root of chain) │
└───────────┬─────────────┘
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ │
┌─────────────────┐ ┌─────────────────┐ │
│ VEX Attestation │ │ Gate Results │ │
│ │ │ Attestation │ │
└────────┬────────┘ └─────────────────┘ │
│ │
▼ ▼
┌─────────────────────────────────────────────┐
│ SBOM Attestation │
│ (image level) │
└───────────┬─────────────┬───────────────────┘
│ │
┌───────┴───────┐ └───────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Layer 0 SBOM │ │ Layer 1 SBOM │ │ Layer N SBOM │
│ Attestation │ │ Attestation │ │ Attestation │
└───────────────┘ └───────────────┘ └───────────────┘
```
## CLI Commands
```bash
# Get attestation chain for an artifact
stella attest chain sha256:imageabc...
# Get chain as graph
stella attest chain sha256:imageabc... --format mermaid
# List layer attestations for a scan
stella attest layers <scan-id>
# Verify complete chain
stella attest verify-chain sha256:imageabc...
```
## Acceptance Criteria
1. **Chain Completeness**: Policy attestation links to all upstream attestations
2. **Per-Layer Coverage**: Every layer has its own attestation
3. **Queryability**: Full chain retrievable from any node
4. **Validation**: Circular references rejected at creation
5. **Performance**: Chain resolution < 100ms for typical depth (5 levels)
## Test Cases
### Unit Tests
- Chain builder creates correct DAG structure
- Link validator detects circular references
- Chain traversal respects depth limits
### Integration Tests
- Full scan produces complete attestation chain
- Chain query returns all linked attestations
- Per-layer attestations stored correctly
### E2E Tests
- End-to-end: scan -> gate -> attestation chain -> export
- Chain verification in exported bundle
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Store links in separate table | Efficient traversal, no attestation mutation |
| Use DAG not tree | Allows multiple parents (SBOM used by VEX and Policy) |
| Batch layer attestations | Performance: one signing operation for all layers |
| Materials field for links | in-toto standard compliance |
| Risk | Mitigation |
|------|------------|
| Chain resolution performance | Depth limit, caching, indexed traversal |
| Circular reference bugs | Validation at insertion, periodic audit |
| Orphaned attestations | Cleanup job for unlinked entries |
## Execution Log
| Date | Author | Action |
|------|--------|--------|
| 2026-01-06 | Claude | Sprint created from product advisory |

View File

@@ -0,0 +1,283 @@
# SPRINT_20260106_004_001_FE_quiet_triage_ux_integration
## Sprint Metadata
| Field | Value |
|-------|-------|
| Sprint ID | 20260106_004_001 |
| Module | FE (Frontend) |
| Title | Quiet-by-Default Triage UX Integration |
| Working Directory | `src/Web/StellaOps.Web/` |
| Dependencies | None (backend APIs complete) |
| Blocking | None |
| Advisory | `docs-archived/product-advisories/06-Jan-2026 - Quiet-by-Default Triage with Attested Exceptions.md` |
## Objective
Integrate the existing quiet-by-default triage backend APIs into the Angular 17 frontend. The backend infrastructure is complete; this sprint delivers the UX layer that enables users to experience "inbox shows only actionables" with one-click access to the Review lane and evidence export.
## Context
**Current State:**
- Backend APIs fully implemented:
- `GatingReasonService` computes gating status
- `GatingContracts.cs` defines DTOs (`FindingGatingStatusDto`, `GatedBucketsSummaryDto`)
- `ApprovalEndpoints` provides CRUD for approvals
- `TriageStatusEndpoints` serves lane/verdict data
- `EvidenceLocker` provides bundle export
- Frontend has existing findings table but lacks:
- Quiet/Review lane toggle
- Gated bucket summary chips
- Breadcrumb navigation
- Approval workflow modal
**Target State:**
- Default view shows only actionable findings (Quiet lane)
- Banner displays gated bucket counts with one-click filters
- Breadcrumb bar enables image->layer->package->symbol->call-path navigation
- Decision drawer supports mute/ack/exception with signing
- One-click evidence bundle export
## Backend APIs (Already Implemented)
| Endpoint | Purpose |
|----------|---------|
| `GET /api/v1/triage/findings` | Findings with gating status |
| `GET /api/v1/triage/findings/{id}/gating` | Individual gating status |
| `GET /api/v1/triage/scans/{id}/gated-buckets` | Gated bucket summary |
| `POST /api/v1/scans/{id}/approvals` | Create approval |
| `GET /api/v1/scans/{id}/approvals` | List approvals |
| `DELETE /api/v1/scans/{id}/approvals/{findingId}` | Revoke approval |
| `GET /api/v1/evidence/bundles/{id}/export` | Export evidence bundle |
## Tasks
### Phase 1: Lane Toggle & Gated Buckets (8 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T001 | Create `GatingService` Angular service | TODO | Wraps gating API calls |
| T002 | Create `TriageLaneToggle` component | TODO | Quiet/Review toggle button |
| T003 | Create `GatedBucketChips` component | TODO | Displays counts per gating reason |
| T004 | Update `FindingsTableComponent` to filter by lane | TODO | Default to Quiet (non-gated) |
| T005 | Add `IncludeHidden` query param support | TODO | Toggle shows hidden findings |
| T006 | Add `GatingReasonFilter` dropdown | TODO | Filter to specific bucket |
| T007 | Style gated badge indicators | TODO | Visual distinction for gated rows |
| T008 | Unit tests for lane toggle and chips | TODO | |
### Phase 2: Breadcrumb Navigation (6 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T009 | Create `ProvenanceBreadcrumb` component | TODO | Image->Layer->Package->Symbol->CallPath |
| T010 | Create `BreadcrumbNodePopover` component | TODO | Inline attestation chips per hop |
| T011 | Integrate with `ReachGraphSliceService` API | TODO | Fetch call-path data |
| T012 | Add layer SBOM link in breadcrumb | TODO | Click to view layer SBOM |
| T013 | Add symbol-to-function link | TODO | Deep link to ReachGraph mini-map |
| T014 | Unit tests for breadcrumb navigation | TODO | |
### Phase 3: Decision Drawer (7 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T015 | Create `DecisionDrawer` component | TODO | Slide-out panel for decisions |
| T016 | Add decision kind selector | TODO | Mute Reach/Mute VEX/Ack/Exception |
| T017 | Add reason code dropdown | TODO | Controlled vocabulary |
| T018 | Add TTL picker for exceptions | TODO | Date picker with validation |
| T019 | Add policy reference display | TODO | Auto-filled, admin-editable |
| T020 | Implement sign-and-apply flow | TODO | Calls `ApprovalEndpoints` |
| T021 | Add undo toast with revoke link | TODO | 10-second undo window |
### Phase 4: Evidence Export (4 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T022 | Create `ExportEvidenceButton` component | TODO | One-click download |
| T023 | Add export progress indicator | TODO | Async job tracking |
| T024 | Implement bundle download handler | TODO | DSSE-signed bundle |
| T025 | Add "include in bundle" markers | TODO | Per-evidence toggle |
### Phase 5: Integration & Polish (5 tasks)
| ID | Task | Status | Notes |
|----|------|--------|-------|
| T026 | Wire components into findings detail page | TODO | |
| T027 | Add keyboard navigation | TODO | Per TRIAGE_UX_GUIDE.md |
| T028 | Implement high-contrast mode support | TODO | Accessibility requirement |
| T029 | Add TTFS telemetry instrumentation | TODO | Time-to-first-signal metric |
| T030 | E2E tests for complete workflow | TODO | Cypress/Playwright |
## Components
### TriageLaneToggle
```typescript
@Component({
selector: 'stella-triage-lane-toggle',
template: `
<div class="lane-toggle">
<button [class.active]="lane === 'quiet'" (click)="setLane('quiet')">
Actionable ({{ visibleCount }})
</button>
<button [class.active]="lane === 'review'" (click)="setLane('review')">
Review ({{ hiddenCount }})
</button>
</div>
`
})
export class TriageLaneToggleComponent {
@Input() visibleCount = 0;
@Input() hiddenCount = 0;
@Output() laneChange = new EventEmitter<'quiet' | 'review'>();
lane: 'quiet' | 'review' = 'quiet';
}
```
### GatedBucketChips
```typescript
@Component({
selector: 'stella-gated-bucket-chips',
template: `
<div class="bucket-chips">
<span class="chip" *ngIf="buckets.unreachableCount" (click)="filterBy('Unreachable')">
Not Reachable: {{ buckets.unreachableCount }}
</span>
<span class="chip" *ngIf="buckets.vexNotAffectedCount" (click)="filterBy('VexNotAffected')">
VEX Not Affected: {{ buckets.vexNotAffectedCount }}
</span>
<span class="chip" *ngIf="buckets.backportedCount" (click)="filterBy('Backported')">
Backported: {{ buckets.backportedCount }}
</span>
<!-- ... other buckets -->
</div>
`
})
export class GatedBucketChipsComponent {
@Input() buckets!: GatedBucketsSummaryDto;
@Output() filterChange = new EventEmitter<GatingReason>();
}
```
### ProvenanceBreadcrumb
```typescript
@Component({
selector: 'stella-provenance-breadcrumb',
template: `
<nav class="breadcrumb-bar">
<a (click)="navigateTo('image')">{{ imageRef }}</a>
<span class="separator">></span>
<a (click)="navigateTo('layer')">{{ layerDigest | truncate:12 }}</a>
<span class="separator">></span>
<a (click)="navigateTo('package')">{{ packagePurl }}</a>
<span class="separator">></span>
<a (click)="navigateTo('symbol')">{{ symbolName }}</a>
<span class="separator">></span>
<span class="current">{{ callPath }}</span>
</nav>
`
})
export class ProvenanceBreadcrumbComponent {
@Input() finding!: FindingWithProvenance;
@Output() navigation = new EventEmitter<BreadcrumbNavigation>();
}
```
## Data Flow
```
FindingsPage
├── TriageLaneToggle (quiet/review selection)
│ └── emits laneChange → updates query params
├── GatedBucketChips (bucket counts)
│ └── emits filterChange → adds gating reason filter
├── FindingsTable (filtered list)
│ └── rows show gating badge when applicable
└── FindingDetailPanel (selected finding)
├── VerdictBanner (SHIP/BLOCK/NEEDS_EXCEPTION)
├── StatusChips (reachability, VEX, exploit, gate)
│ └── click → opens evidence panel
├── ProvenanceBreadcrumb (image→call-path)
│ └── click → navigates to hop detail
├── EvidenceRail (artifacts list)
│ └── ExportEvidenceButton
└── ActionsFooter
└── DecisionDrawer (mute/ack/exception)
```
## Styling Requirements
Per `docs/ux/TRIAGE_UX_GUIDE.md`:
- Status conveyed by text + shape (not color only)
- High contrast mode supported
- Keyboard navigation for table rows, chips, evidence list
- Copy-to-clipboard for digests, PURLs, CVE IDs
- Virtual scroll for findings table
## Telemetry (Required Instrumentation)
| Metric | Description |
|--------|-------------|
| `triage.ttfs` | Time from notification click to verdict banner rendered |
| `triage.time_to_proof` | Time from chip click to proof preview shown |
| `triage.mute_reversal_rate` | % of auto-muted findings that become actionable |
| `triage.bundle_export_latency` | Evidence bundle export time |
## Acceptance Criteria
1. **Default Quiet**: Findings list shows only non-gated (actionable) findings by default
2. **One-Click Review**: Single click toggles to Review lane showing all gated findings
3. **Bucket Visibility**: Gated bucket counts always visible, clickable to filter
4. **Breadcrumb Navigation**: Click-through from image to call-path works end-to-end
5. **Decision Persistence**: Mute/ack/exception decisions persist and show undo toast
6. **Evidence Export**: Bundle downloads within 5 seconds for typical findings
7. **Accessibility**: Keyboard navigation and high-contrast mode functional
8. **Performance**: Findings list renders in <2s for 1000 findings (virtual scroll)
## Test Cases
### Unit Tests
- Lane toggle emits correct events
- Bucket chips render correct counts
- Breadcrumb renders all path segments
- Decision drawer validates required fields
- Export button shows progress state
### Integration Tests
- Lane toggle filters API calls correctly
- Bucket click applies gating reason filter
- Decision submission calls approval API
- Export triggers bundle download
### E2E Tests
- Full workflow: view findings -> toggle lane -> select finding -> view breadcrumb -> export evidence
- Approval workflow: select finding -> open drawer -> submit decision -> verify toast -> verify persistence
## Decisions & Risks
| Decision | Rationale |
|----------|-----------|
| Default to Quiet lane | Reduces noise per advisory; Review always one click away |
| Breadcrumb as separate component | Reusable across finding detail and evidence views |
| Virtual scroll for table | Performance requirement for large finding sets |
| Risk | Mitigation |
|------|------------|
| API latency for gated buckets | Cache bucket summary, refresh on lane toggle |
| Complex breadcrumb state | Use route params for deep-linking support |
| Bundle export timeout | Async job with polling, show progress |
## References
- **UX Guide**: `docs/ux/TRIAGE_UX_GUIDE.md`
- **Backend Contracts**: `src/Scanner/StellaOps.Scanner.WebService/Contracts/GatingContracts.cs`
- **Approval API**: `src/Scanner/StellaOps.Scanner.WebService/Endpoints/ApprovalEndpoints.cs`
- **Archived Advisory**: `docs-archived/product-advisories/06-Jan-2026 - Quiet-by-Default Triage with Attested Exceptions.md`
## Execution Log
| Date | Author | Action |
|------|--------|--------|
| 2026-01-06 | Claude | Sprint created from validated product advisory |