save progress

2026-01-06 09:42:02 +02:00
parent 94d68bee8b
commit 37e11918e0
443 changed files with 85863 additions and 897 deletions
--- a/docs/modules/binary-index/architecture.md
+++ b/docs/modules/binary-index/architecture.md
@@ -218,7 +218,198 @@ public sealed record VulnFingerprint(
 public enum FingerprintType { BasicBlock, ControlFlowGraph, StringReferences, Combined }
 ```

-#### 2.2.5 Binary Vulnerability Service
+#### 2.2.5 Semantic Analysis Library
+
+> **Library:** `StellaOps.BinaryIndex.Semantic`
+> **Sprint:** 20260105_001_001_BINDEX - Semantic Diffing Phase 1
+
+The Semantic Analysis Library extends fingerprint generation with IR-level semantic matching, enabling detection of semantically equivalent code despite compiler optimizations, instruction reordering, and register allocation differences.
+
+**Key Insight:** Traditional instruction-level fingerprinting loses accuracy on optimized binaries by ~15-20%. Semantic analysis lifts to B2R2's Intermediate Representation (LowUIR), extracts key-semantics graphs, and uses graph hashing for similarity computation.
+
+##### 2.2.5.1 Architecture
+
+```
+Binary Input
+    │
+    v
+B2R2 Disassembly → Raw Instructions
+    │
+    v
+IR Lifting Service → LowUIR Statements
+    │
+    v
+Semantic Graph Extractor → Key-Semantics Graph (KSG)
+    │
+    v
+Graph Fingerprinting → Semantic Fingerprint
+    │
+    v
+Semantic Matcher → Similarity Score + Deltas
+```
+
+##### 2.2.5.2 Core Components
+
+**IR Lifting Service** (`IIrLiftingService`)
+
+Lifts disassembled instructions to B2R2 LowUIR:
+
+```csharp
+public interface IIrLiftingService
+{
+    Task<LiftedFunction> LiftToIrAsync(
+        IReadOnlyList<DisassembledInstruction> instructions,
+        string functionName,
+        LiftOptions? options = null,
+        CancellationToken ct = default);
+}
+
+public sealed record LiftedFunction(
+    string Name,
+    ImmutableArray<IrStatement> Statements,
+    ImmutableArray<IrBasicBlock> BasicBlocks);
+```
+
+**Semantic Graph Extractor** (`ISemanticGraphExtractor`)
+
+Extracts key-semantics graphs capturing data dependencies, control flow, and memory operations:
+
+```csharp
+public interface ISemanticGraphExtractor
+{
+    Task<KeySemanticsGraph> ExtractGraphAsync(
+        LiftedFunction function,
+        GraphExtractionOptions? options = null,
+        CancellationToken ct = default);
+}
+
+public sealed record KeySemanticsGraph(
+    string FunctionName,
+    ImmutableArray<SemanticNode> Nodes,
+    ImmutableArray<SemanticEdge> Edges,
+    GraphProperties Properties);
+
+public enum SemanticNodeType { Compute, Load, Store, Branch, Call, Return, Phi }
+public enum SemanticEdgeType { DataDependency, ControlDependency, MemoryDependency }
+```
+
+**Semantic Fingerprint Generator** (`ISemanticFingerprintGenerator`)
+
+Generates semantic fingerprints using Weisfeiler-Lehman graph hashing:
+
+```csharp
+public interface ISemanticFingerprintGenerator
+{
+    Task<SemanticFingerprint> GenerateAsync(
+        KeySemanticsGraph graph,
+        SemanticFingerprintOptions? options = null,
+        CancellationToken ct = default);
+}
+
+public sealed record SemanticFingerprint(
+    string FunctionName,
+    string GraphHashHex,      // WL graph hash (SHA-256)
+    string OperationHashHex,  // Normalized operation sequence hash
+    string DataFlowHashHex,   // Data dependency pattern hash
+    int NodeCount,
+    int EdgeCount,
+    int CyclomaticComplexity,
+    ImmutableArray<string> ApiCalls,
+    SemanticFingerprintAlgorithm Algorithm);
+```
+
+**Semantic Matcher** (`ISemanticMatcher`)
+
+Computes semantic similarity with weighted components:
+
+```csharp
+public interface ISemanticMatcher
+{
+    Task<SemanticMatchResult> MatchAsync(
+        SemanticFingerprint a,
+        SemanticFingerprint b,
+        MatchOptions? options = null,
+        CancellationToken ct = default);
+
+    Task<SemanticMatchResult> MatchWithDeltasAsync(
+        SemanticFingerprint a,
+        SemanticFingerprint b,
+        MatchOptions? options = null,
+        CancellationToken ct = default);
+}
+
+public sealed record SemanticMatchResult(
+    decimal Similarity,       // 0.00-1.00
+    decimal GraphSimilarity,
+    decimal OperationSimilarity,
+    decimal DataFlowSimilarity,
+    decimal ApiCallSimilarity,
+    MatchConfidence Confidence);
+```
+
+##### 2.2.5.3 Algorithm Details
+
+**Weisfeiler-Lehman Graph Hashing:**
+- 3 iterations of label propagation
+- SHA-256 for final hash computation
+- Deterministic node ordering via canonical sort
+
+**Similarity Weights (Default):**
+| Component | Weight |
+|-----------|--------|
+| Graph Hash | 0.35 |
+| Operation Hash | 0.25 |
+| Data Flow Hash | 0.25 |
+| API Calls | 0.15 |
+
+##### 2.2.5.4 Integration Points
+
+The semantic library integrates with existing BinaryIndex components:
+
+**DeltaSignatureGenerator Extension:**
+```csharp
+// Optional semantic services via constructor injection
+services.AddDeltaSignaturesWithSemantic();
+
+// Extended SymbolSignature with semantic properties
+public sealed record SymbolSignature
+{
+    // ... existing properties ...
+    public string? SemanticHashHex { get; init; }
+    public ImmutableArray<string> SemanticApiCalls { get; init; }
+}
+```
+
+**PatchDiffEngine Extension:**
+```csharp
+// SemanticWeight in HashWeights
+public decimal SemanticWeight { get; init; } = 0.2m;
+
+// FunctionFingerprint extended with semantic fingerprint
+public SemanticFingerprint? SemanticFingerprint { get; init; }
+```
+
+##### 2.2.5.5 Test Coverage
+
+| Category | Tests | Coverage |
+|----------|-------|----------|
+| Unit Tests (IR lifting, graph extraction, hashing) | 53 | Core algorithms |
+| Integration Tests (full pipeline) | 9 | End-to-end flow |
+| Golden Corpus (compiler variations) | 11 | Register allocation, optimization, compiler variants |
+| Benchmarks (accuracy, performance) | 7 | Baseline metrics |
+
+##### 2.2.5.6 Current Baselines
+
+> **Note:** Baselines reflect foundational implementation; accuracy improves as semantic features mature.
+
+| Metric | Baseline | Target |
+|--------|----------|--------|
+| Similarity (register allocation variants) | ≥0.55 | ≥0.85 |
+| Overall accuracy | ≥40% | ≥70% |
+| False positive rate | <10% | <5% |
+| P95 fingerprint latency | <100ms | <50ms |
+
+#### 2.2.6 Binary Vulnerability Service

 Main query interface for consumers.

@@ -688,8 +879,11 @@ binaryindex:
 - Scanner Native Analysis: `src/Scanner/StellaOps.Scanner.Analyzers.Native/`
 - Existing Fingerprinting: `src/Scanner/__Libraries/StellaOps.Scanner.EntryTrace/Binary/`
 - Build-ID Index: `src/Scanner/StellaOps.Scanner.Analyzers.Native/Index/`
+- **Semantic Diffing Sprint:** `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
+- **Semantic Library:** `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Semantic/`
+- **Semantic Tests:** `src/BinaryIndex/__Tests/StellaOps.BinaryIndex.Semantic.Tests/`

 ---

-*Document Version: 1.0.0*
-*Last Updated: 2025-12-21*
+*Document Version: 1.1.0*
+*Last Updated: 2025-01-15*
--- a/docs/modules/binary-index/bsim-setup.md
+++ b/docs/modules/binary-index/bsim-setup.md
@@ -0,0 +1,439 @@
+# BSim PostgreSQL Database Setup Guide
+
+**Version:** 1.0
+**Sprint:** SPRINT_20260105_001_003_BINDEX
+**Task:** GHID-011
+
+## Overview
+
+Ghidra's BSim (Binary Similarity) feature requires a separate PostgreSQL database for storing and querying function signatures. This guide covers setup and configuration.
+
+## Architecture
+
+```
+┌──────────────────────────────────────────────────────┐
+│              StellaOps BinaryIndex                   │
+├──────────────────────────────────────────────────────┤
+│  Main Corpus DB          │  BSim DB (Ghidra)        │
+│  (corpus.* schema)       │  (separate instance)     │
+│                          │                          │
+│  - Function metadata     │  - BSim signatures       │
+│  - Fingerprints          │  - Feature vectors       │
+│  - Clusters              │  - Similarity index      │
+│  - CVE associations      │                          │
+└──────────────────────────────────────────────────────┘
+```
+
+**Why Separate?**
+- BSim uses Ghidra-specific schema and stored procedures
+- Different access patterns (corpus: OLTP, BSim: analytical)
+- BSim database can be shared across multiple Ghidra instances
+- Isolation prevents schema conflicts
+
+## Prerequisites
+
+- PostgreSQL 14+ (BSim requires specific PostgreSQL features)
+- Ghidra 11.x with BSim extension
+- Network connectivity between BinaryIndex services and BSim database
+- At least 10GB storage for initial database (scales with corpus size)
+
+## Database Setup
+
+### 1. Create BSim Database
+
+```bash
+# Create database
+createdb bsim_corpus
+
+# Create user
+psql -c "CREATE USER bsim_user WITH PASSWORD 'secure_password_here';"
+psql -c "GRANT ALL PRIVILEGES ON DATABASE bsim_corpus TO bsim_user;"
+```
+
+### 2. Initialize BSim Schema
+
+Ghidra provides scripts to initialize the BSim database schema:
+
+```bash
+# Set Ghidra home
+export GHIDRA_HOME=/opt/ghidra
+
+# Run BSim database initialization
+$GHIDRA_HOME/Ghidra/Features/BSim/data/postgresql_init.sh \
+    --host localhost \
+    --port 5432 \
+    --database bsim_corpus \
+    --user bsim_user \
+    --password secure_password_here
+```
+
+Alternatively, use Ghidra's BSim server setup:
+
+```bash
+# Create BSim server configuration
+$GHIDRA_HOME/support/bsimServerSetup \
+    postgresql://localhost:5432/bsim_corpus \
+    --user bsim_user \
+    --password secure_password_here
+```
+
+### 3. Verify Installation
+
+```bash
+# Connect to database
+psql -h localhost -U bsim_user -d bsim_corpus
+
+# Check BSim tables exist
+\dt
+
+# Expected tables:
+# - bsim_functions
+# - bsim_executables
+# - bsim_vectors
+# - bsim_clusters
+# etc.
+
+# Exit
+\q
+```
+
+## Docker Deployment
+
+### Docker Compose Configuration
+
+```yaml
+# docker-compose.bsim.yml
+version: '3.8'
+
+services:
+  bsim-postgres:
+    image: postgres:16
+    container_name: stellaops-bsim-db
+    environment:
+      POSTGRES_DB: bsim_corpus
+      POSTGRES_USER: bsim_user
+      POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD}
+      POSTGRES_INITDB_ARGS: "-E UTF8 --locale=C"
+    volumes:
+      - bsim-data:/var/lib/postgresql/data
+      - ./scripts/init-bsim.sh:/docker-entrypoint-initdb.d/10-init-bsim.sh:ro
+    ports:
+      - "5433:5432"  # Different port to avoid conflict with main DB
+    networks:
+      - stellaops
+    healthcheck:
+      test: ["CMD-SHELL", "pg_isready -U bsim_user -d bsim_corpus"]
+      interval: 10s
+      timeout: 5s
+      retries: 5
+
+  ghidra-headless:
+    image: stellaops/ghidra-headless:11.2
+    container_name: stellaops-ghidra
+    depends_on:
+      bsim-postgres:
+        condition: service_healthy
+    environment:
+      BSIM_DB_URL: "postgresql://bsim-postgres:5432/bsim_corpus"
+      BSIM_DB_USER: bsim_user
+      BSIM_DB_PASSWORD: ${BSIM_DB_PASSWORD}
+      JAVA_HOME: /opt/java/openjdk
+      MAXMEM: 4G
+    volumes:
+      - ghidra-projects:/projects
+      - ghidra-scripts:/scripts
+    networks:
+      - stellaops
+    deploy:
+      resources:
+        limits:
+          cpus: '4'
+          memory: 8G
+
+volumes:
+  bsim-data:
+    driver: local
+  ghidra-projects:
+  ghidra-scripts:
+
+networks:
+  stellaops:
+    driver: bridge
+```
+
+### Initialization Script
+
+Create `scripts/init-bsim.sh`:
+
+```bash
+#!/bin/bash
+set -e
+
+# Wait for PostgreSQL to be ready
+until pg_isready -U "$POSTGRES_USER" -d "$POSTGRES_DB"; do
+  echo "Waiting for PostgreSQL..."
+  sleep 2
+done
+
+echo "PostgreSQL is ready. Installing BSim schema..."
+
+# Note: Actual BSim schema SQL would be sourced from Ghidra distribution
+# This is a placeholder - replace with actual Ghidra BSim schema
+psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
+    -- BSim schema will be initialized by Ghidra tools
+    -- This script just ensures the database is ready
+
+    COMMENT ON DATABASE bsim_corpus IS 'Ghidra BSim function signature database';
+EOSQL
+
+echo "BSim database initialized successfully"
+```
+
+### Start Services
+
+```bash
+# Set password
+export BSIM_DB_PASSWORD="your_secure_password"
+
+# Start services
+docker-compose -f docker-compose.bsim.yml up -d
+
+# Check logs
+docker-compose -f docker-compose.bsim.yml logs -f ghidra-headless
+```
+
+## Configuration
+
+### BinaryIndex Configuration
+
+Configure BSim connection in `appsettings.json`:
+
+```json
+{
+  "BinaryIndex": {
+    "Ghidra": {
+      "Enabled": true,
+      "GhidraHome": "/opt/ghidra",
+      "BSim": {
+        "Enabled": true,
+        "ConnectionString": "Host=localhost;Port=5433;Database=bsim_corpus;Username=bsim_user;Password=...",
+        "MinSimilarity": 0.7,
+        "MaxResults": 10
+      }
+    }
+  }
+}
+```
+
+### Environment Variables
+
+```bash
+# BSim database connection
+export STELLAOPS_BSIM_CONNECTION="Host=localhost;Port=5433;Database=bsim_corpus;Username=bsim_user;Password=..."
+
+# BSim feature
+export STELLAOPS_BSIM_ENABLED=true
+
+# Query tuning
+export STELLAOPS_BSIM_MIN_SIMILARITY=0.7
+export STELLAOPS_BSIM_QUERY_TIMEOUT=30
+```
+
+## Usage
+
+### Ingesting Functions into BSim
+
+```csharp
+using StellaOps.BinaryIndex.Ghidra;
+
+var bsimService = serviceProvider.GetRequiredService<IBSimService>();
+
+// Analyze binary with Ghidra
+var ghidraService = serviceProvider.GetRequiredService<IGhidraService>();
+var analysis = await ghidraService.AnalyzeAsync(binaryStream, ct: ct);
+
+// Generate BSim signatures
+var signatures = await bsimService.GenerateSignaturesAsync(analysis, ct: ct);
+
+// Ingest into BSim database
+await bsimService.IngestAsync("glibc", "2.31", signatures, ct);
+```
+
+### Querying BSim
+
+```csharp
+// Query for similar functions
+var queryOptions = new BSimQueryOptions
+{
+    MinSimilarity = 0.7,
+    MinSignificance = 0.5,
+    MaxResults = 10
+};
+
+var matches = await bsimService.QueryAsync(signature, queryOptions, ct);
+
+foreach (var match in matches)
+{
+    Console.WriteLine($"Match: {match.MatchedLibrary} {match.MatchedVersion} - {match.MatchedFunction}");
+    Console.WriteLine($"Similarity: {match.Similarity:P2}, Confidence: {match.Confidence:P2}");
+}
+```
+
+## Maintenance
+
+### Database Vacuum
+
+```bash
+# Regular vacuum (run weekly)
+psql -h localhost -U bsim_user -d bsim_corpus -c "VACUUM ANALYZE;"
+
+# Full vacuum (run monthly)
+psql -h localhost -U bsim_user -d bsim_corpus -c "VACUUM FULL;"
+```
+
+### Backup and Restore
+
+```bash
+# Backup
+pg_dump -h localhost -U bsim_user -d bsim_corpus -F c -f bsim_backup_$(date +%Y%m%d).dump
+
+# Restore
+pg_restore -h localhost -U bsim_user -d bsim_corpus -c bsim_backup_20260105.dump
+```
+
+### Monitoring
+
+```sql
+-- Check database size
+SELECT pg_size_pretty(pg_database_size('bsim_corpus'));
+
+-- Check signature count
+SELECT COUNT(*) FROM bsim_functions;
+
+-- Check recent ingest activity
+SELECT * FROM bsim_ingest_log ORDER BY ingested_at DESC LIMIT 10;
+```
+
+## Performance Tuning
+
+### PostgreSQL Configuration
+
+Add to `postgresql.conf`:
+
+```ini
+# Memory settings for BSim workload
+shared_buffers = 4GB
+effective_cache_size = 12GB
+work_mem = 256MB
+maintenance_work_mem = 1GB
+
+# Query parallelism
+max_parallel_workers_per_gather = 4
+max_parallel_workers = 8
+
+# Indexes
+random_page_cost = 1.1  # For SSD storage
+```
+
+### Indexing Strategy
+
+BSim automatically creates required indexes. Monitor slow queries:
+
+```sql
+-- Enable query logging
+ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
+SELECT pg_reload_conf();
+
+-- Check slow queries
+SELECT query, mean_exec_time, calls
+FROM pg_stat_statements
+WHERE query LIKE '%bsim%'
+ORDER BY mean_exec_time DESC
+LIMIT 10;
+```
+
+## Troubleshooting
+
+### Connection Refused
+
+```
+Error: could not connect to server: Connection refused
+```
+
+**Solution:**
+1. Verify PostgreSQL is running: `systemctl status postgresql`
+2. Check port: `netstat -an | grep 5433`
+3. Verify firewall rules
+4. Check `pg_hba.conf` for access rules
+
+### Schema Not Found
+
+```
+Error: relation "bsim_functions" does not exist
+```
+
+**Solution:**
+1. Re-run BSim schema initialization
+2. Verify Ghidra version compatibility
+3. Check BSim extension is installed in Ghidra
+
+### Poor Query Performance
+
+```
+Warning: BSim queries taking > 5s
+```
+
+**Solution:**
+1. Run `VACUUM ANALYZE` on BSim tables
+2. Increase `work_mem` for complex queries
+3. Check index usage: `EXPLAIN ANALYZE` on slow queries
+4. Consider partitioning large tables
+
+## Security Considerations
+
+1. **Network Access:** BSim database should only be accessible from BinaryIndex services and Ghidra instances
+2. **Authentication:** Use strong passwords, consider certificate-based authentication
+3. **Encryption:** Enable SSL/TLS for database connections in production
+4. **Access Control:** Grant minimum necessary privileges
+
+```sql
+-- Create read-only user for query services
+CREATE USER bsim_readonly WITH PASSWORD '...';
+GRANT CONNECT ON DATABASE bsim_corpus TO bsim_readonly;
+GRANT SELECT ON ALL TABLES IN SCHEMA public TO bsim_readonly;
+```
+
+## Integration with Corpus
+
+The BSim database complements the main corpus database:
+
+- **Corpus DB:** Stores function metadata, fingerprints, CVE associations
+- **BSim DB:** Stores Ghidra-specific behavioral signatures and feature vectors
+
+Functions are cross-referenced by:
+- Library name + version
+- Function name
+- Binary hash
+
+## Status: GHID-011 Resolution
+
+**Implementation Status:** Service code complete (`BSimService.cs` implemented)
+
+**Database Status:** Schema initialization documented, awaiting infrastructure provisioning
+
+**Blocker Resolution:** This guide provides complete setup instructions. Database can be provisioned by:
+1. Operations team following Docker Compose setup above
+2. Developers using local PostgreSQL with manual schema init
+3. CI/CD using containerized BSim database for integration tests
+
+**Next Steps:**
+1. Provision BSim PostgreSQL instance (dev/staging/prod)
+2. Run BSim schema initialization
+3. Test BSimService connectivity
+4. Ingest initial corpus into BSim
+
+## References
+
+- Ghidra BSim Documentation: https://ghidra.re/ghidra_docs/api/ghidra/features/bsim/
+- Sprint: `docs/implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
+- BSimService Implementation: `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Ghidra/Services/BSimService.cs`
--- a/docs/modules/binary-index/corpus-ingestion-operations.md
+++ b/docs/modules/binary-index/corpus-ingestion-operations.md
@@ -0,0 +1,232 @@
+# Corpus Ingestion Operations Guide
+
+**Version:** 1.0
+**Sprint:** SPRINT_20260105_001_002_BINDEX
+**Status:** Implementation Complete - Operational Execution Pending
+
+## Overview
+
+This guide describes how to execute corpus ingestion operations to populate the function behavior corpus with fingerprints from known library functions.
+
+## Prerequisites
+
+- StellaOps.BinaryIndex.Corpus library built and deployed
+- PostgreSQL database with corpus schema (see `docs/db/schemas/corpus.sql`)
+- Network access to package mirrors (or local package cache)
+- Sufficient disk space (~100GB for full corpus)
+- Required tools:
+  - .NET 10 runtime
+  - HTTP client access to package repositories
+
+## Implementation Status
+
+**CORP-015, CORP-016, CORP-017: Implementation COMPLETE**
+
+All corpus connector implementations are complete and build successfully:
+- ✓ GlibcCorpusConnector (GNU C Library)
+- ✓ OpenSslCorpusConnector (OpenSSL)
+- ✓ ZlibCorpusConnector (zlib)
+- ✓ CurlCorpusConnector (libcurl)
+
+**Status:** Code implementation is done. These tasks require **operational execution** to download and ingest real package data.
+
+## Running Corpus Ingestion
+
+### 1. Configure Package Sources
+
+Set up access to package mirrors in your configuration:
+
+```yaml
+# config/corpus-ingestion.yaml
+packageSources:
+  debian:
+    mirrorUrl: "http://deb.debian.org/debian"
+    distributions: ["bullseye", "bookworm"]
+    components: ["main"]
+
+  ubuntu:
+    mirrorUrl: "http://archive.ubuntu.com/ubuntu"
+    distributions: ["focal", "jammy"]
+
+  alpine:
+    mirrorUrl: "https://dl-cdn.alpinelinux.org/alpine"
+    versions: ["v3.18", "v3.19"]
+```
+
+### 2. Environment Variables
+
+```bash
+# Database connection
+export STELLAOPS_CORPUS_DB="Host=localhost;Database=stellaops;Username=corpus_user;Password=..."
+
+# Package cache directory (optional)
+export STELLAOPS_PACKAGE_CACHE="/var/cache/stellaops/packages"
+
+# Concurrent workers
+export STELLAOPS_INGESTION_WORKERS=4
+```
+
+### 3. Execute Ingestion (CLI)
+
+```bash
+# Ingest specific library version
+stellaops corpus ingest --library glibc --version 2.31 --architectures x86_64,aarch64
+
+# Ingest version range
+stellaops corpus ingest --library openssl --version-range "1.1.0..1.1.1" --architectures x86_64
+
+# Ingest from local binary
+stellaops corpus ingest-binary --library glibc --version 2.31 --arch x86_64 --path /usr/lib/x86_64-linux-gnu/libc.so.6
+
+# Full ingestion job (all configured libraries)
+stellaops corpus ingest-full --config config/corpus-ingestion.yaml
+```
+
+### 4. Execute Ingestion (Programmatic)
+
+```csharp
+using StellaOps.BinaryIndex.Corpus;
+using StellaOps.BinaryIndex.Corpus.Connectors;
+
+// Setup
+var serviceProvider = ...; // Configure DI
+var ingestionService = serviceProvider.GetRequiredService<ICorpusIngestionService>();
+var glibcConnector = serviceProvider.GetRequiredService<GlibcCorpusConnector>();
+
+// Fetch available versions
+var versions = await glibcConnector.GetAvailableVersionsAsync(ct);
+
+// Ingest specific version
+foreach (var version in versions.Take(5))
+{
+    foreach (var arch in new[] { "x86_64", "aarch64" })
+    {
+        try
+        {
+            var binary = await glibcConnector.FetchBinaryAsync(version, arch, abi: "gnu", ct);
+
+            var metadata = new LibraryMetadata(
+                Name: "glibc",
+                Version: version,
+                Architecture: arch,
+                Abi: "gnu",
+                Compiler: "gcc",
+                OptimizationLevel: "O2"
+            );
+
+            using var stream = File.OpenRead(binary.Path);
+            var result = await ingestionService.IngestLibraryAsync(metadata, stream, ct: ct);
+
+            Console.WriteLine($"Ingested {result.FunctionsIndexed} functions from glibc {version} {arch}");
+        }
+        catch (Exception ex)
+        {
+            Console.WriteLine($"Failed to ingest glibc {version} {arch}: {ex.Message}");
+        }
+    }
+}
+```
+
+## Ingestion Workflow
+
+```
+1. Package Discovery
+   └─> Query package mirror for available versions
+
+2. Package Download
+   └─> Fetch .deb/.apk/.rpm package
+   └─> Extract binary files
+
+3. Binary Analysis
+   └─> Disassemble with B2R2
+   └─> Lift to IR (semantic fingerprints)
+   └─> Extract functions, imports, exports
+
+4. Fingerprint Generation
+   └─> Instruction-level fingerprints
+   └─> Semantic graph fingerprints
+   └─> API call sequence fingerprints
+   └─> Combined fingerprints
+
+5. Database Storage
+   └─> Insert library/version records
+   └─> Insert build variant records
+   └─> Insert function records
+   └─> Insert fingerprint records
+
+6. Clustering (post-ingestion)
+   └─> Group similar functions across versions
+   └─> Compute centroids
+```
+
+## Expected Corpus Coverage
+
+### Phase 2a (Priority Libraries)
+
+| Library | Versions | Architectures | Est. Functions | Status |
+|---------|----------|---------------|----------------|--------|
+| glibc | 2.17, 2.28, 2.31, 2.35, 2.38 | x64, arm64, armv7 | ~15,000 | Ready to ingest |
+| OpenSSL | 1.0.2, 1.1.0, 1.1.1, 3.0, 3.1 | x64, arm64 | ~8,000 | Ready to ingest |
+| zlib | 1.2.8, 1.2.11, 1.2.13, 1.3 | x64, arm64 | ~200 | Ready to ingest |
+| libcurl | 7.50-7.88 (select) | x64, arm64 | ~2,000 | Ready to ingest |
+| SQLite | 3.30-3.44 (select) | x64, arm64 | ~1,500 | Ready to ingest |
+
+**Total Phase 2a:** ~26,700 unique functions, ~80,000 fingerprints (with variants)
+
+## Monitoring Ingestion
+
+```bash
+# Check ingestion job status
+stellaops corpus jobs list
+
+# View statistics
+stellaops corpus stats
+
+# Query specific library coverage
+stellaops corpus query --library glibc --show-versions
+```
+
+## Performance Considerations
+
+- **Parallel ingestion:** Use multiple workers for concurrent processing
+- **Disk I/O:** Local package cache significantly speeds up repeated ingestion
+- **Database:** Ensure PostgreSQL has adequate memory for bulk inserts
+- **Network:** Mirror selection impacts download speed
+
+## Troubleshooting
+
+### Package Download Failures
+
+```
+Error: Failed to download package from mirror
+Solution: Check mirror availability, try alternative mirror
+```
+
+### Fingerprint Generation Failures
+
+```
+Error: Failed to generate semantic fingerprint for function X
+Solution: Check B2R2 support for architecture, verify binary format
+```
+
+### Database Connection Issues
+
+```
+Error: Could not connect to corpus database
+Solution: Verify STELLAOPS_CORPUS_DB connection string, check PostgreSQL is running
+```
+
+## Next Steps
+
+After successful ingestion:
+
+1. Run clustering: `stellaops corpus cluster --library glibc`
+2. Update CVE associations: `stellaops corpus update-cves`
+3. Validate query performance: `stellaops corpus benchmark-query`
+4. Export statistics: `stellaops corpus export-stats --output corpus-stats.json`
+
+## Related Documentation
+
+- Database Schema: `docs/db/schemas/corpus.sql`
+- Architecture: `docs/modules/binary-index/corpus-management.md`
+- Sprint: `docs/implplan/SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
--- a/docs/modules/binary-index/corpus-management.md
+++ b/docs/modules/binary-index/corpus-management.md
@@ -0,0 +1,313 @@
+# Function Behavior Corpus Guide
+
+This document describes StellaOps' Function Behavior Corpus system - a BSim-like capability for identifying functions by their semantic behavior rather than relying on symbols or prior CVE signatures.
+
+## Overview
+
+The Function Behavior Corpus is a database of known library functions with pre-computed fingerprints that enable identification of functions in stripped binaries. When a binary is analyzed, functions can be matched against the corpus to determine:
+
+- **Library origin** - Which library (glibc, OpenSSL, zlib, etc.) the function comes from
+- **Version information** - Which version(s) of the library contain this function
+- **CVE associations** - Whether the function is linked to known vulnerabilities
+- **Patch status** - Whether a function matches a vulnerable or patched variant
+
+## Architecture
+
+```
+┌───────────────────────────────────────────────────────────────────────┐
+│                    Function Behavior Corpus                           │
+│                                                                       │
+│  ┌─────────────────────────────────────────────────────────────────┐  │
+│  │                 Corpus Ingestion Layer                          │  │
+│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐                 │  │
+│  │  │GlibcCorpus │  │OpenSSL     │  │ZlibCorpus  │  ...            │  │
+│  │  │Connector   │  │Connector   │  │Connector   │                 │  │
+│  │  └────────────┘  └────────────┘  └────────────┘                 │  │
+│  └─────────────────────────────────────────────────────────────────┘  │
+│                              │                                        │
+│                              v                                        │
+│  ┌─────────────────────────────────────────────────────────────────┐  │
+│  │                 Fingerprint Generation                          │  │
+│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐                 │  │
+│  │  │Instruction │  │Semantic    │  │API Call    │                 │  │
+│  │  │Hash        │  │KSG Hash    │  │Graph       │                 │  │
+│  │  └────────────┘  └────────────┘  └────────────┘                 │  │
+│  └─────────────────────────────────────────────────────────────────┘  │
+│                              │                                        │
+│                              v                                        │
+│  ┌─────────────────────────────────────────────────────────────────┐  │
+│  │                 Corpus Storage (PostgreSQL)                     │  │
+│  │                                                                 │  │
+│  │  corpus.libraries       - Known libraries                       │  │
+│  │  corpus.library_versions- Version snapshots                     │  │
+│  │  corpus.build_variants  - Architecture/compiler variants        │  │
+│  │  corpus.functions       - Function metadata                     │  │
+│  │  corpus.fingerprints    - Fingerprint index                     │  │
+│  │  corpus.function_clusters- Similar function groups              │  │
+│  │  corpus.function_cves   - CVE associations                      │  │
+│  └─────────────────────────────────────────────────────────────────┘  │
+└───────────────────────────────────────────────────────────────────────┘
+```
+
+## Core Services
+
+### ICorpusIngestionService
+
+Handles ingestion of library binaries into the corpus.
+
+```csharp
+public interface ICorpusIngestionService
+{
+    // Ingest a single library binary
+    Task<IngestionResult> IngestLibraryAsync(
+        LibraryIngestionMetadata metadata,
+        Stream binaryStream,
+        IngestionOptions? options = null,
+        CancellationToken ct = default);
+
+    // Ingest from a library connector (bulk)
+    IAsyncEnumerable<IngestionResult> IngestFromConnectorAsync(
+        string libraryName,
+        ILibraryCorpusConnector connector,
+        IngestionOptions? options = null,
+        CancellationToken ct = default);
+
+    // Update CVE associations for functions
+    Task<int> UpdateCveAssociationsAsync(
+        string cveId,
+        IReadOnlyList<FunctionCveAssociation> associations,
+        CancellationToken ct = default);
+
+    // Check job status
+    Task<IngestionJob?> GetJobStatusAsync(Guid jobId, CancellationToken ct = default);
+}
+```
+
+### ICorpusQueryService
+
+Queries the corpus to identify functions by their fingerprints.
+
+```csharp
+public interface ICorpusQueryService
+{
+    // Identify a single function
+    Task<ImmutableArray<FunctionMatch>> IdentifyFunctionAsync(
+        FunctionFingerprints fingerprints,
+        IdentifyOptions? options = null,
+        CancellationToken ct = default);
+
+    // Batch identify multiple functions
+    Task<ImmutableDictionary<int, ImmutableArray<FunctionMatch>>> IdentifyBatchAsync(
+        IReadOnlyList<FunctionFingerprints> fingerprintSets,
+        IdentifyOptions? options = null,
+        CancellationToken ct = default);
+
+    // Get corpus statistics
+    Task<CorpusStatistics> GetStatisticsAsync(CancellationToken ct = default);
+
+    // List available libraries
+    Task<ImmutableArray<LibrarySummary>> ListLibrariesAsync(CancellationToken ct = default);
+}
+```
+
+### ILibraryCorpusConnector
+
+Interface for library-specific connectors that fetch binaries for ingestion.
+
+```csharp
+public interface ILibraryCorpusConnector
+{
+    string LibraryName { get; }
+    string[] SupportedArchitectures { get; }
+
+    // Get available versions
+    Task<ImmutableArray<string>> GetAvailableVersionsAsync(CancellationToken ct);
+
+    // Fetch binaries for ingestion
+    IAsyncEnumerable<LibraryBinary> FetchBinariesAsync(
+        IReadOnlyList<string> versions,
+        string architecture,
+        LibraryFetchOptions? options = null,
+        CancellationToken ct = default);
+}
+```
+
+## Fingerprint Algorithms
+
+The corpus uses multiple fingerprint algorithms to enable matching under different conditions:
+
+### Semantic K-Skip-Gram Hash (`semantic_ksg`)
+
+Based on Ghidra BSim's approach:
+- Analyzes normalized p-code operations
+- Generates k-skip-gram features from instruction sequences
+- Robust against register renaming and basic-block reordering
+- Best for matching functions across optimization levels
+
+### Instruction Basic-Block Hash (`instruction_bb`)
+
+- Hashes normalized instruction sequences per basic block
+- More sensitive to compiler differences
+- Faster to compute than semantic hash
+- Good for exact or near-exact matches
+
+### Control-Flow Graph Hash (`cfg_wl`)
+
+- Weisfeiler-Lehman graph hash of the CFG
+- Captures structural similarity
+- Works well even when instruction sequences differ
+- Useful for detecting refactored code
+
+## Usage Examples
+
+### Ingesting a Library
+
+```csharp
+// Create ingestion metadata
+var metadata = new LibraryIngestionMetadata(
+    Name: "openssl",
+    Version: "3.0.15",
+    Architecture: "x86_64",
+    Compiler: "gcc",
+    CompilerVersion: "12.2",
+    OptimizationLevel: "O2",
+    IsSecurityRelease: true);
+
+// Ingest from file
+await using var stream = File.OpenRead("libssl.so.3");
+var result = await ingestionService.IngestLibraryAsync(metadata, stream);
+
+Console.WriteLine($"Indexed {result.FunctionsIndexed} functions");
+Console.WriteLine($"Generated {result.FingerprintsGenerated} fingerprints");
+```
+
+### Bulk Ingestion via Connector
+
+```csharp
+// Use the OpenSSL connector to fetch and ingest multiple versions
+var connector = new OpenSslCorpusConnector(httpClientFactory, logger);
+
+await foreach (var result in ingestionService.IngestFromConnectorAsync(
+    "openssl",
+    connector,
+    new IngestionOptions { GenerateClusters = true }))
+{
+    Console.WriteLine($"Ingested {result.LibraryName} {result.Version}: {result.FunctionsIndexed} functions");
+}
+```
+
+### Identifying Functions
+
+```csharp
+// Build fingerprints from analyzed function
+var fingerprints = new FunctionFingerprints(
+    SemanticHash: semanticHashBytes,
+    InstructionHash: instructionHashBytes,
+    CfgHash: cfgHashBytes,
+    ApiCalls: ["malloc", "memcpy", "free"],
+    SizeBytes: 256);
+
+// Query the corpus
+var matches = await queryService.IdentifyFunctionAsync(
+    fingerprints,
+    new IdentifyOptions
+    {
+        MinSimilarity = 0.85m,
+        MaxResults = 5,
+        IncludeCveAssociations = true
+    });
+
+foreach (var match in matches)
+{
+    Console.WriteLine($"Match: {match.LibraryName} {match.Version} - {match.FunctionName}");
+    Console.WriteLine($"  Similarity: {match.Similarity:P1}");
+    Console.WriteLine($"  Match method: {match.MatchMethod}");
+
+    if (match.CveAssociations.Any())
+    {
+        foreach (var cve in match.CveAssociations)
+        {
+            Console.WriteLine($"  CVE: {cve.CveId} ({cve.AffectedState})");
+        }
+    }
+}
+```
+
+### Checking CVE Associations
+
+```csharp
+// When a function matches, check if it's associated with known CVEs
+var match = matches.First();
+if (match.CveAssociations.Any(c => c.AffectedState == CveAffectedState.Vulnerable))
+{
+    Console.WriteLine("WARNING: Function matches a known vulnerable variant!");
+}
+```
+
+## Database Schema
+
+The corpus uses a dedicated PostgreSQL schema with the following key tables:
+
+| Table | Purpose |
+|-------|---------|
+| `corpus.libraries` | Master list of tracked libraries |
+| `corpus.library_versions` | Version records with release metadata |
+| `corpus.build_variants` | Architecture/compiler/optimization variants |
+| `corpus.functions` | Function metadata (name, address, size, etc.) |
+| `corpus.fingerprints` | Fingerprint hashes indexed for lookup |
+| `corpus.function_clusters` | Groups of similar functions |
+| `corpus.function_cves` | CVE-to-function associations |
+| `corpus.ingestion_jobs` | Job tracking for bulk ingestion |
+
+## Supported Libraries
+
+The corpus supports ingestion from these common libraries:
+
+| Library | Connector | Architectures |
+|---------|-----------|---------------|
+| glibc | `GlibcCorpusConnector` | x86_64, aarch64, armv7, i686 |
+| OpenSSL | `OpenSslCorpusConnector` | x86_64, aarch64, armv7 |
+| zlib | `ZlibCorpusConnector` | x86_64, aarch64 |
+| curl | `CurlCorpusConnector` | x86_64, aarch64 |
+| SQLite | `SqliteCorpusConnector` | x86_64, aarch64 |
+
+## Integration with Scanner
+
+The corpus integrates with the Scanner module through `IBinaryVulnerabilityService`:
+
+```csharp
+// Scanner can identify functions from fingerprints
+var matches = await binaryVulnService.IdentifyFunctionFromCorpusAsync(
+    new FunctionFingerprintSet(
+        FunctionAddress: 0x4000,
+        SemanticHash: hash,
+        InstructionHash: null,
+        CfgHash: null,
+        ApiCalls: null,
+        SizeBytes: 128),
+    new CorpusLookupOptions
+    {
+        MinSimilarity = 0.9m,
+        MaxResults = 3
+    });
+```
+
+## Performance Considerations
+
+- **Batch queries**: Use `IdentifyBatchAsync` for multiple functions to reduce round-trips
+- **Fingerprint selection**: Semantic hash is most robust but slowest; instruction hash is faster for exact matches
+- **Similarity threshold**: Higher thresholds reduce false positives but may miss legitimate matches
+- **Clustering**: Pre-computed clusters speed up similarity searches
+
+## Security Notes
+
+- Corpus connectors fetch from external sources; ensure network policies allow required endpoints
+- Ingested binaries are hashed to prevent duplicate processing
+- CVE associations include confidence scores and evidence types for auditability
+- All timestamps use UTC for consistency
+
+## Related Documentation
+
+- [Binary Index Architecture](architecture.md)
+- [Semantic Diffing](semantic-diffing.md)
+- [Scanner Module](../scanner/architecture.md)
--- a/docs/modules/binary-index/ghidra-deployment.md
+++ b/docs/modules/binary-index/ghidra-deployment.md
--- a/docs/modules/binary-index/ml-model-training.md
+++ b/docs/modules/binary-index/ml-model-training.md
@@ -0,0 +1,304 @@
+# BinaryIndex ML Model Training Guide
+
+This document describes how to train, export, and deploy ML models for the BinaryIndex binary similarity detection system.
+
+## Overview
+
+The BinaryIndex ML pipeline uses transformer-based models to generate function embeddings that capture semantic similarity. The primary model is **CodeBERT-Binary**, a fine-tuned variant of CodeBERT optimized for decompiled binary code comparison.
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                    Model Training Pipeline                          │
+│                                                                     │
+│  ┌───────────────┐    ┌────────────────┐    ┌──────────────────┐  │
+│  │ Training Data │ -> │ Fine-tuning    │ -> │ Model Export     │  │
+│  │ (Function     │    │ (Contrastive   │    │ (ONNX format)    │  │
+│  │ Pairs)        │    │ Learning)      │    │                  │  │
+│  └───────────────┘    └────────────────┘    └──────────────────┘  │
+│                                                                     │
+│  ┌───────────────────────────────────────────────────────────────┐ │
+│  │                    Inference Pipeline                         │ │
+│  │                                                               │ │
+│  │  Code -> Tokenizer -> ONNX Runtime -> Embedding (768-dim)    │ │
+│  │                                                               │ │
+│  └───────────────────────────────────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────────────────┘
+```
+
+## Training Data Requirements
+
+### Positive Pairs (Similar Functions)
+
+| Source | Description | Estimated Count |
+|--------|-------------|-----------------|
+| Same function, different optimization | O0 vs O2 vs O3 compilations | ~50,000 |
+| Same function, different compiler | GCC vs Clang vs MSVC | ~30,000 |
+| Same function, different version | From corpus snapshots | ~100,000 |
+| Vulnerability patches | Vulnerable vs fixed versions | ~20,000 |
+
+### Negative Pairs (Dissimilar Functions)
+
+| Source | Description | Estimated Count |
+|--------|-------------|-----------------|
+| Random function pairs | Random sampling from corpus | ~100,000 |
+| Similar-named different functions | Hard negatives for robustness | ~50,000 |
+| Same library, different functions | Medium-difficulty negatives | ~50,000 |
+
+**Total training data:** ~400,000 labeled pairs
+
+### Data Format
+
+Training data is stored as JSON Lines (JSONL) format:
+
+```json
+{"function_a": "int sum(int* a, int n) { int s = 0; for (int i = 0; i < n; i++) s += a[i]; return s; }", "function_b": "int total(int* arr, int len) { int t = 0; for (int j = 0; j < len; j++) t += arr[j]; return t; }", "is_similar": true, "similarity_score": 0.95}
+{"function_a": "int sum(int* a, int n) { ... }", "function_b": "void print(char* s) { ... }", "is_similar": false, "similarity_score": 0.1}
+```
+
+## Training Process
+
+### Prerequisites
+
+- Python 3.10+
+- PyTorch 2.0+
+- Transformers 4.30+
+- CUDA 11.8+ (for GPU training)
+- 64GB RAM, 32GB VRAM (V100 or A100 recommended)
+
+### Installation
+
+```bash
+cd tools/ml
+pip install -r requirements.txt
+```
+
+### Configuration
+
+Create a training configuration file `config/training.yaml`:
+
+```yaml
+model:
+  base_model: microsoft/codebert-base
+  embedding_dim: 768
+  max_sequence_length: 512
+
+training:
+  batch_size: 32
+  epochs: 10
+  learning_rate: 1e-5
+  warmup_steps: 1000
+  weight_decay: 0.01
+
+contrastive:
+  margin: 0.5
+  temperature: 0.07
+
+data:
+  train_path: data/train.jsonl
+  val_path: data/val.jsonl
+  test_path: data/test.jsonl
+
+output:
+  model_dir: models/codebert-binary
+  checkpoint_interval: 1000
+```
+
+### Running Training
+
+```bash
+python train_codebert_binary.py --config config/training.yaml
+```
+
+Training logs are written to `logs/` and checkpoints to `models/`.
+
+### Training Script Overview
+
+```python
+# tools/ml/train_codebert_binary.py
+
+class CodeBertBinaryModel(torch.nn.Module):
+    """CodeBERT fine-tuned for binary code similarity."""
+
+    def __init__(self, pretrained_model="microsoft/codebert-base"):
+        super().__init__()
+        self.encoder = RobertaModel.from_pretrained(pretrained_model)
+        self.projection = torch.nn.Linear(768, 768)
+
+    def forward(self, input_ids, attention_mask):
+        outputs = self.encoder(input_ids, attention_mask=attention_mask)
+        pooled = outputs.last_hidden_state[:, 0, :]  # [CLS] token
+        projected = self.projection(pooled)
+        return torch.nn.functional.normalize(projected, p=2, dim=1)
+
+
+class ContrastiveLoss(torch.nn.Module):
+    """Contrastive loss for learning similarity embeddings."""
+
+    def __init__(self, margin=0.5):
+        super().__init__()
+        self.margin = margin
+
+    def forward(self, embedding_a, embedding_b, label):
+        distance = torch.nn.functional.pairwise_distance(embedding_a, embedding_b)
+        # label=1: similar, label=0: dissimilar
+        loss = label * distance.pow(2) + \
+               (1 - label) * torch.clamp(self.margin - distance, min=0).pow(2)
+        return loss.mean()
+```
+
+## Model Export
+
+After training, export the model to ONNX format for inference:
+
+```bash
+python export_onnx.py \
+    --model models/codebert-binary/best.pt \
+    --output models/codebert-binary.onnx \
+    --opset 17
+```
+
+### Export Script
+
+```python
+# tools/ml/export_onnx.py
+
+def export_to_onnx(model, output_path):
+    model.eval()
+    dummy_input = torch.randint(0, 50000, (1, 512))
+    dummy_mask = torch.ones(1, 512)
+
+    torch.onnx.export(
+        model,
+        (dummy_input, dummy_mask),
+        output_path,
+        input_names=['input_ids', 'attention_mask'],
+        output_names=['embedding'],
+        dynamic_axes={
+            'input_ids': {0: 'batch', 1: 'seq'},
+            'attention_mask': {0: 'batch', 1: 'seq'},
+            'embedding': {0: 'batch'}
+        },
+        opset_version=17
+    )
+```
+
+## Deployment
+
+### Configuration
+
+Configure the ML service in your application:
+
+```yaml
+# etc/binaryindex.yaml
+ml:
+  enabled: true
+  model_path: /opt/stellaops/models/codebert-binary.onnx
+  vocabulary_path: /opt/stellaops/models/vocab.txt
+  num_threads: 4
+  batch_size: 16
+```
+
+### Code Integration
+
+```csharp
+// Register ML services
+services.AddMlServices(options =>
+{
+    options.ModelPath = config["ml:model_path"];
+    options.VocabularyPath = config["ml:vocabulary_path"];
+    options.NumThreads = config.GetValue<int>("ml:num_threads");
+});
+
+// Use embedding service
+var embedding = await embeddingService.GenerateEmbeddingAsync(
+    new EmbeddingInput(decompiledCode, null, null, EmbeddingInputType.DecompiledCode));
+
+// Compare embeddings
+var similarity = embeddingService.ComputeSimilarity(embA, embB, SimilarityMetric.Cosine);
+```
+
+### Fallback Mode
+
+When no ONNX model is available, the system generates hash-based pseudo-embeddings:
+
+```csharp
+// In OnnxInferenceEngine.cs
+if (_session is null)
+{
+    // Fallback: generate hash-based pseudo-embedding for testing
+    vector = GenerateFallbackEmbedding(text, 768);
+}
+```
+
+This allows the system to operate without a trained model (useful for testing) but with reduced accuracy.
+
+## Evaluation
+
+### Metrics
+
+| Metric | Definition | Target |
+|--------|------------|--------|
+| Accuracy | (TP + TN) / Total | > 90% |
+| Precision | TP / (TP + FP) | > 95% |
+| Recall | TP / (TP + FN) | > 85% |
+| F1 Score | 2 * P * R / (P + R) | > 90% |
+| Latency | Per-function embedding time | < 100ms |
+
+### Running Evaluation
+
+```bash
+python evaluate.py \
+    --model models/codebert-binary.onnx \
+    --test data/test.jsonl \
+    --output results/evaluation.json
+```
+
+### Benchmark Results
+
+From `EnsembleAccuracyBenchmarks`:
+
+| Approach | Accuracy | Precision | Recall | F1 Score | Latency |
+|----------|----------|-----------|--------|----------|---------|
+| Phase 1 (Hash only) | 70% | 100% | 0% | 0% | 1ms |
+| AST only | 75% | 80% | 70% | 74% | 5ms |
+| Embedding only | 80% | 85% | 75% | 80% | 50ms |
+| Ensemble (Phase 4) | 92% | 95% | 88% | 91% | 80ms |
+
+## Troubleshooting
+
+### Common Issues
+
+**Model not loading:**
+- Verify ONNX file path is correct
+- Check ONNX Runtime is installed: `dotnet add package Microsoft.ML.OnnxRuntime`
+- Ensure model was exported with compatible opset version
+
+**Low accuracy:**
+- Verify training data quality and balance
+- Check for data leakage between train/test splits
+- Adjust contrastive loss margin
+
+**High latency:**
+- Reduce max sequence length (default 512)
+- Enable batching for bulk operations
+- Consider GPU acceleration for high-volume deployments
+
+### Logging
+
+Enable detailed ML logging:
+
+```csharp
+services.AddLogging(builder =>
+{
+    builder.AddFilter("StellaOps.BinaryIndex.ML", LogLevel.Debug);
+});
+```
+
+## References
+
+- [CodeBERT Paper](https://arxiv.org/abs/2002.08155)
+- [Binary Code Similarity Detection](https://arxiv.org/abs/2308.01463)
+- [ONNX Runtime Documentation](https://onnxruntime.ai/docs/)
+- [Contrastive Learning for Code](https://arxiv.org/abs/2103.03143)
--- a/docs/modules/policy/determinization-architecture.md
+++ b/docs/modules/policy/determinization-architecture.md
@@ -0,0 +1,944 @@
+# Policy Determinization Architecture
+
+## Overview
+
+The **Determinization** subsystem handles CVEs that arrive without complete evidence (EPSS, VEX, reachability). Rather than blocking pipelines or silently ignoring unknowns, it treats them as **probabilistic observations** that can mature as evidence arrives.
+
+**Design Principles:**
+1. **Uncertainty is first-class** - Missing signals contribute to entropy, not guesswork
+2. **Graceful degradation** - Pipelines continue with guardrails, not hard blocks
+3. **Automatic hardening** - Policies tighten as evidence accumulates
+4. **Full auditability** - Every decision traces back to evidence state
+
+## Problem Statement
+
+When a CVE is discovered against a component, several scenarios create uncertainty:
+
+| Scenario | Current Behavior | Desired Behavior |
+|----------|------------------|------------------|
+| EPSS not yet published | Treat as unknown severity | Explicit `SignalState.NotQueried` with default prior |
+| VEX statement missing | Assume affected | Explicit uncertainty with configurable policy |
+| Reachability indeterminate | Conservative block | Allow with guardrails in non-prod |
+| Conflicting VEX sources | K4 Conflict state | Entropy penalty + human review trigger |
+| Stale evidence (>14 days) | No special handling | Decay-adjusted confidence + auto-review |
+
+## Architecture
+
+### Component Diagram
+
+```
+                                    +------------------------+
+                                    |    Policy Engine       |
+                                    |  (Verdict Evaluation)  |
+                                    +------------------------+
+                                              |
+                                              v
+----------------+    +-------------------+   +------------------------+
+|   Feedser      |--->| Signal Aggregator |-->| Determinization Gate   |
+| (EPSS/VEX/KEV) |    | (Null-aware)      |   | (Entropy Thresholds)   |
+----------------+    +-------------------+   +------------------------+
+                              |                        |
+                              v                        v
+                    +-------------------+    +-------------------+
+                    | Uncertainty Score |    | GuardRails Policy |
+                    | Calculator        |    | (Allow/Quarantine)|
+                    +-------------------+    +-------------------+
+                              |                        |
+                              v                        v
+                    +-------------------+    +-------------------+
+                    | Decay Calculator  |    | Observation State |
+                    | (Half-life)       |    | (pending_determ)  |
+                    +-------------------+    +-------------------+
+```
+
+### Library Structure
+
+```
+src/Policy/__Libraries/StellaOps.Policy.Determinization/
+├── Models/
+│   ├── ObservationState.cs           # CVE observation lifecycle states
+│   ├── SignalState.cs                # Null-aware signal wrapper
+│   ├── SignalSnapshot.cs             # Point-in-time signal collection
+│   ├── UncertaintyScore.cs           # Knowledge completeness entropy
+│   ├── ObservationDecay.cs           # Per-CVE decay configuration
+│   ├── GuardRails.cs                 # Guardrail policy outcomes
+│   └── DeterminizationContext.cs     # Evaluation context container
+├── Scoring/
+│   ├── IUncertaintyScoreCalculator.cs
+│   ├── UncertaintyScoreCalculator.cs # entropy = 1 - evidence_sum
+│   ├── IDecayedConfidenceCalculator.cs
+│   ├── DecayedConfidenceCalculator.cs # Half-life decay application
+│   ├── SignalWeights.cs              # Configurable signal weights
+│   └── PriorDistribution.cs          # Default priors for missing signals
+├── Policies/
+│   ├── IDeterminizationPolicy.cs
+│   ├── DeterminizationPolicy.cs      # Allow/quarantine/escalate rules
+│   ├── GuardRailsPolicy.cs           # Guardrails configuration
+│   ├── DeterminizationRuleSet.cs     # Rule definitions
+│   └── EnvironmentThresholds.cs      # Per-environment thresholds
+├── Gates/
+│   ├── IDeterminizationGate.cs
+│   ├── DeterminizationGate.cs        # Policy engine gate
+│   └── DeterminizationGateOptions.cs
+├── Subscriptions/
+│   ├── ISignalUpdateSubscription.cs
+│   ├── SignalUpdateHandler.cs        # Re-evaluation on new signals
+│   └── DeterminizationEventTypes.cs
+├── DeterminizationOptions.cs         # Global options
+└── ServiceCollectionExtensions.cs    # DI registration
+```
+
+## Data Models
+
+### ObservationState
+
+Represents the lifecycle state of a CVE observation, orthogonal to VEX status:
+
+```csharp
+/// <summary>
+/// Observation state for CVE tracking, independent of VEX status.
+/// Allows a CVE to be "Affected" (VEX) but "PendingDeterminization" (observation).
+/// </summary>
+public enum ObservationState
+{
+    /// <summary>
+    /// Initial state: CVE discovered but evidence incomplete.
+    /// Triggers guardrail-based policy evaluation.
+    /// </summary>
+    PendingDeterminization = 0,
+
+    /// <summary>
+    /// Evidence sufficient for confident determination.
+    /// Normal policy evaluation applies.
+    /// </summary>
+    Determined = 1,
+
+    /// <summary>
+    /// Multiple signals conflict (K4 Conflict state).
+    /// Requires human review regardless of confidence.
+    /// </summary>
+    Disputed = 2,
+
+    /// <summary>
+    /// Evidence decayed below threshold; needs refresh.
+    /// Auto-triggered when decay > threshold.
+    /// </summary>
+    StaleRequiresRefresh = 3,
+
+    /// <summary>
+    /// Manually flagged for review.
+    /// Bypasses automatic determinization.
+    /// </summary>
+    ManualReviewRequired = 4,
+
+    /// <summary>
+    /// CVE suppressed/ignored by policy exception.
+    /// Evidence tracking continues but decisions skip.
+    /// </summary>
+    Suppressed = 5
+}
+```
+
+### SignalState<T>
+
+Null-aware wrapper distinguishing "not queried" from "queried, value null":
+
+```csharp
+/// <summary>
+/// Wraps a signal value with query status metadata.
+/// Distinguishes between: not queried, queried with value, queried but absent, query failed.
+/// </summary>
+public sealed record SignalState<T>
+{
+    /// <summary>Status of the signal query.</summary>
+    public required SignalQueryStatus Status { get; init; }
+
+    /// <summary>Signal value if Status is Queried and value exists.</summary>
+    public T? Value { get; init; }
+
+    /// <summary>When the signal was last queried (UTC).</summary>
+    public DateTimeOffset? QueriedAt { get; init; }
+
+    /// <summary>Reason for failure if Status is Failed.</summary>
+    public string? FailureReason { get; init; }
+
+    /// <summary>Source that provided the value (feed ID, issuer, etc.).</summary>
+    public string? Source { get; init; }
+
+    /// <summary>Whether this signal contributes to uncertainty (true if not queried or failed).</summary>
+    public bool ContributesToUncertainty =>
+        Status is SignalQueryStatus.NotQueried or SignalQueryStatus.Failed;
+
+    /// <summary>Whether this signal has a usable value.</summary>
+    public bool HasValue => Status == SignalQueryStatus.Queried && Value is not null;
+}
+
+public enum SignalQueryStatus
+{
+    /// <summary>Signal source not yet queried.</summary>
+    NotQueried = 0,
+
+    /// <summary>Signal source queried; value may be present or absent.</summary>
+    Queried = 1,
+
+    /// <summary>Signal query failed (timeout, network, parse error).</summary>
+    Failed = 2
+}
+```
+
+### SignalSnapshot
+
+Point-in-time collection of all signals for a CVE observation:
+
+```csharp
+/// <summary>
+/// Immutable snapshot of all signals for a CVE observation at a point in time.
+/// </summary>
+public sealed record SignalSnapshot
+{
+    /// <summary>CVE identifier (e.g., CVE-2026-12345).</summary>
+    public required string CveId { get; init; }
+
+    /// <summary>Subject component (PURL).</summary>
+    public required string SubjectPurl { get; init; }
+
+    /// <summary>Snapshot capture time (UTC).</summary>
+    public required DateTimeOffset CapturedAt { get; init; }
+
+    /// <summary>EPSS score signal.</summary>
+    public required SignalState<EpssEvidence> Epss { get; init; }
+
+    /// <summary>VEX claim signal.</summary>
+    public required SignalState<VexClaimSummary> Vex { get; init; }
+
+    /// <summary>Reachability determination signal.</summary>
+    public required SignalState<ReachabilityEvidence> Reachability { get; init; }
+
+    /// <summary>Runtime observation signal (eBPF, dyld, ETW).</summary>
+    public required SignalState<RuntimeEvidence> Runtime { get; init; }
+
+    /// <summary>Fix backport detection signal.</summary>
+    public required SignalState<BackportEvidence> Backport { get; init; }
+
+    /// <summary>SBOM lineage signal.</summary>
+    public required SignalState<SbomLineageEvidence> SbomLineage { get; init; }
+
+    /// <summary>Known Exploited Vulnerability flag.</summary>
+    public required SignalState<bool> Kev { get; init; }
+
+    /// <summary>CVSS score signal.</summary>
+    public required SignalState<CvssEvidence> Cvss { get; init; }
+}
+```
+
+### UncertaintyScore
+
+Knowledge completeness measurement (not code entropy):
+
+```csharp
+/// <summary>
+/// Measures knowledge completeness for a CVE observation.
+/// High entropy (close to 1.0) means many signals are missing.
+/// Low entropy (close to 0.0) means comprehensive evidence.
+/// </summary>
+public sealed record UncertaintyScore
+{
+    /// <summary>Entropy value [0.0-1.0]. Higher = more uncertain.</summary>
+    public required double Entropy { get; init; }
+
+    /// <summary>Completeness value [0.0-1.0]. Higher = more complete. (1 - Entropy)</summary>
+    public double Completeness => 1.0 - Entropy;
+
+    /// <summary>Signals that are missing or failed.</summary>
+    public required ImmutableArray<SignalGap> MissingSignals { get; init; }
+
+    /// <summary>Weighted sum of present signals.</summary>
+    public required double WeightedEvidenceSum { get; init; }
+
+    /// <summary>Maximum possible weighted sum (all signals present).</summary>
+    public required double MaxPossibleWeight { get; init; }
+
+    /// <summary>Tier classification based on entropy.</summary>
+    public UncertaintyTier Tier => Entropy switch
+    {
+        <= 0.2 => UncertaintyTier.VeryLow,    // Comprehensive evidence
+        <= 0.4 => UncertaintyTier.Low,        // Good evidence coverage
+        <= 0.6 => UncertaintyTier.Medium,     // Moderate gaps
+        <= 0.8 => UncertaintyTier.High,       // Significant gaps
+        _ => UncertaintyTier.VeryHigh         // Minimal evidence
+    };
+}
+
+public sealed record SignalGap(
+    string SignalName,
+    double Weight,
+    SignalQueryStatus Status,
+    string? Reason);
+
+public enum UncertaintyTier
+{
+    VeryLow = 0,   // Entropy <= 0.2
+    Low = 1,       // Entropy <= 0.4
+    Medium = 2,    // Entropy <= 0.6
+    High = 3,      // Entropy <= 0.8
+    VeryHigh = 4   // Entropy > 0.8
+}
+```
+
+### ObservationDecay
+
+Time-based confidence decay configuration:
+
+```csharp
+/// <summary>
+/// Tracks evidence freshness decay for a CVE observation.
+/// </summary>
+public sealed record ObservationDecay
+{
+    /// <summary>Half-life for confidence decay. Default: 14 days per advisory.</summary>
+    public required TimeSpan HalfLife { get; init; }
+
+    /// <summary>Minimum confidence floor (never decays below). Default: 0.35.</summary>
+    public required double Floor { get; init; }
+
+    /// <summary>Last time any signal was updated (UTC).</summary>
+    public required DateTimeOffset LastSignalUpdate { get; init; }
+
+    /// <summary>Current decayed confidence multiplier [Floor-1.0].</summary>
+    public required double DecayedMultiplier { get; init; }
+
+    /// <summary>When next auto-review is scheduled (UTC).</summary>
+    public DateTimeOffset? NextReviewAt { get; init; }
+
+    /// <summary>Whether decay has triggered stale state.</summary>
+    public bool IsStale { get; init; }
+}
+```
+
+### GuardRails
+
+Policy outcome with monitoring requirements:
+
+```csharp
+/// <summary>
+/// Guardrails applied when allowing uncertain observations.
+/// </summary>
+public sealed record GuardRails
+{
+    /// <summary>Enable runtime monitoring for this observation.</summary>
+    public required bool EnableRuntimeMonitoring { get; init; }
+
+    /// <summary>Interval for automatic re-review.</summary>
+    public required TimeSpan ReviewInterval { get; init; }
+
+    /// <summary>EPSS threshold that triggers automatic escalation.</summary>
+    public required double EpssEscalationThreshold { get; init; }
+
+    /// <summary>Reachability status that triggers escalation.</summary>
+    public required ImmutableArray<string> EscalatingReachabilityStates { get; init; }
+
+    /// <summary>Maximum time in guarded state before forced review.</summary>
+    public required TimeSpan MaxGuardedDuration { get; init; }
+
+    /// <summary>Alert channels for this observation.</summary>
+    public ImmutableArray<string> AlertChannels { get; init; } = ImmutableArray<string>.Empty;
+
+    /// <summary>Additional context for audit trail.</summary>
+    public string? PolicyRationale { get; init; }
+}
+```
+
+## Scoring Algorithms
+
+### Uncertainty Score Calculation
+
+```csharp
+/// <summary>
+/// Calculates knowledge completeness entropy from signal snapshot.
+/// Formula: entropy = 1 - (sum of weighted present signals / max possible weight)
+/// </summary>
+public sealed class UncertaintyScoreCalculator : IUncertaintyScoreCalculator
+{
+    private readonly SignalWeights _weights;
+
+    public UncertaintyScore Calculate(SignalSnapshot snapshot)
+    {
+        var gaps = new List<SignalGap>();
+        var weightedSum = 0.0;
+        var maxWeight = _weights.TotalWeight;
+
+        // EPSS signal
+        if (snapshot.Epss.HasValue)
+            weightedSum += _weights.Epss;
+        else
+            gaps.Add(new SignalGap("EPSS", _weights.Epss, snapshot.Epss.Status, snapshot.Epss.FailureReason));
+
+        // VEX signal
+        if (snapshot.Vex.HasValue)
+            weightedSum += _weights.Vex;
+        else
+            gaps.Add(new SignalGap("VEX", _weights.Vex, snapshot.Vex.Status, snapshot.Vex.FailureReason));
+
+        // Reachability signal
+        if (snapshot.Reachability.HasValue)
+            weightedSum += _weights.Reachability;
+        else
+            gaps.Add(new SignalGap("Reachability", _weights.Reachability, snapshot.Reachability.Status, snapshot.Reachability.FailureReason));
+
+        // Runtime signal
+        if (snapshot.Runtime.HasValue)
+            weightedSum += _weights.Runtime;
+        else
+            gaps.Add(new SignalGap("Runtime", _weights.Runtime, snapshot.Runtime.Status, snapshot.Runtime.FailureReason));
+
+        // Backport signal
+        if (snapshot.Backport.HasValue)
+            weightedSum += _weights.Backport;
+        else
+            gaps.Add(new SignalGap("Backport", _weights.Backport, snapshot.Backport.Status, snapshot.Backport.FailureReason));
+
+        // SBOM Lineage signal
+        if (snapshot.SbomLineage.HasValue)
+            weightedSum += _weights.SbomLineage;
+        else
+            gaps.Add(new SignalGap("SBOMLineage", _weights.SbomLineage, snapshot.SbomLineage.Status, snapshot.SbomLineage.FailureReason));
+
+        var entropy = 1.0 - (weightedSum / maxWeight);
+
+        return new UncertaintyScore
+        {
+            Entropy = Math.Clamp(entropy, 0.0, 1.0),
+            MissingSignals = gaps.ToImmutableArray(),
+            WeightedEvidenceSum = weightedSum,
+            MaxPossibleWeight = maxWeight
+        };
+    }
+}
+```
+
+### Signal Weights (Configurable)
+
+```csharp
+/// <summary>
+/// Configurable weights for signal contribution to completeness.
+/// Weights should sum to 1.0 for normalized entropy.
+/// </summary>
+public sealed record SignalWeights
+{
+    public double Vex { get; init; } = 0.25;
+    public double Epss { get; init; } = 0.15;
+    public double Reachability { get; init; } = 0.25;
+    public double Runtime { get; init; } = 0.15;
+    public double Backport { get; init; } = 0.10;
+    public double SbomLineage { get; init; } = 0.10;
+
+    public double TotalWeight =>
+        Vex + Epss + Reachability + Runtime + Backport + SbomLineage;
+
+    public SignalWeights Normalize()
+    {
+        var total = TotalWeight;
+        return new SignalWeights
+        {
+            Vex = Vex / total,
+            Epss = Epss / total,
+            Reachability = Reachability / total,
+            Runtime = Runtime / total,
+            Backport = Backport / total,
+            SbomLineage = SbomLineage / total
+        };
+    }
+}
+```
+
+### Decay Calculation
+
+```csharp
+/// <summary>
+/// Applies exponential decay to confidence based on evidence staleness.
+/// Formula: decayed = max(floor, exp(-ln(2) * age_days / half_life_days))
+/// </summary>
+public sealed class DecayedConfidenceCalculator : IDecayedConfidenceCalculator
+{
+    private readonly TimeProvider _timeProvider;
+
+    public ObservationDecay Calculate(
+        DateTimeOffset lastSignalUpdate,
+        TimeSpan halfLife,
+        double floor = 0.35)
+    {
+        var now = _timeProvider.GetUtcNow();
+        var ageDays = (now - lastSignalUpdate).TotalDays;
+
+        double decayedMultiplier;
+        if (ageDays <= 0)
+        {
+            decayedMultiplier = 1.0;
+        }
+        else
+        {
+            var rawDecay = Math.Exp(-Math.Log(2) * ageDays / halfLife.TotalDays);
+            decayedMultiplier = Math.Max(rawDecay, floor);
+        }
+
+        // Calculate next review time (when decay crosses 50% threshold)
+        var daysTo50Percent = halfLife.TotalDays;
+        var nextReviewAt = lastSignalUpdate.AddDays(daysTo50Percent);
+
+        return new ObservationDecay
+        {
+            HalfLife = halfLife,
+            Floor = floor,
+            LastSignalUpdate = lastSignalUpdate,
+            DecayedMultiplier = decayedMultiplier,
+            NextReviewAt = nextReviewAt,
+            IsStale = decayedMultiplier <= 0.5
+        };
+    }
+}
+```
+
+## Policy Rules
+
+### Determinization Policy
+
+```csharp
+/// <summary>
+/// Implements allow/quarantine/escalate logic per advisory specification.
+/// </summary>
+public sealed class DeterminizationPolicy : IDeterminizationPolicy
+{
+    private readonly DeterminizationOptions _options;
+    private readonly ILogger<DeterminizationPolicy> _logger;
+
+    public DeterminizationResult Evaluate(DeterminizationContext ctx)
+    {
+        var snapshot = ctx.SignalSnapshot;
+        var uncertainty = ctx.UncertaintyScore;
+        var decay = ctx.Decay;
+        var env = ctx.Environment;
+
+        // Rule 1: Escalate if runtime evidence shows loaded
+        if (snapshot.Runtime.HasValue &&
+            snapshot.Runtime.Value!.ObservedLoaded)
+        {
+            return DeterminizationResult.Escalated(
+                "Runtime evidence shows vulnerable code loaded",
+                PolicyVerdictStatus.Escalated);
+        }
+
+        // Rule 2: Quarantine if EPSS >= threshold or proven reachable
+        if (snapshot.Epss.HasValue &&
+            snapshot.Epss.Value!.Score >= _options.EpssQuarantineThreshold)
+        {
+            return DeterminizationResult.Quarantined(
+                $"EPSS score {snapshot.Epss.Value.Score:P1} exceeds threshold {_options.EpssQuarantineThreshold:P1}",
+                PolicyVerdictStatus.Blocked);
+        }
+
+        if (snapshot.Reachability.HasValue &&
+            snapshot.Reachability.Value!.Status == ReachabilityStatus.Reachable)
+        {
+            return DeterminizationResult.Quarantined(
+                "Vulnerable code is reachable via call graph",
+                PolicyVerdictStatus.Blocked);
+        }
+
+        // Rule 3: Allow with guardrails if score < threshold AND entropy > threshold AND non-prod
+        var trustScore = ctx.TrustScore;
+        if (trustScore < _options.GuardedAllowScoreThreshold &&
+            uncertainty.Entropy > _options.GuardedAllowEntropyThreshold &&
+            env != DeploymentEnvironment.Production)
+        {
+            var guardrails = BuildGuardrails(ctx);
+            return DeterminizationResult.GuardedAllow(
+                $"Uncertain observation (entropy={uncertainty.Entropy:F2}) allowed with guardrails in {env}",
+                PolicyVerdictStatus.GuardedPass,
+                guardrails);
+        }
+
+        // Rule 4: Block in production with high entropy
+        if (env == DeploymentEnvironment.Production &&
+            uncertainty.Entropy > _options.ProductionBlockEntropyThreshold)
+        {
+            return DeterminizationResult.Quarantined(
+                $"High uncertainty (entropy={uncertainty.Entropy:F2}) not allowed in production",
+                PolicyVerdictStatus.Blocked);
+        }
+
+        // Rule 5: Defer if evidence is stale
+        if (decay.IsStale)
+        {
+            return DeterminizationResult.Deferred(
+                $"Evidence stale (last update: {decay.LastSignalUpdate:u}), requires refresh",
+                PolicyVerdictStatus.Deferred);
+        }
+
+        // Default: Allow (sufficient evidence or acceptable risk)
+        return DeterminizationResult.Allowed(
+            "Evidence sufficient for determination",
+            PolicyVerdictStatus.Pass);
+    }
+
+    private GuardRails BuildGuardrails(DeterminizationContext ctx) =>
+        new GuardRails
+        {
+            EnableRuntimeMonitoring = true,
+            ReviewInterval = TimeSpan.FromDays(_options.GuardedReviewIntervalDays),
+            EpssEscalationThreshold = _options.EpssQuarantineThreshold,
+            EscalatingReachabilityStates = ImmutableArray.Create("Reachable", "ObservedReachable"),
+            MaxGuardedDuration = TimeSpan.FromDays(_options.MaxGuardedDurationDays),
+            PolicyRationale = $"Auto-allowed with entropy={ctx.UncertaintyScore.Entropy:F2}, trust={ctx.TrustScore:F2}"
+        };
+}
+```
+
+### Environment Thresholds
+
+```csharp
+/// <summary>
+/// Per-environment threshold configuration.
+/// </summary>
+public sealed record EnvironmentThresholds
+{
+    public DeploymentEnvironment Environment { get; init; }
+    public double MinConfidenceForNotAffected { get; init; }
+    public double MaxEntropyForAllow { get; init; }
+    public double EpssBlockThreshold { get; init; }
+    public bool RequireReachabilityForAllow { get; init; }
+}
+
+public static class DefaultEnvironmentThresholds
+{
+    public static EnvironmentThresholds Production => new()
+    {
+        Environment = DeploymentEnvironment.Production,
+        MinConfidenceForNotAffected = 0.75,
+        MaxEntropyForAllow = 0.3,
+        EpssBlockThreshold = 0.3,
+        RequireReachabilityForAllow = true
+    };
+
+    public static EnvironmentThresholds Staging => new()
+    {
+        Environment = DeploymentEnvironment.Staging,
+        MinConfidenceForNotAffected = 0.60,
+        MaxEntropyForAllow = 0.5,
+        EpssBlockThreshold = 0.4,
+        RequireReachabilityForAllow = true
+    };
+
+    public static EnvironmentThresholds Development => new()
+    {
+        Environment = DeploymentEnvironment.Development,
+        MinConfidenceForNotAffected = 0.40,
+        MaxEntropyForAllow = 0.7,
+        EpssBlockThreshold = 0.6,
+        RequireReachabilityForAllow = false
+    };
+}
+```
+
+## Integration Points
+
+### Feedser Integration
+
+Feedser attaches `SignalState<T>` to CVE observations:
+
+```csharp
+// In Feedser: EpssSignalAttacher
+public async Task<SignalState<EpssEvidence>> AttachEpssAsync(string cveId, CancellationToken ct)
+{
+    try
+    {
+        var evidence = await _epssClient.GetScoreAsync(cveId, ct);
+        return new SignalState<EpssEvidence>
+        {
+            Status = SignalQueryStatus.Queried,
+            Value = evidence,
+            QueriedAt = _timeProvider.GetUtcNow(),
+            Source = "first.org"
+        };
+    }
+    catch (EpssNotFoundException)
+    {
+        return new SignalState<EpssEvidence>
+        {
+            Status = SignalQueryStatus.Queried,
+            Value = null,
+            QueriedAt = _timeProvider.GetUtcNow(),
+            Source = "first.org"
+        };
+    }
+    catch (Exception ex)
+    {
+        return new SignalState<EpssEvidence>
+        {
+            Status = SignalQueryStatus.Failed,
+            Value = null,
+            FailureReason = ex.Message
+        };
+    }
+}
+```
+
+### Policy Engine Gate
+
+```csharp
+// In Policy.Engine: DeterminizationGate
+public sealed class DeterminizationGate : IPolicyGate
+{
+    private readonly IDeterminizationPolicy _policy;
+    private readonly IUncertaintyScoreCalculator _uncertaintyCalculator;
+    private readonly IDecayedConfidenceCalculator _decayCalculator;
+
+    public async Task<GateResult> EvaluateAsync(PolicyEvaluationContext ctx, CancellationToken ct)
+    {
+        var snapshot = await BuildSignalSnapshotAsync(ctx, ct);
+        var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
+        var decay = _decayCalculator.Calculate(snapshot.CapturedAt, ctx.Options.DecayHalfLife);
+
+        var determCtx = new DeterminizationContext
+        {
+            SignalSnapshot = snapshot,
+            UncertaintyScore = uncertainty,
+            Decay = decay,
+            TrustScore = ctx.TrustScore,
+            Environment = ctx.Environment
+        };
+
+        var result = _policy.Evaluate(determCtx);
+
+        return new GateResult
+        {
+            Passed = result.Status is PolicyVerdictStatus.Pass or PolicyVerdictStatus.GuardedPass,
+            Status = result.Status,
+            Reason = result.Reason,
+            GuardRails = result.GuardRails,
+            Metadata = new Dictionary<string, object>
+            {
+                ["uncertainty_entropy"] = uncertainty.Entropy,
+                ["uncertainty_tier"] = uncertainty.Tier.ToString(),
+                ["decay_multiplier"] = decay.DecayedMultiplier,
+                ["missing_signals"] = uncertainty.MissingSignals.Select(g => g.SignalName).ToArray()
+            }
+        };
+    }
+}
+```
+
+### Graph Integration
+
+CVE nodes in the Graph module carry `ObservationState` and `UncertaintyScore`:
+
+```csharp
+// Extended CVE node for Graph module
+public sealed record CveObservationNode
+{
+    public required string CveId { get; init; }
+    public required string SubjectPurl { get; init; }
+
+    // VEX status (orthogonal to observation state)
+    public required VexClaimStatus? VexStatus { get; init; }
+
+    // Observation lifecycle state
+    public required ObservationState ObservationState { get; init; }
+
+    // Knowledge completeness
+    public required UncertaintyScore Uncertainty { get; init; }
+
+    // Evidence freshness
+    public required ObservationDecay Decay { get; init; }
+
+    // Trust score (from confidence aggregation)
+    public required double TrustScore { get; init; }
+
+    // Policy outcome
+    public required PolicyVerdictStatus PolicyHint { get; init; }
+
+    // Guardrails if GuardedPass
+    public GuardRails? GuardRails { get; init; }
+}
+```
+
+## Event-Driven Re-evaluation
+
+When new signals arrive, the system re-evaluates affected observations:
+
+```csharp
+public sealed class SignalUpdateHandler : ISignalUpdateSubscription
+{
+    private readonly IObservationRepository _observations;
+    private readonly IDeterminizationPolicy _policy;
+    private readonly IEventPublisher _events;
+
+    public async Task HandleAsync(SignalUpdatedEvent evt, CancellationToken ct)
+    {
+        // Find observations affected by this signal
+        var affected = await _observations.FindByCveAndPurlAsync(evt.CveId, evt.Purl, ct);
+
+        foreach (var obs in affected)
+        {
+            // Rebuild signal snapshot
+            var snapshot = await BuildCurrentSnapshotAsync(obs, ct);
+
+            // Recalculate uncertainty
+            var uncertainty = _uncertaintyCalculator.Calculate(snapshot);
+
+            // Re-evaluate policy
+            var result = _policy.Evaluate(new DeterminizationContext
+            {
+                SignalSnapshot = snapshot,
+                UncertaintyScore = uncertainty,
+                // ... other context
+            });
+
+            // Transition state if needed
+            var newState = DetermineNewState(obs.ObservationState, result, uncertainty);
+            if (newState != obs.ObservationState)
+            {
+                await _observations.UpdateStateAsync(obs.Id, newState, ct);
+                await _events.PublishAsync(new ObservationStateChangedEvent(
+                    obs.Id, obs.ObservationState, newState, result.Reason), ct);
+            }
+        }
+    }
+
+    private ObservationState DetermineNewState(
+        ObservationState current,
+        DeterminizationResult result,
+        UncertaintyScore uncertainty)
+    {
+        // Transition logic
+        if (result.Status == PolicyVerdictStatus.Escalated)
+            return ObservationState.ManualReviewRequired;
+
+        if (uncertainty.Tier == UncertaintyTier.VeryLow)
+            return ObservationState.Determined;
+
+        if (current == ObservationState.PendingDeterminization &&
+            uncertainty.Tier <= UncertaintyTier.Low)
+            return ObservationState.Determined;
+
+        return current;
+    }
+}
+```
+
+## Configuration
+
+```csharp
+public sealed class DeterminizationOptions
+{
+    /// <summary>EPSS score that triggers quarantine (block). Default: 0.4</summary>
+    public double EpssQuarantineThreshold { get; set; } = 0.4;
+
+    /// <summary>Trust score threshold for guarded allow. Default: 0.5</summary>
+    public double GuardedAllowScoreThreshold { get; set; } = 0.5;
+
+    /// <summary>Entropy threshold for guarded allow. Default: 0.4</summary>
+    public double GuardedAllowEntropyThreshold { get; set; } = 0.4;
+
+    /// <summary>Entropy threshold for production block. Default: 0.3</summary>
+    public double ProductionBlockEntropyThreshold { get; set; } = 0.3;
+
+    /// <summary>Half-life for evidence decay in days. Default: 14</summary>
+    public int DecayHalfLifeDays { get; set; } = 14;
+
+    /// <summary>Minimum confidence floor after decay. Default: 0.35</summary>
+    public double DecayFloor { get; set; } = 0.35;
+
+    /// <summary>Review interval for guarded observations in days. Default: 7</summary>
+    public int GuardedReviewIntervalDays { get; set; } = 7;
+
+    /// <summary>Maximum time in guarded state in days. Default: 30</summary>
+    public int MaxGuardedDurationDays { get; set; } = 30;
+
+    /// <summary>Signal weights for uncertainty calculation.</summary>
+    public SignalWeights SignalWeights { get; set; } = new();
+
+    /// <summary>Per-environment threshold overrides.</summary>
+    public Dictionary<string, EnvironmentThresholds> EnvironmentThresholds { get; set; } = new();
+}
+```
+
+## Verdict Status Extension
+
+Extended `PolicyVerdictStatus` enum:
+
+```csharp
+public enum PolicyVerdictStatus
+{
+    Pass = 0,           // Finding meets policy requirements
+    GuardedPass = 1,    // NEW: Allow with runtime monitoring enabled
+    Blocked = 2,        // Finding fails policy checks; must be remediated
+    Ignored = 3,        // Finding deliberately ignored via exception
+    Warned = 4,         // Finding passes but with warnings
+    Deferred = 5,       // Decision deferred; needs additional evidence
+    Escalated = 6,      // Decision escalated for human review
+    RequiresVex = 7     // VEX statement required to make decision
+}
+```
+
+## Metrics & Observability
+
+```csharp
+public static class DeterminizationMetrics
+{
+    // Counters
+    public static readonly Counter<int> ObservationsCreated =
+        Meter.CreateCounter<int>("stellaops_determinization_observations_created_total");
+
+    public static readonly Counter<int> StateTransitions =
+        Meter.CreateCounter<int>("stellaops_determinization_state_transitions_total");
+
+    public static readonly Counter<int> PolicyEvaluations =
+        Meter.CreateCounter<int>("stellaops_determinization_policy_evaluations_total");
+
+    // Histograms
+    public static readonly Histogram<double> UncertaintyEntropy =
+        Meter.CreateHistogram<double>("stellaops_determinization_uncertainty_entropy");
+
+    public static readonly Histogram<double> DecayMultiplier =
+        Meter.CreateHistogram<double>("stellaops_determinization_decay_multiplier");
+
+    // Gauges
+    public static readonly ObservableGauge<int> PendingObservations =
+        Meter.CreateObservableGauge<int>("stellaops_determinization_pending_observations",
+            () => /* query count */);
+
+    public static readonly ObservableGauge<int> StaleObservations =
+        Meter.CreateObservableGauge<int>("stellaops_determinization_stale_observations",
+            () => /* query count */);
+}
+```
+
+## Testing Strategy
+
+| Test Category | Focus Area | Example |
+|---------------|------------|---------|
+| Unit | Uncertainty calculation | Missing 2 signals = correct entropy |
+| Unit | Decay calculation | 14 days = 50% multiplier |
+| Unit | Policy rules | EPSS 0.5 + dev = guarded allow |
+| Integration | Signal attachment | Feedser EPSS query → SignalState |
+| Integration | State transitions | New VEX → PendingDeterminization → Determined |
+| Determinism | Same input → same output | Canonical snapshot → reproducible entropy |
+| Property | Entropy bounds | Always [0.0, 1.0] |
+| Property | Decay monotonicity | Older → lower multiplier |
+
+## Security Considerations
+
+1. **No Guessing:** Missing signals use explicit priors, never random values
+2. **Audit Trail:** Every state transition logged with evidence snapshot
+3. **Conservative Defaults:** Production blocks high entropy; only non-prod allows guardrails
+4. **Escalation Path:** Runtime evidence always escalates regardless of other signals
+5. **Tamper Detection:** Signal snapshots hashed for integrity verification
+
+## References
+
+- Product Advisory: "Unknown CVEs: graceful placeholders, not blockers"
+- Existing: `src/Policy/__Libraries/StellaOps.Policy.Unknowns/`
+- Existing: `src/Policy/__Libraries/StellaOps.Policy/Confidence/`
+- Existing: `src/Excititor/__Libraries/StellaOps.Excititor.Core/TrustVector/`
+- OpenVEX Specification: https://openvex.dev/
+- EPSS Model: https://www.first.org/epss/
--- a/docs/modules/scheduler/hlc-migration-guide.md
+++ b/docs/modules/scheduler/hlc-migration-guide.md
@@ -0,0 +1,190 @@
+# HLC Queue Ordering Migration Guide
+
+This guide describes how to enable HLC (Hybrid Logical Clock) ordering for the Scheduler queue, transitioning from legacy `(priority, created_at)` ordering to HLC-based ordering with cryptographic chain linking.
+
+## Overview
+
+HLC ordering provides:
+- **Deterministic global ordering**: Causal consistency across distributed nodes
+- **Cryptographic chain linking**: Audit-safe job sequence proofs
+- **Reproducible processing**: Same input produces same chain
+
+## Prerequisites
+
+1. PostgreSQL 16+ with the scheduler schema
+2. HLC library dependency (`StellaOps.HybridLogicalClock`)
+3. Schema migration `002_hlc_queue_chain.sql` applied
+
+## Migration Phases
+
+### Phase 1: Deploy with Dual-Write Mode
+
+Enable dual-write to populate the new `scheduler_log` table without affecting existing operations.
+
+```yaml
+# appsettings.yaml or environment configuration
+Scheduler:
+  Queue:
+    Hlc:
+      EnableHlcOrdering: false  # Keep using legacy ordering for reads
+      DualWriteMode: true       # Write to both legacy and HLC tables
+```
+
+```csharp
+// Program.cs or Startup.cs
+services.AddOptions<SchedulerQueueOptions>()
+    .Bind(configuration.GetSection("Scheduler:Queue"))
+    .ValidateDataAnnotations()
+    .ValidateOnStart();
+
+// Register HLC services
+services.AddHlcSchedulerServices();
+
+// Register HLC clock
+services.AddSingleton<IHybridLogicalClock>(sp =>
+{
+    var nodeId = Environment.MachineName; // or use a stable node identifier
+    return new HybridLogicalClock(nodeId, TimeProvider.System);
+});
+```
+
+**Verification:**
+- Monitor `scheduler_hlc_enqueues_total` metric for dual-write activity
+- Verify `scheduler_log` table is being populated
+- Check chain verification passes: `scheduler_chain_verifications_total{result="valid"}`
+
+### Phase 2: Backfill Historical Data (Optional)
+
+If you need historical jobs in the HLC chain, backfill from the existing `scheduler.jobs` table:
+
+```sql
+-- Backfill script (run during maintenance window)
+-- Note: This creates a new chain starting from historical data
+-- The chain will not have valid prev_link values for historical entries
+
+INSERT INTO scheduler.scheduler_log (
+    tenant_id, t_hlc, partition_key, job_id, payload_hash, prev_link, link
+)
+SELECT
+    tenant_id,
+    -- Generate synthetic HLC timestamps based on created_at
+    -- Format: YYYYMMDDHHMMSS-nodeid-counter
+    TO_CHAR(created_at AT TIME ZONE 'UTC', 'YYYYMMDDHH24MISS') || '-backfill-' ||
+        LPAD(ROW_NUMBER() OVER (PARTITION BY tenant_id ORDER BY created_at)::TEXT, 6, '0'),
+    COALESCE(project_id, ''),
+    id,
+    DECODE(payload_digest, 'hex'),
+    NULL,  -- No chain linking for historical data
+    DECODE(payload_digest, 'hex')  -- Use payload_digest as link placeholder
+FROM scheduler.jobs
+WHERE status IN ('pending', 'scheduled', 'running')
+  AND NOT EXISTS (
+    SELECT 1 FROM scheduler.scheduler_log sl
+    WHERE sl.job_id = jobs.id
+  )
+ORDER BY tenant_id, created_at;
+```
+
+### Phase 3: Enable HLC Ordering for Reads
+
+Once dual-write is stable and backfill (if needed) is complete:
+
+```yaml
+Scheduler:
+  Queue:
+    Hlc:
+      EnableHlcOrdering: true   # Use HLC ordering for reads
+      DualWriteMode: true       # Keep dual-write during transition
+      VerifyOnDequeue: false    # Optional: enable for extra validation
+```
+
+**Verification:**
+- Monitor dequeue latency (should be similar to legacy)
+- Verify job processing order matches HLC order
+- Check chain integrity periodically
+
+### Phase 4: Disable Dual-Write Mode
+
+Once confident in HLC ordering:
+
+```yaml
+Scheduler:
+  Queue:
+    Hlc:
+      EnableHlcOrdering: true
+      DualWriteMode: false      # Stop writing to legacy table
+      VerifyOnDequeue: false
+```
+
+## Configuration Reference
+
+### SchedulerHlcOptions
+
+| Property | Type | Default | Description |
+|----------|------|---------|-------------|
+| `EnableHlcOrdering` | bool | false | Use HLC ordering for queue reads |
+| `DualWriteMode` | bool | false | Write to both legacy and HLC tables |
+| `VerifyOnDequeue` | bool | false | Verify chain integrity on each dequeue |
+| `MaxClockDriftMs` | int | 60000 | Maximum allowed clock drift in milliseconds |
+
+## Metrics
+
+| Metric | Type | Description |
+|--------|------|-------------|
+| `scheduler_hlc_enqueues_total` | Counter | Total HLC enqueue operations |
+| `scheduler_hlc_enqueue_deduplicated_total` | Counter | Deduplicated enqueue operations |
+| `scheduler_hlc_enqueue_duration_seconds` | Histogram | Enqueue operation duration |
+| `scheduler_hlc_dequeues_total` | Counter | Total HLC dequeue operations |
+| `scheduler_hlc_dequeued_entries_total` | Counter | Total entries dequeued |
+| `scheduler_chain_verifications_total` | Counter | Chain verification operations |
+| `scheduler_chain_verification_issues_total` | Counter | Chain verification issues found |
+| `scheduler_batch_snapshots_created_total` | Counter | Batch snapshots created |
+
+## Troubleshooting
+
+### Chain Verification Failures
+
+If chain verification reports issues:
+
+1. Check `scheduler_chain_verification_issues_total` for issue count
+2. Query the log for specific issues:
+   ```csharp
+   var result = await chainVerifier.VerifyAsync(tenantId);
+   foreach (var issue in result.Issues)
+   {
+       logger.LogError(
+           "Chain issue at job {JobId}: {Type} - {Description}",
+           issue.JobId, issue.IssueType, issue.Description);
+   }
+   ```
+
+3. Common causes:
+   - Database corruption: Restore from backup
+   - Concurrent writes without proper locking: Check transaction isolation
+   - Clock drift: Verify `MaxClockDriftMs` setting
+
+### Performance Considerations
+
+- **Index usage**: Ensure `idx_scheduler_log_tenant_hlc` is being used
+- **Chain head caching**: The `chain_heads` table provides O(1) access to latest link
+- **Batch sizes**: Adjust dequeue batch size based on workload
+
+## Rollback Procedure
+
+To rollback to legacy ordering:
+
+```yaml
+Scheduler:
+  Queue:
+    Hlc:
+      EnableHlcOrdering: false
+      DualWriteMode: false
+```
+
+The `scheduler_log` table can be retained for audit purposes or dropped if no longer needed.
+
+## Related Documentation
+
+- [Scheduler Architecture](architecture.md)
+- [HLC Library Documentation](../../__Libraries/StellaOps.HybridLogicalClock/README.md)
+- [Product Advisory: Audit-safe Job Queue Ordering](../../product-advisories/audit-safe-job-queue-ordering.md)
--- a/docs/modules/testing/testing-enhancements-architecture.md
+++ b/docs/modules/testing/testing-enhancements-architecture.md
@@ -0,0 +1,409 @@
+# Testing Enhancements Architecture
+
+**Version:** 1.0.0
+**Last Updated:** 2026-01-05
+**Status:** In Development
+
+## Overview
+
+This document describes the architecture of StellaOps testing enhancements derived from the product advisory "New Testing Enhancements for Stella Ops" (05-Dec-2026). The enhancements address gaps in temporal correctness, policy drift control, replayability, and competitive awareness.
+
+## Problem Statement
+
+> "The next gains for StellaOps testing are no longer about coverage—they're about temporal correctness, policy drift control, replayability, and competitive awareness. Systems that fail now do so quietly, over time, and under sequence pressure."
+
+### Key Gaps Identified
+
+| Gap | Impact | Current State |
+|-----|--------|---------------|
+| **Temporal Edge Cases** | Silent failures under clock drift, leap seconds, TTL boundaries | TimeProvider exists but no edge case tests |
+| **Failure Choreography** | Cascading failures untested | Single-point chaos tests only |
+| **Trace Replay** | Assumptions vs. reality mismatch | Replay module underutilized |
+| **Policy Drift** | Silent behavior changes | Determinism tests exist but no diff testing |
+| **Decision Opacity** | Audit/debug difficulty | Verdicts without explanations |
+| **Evidence Gaps** | Test runs not audit-grade | TRX files not in EvidenceLocker |
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────────────────┐
+│                    Testing Enhancements Architecture                     │
+├─────────────────────────────────────────────────────────────────────────┤
+│                                                                          │
+│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐              │
+│  │   Time-Skew    │  │ Trace Replay   │  │   Failure      │              │
+│  │  & Idempotency │  │  & Evidence    │  │ Choreography   │              │
+│  └───────┬────────┘  └───────┬────────┘  └───────┬────────┘              │
+│          │                   │                   │                       │
+│          ▼                   ▼                   ▼                       │
+│  ┌───────────────────────────────────────────────────────────────┐      │
+│  │                 StellaOps.Testing.* Libraries                  │      │
+│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
+│  │  │  Temporal   │ │   Replay    │ │    Chaos    │ │ Evidence │ │      │
+│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
+│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
+│  │  │   Policy    │ │Explainability│ │  Coverage  │ │ConfigDiff│ │      │
+│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
+│  └───────────────────────────────────────────────────────────────┘      │
+│                                  │                                       │
+│                                  ▼                                       │
+│  ┌───────────────────────────────────────────────────────────────┐      │
+│  │                     Existing Infrastructure                    │      │
+│  │  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │      │
+│  │  │  TestKit    │ │Determinism  │ │  Postgres   │ │  AirGap  │ │      │
+│  │  │             │ │  Testing    │ │  Testing    │ │ Testing  │ │      │
+│  │  └─────────────┘ └─────────────┘ └─────────────┘ └──────────┘ │      │
+│  └───────────────────────────────────────────────────────────────┘      │
+│                                                                          │
+└─────────────────────────────────────────────────────────────────────────┘
+```
+
+## Component Architecture
+
+### 1. Temporal Testing (`StellaOps.Testing.Temporal`)
+
+**Purpose:** Simulate temporal edge conditions and verify idempotency.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Temporal Testing                          │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌─────────────────────┐    ┌─────────────────────┐         │
+│  │ SimulatedTimeProvider│    │ IdempotencyVerifier │         │
+│  │  - Advance()         │    │  - VerifyAsync()    │         │
+│  │  - JumpTo()          │    │  - VerifyWithRetries│         │
+│  │  - SetDrift()        │    └─────────────────────┘         │
+│  │  - JumpBackward()    │                                    │
+│  └─────────────────────┘                                     │
+│                                                              │
+│  ┌─────────────────────┐    ┌─────────────────────┐         │
+│  │LeapSecondTimeProvider│   │TtlBoundaryTimeProvider│        │
+│  │  - AdvanceThrough    │   │  - PositionAtExpiry   │        │
+│  │    LeapSecond()      │   │  - GenerateBoundary   │        │
+│  └─────────────────────┘   │    TestCases()        │        │
+│                             └─────────────────────┘         │
+│                                                              │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │            ClockSkewAssertions                   │        │
+│  │  - AssertHandlesClockJumpForward()              │        │
+│  │  - AssertHandlesClockJumpBackward()             │        │
+│  │  - AssertHandlesClockDrift()                    │        │
+│  └─────────────────────────────────────────────────┘        │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+**Key Interfaces:**
+- `SimulatedTimeProvider` - Time progression with drift
+- `IdempotencyVerifier<T>` - Retry idempotency verification
+- `ClockSkewAssertions` - Clock anomaly assertions
+
+### 2. Trace Replay & Evidence (`StellaOps.Testing.Replay`, `StellaOps.Testing.Evidence`)
+
+**Purpose:** Replay production traces and link test runs to EvidenceLocker.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│              Trace Replay & Evidence                         │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌─────────────────┐      ┌─────────────────────┐           │
+│  │TraceAnonymizer  │      │  TestEvidenceService │           │
+│  │ - AnonymizeAsync│      │  - BeginSessionAsync │           │
+│  │ - ValidateAnon  │      │  - RecordTestResult  │           │
+│  └────────┬────────┘      │  - FinalizeSession   │           │
+│           │               └──────────┬──────────┘           │
+│           ▼                          │                       │
+│  ┌─────────────────┐                 ▼                       │
+│  │TraceCorpusManager│       ┌─────────────────────┐          │
+│  │ - ImportAsync    │       │  EvidenceLocker     │          │
+│  │ - QueryAsync     │       │  (immutable storage)│          │
+│  └────────┬─────────┘       └─────────────────────┘          │
+│           │                                                  │
+│           ▼                                                  │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │           ReplayIntegrationTestBase             │        │
+│  │  - ReplayAndVerifyAsync()                       │        │
+│  │  - ReplayBatchAsync()                           │        │
+│  └─────────────────────────────────────────────────┘        │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+**Data Flow:**
+```
+Production Traces → Anonymization → Corpus → Replay Tests → Evidence Bundle
+```
+
+### 3. Failure Choreography (`StellaOps.Testing.Chaos`)
+
+**Purpose:** Orchestrate sequenced, cascading failure scenarios.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                Failure Choreography                          │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │              FailureChoreographer                │        │
+│  │  - InjectFailure(componentId, failureType)      │        │
+│  │  - RecoverComponent(componentId)                │        │
+│  │  - ExecuteOperation(name, action)               │        │
+│  │  - AssertCondition(name, condition)             │        │
+│  │  - ExecuteAsync() → ChoreographyResult          │        │
+│  └─────────────────────────────────────────────────┘        │
+│                           │                                  │
+│           ┌───────────────┼───────────────┐                 │
+│           ▼               ▼               ▼                 │
+│  ┌────────────────┐ ┌────────────┐ ┌────────────────┐       │
+│  │DatabaseFailure │ │HttpClient  │ │ CacheFailure   │       │
+│  │  Injector      │ │ Injector   │ │   Injector     │       │
+│  └────────────────┘ └────────────┘ └────────────────┘       │
+│                                                              │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │             ConvergenceTracker                   │        │
+│  │  - CaptureSnapshotAsync()                       │        │
+│  │  - WaitForConvergenceAsync()                    │        │
+│  │  - VerifyConvergenceAsync()                     │        │
+│  └─────────────────────────────────────────────────┘        │
+│                           │                                  │
+│           ┌───────────────┼───────────────┐                 │
+│           ▼               ▼               ▼                 │
+│  ┌────────────────┐ ┌────────────┐ ┌────────────────┐       │
+│  │ DatabaseState  │ │ Metrics    │ │  QueueState    │       │
+│  │    Probe       │ │  Probe     │ │    Probe       │       │
+│  └────────────────┘ └────────────┘ └────────────────┘       │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+**Failure Types:**
+- `Unavailable` - Component completely down
+- `Timeout` - Slow responses
+- `Intermittent` - Random failures
+- `PartialFailure` - Some operations fail
+- `Degraded` - Reduced capacity
+- `Flapping` - Alternating up/down
+
+### 4. Policy & Explainability (`StellaOps.Core.Explainability`, `StellaOps.Testing.Policy`)
+
+**Purpose:** Explain automated decisions and test policy changes.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│              Policy & Explainability                         │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │              DecisionExplanation                 │        │
+│  │  - DecisionId, DecisionType, DecidedAt          │        │
+│  │  - Outcome (value, confidence, summary)         │        │
+│  │  - Factors[] (type, weight, contribution)       │        │
+│  │  - AppliedRules[] (id, triggered, impact)       │        │
+│  │  - Metadata (engine version, input hashes)      │        │
+│  └─────────────────────────────────────────────────┘        │
+│                                                              │
+│  ┌─────────────────┐    ┌─────────────────────────┐         │
+│  │IExplainableDecision│  │ ExplainabilityAssertions│         │
+│  │ <TInput, TOutput> │  │  - AssertHasExplanation │         │
+│  │ - EvaluateWith    │  │  - AssertExplanation    │         │
+│  │   ExplanationAsync│  │    Reproducible         │         │
+│  └─────────────────┘    └─────────────────────────┘         │
+│                                                              │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │             PolicyDiffEngine                     │        │
+│  │  - ComputeDiffAsync(baseline, new, inputs)      │        │
+│  │  → PolicyDiffResult (changed behaviors, deltas) │        │
+│  └─────────────────────────────────────────────────┘        │
+│                           │                                  │
+│                           ▼                                  │
+│  ┌─────────────────────────────────────────────────┐        │
+│  │          PolicyRegressionTestBase               │        │
+│  │  - Policy_Change_Produces_Expected_Diff()       │        │
+│  │  - Policy_Change_No_Unexpected_Regressions()    │        │
+│  └─────────────────────────────────────────────────┘        │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+**Explainable Services:**
+- `ExplainableVexConsensusService`
+- `ExplainableRiskScoringService`
+- `ExplainablePolicyEngine`
+
+### 5. Cross-Cutting Standards (`StellaOps.Testing.*`)
+
+**Purpose:** Enforce standards across all testing.
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                Cross-Cutting Standards                       │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  ┌───────────────────────────────────────────┐              │
+│  │         BlastRadius Annotations            │              │
+│  │  - Auth, Scanning, Evidence, Compliance   │              │
+│  │  - Advisories, RiskPolicy, Crypto         │              │
+│  │  - Integrations, Persistence, Api         │              │
+│  └───────────────────────────────────────────┘              │
+│                                                              │
+│  ┌───────────────────────────────────────────┐              │
+│  │        SchemaEvolutionTestBase            │              │
+│  │  - TestAgainstPreviousSchemaAsync()       │              │
+│  │  - TestReadBackwardCompatibilityAsync()   │              │
+│  │  - TestWriteForwardCompatibilityAsync()   │              │
+│  └───────────────────────────────────────────┘              │
+│                                                              │
+│  ┌───────────────────────────────────────────┐              │
+│  │        BranchCoverageEnforcer             │              │
+│  │  - Validate() → dead paths                │              │
+│  │  - GenerateDeadPathReport()               │              │
+│  │  - Exemption mechanism                    │              │
+│  └───────────────────────────────────────────┘              │
+│                                                              │
+│  ┌───────────────────────────────────────────┐              │
+│  │          ConfigDiffTestBase               │              │
+│  │  - TestConfigBehavioralDeltaAsync()       │              │
+│  │  - TestConfigIsolationAsync()             │              │
+│  └───────────────────────────────────────────┘              │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Library Structure
+
+```
+src/__Tests/__Libraries/
+├── StellaOps.Testing.Temporal/
+│   ├── SimulatedTimeProvider.cs
+│   ├── LeapSecondTimeProvider.cs
+│   ├── TtlBoundaryTimeProvider.cs
+│   ├── IdempotencyVerifier.cs
+│   └── ClockSkewAssertions.cs
+│
+├── StellaOps.Testing.Replay/
+│   ├── ReplayIntegrationTestBase.cs
+│   └── IReplayOrchestrator.cs
+│
+├── StellaOps.Testing.Evidence/
+│   ├── ITestEvidenceService.cs
+│   ├── TestEvidenceService.cs
+│   └── XunitEvidenceReporter.cs
+│
+├── StellaOps.Testing.Chaos/
+│   ├── FailureChoreographer.cs
+│   ├── ConvergenceTracker.cs
+│   ├── Injectors/
+│   │   ├── IFailureInjector.cs
+│   │   ├── DatabaseFailureInjector.cs
+│   │   ├── HttpClientFailureInjector.cs
+│   │   └── CacheFailureInjector.cs
+│   └── Probes/
+│       ├── IStateProbe.cs
+│       ├── DatabaseStateProbe.cs
+│       └── MetricsStateProbe.cs
+│
+├── StellaOps.Testing.Policy/
+│   ├── PolicyDiffEngine.cs
+│   ├── PolicyRegressionTestBase.cs
+│   └── PolicyVersionControl.cs
+│
+├── StellaOps.Testing.Explainability/
+│   └── ExplainabilityAssertions.cs
+│
+├── StellaOps.Testing.SchemaEvolution/
+│   └── SchemaEvolutionTestBase.cs
+│
+├── StellaOps.Testing.Coverage/
+│   └── BranchCoverageEnforcer.cs
+│
+└── StellaOps.Testing.ConfigDiff/
+    └── ConfigDiffTestBase.cs
+```
+
+## CI/CD Integration
+
+### Pipeline Structure
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    CI/CD Pipelines                           │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  PR-Gating:                                                  │
+│  ├── test-blast-radius.yml    (validate annotations)        │
+│  ├── policy-diff.yml          (policy change validation)    │
+│  ├── dead-path-detection.yml  (coverage enforcement)        │
+│  └── test-evidence.yml        (evidence capture)            │
+│                                                              │
+│  Scheduled:                                                  │
+│  ├── schema-evolution.yml     (backward compat tests)       │
+│  ├── chaos-choreography.yml   (failure choreography)        │
+│  └── trace-replay.yml         (production trace replay)     │
+│                                                              │
+│  On-Demand:                                                  │
+│  └── rollback-lag.yml         (rollback timing measurement) │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Workflow Triggers
+
+| Workflow | Trigger | Purpose |
+|----------|---------|---------|
+| test-blast-radius | PR (test files) | Validate annotations |
+| policy-diff | PR (policy files) | Validate policy changes |
+| dead-path-detection | Push/PR | Prevent untested code |
+| test-evidence | Push (main) | Store test evidence |
+| schema-evolution | Daily | Backward compatibility |
+| chaos-choreography | Weekly | Cascading failure tests |
+| trace-replay | Weekly | Production trace validation |
+| rollback-lag | Manual | Measure rollback timing |
+
+## Implementation Roadmap
+
+### Sprint Schedule
+
+| Sprint | Focus | Duration | Key Deliverables |
+|--------|-------|----------|------------------|
+| 002_001 | Time-Skew & Idempotency | 3 weeks | Temporal libraries, module tests |
+| 002_002 | Trace Replay & Evidence | 3 weeks | Anonymization, evidence linking |
+| 002_003 | Failure Choreography | 3 weeks | Choreographer, cascade tests |
+| 002_004 | Policy & Explainability | 3 weeks | Explanation schema, diff testing |
+| 002_005 | Cross-Cutting Standards | 3 weeks | Annotations, CI enforcement |
+
+### Dependencies
+
+```
+002_001 (Temporal) ────┐
+                       │
+002_002 (Replay) ──────┼──→ 002_003 (Choreography) ──→ 002_005 (Cross-Cutting)
+                       │                                    ↑
+002_004 (Policy) ──────┘────────────────────────────────────┘
+```
+
+## Success Metrics
+
+| Metric | Baseline | Target | Sprint |
+|--------|----------|--------|--------|
+| Temporal edge case coverage | ~5% | 80%+ | 002_001 |
+| Idempotency test coverage | ~10% | 90%+ | 002_001 |
+| Replay test coverage | 0% | 50%+ | 002_002 |
+| Test evidence capture | 0% | 100% | 002_002 |
+| Choreographed failure scenarios | 0 | 15+ | 002_003 |
+| Decisions with explanations | 0% | 100% | 002_004 |
+| Policy changes with diff tests | 0% | 100% | 002_004 |
+| Tests with blast-radius | ~10% | 100% | 002_005 |
+| Dead paths (non-exempt) | Unknown | <50 | 002_005 |
+
+## References
+
+- **Sprint Files:**
+  - `docs/implplan/SPRINT_20260105_002_001_TEST_time_skew_idempotency.md`
+  - `docs/implplan/SPRINT_20260105_002_002_TEST_trace_replay_evidence.md`
+  - `docs/implplan/SPRINT_20260105_002_003_TEST_failure_choreography.md`
+  - `docs/implplan/SPRINT_20260105_002_004_TEST_policy_explainability.md`
+  - `docs/implplan/SPRINT_20260105_002_005_TEST_cross_cutting.md`
+- **Advisory:** `docs/product-advisories/05-Dec-2026 - New Testing Enhancements for Stella Ops.md`
+- **Test Infrastructure:** `src/__Tests/AGENTS.md`