docs consolidation, big sln build fixes, new advisories and sprints/tasks

This commit is contained in:
master
2026-01-05 18:37:04 +02:00
parent d0a7b88398
commit d7bdca6d97
175 changed files with 10322 additions and 307 deletions

View File

@@ -20,7 +20,7 @@ This directory contains architecture documentation for all StellaOps modules.
| [Concelier](./concelier/) | `src/Concelier/` | Vulnerability advisory ingestion and merge engine |
| [Excititor](./excititor/) | `src/Excititor/` | VEX document ingestion and export |
| [VexLens](./vex-lens/) | `src/VexLens/` | VEX consensus computation across issuers |
| [VexHub](./vexhub/) | `src/VexHub/` | VEX distribution and exchange hub |
| [VexHub](./vex-hub/) | `src/VexHub/` | VEX distribution and exchange hub |
| [IssuerDirectory](./issuer-directory/) | `src/IssuerDirectory/` | Issuer trust registry (CSAF publishers) |
| [Feedser](./feedser/) | `src/Feedser/` | Evidence collection library for backport detection |
| [Mirror](./mirror/) | `src/Mirror/` | Vulnerability feed mirror and distribution |
@@ -30,10 +30,10 @@ This directory contains architecture documentation for all StellaOps modules.
| Module | Path | Description |
|--------|------|-------------|
| [Scanner](./scanner/) | `src/Scanner/` | Container scanning with SBOM generation |
| [BinaryIndex](./binaryindex/) | `src/BinaryIndex/` | Binary identity extraction and fingerprinting |
| [BinaryIndex](./binary-index/) | `src/BinaryIndex/` | Binary identity extraction and fingerprinting |
| [AdvisoryAI](./advisory-ai/) | `src/AdvisoryAI/` | AI-assisted advisory analysis |
| [Symbols](./symbols/) | `src/Symbols/` | Symbol resolution and debug information |
| [ReachGraph](./reachgraph/) | `src/ReachGraph/` | Reachability graph service |
| [ReachGraph](./reach-graph/) | `src/ReachGraph/` | Reachability graph service |
### Artifacts & Evidence
@@ -41,18 +41,18 @@ This directory contains architecture documentation for all StellaOps modules.
|--------|------|-------------|
| [Attestor](./attestor/) | `src/Attestor/` | in-toto/DSSE attestation generation |
| [Signer](./signer/) | `src/Signer/` | Cryptographic signing operations |
| [SbomService](./sbomservice/) | `src/SbomService/` | SBOM storage, versioning, and lineage ledger |
| [SbomService](./sbom-service/) | `src/SbomService/` | SBOM storage, versioning, and lineage ledger |
| [EvidenceLocker](./evidence-locker/) | `src/EvidenceLocker/` | Sealed evidence storage and export |
| [ExportCenter](./export-center/) | `src/ExportCenter/` | Batch export and report generation |
| [Provenance](./provenance/) | `src/Provenance/` | SLSA/DSSE attestation tooling |
| [Provcache](./provcache/) | Library | Provenance cache utilities |
| [Provcache](./prov-cache/) | Library | Provenance cache utilities |
### Policy & Risk
| Module | Path | Description |
|--------|------|-------------|
| [Policy](./policy/) | `src/Policy/` | Policy engine with K4 lattice logic |
| [RiskEngine](./riskengine/) | `src/RiskEngine/` | Risk scoring runtime |
| [RiskEngine](./risk-engine/) | `src/RiskEngine/` | Risk scoring runtime |
| [VulnExplorer](./vuln-explorer/) | `src/VulnExplorer/` | Vulnerability exploration and triage |
| [Unknowns](./unknowns/) | `src/Unknowns/` | Unknown component tracking registry |
@@ -65,8 +65,8 @@ This directory contains architecture documentation for all StellaOps modules.
| [TaskRunner](./taskrunner/) | `src/TaskRunner/` | Task pack execution engine |
| [Notify](./notify/) | `src/Notify/` | Notification toolkit (Email, Slack, Teams, Webhooks) |
| [Notifier](./notifier/) | `src/Notifier/` | Notifications Studio host |
| [PacksRegistry](./packsregistry/) | `src/PacksRegistry/` | Task packs registry |
| [TimelineIndexer](./timelineindexer/) | `src/TimelineIndexer/` | Timeline event indexing |
| [PacksRegistry](./packs-registry/) | `src/PacksRegistry/` | Task packs registry |
| [TimelineIndexer](./timeline-indexer/) | `src/TimelineIndexer/` | Timeline event indexing |
| [Replay](./replay/) | `src/Replay/` | Deterministic replay engine |
### Integration

View File

@@ -273,6 +273,6 @@ stella model benchmark llama3-8b-q4km --iterations 10
## Related Documentation
- [Advisory AI Architecture](../architecture.md)
- [Offline Kit Overview](../../../24_OFFLINE_KIT.md)
- [Offline Kit Overview](../../../OFFLINE_KIT.md)
- [AI Attestations](../../../implplan/SPRINT_20251226_018_AI_attestations.md)
- [Replay Semantics](./replay-semantics.md)

View File

@@ -42,7 +42,7 @@ Key settings:
## Related Documentation
- Operations: `./operations/` (if exists)
- Offline Kit: `../../24_OFFLINE_KIT.md`
- Offline Kit: `../../OFFLINE_KIT.md`
- Mirror: `../mirror/`
- ExportCenter: `../export-center/`

View File

@@ -344,5 +344,5 @@ AirGap:
* Evidence reconciliation: `./evidence-reconciliation.md`
* Exporter coordination: `./exporter-cli-coordination.md`
* Mirror DSSE plan: `./mirror-dsse-plan.md`
* Offline Kit: `../../24_OFFLINE_KIT.md`
* Offline Kit: `../../OFFLINE_KIT.md`
* Time anchor schema: `../../airgap/time-anchor-schema.md`

View File

@@ -489,7 +489,7 @@ Content-Disposition: attachment; filename="verdict-{manifestId}.json"
- [Trust Lattice Specification](../excititor/trust-lattice.md)
- [Authority Architecture](./architecture.md)
- [DSSE Signing](../../dev/dsse-signing.md)
- [API Reference](../../09_API_CLI_REFERENCE.md)
- [API Reference](../../API_CLI_REFERENCE.md)
---

View File

@@ -0,0 +1,94 @@
# BinaryIndex
**Status:** Implemented
**Source:** `src/BinaryIndex/`
**Owner:** Scanner Guild + Concelier Guild
## Purpose
BinaryIndex provides vulnerable binary detection independent of package metadata. It addresses the gap where package version strings can lie (backports, custom builds, stripped metadata) through binary-first vulnerability identification using Build-IDs, hash catalogs, and function fingerprints.
## Components
**Libraries:**
- `StellaOps.BinaryIndex.Core` - Core binary identity extraction and matching engine
- `StellaOps.BinaryIndex.Corpus` - Binary-to-advisory mapping database
- `StellaOps.BinaryIndex.Corpus.Debian` - Debian-specific corpus support
- `StellaOps.BinaryIndex.Fingerprints` - Function fingerprint storage and matching (CFG/basic-block hashes)
- `StellaOps.BinaryIndex.FixIndex` - Patch-aware backport handling
- `StellaOps.BinaryIndex.Persistence` - Storage adapters for binary catalogs
## Configuration
Configuration is typically embedded in Scanner and Concelier module settings.
Key features:
- Three-tier binary identification (package/version, Build-ID/hash, function fingerprints)
- Binary identity extraction (Build-ID, PE CodeView GUID, Mach-O UUID)
- Integration with Scanner.Worker for binary lookup
- Offline-first design with deterministic outputs
## Dependencies
- PostgreSQL (integrated with Scanner/Concelier schemas)
- Scanner.Analyzers.Native (for binary disassembly/analysis)
- Concelier (for advisory-to-binary mapping)
## Related Documentation
- Architecture: `./architecture.md`
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
- Scanner Architecture: `../scanner/architecture.md`
- Concelier Architecture: `../concelier/architecture.md`
## Current Status
Library implementation complete with support for ELF (Build-ID), PE (CodeView GUID), and Mach-O (UUID) binary formats. Integrated into Scanner's native binary analysis pipeline.
---
## Semantic Diffing Roadmap
A major enhancement to BinaryIndex is planned to enable **semantic-level binary diffing** - detecting function equivalence based on behavior rather than syntax. This addresses limitations in current byte/symbol-based matching when dealing with:
- Compiler optimizations (same source, different instructions)
- Stripped binaries (no symbols)
- Cross-compiler builds (GCC vs Clang)
- Obfuscated code
### Planned Phases
| Phase | Description | Impact | Status |
|-------|-------------|--------|--------|
| **Phase 1** | IR-Level Semantic Analysis | +15% accuracy on optimized binaries | Planned |
| **Phase 2** | Function Behavior Corpus | +10% coverage on stripped binaries | Planned |
| **Phase 3** | Ghidra Integration | +5% edge case handling | Planned |
| **Phase 4** | Decompiler & ML Similarity | +10% obfuscation resilience | Planned |
### New Libraries (Planned)
- `StellaOps.BinaryIndex.Semantic` - IR lifting and semantic graph fingerprints
- `StellaOps.BinaryIndex.Corpus` - 30K+ function behavior database
- `StellaOps.BinaryIndex.Ghidra` - Ghidra Headless integration
- `StellaOps.BinaryIndex.Decompiler` - Decompiled code AST comparison
- `StellaOps.BinaryIndex.ML` - CodeBERT-based function embeddings
- `StellaOps.BinaryIndex.Ensemble` - Multi-signal decision fusion
### Expected Outcomes
| Metric | Current | Target |
|--------|---------|--------|
| Patch detection accuracy | ~70% | 92%+ |
| Function identification (stripped) | ~50% | 85%+ |
| False positive rate | ~5% | <2% |
### Sprint Files
- `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
- `docs/implplan/SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
- `docs/implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
- `docs/implplan/SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
### Architecture Documentation
See `./semantic-diffing.md` for comprehensive architecture documentation.

View File

@@ -3,7 +3,7 @@
> **Ownership:** Scanner Guild + Concelier Guild
> **Status:** DRAFT
> **Version:** 1.0.0
> **Related:** [High-Level Architecture](../../07_HIGH_LEVEL_ARCHITECTURE.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md)
> **Related:** [High-Level Architecture](../../ARCHITECTURE_OVERVIEW.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md)
---

View File

@@ -0,0 +1,564 @@
# Semantic Diffing Architecture
> **Status:** PLANNED
> **Version:** 1.0.0
> **Related Sprints:**
> - `SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
> - `SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
> - `SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
> - `SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
---
## 1. Executive Summary
Semantic diffing is an advanced binary analysis capability that detects function equivalence based on **behavior** rather than **syntax**. This enables accurate vulnerability detection in scenarios where traditional byte-level or symbol-based matching fails:
- **Compiler optimizations** - Same source, different instructions
- **Obfuscation** - Intentionally altered code structure
- **Stripped binaries** - No symbols or debug information
- **Cross-compiler** - GCC vs Clang produce different output
- **Backported patches** - Different version, same fix
### Expected Impact
| Capability | Current Accuracy | With Semantic Diffing |
|------------|-----------------|----------------------|
| Patch detection (optimized) | ~70% | 92%+ |
| Function identification (stripped) | ~50% | 85%+ |
| Obfuscation resilience | ~40% | 75%+ |
| False positive rate | ~5% | <2% |
---
## 2. Architecture Overview
```
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Semantic Diffing Architecture │
│ │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Analysis Layer ││
│ │ ││
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││
│ │ │ B2R2 │ │ Ghidra │ │ Decompiler │ │ ML │ ││
│ │ │ (Primary) │ │ (Fallback) │ │ (Optional) │ │ (Optional) │ ││
│ │ │ │ │ │ │ │ │ │ ││
│ │ │ - Disasm │ │ - P-Code │ │ - C output │ │ - CodeBERT │ ││
│ │ │ - LowUIR │ │ - BSim │ │ - AST parse │ │ - GraphSage │ ││
│ │ │ - CFG │ │ - Ver.Track │ │ - Normalize │ │ - Embedding │ ││
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ ││
│ │ │ │ │ │ ││
│ └─────────┴────────────────┴────────────────┴────────────────┴───────────────┘│
│ │ │
│ v │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Fingerprint Layer ││
│ │ ││
│ │ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ ││
│ │ │ Instruction │ │ Semantic │ │ Decompiled │ ││
│ │ │ Fingerprint │ │ Fingerprint │ │ Fingerprint │ ││
│ │ │ │ │ │ │ │ ││
│ │ │ - BasicBlock hash │ │ - KSG graph hash │ │ - AST hash │ ││
│ │ │ - CFG edge hash │ │ - WL hash │ │ - Normalized code │ ││
│ │ │ - String refs │ │ - DataFlow hash │ │ - API sequence │ ││
│ │ │ - Rolling chunks │ │ - API calls │ │ - Pattern hash │ ││
│ │ └───────────────────┘ └───────────────────┘ └───────────────────┘ ││
│ │ ││
│ │ ┌───────────────────┐ ┌───────────────────┐ ││
│ │ │ BSim │ │ ML Embedding │ ││
│ │ │ Signature │ │ Vector │ ││
│ │ │ │ │ │ ││
│ │ │ - Feature vector │ │ - 768-dim float[] │ ││
│ │ │ - Significance │ │ - Cosine sim │ ││
│ │ └───────────────────┘ └───────────────────┘ ││
│ │ ││
│ └─────────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ v │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Matching Layer ││
│ │ ││
│ │ ┌───────────────────────────────────────────────────────────────────────┐ ││
│ │ │ Ensemble Decision Engine │ ││
│ │ │ │ ││
│ │ │ Signal Weights: │ ││
│ │ │ - Instruction fingerprint: 15% │ ││
│ │ │ - Semantic graph: 25% │ ││
│ │ │ - Decompiled AST: 35% │ ││
│ │ │ - ML embedding: 25% │ ││
│ │ │ │ ││
│ │ │ Output: Confidence-weighted similarity score │ ││
│ │ │ │ ││
│ │ └───────────────────────────────────────────────────────────────────────┘ ││
│ │ ││
│ └─────────────────────────────────────────────────────────────────────────────┘│
│ │ │
│ v │
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
│ │ Storage Layer ││
│ │ ││
│ │ PostgreSQL RustFS Valkey ││
│ │ - corpus.* tables - Fingerprint blobs - Query cache ││
│ │ - binaries.* tables - Model artifacts - Embedding index ││
│ │ - BSim database - Training data ││
│ │ ││
│ └─────────────────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────────────────┘
```
---
## 3. Implementation Phases
### Phase 1: IR-Level Semantic Analysis (Foundation)
**Sprint:** `SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
Leverage B2R2's Intermediate Representation (IR) for semantic-level function comparison.
**Key Components:**
- `IrLiftingService` - Lift instructions to LowUIR
- `SemanticGraphExtractor` - Build Key-Semantics Graph (KSG)
- `WeisfeilerLehmanHasher` - Graph fingerprinting
- `SemanticMatcher` - Semantic similarity scoring
**Deliverables:**
- `StellaOps.BinaryIndex.Semantic` library
- 20 tasks, ~3 weeks
### Phase 2: Function Behavior Corpus (Scale)
**Sprint:** `SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
Build comprehensive database of known library functions.
**Key Components:**
- Library corpus connectors (glibc, OpenSSL, zlib, curl, SQLite)
- `CorpusIngestionService` - Batch fingerprint generation
- `FunctionClusteringService` - Group similar functions
- `CorpusQueryService` - Function identification
**Deliverables:**
- `StellaOps.BinaryIndex.Corpus` library
- PostgreSQL `corpus.*` schema
- ~30,000 indexed functions
- 22 tasks, ~4 weeks
### Phase 3: Ghidra Integration (Depth)
**Sprint:** `SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
Add Ghidra as secondary backend for complex cases.
**Key Components:**
- `GhidraHeadlessManager` - Process lifecycle
- `VersionTrackingService` - Multi-correlator diffing
- `GhidriffBridge` - Python interop
- `BSimService` - Behavioral similarity
**Deliverables:**
- `StellaOps.BinaryIndex.Ghidra` library
- Docker image for Ghidra Headless
- 20 tasks, ~4 weeks
### Phase 4: Decompiler & ML (Excellence)
**Sprint:** `SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
Highest-fidelity semantic analysis.
**Key Components:**
- `IDecompilerService` - Ghidra decompilation
- `AstComparisonEngine` - Structural similarity
- `OnnxInferenceEngine` - ML embeddings
- `EnsembleDecisionEngine` - Multi-signal fusion
**Deliverables:**
- `StellaOps.BinaryIndex.Decompiler` library
- `StellaOps.BinaryIndex.ML` library
- Trained CodeBERT-Binary model
- 30 tasks, ~5 weeks
---
## 4. Fingerprint Types
### 4.1 Instruction Fingerprint (Existing)
**Algorithm:** BasicBlock hash + CFG edge hash + String refs hash
**Properties:**
- Fast to compute
- Sensitive to instruction changes
- Good for exact/near-exact matches
**Weight in ensemble:** 15%
### 4.2 Semantic Fingerprint (Phase 1)
**Algorithm:** Key-Semantics Graph + Weisfeiler-Lehman hash
**Properties:**
- Captures data/control dependencies
- Resilient to register renaming
- Resilient to instruction reordering
**Weight in ensemble:** 25%
### 4.3 Decompiled Fingerprint (Phase 4)
**Algorithm:** Normalized AST hash + Pattern detection
**Properties:**
- Highest semantic fidelity
- Captures algorithmic structure
- Resilient to most optimizations
**Weight in ensemble:** 35%
### 4.4 ML Embedding (Phase 4)
**Algorithm:** CodeBERT-Binary transformer, 768-dim vectors
**Properties:**
- Learned similarity metric
- Captures latent patterns
- Resilient to obfuscation
**Weight in ensemble:** 25%
---
## 5. Matching Pipeline
```mermaid
sequenceDiagram
participant Client
participant DiffEngine as PatchDiffEngine
participant B2R2
participant Ghidra
participant Corpus
participant Ensemble
Client->>DiffEngine: Compare(oldBinary, newBinary)
par Parallel Analysis
DiffEngine->>B2R2: Disassemble + IR lift
DiffEngine->>Ghidra: Decompile (if needed)
end
B2R2-->>DiffEngine: SemanticFingerprints[]
Ghidra-->>DiffEngine: DecompiledFunctions[]
DiffEngine->>Corpus: IdentifyFunctions(fingerprints)
Corpus-->>DiffEngine: FunctionMatches[]
DiffEngine->>Ensemble: ComputeSimilarity(old, new)
Ensemble-->>DiffEngine: EnsembleResult
DiffEngine-->>Client: PatchDiffResult
```
---
## 6. Fallback Strategy
The system uses a tiered fallback strategy:
```
Tier 1: B2R2 IR + Semantic Graph (fast, ~90% coverage)
│ If confidence < threshold OR architecture unsupported
v
Tier 2: Ghidra Version Tracking (slower, ~95% coverage)
│ If function is high-value (CVE-relevant)
v
Tier 3: Decompiled AST + ML Embedding (slowest, ~99% coverage)
```
**Selection Criteria:**
| Condition | Backend | Reason |
|-----------|---------|--------|
| Standard x64/ARM64 binary | B2R2 only | Fast, accurate |
| Low B2R2 confidence (<0.7) | B2R2 + Ghidra | Validation |
| Exotic architecture | Ghidra only | Better coverage |
| CVE-affected function | Full pipeline | Maximum accuracy |
| Obfuscated binary | ML embedding | Obfuscation resilience |
---
## 7. Corpus Coverage
### Priority Libraries
| Library | Priority | Functions | CVEs |
|---------|----------|-----------|------|
| glibc | Critical | ~15,000 | 50+ |
| OpenSSL | Critical | ~8,000 | 100+ |
| zlib | High | ~200 | 5+ |
| libcurl | High | ~2,000 | 80+ |
| SQLite | High | ~1,500 | 30+ |
| libxml2 | Medium | ~1,200 | 40+ |
| libpng | Medium | ~300 | 10+ |
| expat | Medium | ~150 | 15+ |
### Architecture Coverage
| Architecture | B2R2 | Ghidra | Status |
|--------------|------|--------|--------|
| x86_64 | Excellent | Excellent | Primary |
| ARM64 | Excellent | Excellent | Primary |
| ARM32 | Good | Excellent | Secondary |
| MIPS32 | Fair | Excellent | Fallback |
| MIPS64 | Fair | Excellent | Fallback |
| RISC-V | Good | Good | Emerging |
| PPC32/64 | Fair | Excellent | Fallback |
---
## 8. Performance Characteristics
### Latency Budget
| Operation | Target | Notes |
|-----------|--------|-------|
| B2R2 disassembly | <100ms | Per function |
| IR lifting | <50ms | Per function |
| Semantic fingerprint | <50ms | Per function |
| Ghidra analysis | <30s | Per binary (startup) |
| Decompilation | <500ms | Per function |
| ML inference | <100ms | Per function |
| Ensemble decision | <10ms | Per comparison |
| **Total (Tier 1)** | **<200ms** | Per function |
| **Total (Full)** | **<1s** | Per function |
### Memory Budget
| Component | Memory | Notes |
|-----------|--------|-------|
| B2R2 per binary | ~100MB | Scales with binary size |
| Ghidra per project | ~2GB | Persistent cache |
| ML model | ~500MB | ONNX loaded |
| Corpus query cache | ~100MB | LRU eviction |
---
## 9. Integration Points
### 9.1 Scanner Integration
```csharp
// Scanner.Worker uses semantic diffing for binary vulnerability detection
var result = await _binaryVulnerabilityService.LookupByFingerprintAsync(
fingerprint,
minSimilarity: 0.85m,
useSemanticMatching: true, // Enable semantic diffing
ct);
```
### 9.2 PatchDiffEngine Enhancement
```csharp
// PatchDiffEngine now includes semantic comparison
var diff = await _patchDiffEngine.DiffAsync(
vulnerableBinary,
patchedBinary,
new PatchDiffOptions
{
UseSemanticAnalysis = true,
SemanticThreshold = 0.7m,
IncludeDecompilation = true,
IncludeMlEmbedding = true
},
ct);
```
### 9.3 DeltaSignature Enhancement
```csharp
// Delta signatures now include semantic fingerprints
var signature = await _deltaSignatureGenerator.GenerateSignaturesAsync(
binaryStream,
new DeltaSignatureRequest
{
Cve = "CVE-2024-1234",
TargetSymbols = ["vulnerable_func"],
IncludeSemanticFingerprint = true,
IncludeDecompiledHash = true
},
ct);
```
---
## 10. Security Considerations
### 10.1 Sandbox Requirements
All binary analysis runs in sandboxed environments:
- Seccomp profile restricting syscalls
- Read-only root filesystem
- No network access during analysis
- Memory/CPU limits
### 10.2 Model Security
ML models are:
- Signed with DSSE attestations
- Verified before loading
- Not user-uploadable (pre-trained only)
### 10.3 Corpus Integrity
Corpus data is:
- Ingested from trusted sources only
- Signed at snapshot level
- Version-controlled with audit trail
---
## 11. Configuration
```yaml
# binaryindex.yaml - Semantic diffing configuration
binaryindex:
semantic_diffing:
enabled: true
# Analysis backends
backends:
b2r2:
enabled: true
ir_lifting: true
semantic_graph: true
ghidra:
enabled: true
fallback_only: true
min_b2r2_confidence: 0.7
headless_timeout_ms: 30000
decompiler:
enabled: true
high_value_only: true # Only for CVE-affected functions
ml:
enabled: true
model_path: /models/codebert_binary_v1.onnx
embedding_dimension: 768
# Ensemble weights
ensemble:
instruction_weight: 0.15
semantic_weight: 0.25
decompiled_weight: 0.35
ml_weight: 0.25
min_confidence: 0.6
# Corpus
corpus:
auto_update: true
update_interval_hours: 24
libraries:
- glibc
- openssl
- zlib
- curl
- sqlite
# Performance
performance:
max_parallel_analyses: 4
cache_ttl_seconds: 3600
max_function_size_bytes: 1048576 # 1MB
```
---
## 12. Metrics & Observability
### Metrics
| Metric | Type | Labels |
|--------|------|--------|
| `semantic_diffing_analysis_total` | Counter | backend, result |
| `semantic_diffing_latency_ms` | Histogram | backend, tier |
| `semantic_diffing_accuracy` | Gauge | comparison_type |
| `corpus_functions_total` | Gauge | library |
| `ml_inference_latency_ms` | Histogram | model |
| `ensemble_signal_weight` | Gauge | signal_type |
### Traces
- `semantic_diffing.analyze` - Full analysis span
- `semantic_diffing.b2r2.lift` - IR lifting
- `semantic_diffing.ghidra.decompile` - Decompilation
- `semantic_diffing.ml.inference` - ML embedding
- `semantic_diffing.ensemble.decide` - Ensemble decision
---
## 13. Testing Strategy
### Unit Tests
| Test Suite | Coverage |
|------------|----------|
| `IrLiftingServiceTests` | IR lifting correctness |
| `SemanticGraphExtractorTests` | Graph construction |
| `WeisfeilerLehmanHasherTests` | Hash stability |
| `AstComparisonEngineTests` | AST similarity |
| `OnnxInferenceEngineTests` | ML inference |
| `EnsembleDecisionEngineTests` | Weight combination |
### Integration Tests
| Test Suite | Coverage |
|------------|----------|
| `EndToEndSemanticDiffTests` | Full pipeline |
| `OptimizationResilienceTests` | O0 vs O2 vs O3 |
| `CompilerVariantTests` | GCC vs Clang |
| `GhidraFallbackTests` | Fallback scenarios |
### Golden Corpus Tests
Pre-computed test cases with known results:
- 100 CVE patch pairs (vulnerable -> fixed)
- 50 optimization variant sets
- 25 compiler variant sets
- 25 obfuscation variant sets
---
## 14. Roadmap
| Phase | Status | ETA | Impact |
|-------|--------|-----|--------|
| Phase 1: IR Semantics | Planned | 2026-01-24 | +15% accuracy |
| Phase 2: Corpus | Planned | 2026-02-15 | +10% coverage |
| Phase 3: Ghidra | Planned | 2026-02-28 | +5% edge cases |
| Phase 4: Decompiler/ML | Planned | 2026-03-31 | +10% obfuscation |
| **Total** | | | **+35-40%** |
---
## 15. References
### Internal
- `docs/modules/binary-index/architecture.md`
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Fingerprints/`
### External
- [B2R2 Binary Analysis Framework](https://b2r2.org/)
- [Ghidra Patch Diffing Guide](https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing)
- [ghidriff Tool](https://github.com/clearbluejar/ghidriff)
- [SemDiff Paper (arXiv)](https://arxiv.org/abs/2308.01463)
- [SEI Semantic Equivalence Research](https://www.sei.cmu.edu/annual-reviews/2022-research-review/semantic-equivalence-checking-of-decompiled-binaries/)
---
*Document Version: 1.0.0*
*Last Updated: 2026-01-05*

View File

@@ -1,46 +0,0 @@
# BinaryIndex
**Status:** Implemented
**Source:** `src/BinaryIndex/`
**Owner:** Scanner Guild + Concelier Guild
## Purpose
BinaryIndex provides vulnerable binary detection independent of package metadata. It addresses the gap where package version strings can lie (backports, custom builds, stripped metadata) through binary-first vulnerability identification using Build-IDs, hash catalogs, and function fingerprints.
## Components
**Libraries:**
- `StellaOps.BinaryIndex.Core` - Core binary identity extraction and matching engine
- `StellaOps.BinaryIndex.Corpus` - Binary-to-advisory mapping database
- `StellaOps.BinaryIndex.Corpus.Debian` - Debian-specific corpus support
- `StellaOps.BinaryIndex.Fingerprints` - Function fingerprint storage and matching (CFG/basic-block hashes)
- `StellaOps.BinaryIndex.FixIndex` - Patch-aware backport handling
- `StellaOps.BinaryIndex.Persistence` - Storage adapters for binary catalogs
## Configuration
Configuration is typically embedded in Scanner and Concelier module settings.
Key features:
- Three-tier binary identification (package/version, Build-ID/hash, function fingerprints)
- Binary identity extraction (Build-ID, PE CodeView GUID, Mach-O UUID)
- Integration with Scanner.Worker for binary lookup
- Offline-first design with deterministic outputs
## Dependencies
- PostgreSQL (integrated with Scanner/Concelier schemas)
- Scanner.Analyzers.Native (for binary disassembly/analysis)
- Concelier (for advisory-to-binary mapping)
## Related Documentation
- Architecture: `./architecture.md`
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
- Scanner Architecture: `../scanner/architecture.md`
- Concelier Architecture: `../concelier/architecture.md`
## Current Status
Library implementation complete with support for ELF (Build-ID), PE (CodeView GUID), and Mach-O (UUID) binary formats. Integrated into Scanner's native binary analysis pipeline.

View File

@@ -418,7 +418,7 @@ Additional notes:
- [Aggregation-Only Contract reference](../../../aoc/aggregation-only-contract.md)
- [Architecture overview](../../platform/architecture-overview.md)
- [Console operator guide](../../../15_UI_GUIDE.md)
- [Console operator guide](../../../UI_GUIDE.md)
- [Authority scopes](../../authority/architecture.md)
- [Task Pack CLI profiles](./packs-profiles.md)

View File

@@ -158,5 +158,5 @@ stella scan replay \
## See Also
- [Deterministic Replay Specification](../../replay/DETERMINISTIC_REPLAY.md)
- [Offline Kit Documentation](../../24_OFFLINE_KIT.md)
- [Offline Kit Documentation](../../OFFLINE_KIT.md)
- [Evidence Bundle Format](./evidence-bundle-format.md)

View File

@@ -490,7 +490,7 @@ When operating in air-gapped environments:
--expected-digest sha256:...
```
For full offline kit support, see the [Offline Kit documentation](../../../24_OFFLINE_KIT.md).
For full offline kit support, see the [Offline Kit documentation](../../../OFFLINE_KIT.md).
---
@@ -499,4 +499,4 @@ For full offline kit support, see the [Offline Kit documentation](../../../24_OF
- [VEX Consensus CLI](./vex-cli.md) - VEX status management
- [Policy Simulation](../../policy/guides/simulation.md) - Policy testing
- [Authentication Guide](./auth-cli.md) - Token management
- [API Reference](../../../09_API_CLI_REFERENCE.md) - Full API documentation
- [API Reference](../../../API_CLI_REFERENCE.md) - Full API documentation

View File

@@ -35,13 +35,13 @@ Concelier ingests signed advisories from **32 advisory connectors** and converts
- Connector runbooks in ./operations/connectors/.
- Mirror operations for Offline Kit parity.
- Grafana dashboards for connector health.
- **Authority toggle rollout (2025-10-22 update).** Follow the phased table and audit checklist in `../../10_CONCELIER_CLI_QUICKSTART.md` when enabling `authority.enabled`/`authority.allowAnonymousFallback`, and cross-check the refreshed `./operations/authority-audit-runbook.md` before enforcement.
- **Authority toggle rollout (2025-10-22 update).** Follow the phased table and audit checklist in `../../CONCELIER_CLI_QUICKSTART.md` when enabling `authority.enabled`/`authority.allowAnonymousFallback`, and cross-check the refreshed `./operations/authority-audit-runbook.md` before enforcement.
## Related resources
- ./operations/conflict-resolution.md
- ./operations/mirror.md
- ./operations/authority-audit-runbook.md
- ../../10_CONCELIER_CLI_QUICKSTART.md (authority integration timeline & smoke tests)
- ../../CONCELIER_CLI_QUICKSTART.md (authority integration timeline & smoke tests)
## Backlog references
- DOCS-LNM-22-001, DOCS-LNM-22-007 in ../../TASKS.md.

View File

@@ -132,7 +132,7 @@ operating offline.
## 4. Locale & Translation Guidance
- Advisories remain in German (`language: "de"`). Preserve wording for provenance and legal accuracy.
- UI localisation: enable the translation bundles documented in `docs/15_UI_GUIDE.md` if English UI copy is required. Operators can overlay machine or human translations, but the canonical database stores the source text.
- UI localisation: enable the translation bundles documented in `docs/UI_GUIDE.md` if English UI copy is required. Operators can overlay machine or human translations, but the canonical database stores the source text.
- Docs guild is compiling a CERT-Bund terminology glossary under `docs/locale/certbund-glossary.md` so downstream teams can reference consistent English equivalents without altering the stored advisories.
---

View File

@@ -42,7 +42,7 @@ Key features:
- Signer Module: `../signer/`
- Attestor Module: `../attestor/`
- Authority Module: `../authority/`
- Air-Gap Operations: `../../24_OFFLINE_KIT.md`
- Air-Gap Operations: `../../OFFLINE_KIT.md`
## Current Status

View File

@@ -295,4 +295,4 @@ For air-gapped deployments:
* Multi-profile signing: `./multi-profile-signing-specification.md`
* Signer module: `../signer/architecture.md`
* Attestor module: `../attestor/architecture.md`
* Offline operations: `../../24_OFFLINE_KIT.md`
* Offline operations: `../../OFFLINE_KIT.md`

View File

@@ -41,7 +41,7 @@ Key settings:
- Operations: `./operations/` (if exists)
- ExportCenter: `../export-center/`
- Attestor: `../attestor/`
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
## Current Status

View File

@@ -152,7 +152,7 @@ Downstream automation reads `manifest.json`/`bundle.json` directly, while `/exci
* Track quota utilisation via HTTP 429 metrics (configure structured logging or OTEL counters when rate limiting triggers).
* Mirror domains can be deployed per tenant (e.g., `tenant-a`, `tenant-b`) with different auth requirements.
* Ensure the underlying artifact stores (`FileSystem`, `S3`, offline bundle) retain artefacts long enough for mirrors to sync.
* For air-gapped mirrors, combine mirror endpoints with the Offline Kit (see `docs/24_OFFLINE_KIT.md`).
* For air-gapped mirrors, combine mirror endpoints with the Offline Kit (see `docs/OFFLINE_KIT.md`).
---

View File

@@ -409,7 +409,7 @@ gates:
| POST | `/api/v1/authority/verdicts/{manifestId}/replay` | Verify replay |
| GET | `/api/v1/authority/verdicts/{manifestId}/download` | Download signed manifest |
See `docs/09_API_CLI_REFERENCE.md` for complete API documentation.
See `docs/API_CLI_REFERENCE.md` for complete API documentation.
---
@@ -506,7 +506,7 @@ Note: Conflict recorded in audit trail
- [Excititor Architecture](./architecture.md)
- [Verdict Manifest Specification](../authority/verdict-manifest.md)
- [Policy Gates Configuration](../policy/architecture.md)
- [API Reference](../../09_API_CLI_REFERENCE.md)
- [API Reference](../../API_CLI_REFERENCE.md)
---

View File

@@ -200,6 +200,6 @@ If encryption enabled, decrypt using age or AES key before verification.
- `docs/modules/export-center/mirror-bundles.md`
- `ops/devops/TASKS.md` (`DEVOPS-EXPORT-36-001`, `DEVOPS-EXPORT-37-001`)
- `docs/aoc/aggregation-only-contract.md`
- `docs/24_OFFLINE_KIT.md`
- `docs/OFFLINE_KIT.md`
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

View File

@@ -40,7 +40,7 @@ See `docs/security/policy-governance.md` and `docs/aoc/aggregation-only-contract
- **Mirror bundles.** `mirror:full` packages raw evidence, normalized indexes, policy snapshots, and provenance in a portable filesystem layout suitable for disconnected environments. `mirror:delta` tracks changes relative to a prior export manifest.
- **No unsanctioned egress.** The exporter respects the platform allowlist. External calls (e.g., OCI pushes) require explicit configuration and are disabled by default for offline installs.
Consult `docs/24_OFFLINE_KIT.md` for Offline Kit delivery and `docs/modules/concelier/operations/mirror.md` for mirror ingestion procedures.
Consult `docs/OFFLINE_KIT.md` for Offline Kit delivery and `docs/modules/concelier/operations/mirror.md` for mirror ingestion procedures.
## Getting started
1. **Choose a profile.** Map requirements to the profile table above. Policy-aware exports need a published policy snapshot.

View File

@@ -0,0 +1,700 @@
# Facet Sealing Architecture
> **Ownership:** Scanner Guild, Policy Guild
> **Audience:** Service owners, platform engineers, security architects
> **Related:** [Platform Architecture](../platform/architecture-overview.md), [Scanner Architecture](../scanner/architecture.md), [Replay Architecture](../replay/architecture.md), [Policy Engine](../policy/architecture.md)
This dossier describes the Facet Sealing subsystem, which provides cryptographically sealed manifests for logical slices of container images, enabling fine-grained drift detection, per-facet quota enforcement, and deterministic change tracking.
---
## 1. Overview
A **Facet** is a declared logical slice of a container image representing a cohesive set of files with shared characteristics:
| Facet Type | Description | Examples |
|------------|-------------|----------|
| `os` | Operating system packages | `/var/lib/dpkg/**`, `/var/lib/rpm/**` |
| `lang/<ecosystem>` | Language-specific dependencies | `node_modules/**`, `site-packages/**`, `vendor/**` |
| `binary` | Native binaries and shared libraries | `/usr/bin/*`, `/lib/**/*.so*` |
| `config` | Configuration files | `/etc/**`, `*.conf`, `*.yaml` |
| `custom` | User-defined patterns | Project-specific paths |
Each facet can be individually **sealed** (cryptographic snapshot) and monitored for **drift** (changes between seals).
---
## 2. System Landscape
```mermaid
graph TD
subgraph Scanner["Scanner Services"]
FE[FacetExtractor]
FH[FacetHasher]
MB[MerkleBuilder]
end
subgraph Storage["Facet Storage"]
FS[(PostgreSQL<br/>facet_seals)]
FC[(CAS<br/>facet_manifests)]
end
subgraph Policy["Policy & Enforcement"]
DC[DriftCalculator]
QE[QuotaEnforcer]
AV[AdmissionValidator]
end
subgraph Signing["Attestation"]
DS[DSSE Signer]
AT[Attestor]
end
subgraph CLI["CLI & Integration"]
SealCmd[stella seal]
DriftCmd[stella drift]
VexCmd[stella vex gen]
Zastava[Zastava Webhook]
end
FE --> FH
FH --> MB
MB --> DS
DS --> FS
DS --> FC
FS --> DC
DC --> QE
QE --> AV
AV --> Zastava
SealCmd --> FE
DriftCmd --> DC
VexCmd --> DC
```
---
## 3. Core Data Models
### 3.1 FacetDefinition
Declares a facet with its extraction patterns and quota constraints:
```csharp
public sealed record FacetDefinition
{
public required string FacetId { get; init; } // e.g., "os", "lang/node", "binary"
public required FacetType Type { get; init; } // OS, LangNode, LangPython, Binary, Config, Custom
public required ImmutableArray<string> IncludeGlobs { get; init; }
public ImmutableArray<string> ExcludeGlobs { get; init; } = [];
public FacetQuota? Quota { get; init; }
}
public enum FacetType
{
OS,
LangNode,
LangPython,
LangGo,
LangRust,
LangJava,
LangDotNet,
Binary,
Config,
Custom
}
```
### 3.2 FacetManifest
Per-facet file manifest with Merkle root:
```csharp
public sealed record FacetManifest
{
public required string FacetId { get; init; }
public required FacetType Type { get; init; }
public required ImmutableArray<FacetFileEntry> Files { get; init; }
public required string MerkleRoot { get; init; } // SHA-256 hex
public required int FileCount { get; init; }
public required long TotalBytes { get; init; }
public required DateTimeOffset ExtractedAt { get; init; }
public required string ExtractorVersion { get; init; }
}
public sealed record FacetFileEntry
{
public required string Path { get; init; } // Normalized POSIX path
public required string ContentHash { get; init; } // SHA-256 hex
public required long Size { get; init; }
public required string Mode { get; init; } // POSIX mode string "0644"
public required DateTimeOffset ModTime { get; init; } // Normalized to UTC
}
```
### 3.3 FacetSeal
DSSE-signed seal combining manifest with metadata:
```csharp
public sealed record FacetSeal
{
public required Guid SealId { get; init; }
public required string ImageRef { get; init; } // registry/repo:tag@sha256:...
public required string ImageDigest { get; init; } // sha256:...
public required FacetManifest Manifest { get; init; }
public required DateTimeOffset SealedAt { get; init; }
public required string SealedBy { get; init; } // Identity/service
public required FacetQuota? AppliedQuota { get; init; }
public required DsseEnvelope Envelope { get; init; }
}
```
### 3.4 FacetQuota
Per-facet change budget:
```csharp
public sealed record FacetQuota
{
public required string FacetId { get; init; }
public double MaxChurnPercent { get; init; } = 5.0; // 0-100
public int MaxChangedFiles { get; init; } = 50;
public int MaxAddedFiles { get; init; } = 25;
public int MaxRemovedFiles { get; init; } = 10;
public QuotaAction OnExceed { get; init; } = QuotaAction.Warn;
}
public enum QuotaAction
{
Warn, // Log warning, allow admission
Block, // Reject admission
RequireVex // Require VEX justification before admission
}
```
### 3.5 FacetDrift
Drift calculation result between two seals:
```csharp
public sealed record FacetDrift
{
public required string FacetId { get; init; }
public required Guid BaselineSealId { get; init; }
public required Guid CurrentSealId { get; init; }
public required ImmutableArray<DriftEntry> Added { get; init; }
public required ImmutableArray<DriftEntry> Removed { get; init; }
public required ImmutableArray<DriftEntry> Modified { get; init; }
public required DriftScore Score { get; init; }
public required QuotaVerdict QuotaVerdict { get; init; }
}
public sealed record DriftEntry
{
public required string Path { get; init; }
public string? OldHash { get; init; }
public string? NewHash { get; init; }
public long? OldSize { get; init; }
public long? NewSize { get; init; }
public DriftCause Cause { get; init; } = DriftCause.Unknown;
}
public enum DriftCause
{
Unknown,
PackageUpdate,
ConfigChange,
BinaryRebuild,
NewDependency,
RemovedDependency,
SecurityPatch
}
public sealed record DriftScore
{
public required int TotalChanges { get; init; }
public required double ChurnPercent { get; init; }
public required int AddedCount { get; init; }
public required int RemovedCount { get; init; }
public required int ModifiedCount { get; init; }
}
public sealed record QuotaVerdict
{
public required bool Passed { get; init; }
public required ImmutableArray<QuotaViolation> Violations { get; init; }
public required QuotaAction RecommendedAction { get; init; }
}
public sealed record QuotaViolation
{
public required string QuotaField { get; init; } // e.g., "MaxChurnPercent"
public required double Limit { get; init; }
public required double Actual { get; init; }
public required string Message { get; init; }
}
```
---
## 4. Component Architecture
### 4.1 FacetExtractor
Extracts file entries from container images based on facet definitions:
```csharp
public interface IFacetExtractor
{
Task<FacetManifest> ExtractAsync(
string imageRef,
FacetDefinition definition,
CancellationToken ct = default);
Task<ImmutableArray<FacetManifest>> ExtractAllAsync(
string imageRef,
ImmutableArray<FacetDefinition> definitions,
CancellationToken ct = default);
}
```
Implementation notes:
- Uses existing `ISurfaceReader` for container layer traversal
- Normalizes paths to POSIX format (forward slashes, no trailing slashes)
- Computes SHA-256 content hashes for each file
- Normalizes timestamps to UTC, mode to POSIX string
- Sorts files lexicographically for deterministic ordering
### 4.2 FacetHasher
Computes Merkle tree for facet file entries:
```csharp
public interface IFacetHasher
{
FacetMerkleResult ComputeMerkle(ImmutableArray<FacetFileEntry> files);
}
public sealed record FacetMerkleResult
{
public required string Root { get; init; }
public required ImmutableArray<string> LeafHashes { get; init; }
public required ImmutableArray<MerkleProofNode> Proof { get; init; }
}
```
Implementation notes:
- Leaf hash = SHA-256(path || contentHash || size || mode)
- Binary Merkle tree with lexicographic leaf ordering
- Empty facet produces well-known empty root hash
- Proof enables verification of individual file membership
### 4.3 FacetSealStore
PostgreSQL storage for sealed facet manifests:
```sql
-- Core seal storage
CREATE TABLE facet_seals (
seal_id UUID PRIMARY KEY,
tenant TEXT NOT NULL,
image_ref TEXT NOT NULL,
image_digest TEXT NOT NULL,
facet_id TEXT NOT NULL,
facet_type TEXT NOT NULL,
merkle_root TEXT NOT NULL,
file_count INTEGER NOT NULL,
total_bytes BIGINT NOT NULL,
sealed_at TIMESTAMPTZ NOT NULL,
sealed_by TEXT NOT NULL,
quota_json JSONB,
manifest_cas TEXT NOT NULL, -- CAS URI to full manifest
dsse_envelope JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT uq_facet_seal UNIQUE (tenant, image_digest, facet_id)
);
CREATE INDEX ix_facet_seals_image ON facet_seals (tenant, image_digest);
CREATE INDEX ix_facet_seals_merkle ON facet_seals (merkle_root);
-- Drift history
CREATE TABLE facet_drift_history (
drift_id UUID PRIMARY KEY,
tenant TEXT NOT NULL,
baseline_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
current_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
facet_id TEXT NOT NULL,
drift_score_json JSONB NOT NULL,
quota_verdict_json JSONB NOT NULL,
computed_at TIMESTAMPTZ NOT NULL,
CONSTRAINT uq_drift_pair UNIQUE (baseline_seal_id, current_seal_id)
);
```
### 4.4 DriftCalculator
Computes drift between baseline and current seals:
```csharp
public interface IDriftCalculator
{
Task<FacetDrift> CalculateAsync(
Guid baselineSealId,
Guid currentSealId,
CancellationToken ct = default);
Task<ImmutableArray<FacetDrift>> CalculateAllAsync(
string imageDigestBaseline,
string imageDigestCurrent,
CancellationToken ct = default);
}
```
Implementation notes:
- Retrieves manifests from CAS via seal metadata
- Performs set difference operations on file paths
- Detects modifications via content hash comparison
- Attributes drift causes where determinable (e.g., package manager metadata)
### 4.5 QuotaEnforcer
Evaluates drift against quota constraints:
```csharp
public interface IQuotaEnforcer
{
QuotaVerdict Evaluate(FacetDrift drift, FacetQuota quota);
Task<ImmutableArray<QuotaVerdict>> EvaluateAllAsync(
ImmutableArray<FacetDrift> drifts,
ImmutableDictionary<string, FacetQuota> quotas,
CancellationToken ct = default);
}
```
### 4.6 AdmissionValidator
Zastava webhook integration for admission control:
```csharp
public interface IFacetAdmissionValidator
{
Task<AdmissionResult> ValidateAsync(
AdmissionRequest request,
CancellationToken ct = default);
}
public sealed record AdmissionResult
{
public required bool Allowed { get; init; }
public string? Message { get; init; }
public ImmutableArray<QuotaViolation> Violations { get; init; } = [];
public string? RequiredVexStatement { get; init; }
}
```
---
## 5. DSSE Envelope Structure
Facet seals use DSSE (Dead Simple Signing Envelope) for cryptographic binding:
```json
{
"payloadType": "application/vnd.stellaops.facet-seal.v1+json",
"payload": "<base64url-encoded canonical JSON of FacetSeal>",
"signatures": [
{
"keyid": "sha256:abc123...",
"sig": "<base64url-encoded signature>"
}
]
}
```
Payload structure (canonical JSON, RFC 8785):
```json
{
"_type": "https://stellaops.io/FacetSeal/v1",
"facetId": "os",
"facetType": "OS",
"imageDigest": "sha256:abc123...",
"imageRef": "registry.example.com/app:v1.2.3",
"manifest": {
"extractedAt": "2026-01-05T10:00:00.000Z",
"extractorVersion": "1.0.0",
"fileCount": 1234,
"files": [
{
"contentHash": "sha256:...",
"mode": "0644",
"modTime": "2026-01-01T00:00:00.000Z",
"path": "/etc/os-release",
"size": 256
}
],
"merkleRoot": "sha256:def456...",
"totalBytes": 1048576
},
"quota": {
"maxAddedFiles": 25,
"maxChangedFiles": 50,
"maxChurnPercent": 5.0,
"maxRemovedFiles": 10,
"onExceed": "Warn"
},
"sealId": "550e8400-e29b-41d4-a716-446655440000",
"sealedAt": "2026-01-05T10:05:00.000Z",
"sealedBy": "scanner-worker-01"
}
```
---
## 6. Default Facet Definitions
Standard facet definitions applied when no custom configuration is provided:
```yaml
# Default facet configuration
facets:
- facetId: os
type: OS
includeGlobs:
- /var/lib/dpkg/**
- /var/lib/rpm/**
- /var/lib/pacman/**
- /var/lib/apk/**
- /var/cache/apt/**
- /etc/apt/**
- /etc/yum.repos.d/**
excludeGlobs:
- "**/*.log"
quota:
maxChurnPercent: 5.0
maxChangedFiles: 100
onExceed: Warn
- facetId: lang/node
type: LangNode
includeGlobs:
- "**/node_modules/**"
- "**/package.json"
- "**/package-lock.json"
- "**/yarn.lock"
- "**/pnpm-lock.yaml"
quota:
maxChurnPercent: 10.0
maxChangedFiles: 500
onExceed: RequireVex
- facetId: lang/python
type: LangPython
includeGlobs:
- "**/site-packages/**"
- "**/dist-packages/**"
- "**/requirements.txt"
- "**/Pipfile.lock"
- "**/poetry.lock"
quota:
maxChurnPercent: 10.0
maxChangedFiles: 200
onExceed: Warn
- facetId: lang/go
type: LangGo
includeGlobs:
- "**/go.mod"
- "**/go.sum"
- "**/vendor/**"
quota:
maxChurnPercent: 15.0
maxChangedFiles: 100
onExceed: Warn
- facetId: binary
type: Binary
includeGlobs:
- /usr/bin/*
- /usr/sbin/*
- /bin/*
- /sbin/*
- /usr/lib/**/*.so*
- /lib/**/*.so*
- /usr/local/bin/*
excludeGlobs:
- "**/*.py"
- "**/*.sh"
quota:
maxChurnPercent: 2.0
maxChangedFiles: 20
onExceed: Block
- facetId: config
type: Config
includeGlobs:
- /etc/**
- "**/*.conf"
- "**/*.cfg"
- "**/*.ini"
- "**/*.yaml"
- "**/*.yml"
- "**/*.json"
excludeGlobs:
- /etc/passwd
- /etc/shadow
- /etc/group
- "**/*.log"
quota:
maxChurnPercent: 20.0
maxChangedFiles: 50
onExceed: Warn
```
---
## 7. Integration Points
### 7.1 Scanner Integration
Scanner invokes facet extraction during scan:
```csharp
// In ScanOrchestrator
var facetDefs = await _facetConfigLoader.LoadAsync(scanRequest.FacetConfig, ct);
var manifests = await _facetExtractor.ExtractAllAsync(imageRef, facetDefs, ct);
foreach (var manifest in manifests)
{
var seal = await _facetSealer.SealAsync(manifest, scanRequest, ct);
await _facetSealStore.SaveAsync(seal, ct);
}
```
### 7.2 CLI Integration
```bash
# Seal all facets for an image
stella seal myregistry.io/app:v1.2.3 --output seals.json
# Seal specific facets
stella seal myregistry.io/app:v1.2.3 --facet os --facet lang/node
# Check drift between two image versions
stella drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4 --format json
# Generate VEX from drift
stella vex gen --from-drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4
```
### 7.3 Zastava Webhook Integration
```csharp
// In FacetAdmissionValidator
public async Task<AdmissionResult> ValidateAsync(AdmissionRequest request, CancellationToken ct)
{
// Find baseline seal (latest approved)
var baseline = await _sealStore.GetLatestApprovedAsync(request.ImageRef, ct);
if (baseline is null)
return AdmissionResult.Allowed("No baseline seal found, skipping facet check");
// Extract current facets
var currentManifests = await _extractor.ExtractAllAsync(request.ImageRef, _defaultFacets, ct);
// Calculate drift for each facet
var drifts = new List<FacetDrift>();
foreach (var manifest in currentManifests)
{
var baselineSeal = baseline.FirstOrDefault(s => s.FacetId == manifest.FacetId);
if (baselineSeal is not null)
{
var drift = await _driftCalculator.CalculateAsync(baselineSeal, manifest, ct);
drifts.Add(drift);
}
}
// Evaluate quotas
var violations = new List<QuotaViolation>();
QuotaAction maxAction = QuotaAction.Warn;
foreach (var drift in drifts)
{
var verdict = _quotaEnforcer.Evaluate(drift, drift.AppliedQuota);
if (!verdict.Passed)
{
violations.AddRange(verdict.Violations);
if (verdict.RecommendedAction > maxAction)
maxAction = verdict.RecommendedAction;
}
}
return maxAction switch
{
QuotaAction.Block => AdmissionResult.Denied(violations),
QuotaAction.RequireVex => AdmissionResult.RequiresVex(violations),
_ => AdmissionResult.Allowed(violations)
};
}
```
---
## 8. Observability
### 8.1 Metrics
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `facet_seal_total` | Counter | `tenant`, `facet_type`, `status` | Total seals created |
| `facet_seal_duration_seconds` | Histogram | `facet_type` | Time to create seal |
| `facet_drift_score` | Gauge | `tenant`, `facet_id`, `image` | Current drift score |
| `facet_quota_violations_total` | Counter | `tenant`, `facet_id`, `quota_field` | Quota violations |
| `facet_admission_decisions_total` | Counter | `tenant`, `decision`, `facet_id` | Admission decisions |
### 8.2 Traces
```
facet.extract - Facet file extraction from image
facet.hash - Merkle tree computation
facet.seal - DSSE signing
facet.drift.compute - Drift calculation
facet.quota.evaluate - Quota enforcement
facet.admission - Admission validation
```
### 8.3 Logs
Structured log fields:
- `facetId`: Facet identifier
- `imageRef`: Container image reference
- `imageDigest`: Image content digest
- `merkleRoot`: Facet Merkle root
- `driftScore`: Computed drift percentage
- `quotaVerdict`: Pass/fail status
---
## 9. Security Considerations
1. **Signature Verification**: All seals must be DSSE-signed with keys managed by Authority service
2. **Tenant Isolation**: Seals are scoped to tenants; cross-tenant access is prohibited
3. **Immutability**: Once created, seals cannot be modified; only superseded by new seals
4. **Audit Trail**: All seal operations are logged with correlation IDs
5. **Key Rotation**: Signing keys support rotation; old signatures remain valid with archived keys
---
## 10. References
- [DSSE Specification](https://github.com/secure-systems-lab/dsse)
- [RFC 8785 - JSON Canonicalization](https://tools.ietf.org/html/rfc8785)
- [Scanner Architecture](../scanner/architecture.md)
- [Attestor Architecture](../attestor/architecture.md)
- [Policy Engine Architecture](../policy/architecture.md)
- [Replay Architecture](../replay/architecture.md)
---
*Last updated: 2026-01-05*

View File

@@ -165,7 +165,7 @@ jobs:
- [Tenant Isolation & Redaction](../tenant-isolation-redaction.md)
- [Findings Ledger Deployment](../deployment.md)
- [Offline Kit Operations](../../../24_OFFLINE_KIT.md)
- [Offline Kit Operations](../../../OFFLINE_KIT.md)
---

View File

@@ -42,7 +42,7 @@ Key settings:
- Architecture: `./architecture.md`
- Router Module: `../router/`
- Authority Module: `../authority/`
- API Reference: `../../09_API_CLI_REFERENCE.md`
- API Reference: `../../API_CLI_REFERENCE.md`
## Current Status

View File

@@ -38,7 +38,7 @@ Key features:
- AirGap Module: `../airgap/`
- ExportCenter: `../export-center/`
- Offline Kit: `../../24_OFFLINE_KIT.md`
- Offline Kit: `../../OFFLINE_KIT.md`
- Operations: `./operations/` (if exists)
## Current Status

View File

@@ -1,7 +1,7 @@
# StellaOps Architecture Overview (Sprint19)
# StellaOps Architecture Overview
> **Ownership:** Architecture Guild • Docs Guild
> **Audience:** Service owners, platform engineers, solution architects
> **Ownership:** Architecture Guild • Docs Guild
> **Audience:** Service owners, platform engineers, solution architects
> **Related:** [High-Level Architecture](../../ARCHITECTURE_REFERENCE.md), [Concelier Architecture](../concelier/architecture.md), [Policy Engine Architecture](../policy/architecture.md), [Aggregation-Only Contract](../../aoc/aggregation-only-contract.md)
This dossier summarises the end-to-end runtime topology after the Aggregation-Only Contract (AOC) rollout. It highlights where raw facts live, how ingest services enforce guardrails, and how downstream components consume those facts to derive policy decisions and user-facing experiences.
@@ -27,7 +27,7 @@ This dossier summarises the end-to-end runtime topology after the Aggregation-On
> Evaluate public scanner incidents? The [Ecosystem Test Cases](../product-advisories/30-Nov-2025 - Ecosystem Test Cases for StellaOps.md) document five hardened regressions (Grype credential leak, Trivy offline schema, SBOM parity, Grype instability) that you can turn into acceptance tests today.
## 1 · System landscape
## 1 · System landscape
```mermaid
graph TD
@@ -94,7 +94,7 @@ Key boundaries:
---
## 2 · Aggregation-Only Contract focus
## 2 · Aggregation-Only Contract focus
### 2.1 Responsibilities at the boundary
@@ -146,7 +146,7 @@ sequenceDiagram
---
## 3 · Data & control flow highlights
## 3 · Data & control flow highlights
1. **Ingestion:** Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to `AOCWriteGuard`. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to PostgreSQL.
2. **Verification:** `stella aoc verify` (CLI/CI) and `/aoc/verify` endpoints replay guard checks against stored documents, mapping `ERR_AOC_00x` codes to exit codes for automation.
@@ -156,7 +156,7 @@ sequenceDiagram
---
## 4 · Offline & disaster readiness
## 4 · Offline & disaster readiness
- **Offline Kit:** Packages raw PostgreSQL snapshots (`advisory_raw`, `vex_raw`) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
- **Recovery:** Supersedes chains allow rollback to prior revisions without mutating rows. Disaster exercises must rehearse restoring from snapshot, replaying logical replication into Policy Engine, and re-validating guard compliance.
@@ -164,9 +164,9 @@ sequenceDiagram
---
## 5 · Replay CAS & deterministic bundles
## 5 · Replay CAS & deterministic bundles
- **Replay CAS:** Content-addressed storage lives under `cas://replay/<sha256-prefix>/<digest>.tar.zst`. Writers must use [StellaOps.Replay.Core](../../src/__Libraries/StellaOps.Replay.Core/AGENTS.md) helpers to ensure lexicographic file ordering, POSIX mode normalisation (0644/0755), LF newlines, zstd level19 compression, and shard-by-prefix CAS URIs (`BuildCasUri`). Bundle metadata (size, hash, created) feeds the platform-wide `replay_bundles` collection defined in `docs/db/replay-schema.md`.
- **Replay CAS:** Content-addressed storage lives under `cas://replay/<sha256-prefix>/<digest>.tar.zst`. Writers must use [StellaOps.Replay.Core](../../src/__Libraries/StellaOps.Replay.Core/AGENTS.md) helpers to ensure lexicographic file ordering, POSIX mode normalisation (0644/0755), LF newlines, zstd level 19 compression, and shard-by-prefix CAS URIs (`BuildCasUri`). Bundle metadata (size, hash, created) feeds the platform-wide `replay_bundles` collection defined in `docs/db/replay-schema.md`.
- **Artifacts:** Each recorded scan stores three bundles:
1. `manifest.json` (canonical JSON, hashed and signed via DSSE).
2. `inputbundle.tar.zst` (feeds, policies, tools, environment snapshot).
@@ -175,11 +175,11 @@ sequenceDiagram
- **Reachability subtree:** When reachability recording is enabled, Scanner uploads graphs & runtime traces under `cas://replay/<scan-id>/reachability/graphs/` and `cas://replay/<scan-id>/reachability/traces/`. Manifest references (StellaOps.Replay.Core) bind these URIs along with analyzer hashes so Replay + Signals can rehydrate explainability evidence deterministically.
- **Storage tiers:** Primary storage is PostgreSQL (`replay_runs`, `replay_subjects`) plus the CAS bucket. Evidence Locker mirrors bundles for long-term retention and legal hold workflows (`docs/modules/evidence-locker/architecture.md`). Offline kits package bundles under `offline/replay/<scan-id>` with detached DSSE envelopes for air-gapped verification.
- **APIs & ownership:** Scanner WebService produces the bundles via `record` mode, Scanner Worker emits Merkle metadata, Signer/Authority provide DSSE signatures, Attestor anchors manifests to Rekor, CLI/Evidence Locker handle retrieval, and Docs Guild maintains runbooks. Responsibilities are tracked in `docs/implplan/SPRINT_185_shared_replay_primitives.md` through `SPRINT_187_evidence_locker_cli_integration.md`.
- **Operational policies:** Retention defaults to 180days for hot CAS storage and 2years for cold Evidence Locker copies. Rotation and pruning follow the checklist in `docs/runbooks/replay_ops.md`.
- **Operational policies:** Retention defaults to 180 days for hot CAS storage and 2 years for cold Evidence Locker copies. Rotation and pruning follow the checklist in `docs/runbooks/replay_ops.md`.
---
## 6 · References
## 6 · References
- [Aggregation-Only Contract reference](../../aoc/aggregation-only-contract.md)
- [Concelier architecture](../concelier/architecture.md)
@@ -194,7 +194,7 @@ sequenceDiagram
---
## 7 · Compliance checklist
## 7 · Compliance checklist
- [ ] AOC guard enabled for all Concelier and Excititor write paths in production.
- [ ] PostgreSQL schema constraints deployed for `advisory_raw` and `vex_raw`; logical replication scoped per tenant.
@@ -208,4 +208,4 @@ sequenceDiagram
---
*Last updated: 2025-12-23 (Testing strategy links and catalog).*
*Last updated: 2026-01-05 (Removed dated sprint reference).*

View File

@@ -617,6 +617,6 @@ CREATE INDEX idx_revocations_time ON provcache.prov_revocations(revoked_at);
- **[Provcache Architecture Guide](architecture.md)** - Detailed architecture, invalidation flows, and API reference
- [Policy Engine Architecture](../policy/README.md)
- [TrustLattice Engine](../policy/design/policy-deterministic-evaluator.md)
- [Offline Kit Documentation](../../24_OFFLINE_KIT.md)
- [Offline Kit Documentation](../../OFFLINE_KIT.md)
- [Air-Gap Controller](../airgap/README.md)
- [Authority Key Rotation](../authority/README.md)

View File

@@ -0,0 +1,606 @@
# Replay Proof Schema
> **Ownership:** Replay Guild, Scanner Guild, Attestor Guild
> **Audience:** Service owners, platform engineers, auditors, compliance teams
> **Related:** [Platform Architecture](../platform/architecture-overview.md), [Replay Architecture](./architecture.md), [Facet Sealing](../facet/architecture.md), [DSSE Specification](https://github.com/secure-systems-lab/dsse)
This document defines the schema for Replay Proofs - compact, cryptographically verifiable artifacts that attest to deterministic policy evaluation outcomes.
---
## 1. Overview
A **Replay Proof** is a DSSE-signed artifact that proves a policy evaluation produced a specific verdict given a specific set of inputs. Replay proofs enable:
- **Audit trails**: Compact proof that a verdict was computed correctly
- **Determinism verification**: Re-running with same inputs produces identical output
- **Time-travel debugging**: Understand why a past decision was made
- **Compliance evidence**: Cryptographic proof for regulatory requirements
---
## 2. Replay Bundle Structure
A complete replay bundle consists of three artifacts stored in CAS:
```
cas://replay/<run-id>/
manifest.json # DSSE-signed manifest (this document's focus)
inputbundle.tar.zst # Compressed input artifacts
outputbundle.tar.zst # Compressed output artifacts
```
### 2.1 Directory Layout
```
<run-id>/
manifest.json
inputbundle.tar.zst
feeds/
nvd/<date>.json
osv/<date>.json
ghsa/<date>.json
policy/
bundle.tar
version.json
sboms/
<sbom-id>.spdx.json
<sbom-id>.cdx.json
vex/
<vex-id>.openvex.json
config/
lattice.json
feature-flags.json
seeds/
random-seeds.json
clock-offsets.json
outputbundle.tar.zst
verdicts/
<verdict-id>.json
findings/
<finding-id>.json
merkle/
verdict-tree.json
finding-tree.json
logs/
replay.log
trace.json
```
---
## 3. Core Schema Definitions
### 3.1 ReplayProof
The primary proof artifact - a compact summary suitable for verification:
```csharp
public sealed record ReplayProof
{
// Identity
public required Guid ProofId { get; init; }
public required Guid RunId { get; init; }
public required string Subject { get; init; } // Image digest or SBOM ID
// Input digest
public required KnowledgeSnapshotDigest InputDigest { get; init; }
// Output digest
public required VerdictDigest OutputDigest { get; init; }
// Execution metadata
public required ExecutionMetadata Execution { get; init; }
// CAS references
public required BundleReferences Bundles { get; init; }
// Signature
public required DateTimeOffset SignedAt { get; init; }
public required string SignedBy { get; init; }
}
```
### 3.2 KnowledgeSnapshotDigest
Cryptographic digest of all inputs:
```csharp
public sealed record KnowledgeSnapshotDigest
{
// Component digests
public required string SbomsDigest { get; init; } // SHA-256 of sorted SBOM hashes
public required string VexDigest { get; init; } // SHA-256 of sorted VEX hashes
public required string FeedsDigest { get; init; } // SHA-256 of feed version manifest
public required string PolicyDigest { get; init; } // SHA-256 of policy bundle
public required string LatticeDigest { get; init; } // SHA-256 of lattice config
public required string SeedsDigest { get; init; } // SHA-256 of random seeds
// Combined root
public required string RootDigest { get; init; } // SHA-256 of all component digests
// Counts for quick comparison
public required int SbomCount { get; init; }
public required int VexCount { get; init; }
public required int FeedCount { get; init; }
}
```
### 3.3 VerdictDigest
Cryptographic digest of all outputs:
```csharp
public sealed record VerdictDigest
{
public required string VerdictMerkleRoot { get; init; } // Merkle root of verdicts
public required string FindingMerkleRoot { get; init; } // Merkle root of findings
public required int VerdictCount { get; init; }
public required int FindingCount { get; init; }
public required VerdictSummary Summary { get; init; }
}
public sealed record VerdictSummary
{
public required int Critical { get; init; }
public required int High { get; init; }
public required int Medium { get; init; }
public required int Low { get; init; }
public required int Informational { get; init; }
public required int Suppressed { get; init; }
public required int Total { get; init; }
}
```
### 3.4 ExecutionMetadata
Execution environment and timing:
```csharp
public sealed record ExecutionMetadata
{
// Timing
public required DateTimeOffset StartedAt { get; init; }
public required DateTimeOffset CompletedAt { get; init; }
public required long DurationMs { get; init; }
// Engine version
public required EngineVersion Engine { get; init; }
// Environment
public required string HostId { get; init; }
public required string RuntimeVersion { get; init; } // e.g., ".NET 10.0.0"
public required string Platform { get; init; } // e.g., "linux-x64"
// Determinism markers
public required bool DeterministicMode { get; init; }
public required string ClockMode { get; init; } // "frozen", "simulated", "real"
public required string RandomMode { get; init; } // "seeded", "recorded", "real"
}
public sealed record EngineVersion
{
public required string Name { get; init; } // e.g., "PolicyEngine"
public required string Version { get; init; } // e.g., "2.1.0"
public required string SourceDigest { get; init; } // SHA-256 of engine source/binary
}
```
### 3.5 BundleReferences
CAS URIs to full bundles:
```csharp
public sealed record BundleReferences
{
public required string ManifestUri { get; init; } // cas://replay/<run-id>/manifest.json
public required string InputBundleUri { get; init; } // cas://replay/<run-id>/inputbundle.tar.zst
public required string OutputBundleUri { get; init; } // cas://replay/<run-id>/outputbundle.tar.zst
public required string ManifestDigest { get; init; } // SHA-256 of manifest.json
public required string InputBundleDigest { get; init; } // SHA-256 of inputbundle.tar.zst
public required string OutputBundleDigest { get; init; } // SHA-256 of outputbundle.tar.zst
public required long InputBundleSize { get; init; }
public required long OutputBundleSize { get; init; }
}
```
---
## 4. DSSE Envelope
Replay proofs are wrapped in DSSE envelopes for cryptographic binding:
```json
{
"payloadType": "application/vnd.stellaops.replay-proof.v1+json",
"payload": "<base64url-encoded canonical JSON>",
"signatures": [
{
"keyid": "sha256:abc123...",
"sig": "<base64url-encoded signature>"
}
]
}
```
### 4.1 Payload Type URI
- **v1**: `application/vnd.stellaops.replay-proof.v1+json`
- **in-toto compatible**: `https://stellaops.io/ReplayProof/v1`
### 4.2 Canonical JSON Encoding
Payloads MUST be encoded using RFC 8785 canonical JSON:
1. Keys sorted lexicographically using Unicode code points
2. No whitespace between structural characters
3. No trailing commas
4. Numbers without unnecessary decimal points or exponents
5. Strings with minimal escaping (only required characters)
---
## 5. Full Manifest Schema
The `manifest.json` file contains the complete proof plus additional metadata:
```json
{
"_type": "https://stellaops.io/ReplayManifest/v1",
"proofId": "550e8400-e29b-41d4-a716-446655440000",
"runId": "660e8400-e29b-41d4-a716-446655440001",
"subject": "sha256:abc123def456...",
"tenant": "acme-corp",
"inputDigest": {
"sbomsDigest": "sha256:111...",
"vexDigest": "sha256:222...",
"feedsDigest": "sha256:333...",
"policyDigest": "sha256:444...",
"latticeDigest": "sha256:555...",
"seedsDigest": "sha256:666...",
"rootDigest": "sha256:aaa...",
"sbomCount": 1,
"vexCount": 5,
"feedCount": 3
},
"outputDigest": {
"verdictMerkleRoot": "sha256:bbb...",
"findingMerkleRoot": "sha256:ccc...",
"verdictCount": 42,
"findingCount": 156,
"summary": {
"critical": 2,
"high": 8,
"medium": 25,
"low": 12,
"informational": 3,
"suppressed": 106,
"total": 156
}
},
"execution": {
"startedAt": "2026-01-05T10:00:00.000Z",
"completedAt": "2026-01-05T10:00:05.123Z",
"durationMs": 5123,
"engine": {
"name": "PolicyEngine",
"version": "2.1.0",
"sourceDigest": "sha256:engine123..."
},
"hostId": "scanner-worker-01",
"runtimeVersion": ".NET 10.0.0",
"platform": "linux-x64",
"deterministicMode": true,
"clockMode": "frozen",
"randomMode": "seeded"
},
"bundles": {
"manifestUri": "cas://replay/660e8400.../manifest.json",
"inputBundleUri": "cas://replay/660e8400.../inputbundle.tar.zst",
"outputBundleUri": "cas://replay/660e8400.../outputbundle.tar.zst",
"manifestDigest": "sha256:manifest...",
"inputBundleDigest": "sha256:input...",
"outputBundleDigest": "sha256:output...",
"inputBundleSize": 10485760,
"outputBundleSize": 2097152
},
"signedAt": "2026-01-05T10:00:06.000Z",
"signedBy": "scanner-worker-01"
}
```
---
## 6. Verification Protocol
### 6.1 Quick Verification (Proof Only)
Verify the DSSE signature and check digest consistency:
```csharp
public async Task<VerificationResult> VerifyProofAsync(
ReplayProof proof,
DsseEnvelope envelope,
CancellationToken ct)
{
// 1. Verify DSSE signature
var sigValid = await _dsseVerifier.VerifyAsync(envelope, ct);
if (!sigValid)
return VerificationResult.Failed("DSSE signature invalid");
// 2. Verify input digest consistency
var inputRoot = ComputeInputRoot(
proof.InputDigest.SbomsDigest,
proof.InputDigest.VexDigest,
proof.InputDigest.FeedsDigest,
proof.InputDigest.PolicyDigest,
proof.InputDigest.LatticeDigest,
proof.InputDigest.SeedsDigest);
if (inputRoot != proof.InputDigest.RootDigest)
return VerificationResult.Failed("Input root digest mismatch");
return VerificationResult.Passed();
}
```
### 6.2 Full Verification (With Replay)
Download bundles and re-execute to verify determinism:
```csharp
public async Task<VerificationResult> VerifyWithReplayAsync(
ReplayProof proof,
CancellationToken ct)
{
// 1. Quick verification first
var quickResult = await VerifyProofAsync(proof, envelope, ct);
if (!quickResult.Passed)
return quickResult;
// 2. Download bundles from CAS
var inputBundle = await _cas.DownloadAsync(proof.Bundles.InputBundleUri, ct);
var outputBundle = await _cas.DownloadAsync(proof.Bundles.OutputBundleUri, ct);
// 3. Verify bundle digests
if (ComputeDigest(inputBundle) != proof.Bundles.InputBundleDigest)
return VerificationResult.Failed("Input bundle digest mismatch");
if (ComputeDigest(outputBundle) != proof.Bundles.OutputBundleDigest)
return VerificationResult.Failed("Output bundle digest mismatch");
// 4. Extract and verify individual input digests
var inputs = await ExtractInputsAsync(inputBundle, ct);
var computedInputDigest = ComputeKnowledgeDigest(inputs);
if (computedInputDigest.RootDigest != proof.InputDigest.RootDigest)
return VerificationResult.Failed("Computed input digest mismatch");
// 5. Re-execute policy evaluation
var replayResult = await _replayEngine.ExecuteAsync(inputs, ct);
// 6. Compare output digests
var computedOutputDigest = ComputeVerdictDigest(replayResult);
if (computedOutputDigest.VerdictMerkleRoot != proof.OutputDigest.VerdictMerkleRoot)
return VerificationResult.Failed("Verdict Merkle root mismatch - non-deterministic!");
if (computedOutputDigest.FindingMerkleRoot != proof.OutputDigest.FindingMerkleRoot)
return VerificationResult.Failed("Finding Merkle root mismatch - non-deterministic!");
return VerificationResult.Passed();
}
```
---
## 7. Digest Computation
### 7.1 Input Root Digest
```csharp
public string ComputeInputRoot(
string sbomsDigest,
string vexDigest,
string feedsDigest,
string policyDigest,
string latticeDigest,
string seedsDigest)
{
// Concatenate in fixed order with separators
var combined = string.Join("|",
sbomsDigest,
vexDigest,
feedsDigest,
policyDigest,
latticeDigest,
seedsDigest);
return ComputeSha256(combined);
}
```
### 7.2 SBOM Collection Digest
```csharp
public string ComputeSbomsDigest(IEnumerable<SbomRef> sboms)
{
// Sort by ID for determinism
var sorted = sboms.OrderBy(s => s.SbomId, StringComparer.Ordinal);
// Concatenate hashes
var combined = string.Join("|", sorted.Select(s => s.ContentHash));
return ComputeSha256(combined);
}
```
### 7.3 Verdict Merkle Root
```csharp
public string ComputeVerdictMerkleRoot(IEnumerable<Verdict> verdicts)
{
// Sort by verdict ID for determinism
var sorted = verdicts.OrderBy(v => v.VerdictId, StringComparer.Ordinal);
// Compute leaf hashes
var leaves = sorted.Select(v => ComputeVerdictLeafHash(v)).ToArray();
// Build Merkle tree
return MerkleTreeBuilder.ComputeRoot(leaves);
}
private string ComputeVerdictLeafHash(Verdict verdict)
{
var canonical = CanonicalJsonSerializer.Serialize(verdict);
return ComputeSha256(canonical);
}
```
---
## 8. Database Schema
```sql
-- Replay proof storage
CREATE TABLE replay_proofs (
proof_id UUID PRIMARY KEY,
run_id UUID NOT NULL,
tenant TEXT NOT NULL,
subject TEXT NOT NULL,
input_root_digest TEXT NOT NULL,
output_verdict_root TEXT NOT NULL,
output_finding_root TEXT NOT NULL,
execution_json JSONB NOT NULL,
bundles_json JSONB NOT NULL,
dsse_envelope JSONB NOT NULL,
signed_at TIMESTAMPTZ NOT NULL,
signed_by TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT uq_replay_run UNIQUE (run_id)
);
CREATE INDEX ix_replay_proofs_tenant ON replay_proofs (tenant, created_at DESC);
CREATE INDEX ix_replay_proofs_subject ON replay_proofs (subject);
CREATE INDEX ix_replay_proofs_input ON replay_proofs (input_root_digest);
-- Replay verification log
CREATE TABLE replay_verifications (
verification_id UUID PRIMARY KEY,
proof_id UUID NOT NULL REFERENCES replay_proofs(proof_id),
tenant TEXT NOT NULL,
verification_type TEXT NOT NULL, -- 'quick', 'full'
passed BOOLEAN NOT NULL,
failure_reason TEXT,
duration_ms BIGINT NOT NULL,
verified_at TIMESTAMPTZ NOT NULL,
verified_by TEXT NOT NULL,
CONSTRAINT fk_proof FOREIGN KEY (proof_id) REFERENCES replay_proofs(proof_id)
);
CREATE INDEX ix_replay_verifications_proof ON replay_verifications (proof_id);
```
---
## 9. CLI Integration
```bash
# Verify a replay proof (quick - signature only)
stella verify --proof proof.json
# Verify with full replay
stella verify --proof proof.json --replay
# Verify from CAS URI
stella verify --bundle cas://replay/660e8400.../manifest.json
# Export proof for audit
stella replay export --run-id 660e8400-... --output proof.json
# List proofs for an image
stella replay list --subject sha256:abc123...
# Diff two replay results
stella replay diff --run-id-a 660e8400... --run-id-b 770e8400...
```
---
## 10. API Endpoints
```http
# Get proof by run ID
GET /api/v1/replay/{runId}/proof
Response: ReplayProof (JSON)
# Verify proof
POST /api/v1/replay/{runId}/verify
Request: { "type": "quick" | "full" }
Response: VerificationResult
# List proofs for subject
GET /api/v1/replay/proofs?subject={digest}&tenant={tenant}
Response: ReplayProofSummary[]
# Download bundle
GET /api/v1/replay/{runId}/bundles/{type}
Response: Binary stream (tar.zst)
# Compare two runs
GET /api/v1/replay/diff?runIdA={id}&runIdB={id}
Response: ReplayDiffResult
```
---
## 11. Error Codes
| Code | Description |
|------|-------------|
| `REPLAY_001` | Proof not found |
| `REPLAY_002` | DSSE signature verification failed |
| `REPLAY_003` | Input digest mismatch |
| `REPLAY_004` | Output digest mismatch (non-deterministic) |
| `REPLAY_005` | Bundle not found in CAS |
| `REPLAY_006` | Bundle digest mismatch |
| `REPLAY_007` | Engine version mismatch |
| `REPLAY_008` | Replay execution failed |
| `REPLAY_009` | Insufficient permissions |
| `REPLAY_010` | Bundle format invalid |
---
## 12. Migration from v0
If upgrading from pre-v1 replay bundles:
1. **Schema migration**: Run `migrate-replay-schema.sql`
2. **Re-sign existing proofs**: Use `stella replay migrate --sign` to add DSSE envelopes
3. **Verify migration**: Run `stella replay verify --all` to check integrity
4. **Update consumers**: Point to new `/api/v1/replay` endpoints
---
## 13. Security Considerations
1. **Key Management**: Signing keys managed by Authority service with rotation support
2. **Tenant Isolation**: Proofs scoped to tenants; cross-tenant access prohibited
3. **Integrity**: All digests use SHA-256; Merkle proofs enable partial verification
4. **Immutability**: Proofs cannot be modified once signed
5. **Audit**: All verification attempts logged with correlation IDs
6. **Air-gap**: Proofs and bundles can be exported for offline verification
---
## 14. References
- [DSSE Specification](https://github.com/secure-systems-lab/dsse)
- [RFC 8785 - JSON Canonicalization](https://tools.ietf.org/html/rfc8785)
- [in-toto Attestation Framework](https://github.com/in-toto/attestation)
- [SLSA Provenance](https://slsa.dev/provenance)
- [Platform Architecture](../platform/architecture-overview.md)
- [Facet Sealing Architecture](../facet/architecture.md)
---
*Last updated: 2026-01-05*

View File

@@ -320,4 +320,4 @@ When schemas/adapters change:
- Sprint: `docs/implplan/SPRINT_0186_0001_0001_record_deterministic_execution.md` (SC10)
- Roadmap: `docs/modules/scanner/design/standards-convergence-roadmap.md` (SC1)
- Governance: `docs/modules/scanner/design/schema-governance.md` (SC9)
- Offline Operation: `docs/24_OFFLINE_KIT.md`
- Offline Operation: `docs/OFFLINE_KIT.md`

View File

@@ -277,4 +277,4 @@ Stripped binaries may lack Build-IDs. Options:
- [BinaryIndex Architecture](../../binaryindex/architecture.md)
- [Scanner Architecture](../architecture.md)
- [Proof Chain Specification](../../attestor/proof-chain-specification.md)
- [CLI Reference](../../../09_API_CLI_REFERENCE.md)
- [CLI Reference](../../../API_CLI_REFERENCE.md)

View File

@@ -411,4 +411,4 @@ var payload = await _payloadStore.GetAsync(artifact.Uri, ct);
- [Surface.FS Design](../design/surface-fs.md)
- [Surface.Env Design](../design/surface-env.md)
- [Surface.Validation Guide](./surface-validation-extensibility.md)
- [Offline Kit Documentation](../../../../24_OFFLINE_KIT.md)
- [Offline Kit Documentation](../../../../OFFLINE_KIT.md)

View File

@@ -23,7 +23,7 @@
| Rekor v2 (managed or self-hosted) | Transparency log providing UUIDs + inclusion proofs. | `docs/ops/rekor/README.md` (if self-hosted) |
| `StellaOps.Scanner` (WebService/Worker) | Requests attestations per scan, stores Rekor metadata next to SBOM artefacts. | `docs/modules/scanner/architecture.md` |
| Export Center | Packages DSSE payloads + proofs into Offline Kit bundles and mirrors license notices. | `docs/modules/export-center/architecture.md` |
| Policy Engine + CLI | Enforce “attested only” promotion, expose CLI verification verbs. | `docs/modules/policy/architecture.md`, `docs/09_API_CLI_REFERENCE.md` |
| Policy Engine + CLI | Enforce “attested only” promotion, expose CLI verification verbs. | `docs/modules/policy/architecture.md`, `docs/API_CLI_REFERENCE.md` |
---
@@ -210,4 +210,4 @@ stellaops-cli attest verify --envelope artifacts/scan123/attest/sbom.dsse.json \
- Scanner architecture (§Signer → Attestor → Rekor): `docs/modules/scanner/architecture.md`
- Export Center profiles: `docs/modules/export-center/architecture.md`
- Policy Engine predicates: `docs/modules/policy/architecture.md`
- CLI reference: `docs/09_API_CLI_REFERENCE.md`
- CLI reference: `docs/API_CLI_REFERENCE.md`

View File

@@ -371,5 +371,5 @@ The bundle was created without the `--sign` flag. Either:
- `docs/modules/policy/secret-leak-detection-readiness.md`
- `docs/benchmarks/scanner/deep-dives/secrets.md`
- `docs/modules/scanner/design/surface-secrets.md`
- `docs/07_HIGH_LEVEL_ARCHITECTURE.md` - Runtime inventory (Scanner)
- `docs/ARCHITECTURE_OVERVIEW.md` - Runtime inventory (Scanner)
- [Secrets Bundle Rotation](./secrets-bundle-rotation.md)

View File

@@ -39,7 +39,7 @@ Key features:
## Related Documentation
- API Reference: `../../09_API_CLI_REFERENCE.md`
- API Reference: `../../API_CLI_REFERENCE.md`
- OpenAPI Specs: `../../api/` (if exists)
- CLI: `../cli/`
- Gateway: `../gateway/`

View File

@@ -51,7 +51,7 @@ Key settings:
- Architecture: `./architecture.md`
- Policy Engine: `../policy/`
- VexLens: `../vex-lens/`
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
## Current Status

View File

@@ -44,7 +44,7 @@ Snapshot functionality is implemented across multiple modules:
- AirGap: `../airgap/`
- ExportCenter: `../export-center/`
- Replay: `../replay/` (if exists)
- Offline Kit: `../../24_OFFLINE_KIT.md`
- Offline Kit: `../../OFFLINE_KIT.md`
## Current Status

View File

@@ -32,7 +32,7 @@ The Console presents operator dashboards for scans, policies, VEX evidence, runt
- Auth smoke tests in `operations/auth-smoke.md`.
- Observability runbook + dashboard stub in `operations/observability.md` and `operations/dashboards/console-ui-observability.json` (offline import).
- Console architecture doc for layout and SSE fan-out.
- Operator guide: `../../15_UI_GUIDE.md`. Accessibility: `../../accessibility.md`. Security: `../../security/`.
- Operator guide: `../../UI_GUIDE.md`. Accessibility: `../../accessibility.md`. Security: `../../security/`.
## Related resources
- ./operations/auth-smoke.md

View File

@@ -4,7 +4,7 @@
> **Ownership:** Console Guild • Docs Guild
> **Delivery scope:** `StellaOps.Web` Angular workspace, Console Web Gateway routes (`/console/*`), Downloads manifest surfacing, SSE fan-out for Scheduler & telemetry.
> **Related docs:** [Console operator guide](../../15_UI_GUIDE.md), [Admin workflows](../../console/admin-tenants.md), [Air-gap workflows](../../console/airgap.md), [Console security posture](../../security/console-security.md), [Console observability](../../console/observability.md), [UI telemetry](../../observability/ui-telemetry.md), [Deployment guide](../../deploy/console.md)
> **Related docs:** [Console operator guide](../../UI_GUIDE.md), [Admin workflows](../../console/admin-tenants.md), [Air-gap workflows](../../console/airgap.md), [Console security posture](../../security/console-security.md), [Console observability](../../console/observability.md), [UI telemetry](../../observability/ui-telemetry.md), [Deployment guide](../../deploy/console.md)
This dossier describes the end-to-end architecture of the StellaOps Console as delivered in Sprint23. It covers the Angular workspace layout, API/gateway integration points, live-update channels, performance budgets, offline workflows, and observability hooks needed to keep the console deterministic and air-gap friendly.

View File

@@ -414,6 +414,6 @@ Deep-dive into the cryptographic attestation chain, showing DSSE envelopes and R
## References
- `docs/db/SPECIFICATION.md` Section 5.6-5.8 Schema definitions
- `docs/24_OFFLINE_KIT.md` Section 2.2 Proof replay workflow
- `docs/OFFLINE_KIT.md` Section 2.2 Proof replay workflow
- `SPRINT_3500_0001_0001_deeper_moat_master.md` Feature requirements
- `docs/modules/ui/architecture.md` Console architecture