docs consolidation, big sln build fixes, new advisories and sprints/tasks
This commit is contained in:
@@ -20,7 +20,7 @@ This directory contains architecture documentation for all StellaOps modules.
|
||||
| [Concelier](./concelier/) | `src/Concelier/` | Vulnerability advisory ingestion and merge engine |
|
||||
| [Excititor](./excititor/) | `src/Excititor/` | VEX document ingestion and export |
|
||||
| [VexLens](./vex-lens/) | `src/VexLens/` | VEX consensus computation across issuers |
|
||||
| [VexHub](./vexhub/) | `src/VexHub/` | VEX distribution and exchange hub |
|
||||
| [VexHub](./vex-hub/) | `src/VexHub/` | VEX distribution and exchange hub |
|
||||
| [IssuerDirectory](./issuer-directory/) | `src/IssuerDirectory/` | Issuer trust registry (CSAF publishers) |
|
||||
| [Feedser](./feedser/) | `src/Feedser/` | Evidence collection library for backport detection |
|
||||
| [Mirror](./mirror/) | `src/Mirror/` | Vulnerability feed mirror and distribution |
|
||||
@@ -30,10 +30,10 @@ This directory contains architecture documentation for all StellaOps modules.
|
||||
| Module | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| [Scanner](./scanner/) | `src/Scanner/` | Container scanning with SBOM generation |
|
||||
| [BinaryIndex](./binaryindex/) | `src/BinaryIndex/` | Binary identity extraction and fingerprinting |
|
||||
| [BinaryIndex](./binary-index/) | `src/BinaryIndex/` | Binary identity extraction and fingerprinting |
|
||||
| [AdvisoryAI](./advisory-ai/) | `src/AdvisoryAI/` | AI-assisted advisory analysis |
|
||||
| [Symbols](./symbols/) | `src/Symbols/` | Symbol resolution and debug information |
|
||||
| [ReachGraph](./reachgraph/) | `src/ReachGraph/` | Reachability graph service |
|
||||
| [ReachGraph](./reach-graph/) | `src/ReachGraph/` | Reachability graph service |
|
||||
|
||||
### Artifacts & Evidence
|
||||
|
||||
@@ -41,18 +41,18 @@ This directory contains architecture documentation for all StellaOps modules.
|
||||
|--------|------|-------------|
|
||||
| [Attestor](./attestor/) | `src/Attestor/` | in-toto/DSSE attestation generation |
|
||||
| [Signer](./signer/) | `src/Signer/` | Cryptographic signing operations |
|
||||
| [SbomService](./sbomservice/) | `src/SbomService/` | SBOM storage, versioning, and lineage ledger |
|
||||
| [SbomService](./sbom-service/) | `src/SbomService/` | SBOM storage, versioning, and lineage ledger |
|
||||
| [EvidenceLocker](./evidence-locker/) | `src/EvidenceLocker/` | Sealed evidence storage and export |
|
||||
| [ExportCenter](./export-center/) | `src/ExportCenter/` | Batch export and report generation |
|
||||
| [Provenance](./provenance/) | `src/Provenance/` | SLSA/DSSE attestation tooling |
|
||||
| [Provcache](./provcache/) | Library | Provenance cache utilities |
|
||||
| [Provcache](./prov-cache/) | Library | Provenance cache utilities |
|
||||
|
||||
### Policy & Risk
|
||||
|
||||
| Module | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| [Policy](./policy/) | `src/Policy/` | Policy engine with K4 lattice logic |
|
||||
| [RiskEngine](./riskengine/) | `src/RiskEngine/` | Risk scoring runtime |
|
||||
| [RiskEngine](./risk-engine/) | `src/RiskEngine/` | Risk scoring runtime |
|
||||
| [VulnExplorer](./vuln-explorer/) | `src/VulnExplorer/` | Vulnerability exploration and triage |
|
||||
| [Unknowns](./unknowns/) | `src/Unknowns/` | Unknown component tracking registry |
|
||||
|
||||
@@ -65,8 +65,8 @@ This directory contains architecture documentation for all StellaOps modules.
|
||||
| [TaskRunner](./taskrunner/) | `src/TaskRunner/` | Task pack execution engine |
|
||||
| [Notify](./notify/) | `src/Notify/` | Notification toolkit (Email, Slack, Teams, Webhooks) |
|
||||
| [Notifier](./notifier/) | `src/Notifier/` | Notifications Studio host |
|
||||
| [PacksRegistry](./packsregistry/) | `src/PacksRegistry/` | Task packs registry |
|
||||
| [TimelineIndexer](./timelineindexer/) | `src/TimelineIndexer/` | Timeline event indexing |
|
||||
| [PacksRegistry](./packs-registry/) | `src/PacksRegistry/` | Task packs registry |
|
||||
| [TimelineIndexer](./timeline-indexer/) | `src/TimelineIndexer/` | Timeline event indexing |
|
||||
| [Replay](./replay/) | `src/Replay/` | Deterministic replay engine |
|
||||
|
||||
### Integration
|
||||
|
||||
@@ -273,6 +273,6 @@ stella model benchmark llama3-8b-q4km --iterations 10
|
||||
## Related Documentation
|
||||
|
||||
- [Advisory AI Architecture](../architecture.md)
|
||||
- [Offline Kit Overview](../../../24_OFFLINE_KIT.md)
|
||||
- [Offline Kit Overview](../../../OFFLINE_KIT.md)
|
||||
- [AI Attestations](../../../implplan/SPRINT_20251226_018_AI_attestations.md)
|
||||
- [Replay Semantics](./replay-semantics.md)
|
||||
|
||||
@@ -42,7 +42,7 @@ Key settings:
|
||||
## Related Documentation
|
||||
|
||||
- Operations: `./operations/` (if exists)
|
||||
- Offline Kit: `../../24_OFFLINE_KIT.md`
|
||||
- Offline Kit: `../../OFFLINE_KIT.md`
|
||||
- Mirror: `../mirror/`
|
||||
- ExportCenter: `../export-center/`
|
||||
|
||||
|
||||
@@ -344,5 +344,5 @@ AirGap:
|
||||
* Evidence reconciliation: `./evidence-reconciliation.md`
|
||||
* Exporter coordination: `./exporter-cli-coordination.md`
|
||||
* Mirror DSSE plan: `./mirror-dsse-plan.md`
|
||||
* Offline Kit: `../../24_OFFLINE_KIT.md`
|
||||
* Offline Kit: `../../OFFLINE_KIT.md`
|
||||
* Time anchor schema: `../../airgap/time-anchor-schema.md`
|
||||
|
||||
@@ -489,7 +489,7 @@ Content-Disposition: attachment; filename="verdict-{manifestId}.json"
|
||||
- [Trust Lattice Specification](../excititor/trust-lattice.md)
|
||||
- [Authority Architecture](./architecture.md)
|
||||
- [DSSE Signing](../../dev/dsse-signing.md)
|
||||
- [API Reference](../../09_API_CLI_REFERENCE.md)
|
||||
- [API Reference](../../API_CLI_REFERENCE.md)
|
||||
|
||||
---
|
||||
|
||||
|
||||
94
docs/modules/binary-index/README.md
Normal file
94
docs/modules/binary-index/README.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# BinaryIndex
|
||||
|
||||
**Status:** Implemented
|
||||
**Source:** `src/BinaryIndex/`
|
||||
**Owner:** Scanner Guild + Concelier Guild
|
||||
|
||||
## Purpose
|
||||
|
||||
BinaryIndex provides vulnerable binary detection independent of package metadata. It addresses the gap where package version strings can lie (backports, custom builds, stripped metadata) through binary-first vulnerability identification using Build-IDs, hash catalogs, and function fingerprints.
|
||||
|
||||
## Components
|
||||
|
||||
**Libraries:**
|
||||
- `StellaOps.BinaryIndex.Core` - Core binary identity extraction and matching engine
|
||||
- `StellaOps.BinaryIndex.Corpus` - Binary-to-advisory mapping database
|
||||
- `StellaOps.BinaryIndex.Corpus.Debian` - Debian-specific corpus support
|
||||
- `StellaOps.BinaryIndex.Fingerprints` - Function fingerprint storage and matching (CFG/basic-block hashes)
|
||||
- `StellaOps.BinaryIndex.FixIndex` - Patch-aware backport handling
|
||||
- `StellaOps.BinaryIndex.Persistence` - Storage adapters for binary catalogs
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration is typically embedded in Scanner and Concelier module settings.
|
||||
|
||||
Key features:
|
||||
- Three-tier binary identification (package/version, Build-ID/hash, function fingerprints)
|
||||
- Binary identity extraction (Build-ID, PE CodeView GUID, Mach-O UUID)
|
||||
- Integration with Scanner.Worker for binary lookup
|
||||
- Offline-first design with deterministic outputs
|
||||
|
||||
## Dependencies
|
||||
|
||||
- PostgreSQL (integrated with Scanner/Concelier schemas)
|
||||
- Scanner.Analyzers.Native (for binary disassembly/analysis)
|
||||
- Concelier (for advisory-to-binary mapping)
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- Architecture: `./architecture.md`
|
||||
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
|
||||
- Scanner Architecture: `../scanner/architecture.md`
|
||||
- Concelier Architecture: `../concelier/architecture.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
Library implementation complete with support for ELF (Build-ID), PE (CodeView GUID), and Mach-O (UUID) binary formats. Integrated into Scanner's native binary analysis pipeline.
|
||||
|
||||
---
|
||||
|
||||
## Semantic Diffing Roadmap
|
||||
|
||||
A major enhancement to BinaryIndex is planned to enable **semantic-level binary diffing** - detecting function equivalence based on behavior rather than syntax. This addresses limitations in current byte/symbol-based matching when dealing with:
|
||||
|
||||
- Compiler optimizations (same source, different instructions)
|
||||
- Stripped binaries (no symbols)
|
||||
- Cross-compiler builds (GCC vs Clang)
|
||||
- Obfuscated code
|
||||
|
||||
### Planned Phases
|
||||
|
||||
| Phase | Description | Impact | Status |
|
||||
|-------|-------------|--------|--------|
|
||||
| **Phase 1** | IR-Level Semantic Analysis | +15% accuracy on optimized binaries | Planned |
|
||||
| **Phase 2** | Function Behavior Corpus | +10% coverage on stripped binaries | Planned |
|
||||
| **Phase 3** | Ghidra Integration | +5% edge case handling | Planned |
|
||||
| **Phase 4** | Decompiler & ML Similarity | +10% obfuscation resilience | Planned |
|
||||
|
||||
### New Libraries (Planned)
|
||||
|
||||
- `StellaOps.BinaryIndex.Semantic` - IR lifting and semantic graph fingerprints
|
||||
- `StellaOps.BinaryIndex.Corpus` - 30K+ function behavior database
|
||||
- `StellaOps.BinaryIndex.Ghidra` - Ghidra Headless integration
|
||||
- `StellaOps.BinaryIndex.Decompiler` - Decompiled code AST comparison
|
||||
- `StellaOps.BinaryIndex.ML` - CodeBERT-based function embeddings
|
||||
- `StellaOps.BinaryIndex.Ensemble` - Multi-signal decision fusion
|
||||
|
||||
### Expected Outcomes
|
||||
|
||||
| Metric | Current | Target |
|
||||
|--------|---------|--------|
|
||||
| Patch detection accuracy | ~70% | 92%+ |
|
||||
| Function identification (stripped) | ~50% | 85%+ |
|
||||
| False positive rate | ~5% | <2% |
|
||||
|
||||
### Sprint Files
|
||||
|
||||
- `docs/implplan/SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
|
||||
- `docs/implplan/SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
|
||||
- `docs/implplan/SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
|
||||
- `docs/implplan/SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
|
||||
|
||||
### Architecture Documentation
|
||||
|
||||
See `./semantic-diffing.md` for comprehensive architecture documentation.
|
||||
@@ -3,7 +3,7 @@
|
||||
> **Ownership:** Scanner Guild + Concelier Guild
|
||||
> **Status:** DRAFT
|
||||
> **Version:** 1.0.0
|
||||
> **Related:** [High-Level Architecture](../../07_HIGH_LEVEL_ARCHITECTURE.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md)
|
||||
> **Related:** [High-Level Architecture](../../ARCHITECTURE_OVERVIEW.md), [Scanner Architecture](../scanner/architecture.md), [Concelier Architecture](../concelier/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
564
docs/modules/binary-index/semantic-diffing.md
Normal file
564
docs/modules/binary-index/semantic-diffing.md
Normal file
@@ -0,0 +1,564 @@
|
||||
# Semantic Diffing Architecture
|
||||
|
||||
> **Status:** PLANNED
|
||||
> **Version:** 1.0.0
|
||||
> **Related Sprints:**
|
||||
> - `SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
|
||||
> - `SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
|
||||
> - `SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
|
||||
> - `SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive Summary
|
||||
|
||||
Semantic diffing is an advanced binary analysis capability that detects function equivalence based on **behavior** rather than **syntax**. This enables accurate vulnerability detection in scenarios where traditional byte-level or symbol-based matching fails:
|
||||
|
||||
- **Compiler optimizations** - Same source, different instructions
|
||||
- **Obfuscation** - Intentionally altered code structure
|
||||
- **Stripped binaries** - No symbols or debug information
|
||||
- **Cross-compiler** - GCC vs Clang produce different output
|
||||
- **Backported patches** - Different version, same fix
|
||||
|
||||
### Expected Impact
|
||||
|
||||
| Capability | Current Accuracy | With Semantic Diffing |
|
||||
|------------|-----------------|----------------------|
|
||||
| Patch detection (optimized) | ~70% | 92%+ |
|
||||
| Function identification (stripped) | ~50% | 85%+ |
|
||||
| Obfuscation resilience | ~40% | 75%+ |
|
||||
| False positive rate | ~5% | <2% |
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Semantic Diffing Architecture │
|
||||
│ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Analysis Layer ││
|
||||
│ │ ││
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││
|
||||
│ │ │ B2R2 │ │ Ghidra │ │ Decompiler │ │ ML │ ││
|
||||
│ │ │ (Primary) │ │ (Fallback) │ │ (Optional) │ │ (Optional) │ ││
|
||||
│ │ │ │ │ │ │ │ │ │ ││
|
||||
│ │ │ - Disasm │ │ - P-Code │ │ - C output │ │ - CodeBERT │ ││
|
||||
│ │ │ - LowUIR │ │ - BSim │ │ - AST parse │ │ - GraphSage │ ││
|
||||
│ │ │ - CFG │ │ - Ver.Track │ │ - Normalize │ │ - Embedding │ ││
|
||||
│ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ ││
|
||||
│ │ │ │ │ │ ││
|
||||
│ └─────────┴────────────────┴────────────────┴────────────────┴───────────────┘│
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Fingerprint Layer ││
|
||||
│ │ ││
|
||||
│ │ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ ││
|
||||
│ │ │ Instruction │ │ Semantic │ │ Decompiled │ ││
|
||||
│ │ │ Fingerprint │ │ Fingerprint │ │ Fingerprint │ ││
|
||||
│ │ │ │ │ │ │ │ ││
|
||||
│ │ │ - BasicBlock hash │ │ - KSG graph hash │ │ - AST hash │ ││
|
||||
│ │ │ - CFG edge hash │ │ - WL hash │ │ - Normalized code │ ││
|
||||
│ │ │ - String refs │ │ - DataFlow hash │ │ - API sequence │ ││
|
||||
│ │ │ - Rolling chunks │ │ - API calls │ │ - Pattern hash │ ││
|
||||
│ │ └───────────────────┘ └───────────────────┘ └───────────────────┘ ││
|
||||
│ │ ││
|
||||
│ │ ┌───────────────────┐ ┌───────────────────┐ ││
|
||||
│ │ │ BSim │ │ ML Embedding │ ││
|
||||
│ │ │ Signature │ │ Vector │ ││
|
||||
│ │ │ │ │ │ ││
|
||||
│ │ │ - Feature vector │ │ - 768-dim float[] │ ││
|
||||
│ │ │ - Significance │ │ - Cosine sim │ ││
|
||||
│ │ └───────────────────┘ └───────────────────┘ ││
|
||||
│ │ ││
|
||||
│ └─────────────────────────────────────────────────────────────────────────────┘│
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Matching Layer ││
|
||||
│ │ ││
|
||||
│ │ ┌───────────────────────────────────────────────────────────────────────┐ ││
|
||||
│ │ │ Ensemble Decision Engine │ ││
|
||||
│ │ │ │ ││
|
||||
│ │ │ Signal Weights: │ ││
|
||||
│ │ │ - Instruction fingerprint: 15% │ ││
|
||||
│ │ │ - Semantic graph: 25% │ ││
|
||||
│ │ │ - Decompiled AST: 35% │ ││
|
||||
│ │ │ - ML embedding: 25% │ ││
|
||||
│ │ │ │ ││
|
||||
│ │ │ Output: Confidence-weighted similarity score │ ││
|
||||
│ │ │ │ ││
|
||||
│ │ └───────────────────────────────────────────────────────────────────────┘ ││
|
||||
│ │ ││
|
||||
│ └─────────────────────────────────────────────────────────────────────────────┘│
|
||||
│ │ │
|
||||
│ v │
|
||||
│ ┌─────────────────────────────────────────────────────────────────────────────┐│
|
||||
│ │ Storage Layer ││
|
||||
│ │ ││
|
||||
│ │ PostgreSQL RustFS Valkey ││
|
||||
│ │ - corpus.* tables - Fingerprint blobs - Query cache ││
|
||||
│ │ - binaries.* tables - Model artifacts - Embedding index ││
|
||||
│ │ - BSim database - Training data ││
|
||||
│ │ ││
|
||||
│ └─────────────────────────────────────────────────────────────────────────────┘│
|
||||
└─────────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Implementation Phases
|
||||
|
||||
### Phase 1: IR-Level Semantic Analysis (Foundation)
|
||||
|
||||
**Sprint:** `SPRINT_20260105_001_001_BINDEX_semdiff_ir_semantics.md`
|
||||
|
||||
Leverage B2R2's Intermediate Representation (IR) for semantic-level function comparison.
|
||||
|
||||
**Key Components:**
|
||||
- `IrLiftingService` - Lift instructions to LowUIR
|
||||
- `SemanticGraphExtractor` - Build Key-Semantics Graph (KSG)
|
||||
- `WeisfeilerLehmanHasher` - Graph fingerprinting
|
||||
- `SemanticMatcher` - Semantic similarity scoring
|
||||
|
||||
**Deliverables:**
|
||||
- `StellaOps.BinaryIndex.Semantic` library
|
||||
- 20 tasks, ~3 weeks
|
||||
|
||||
### Phase 2: Function Behavior Corpus (Scale)
|
||||
|
||||
**Sprint:** `SPRINT_20260105_001_002_BINDEX_semdiff_corpus.md`
|
||||
|
||||
Build comprehensive database of known library functions.
|
||||
|
||||
**Key Components:**
|
||||
- Library corpus connectors (glibc, OpenSSL, zlib, curl, SQLite)
|
||||
- `CorpusIngestionService` - Batch fingerprint generation
|
||||
- `FunctionClusteringService` - Group similar functions
|
||||
- `CorpusQueryService` - Function identification
|
||||
|
||||
**Deliverables:**
|
||||
- `StellaOps.BinaryIndex.Corpus` library
|
||||
- PostgreSQL `corpus.*` schema
|
||||
- ~30,000 indexed functions
|
||||
- 22 tasks, ~4 weeks
|
||||
|
||||
### Phase 3: Ghidra Integration (Depth)
|
||||
|
||||
**Sprint:** `SPRINT_20260105_001_003_BINDEX_semdiff_ghidra.md`
|
||||
|
||||
Add Ghidra as secondary backend for complex cases.
|
||||
|
||||
**Key Components:**
|
||||
- `GhidraHeadlessManager` - Process lifecycle
|
||||
- `VersionTrackingService` - Multi-correlator diffing
|
||||
- `GhidriffBridge` - Python interop
|
||||
- `BSimService` - Behavioral similarity
|
||||
|
||||
**Deliverables:**
|
||||
- `StellaOps.BinaryIndex.Ghidra` library
|
||||
- Docker image for Ghidra Headless
|
||||
- 20 tasks, ~4 weeks
|
||||
|
||||
### Phase 4: Decompiler & ML (Excellence)
|
||||
|
||||
**Sprint:** `SPRINT_20260105_001_004_BINDEX_semdiff_decompiler_ml.md`
|
||||
|
||||
Highest-fidelity semantic analysis.
|
||||
|
||||
**Key Components:**
|
||||
- `IDecompilerService` - Ghidra decompilation
|
||||
- `AstComparisonEngine` - Structural similarity
|
||||
- `OnnxInferenceEngine` - ML embeddings
|
||||
- `EnsembleDecisionEngine` - Multi-signal fusion
|
||||
|
||||
**Deliverables:**
|
||||
- `StellaOps.BinaryIndex.Decompiler` library
|
||||
- `StellaOps.BinaryIndex.ML` library
|
||||
- Trained CodeBERT-Binary model
|
||||
- 30 tasks, ~5 weeks
|
||||
|
||||
---
|
||||
|
||||
## 4. Fingerprint Types
|
||||
|
||||
### 4.1 Instruction Fingerprint (Existing)
|
||||
|
||||
**Algorithm:** BasicBlock hash + CFG edge hash + String refs hash
|
||||
|
||||
**Properties:**
|
||||
- Fast to compute
|
||||
- Sensitive to instruction changes
|
||||
- Good for exact/near-exact matches
|
||||
|
||||
**Weight in ensemble:** 15%
|
||||
|
||||
### 4.2 Semantic Fingerprint (Phase 1)
|
||||
|
||||
**Algorithm:** Key-Semantics Graph + Weisfeiler-Lehman hash
|
||||
|
||||
**Properties:**
|
||||
- Captures data/control dependencies
|
||||
- Resilient to register renaming
|
||||
- Resilient to instruction reordering
|
||||
|
||||
**Weight in ensemble:** 25%
|
||||
|
||||
### 4.3 Decompiled Fingerprint (Phase 4)
|
||||
|
||||
**Algorithm:** Normalized AST hash + Pattern detection
|
||||
|
||||
**Properties:**
|
||||
- Highest semantic fidelity
|
||||
- Captures algorithmic structure
|
||||
- Resilient to most optimizations
|
||||
|
||||
**Weight in ensemble:** 35%
|
||||
|
||||
### 4.4 ML Embedding (Phase 4)
|
||||
|
||||
**Algorithm:** CodeBERT-Binary transformer, 768-dim vectors
|
||||
|
||||
**Properties:**
|
||||
- Learned similarity metric
|
||||
- Captures latent patterns
|
||||
- Resilient to obfuscation
|
||||
|
||||
**Weight in ensemble:** 25%
|
||||
|
||||
---
|
||||
|
||||
## 5. Matching Pipeline
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Client
|
||||
participant DiffEngine as PatchDiffEngine
|
||||
participant B2R2
|
||||
participant Ghidra
|
||||
participant Corpus
|
||||
participant Ensemble
|
||||
|
||||
Client->>DiffEngine: Compare(oldBinary, newBinary)
|
||||
|
||||
par Parallel Analysis
|
||||
DiffEngine->>B2R2: Disassemble + IR lift
|
||||
DiffEngine->>Ghidra: Decompile (if needed)
|
||||
end
|
||||
|
||||
B2R2-->>DiffEngine: SemanticFingerprints[]
|
||||
Ghidra-->>DiffEngine: DecompiledFunctions[]
|
||||
|
||||
DiffEngine->>Corpus: IdentifyFunctions(fingerprints)
|
||||
Corpus-->>DiffEngine: FunctionMatches[]
|
||||
|
||||
DiffEngine->>Ensemble: ComputeSimilarity(old, new)
|
||||
Ensemble-->>DiffEngine: EnsembleResult
|
||||
|
||||
DiffEngine-->>Client: PatchDiffResult
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Fallback Strategy
|
||||
|
||||
The system uses a tiered fallback strategy:
|
||||
|
||||
```
|
||||
Tier 1: B2R2 IR + Semantic Graph (fast, ~90% coverage)
|
||||
│
|
||||
│ If confidence < threshold OR architecture unsupported
|
||||
v
|
||||
Tier 2: Ghidra Version Tracking (slower, ~95% coverage)
|
||||
│
|
||||
│ If function is high-value (CVE-relevant)
|
||||
v
|
||||
Tier 3: Decompiled AST + ML Embedding (slowest, ~99% coverage)
|
||||
```
|
||||
|
||||
**Selection Criteria:**
|
||||
|
||||
| Condition | Backend | Reason |
|
||||
|-----------|---------|--------|
|
||||
| Standard x64/ARM64 binary | B2R2 only | Fast, accurate |
|
||||
| Low B2R2 confidence (<0.7) | B2R2 + Ghidra | Validation |
|
||||
| Exotic architecture | Ghidra only | Better coverage |
|
||||
| CVE-affected function | Full pipeline | Maximum accuracy |
|
||||
| Obfuscated binary | ML embedding | Obfuscation resilience |
|
||||
|
||||
---
|
||||
|
||||
## 7. Corpus Coverage
|
||||
|
||||
### Priority Libraries
|
||||
|
||||
| Library | Priority | Functions | CVEs |
|
||||
|---------|----------|-----------|------|
|
||||
| glibc | Critical | ~15,000 | 50+ |
|
||||
| OpenSSL | Critical | ~8,000 | 100+ |
|
||||
| zlib | High | ~200 | 5+ |
|
||||
| libcurl | High | ~2,000 | 80+ |
|
||||
| SQLite | High | ~1,500 | 30+ |
|
||||
| libxml2 | Medium | ~1,200 | 40+ |
|
||||
| libpng | Medium | ~300 | 10+ |
|
||||
| expat | Medium | ~150 | 15+ |
|
||||
|
||||
### Architecture Coverage
|
||||
|
||||
| Architecture | B2R2 | Ghidra | Status |
|
||||
|--------------|------|--------|--------|
|
||||
| x86_64 | Excellent | Excellent | Primary |
|
||||
| ARM64 | Excellent | Excellent | Primary |
|
||||
| ARM32 | Good | Excellent | Secondary |
|
||||
| MIPS32 | Fair | Excellent | Fallback |
|
||||
| MIPS64 | Fair | Excellent | Fallback |
|
||||
| RISC-V | Good | Good | Emerging |
|
||||
| PPC32/64 | Fair | Excellent | Fallback |
|
||||
|
||||
---
|
||||
|
||||
## 8. Performance Characteristics
|
||||
|
||||
### Latency Budget
|
||||
|
||||
| Operation | Target | Notes |
|
||||
|-----------|--------|-------|
|
||||
| B2R2 disassembly | <100ms | Per function |
|
||||
| IR lifting | <50ms | Per function |
|
||||
| Semantic fingerprint | <50ms | Per function |
|
||||
| Ghidra analysis | <30s | Per binary (startup) |
|
||||
| Decompilation | <500ms | Per function |
|
||||
| ML inference | <100ms | Per function |
|
||||
| Ensemble decision | <10ms | Per comparison |
|
||||
| **Total (Tier 1)** | **<200ms** | Per function |
|
||||
| **Total (Full)** | **<1s** | Per function |
|
||||
|
||||
### Memory Budget
|
||||
|
||||
| Component | Memory | Notes |
|
||||
|-----------|--------|-------|
|
||||
| B2R2 per binary | ~100MB | Scales with binary size |
|
||||
| Ghidra per project | ~2GB | Persistent cache |
|
||||
| ML model | ~500MB | ONNX loaded |
|
||||
| Corpus query cache | ~100MB | LRU eviction |
|
||||
|
||||
---
|
||||
|
||||
## 9. Integration Points
|
||||
|
||||
### 9.1 Scanner Integration
|
||||
|
||||
```csharp
|
||||
// Scanner.Worker uses semantic diffing for binary vulnerability detection
|
||||
var result = await _binaryVulnerabilityService.LookupByFingerprintAsync(
|
||||
fingerprint,
|
||||
minSimilarity: 0.85m,
|
||||
useSemanticMatching: true, // Enable semantic diffing
|
||||
ct);
|
||||
```
|
||||
|
||||
### 9.2 PatchDiffEngine Enhancement
|
||||
|
||||
```csharp
|
||||
// PatchDiffEngine now includes semantic comparison
|
||||
var diff = await _patchDiffEngine.DiffAsync(
|
||||
vulnerableBinary,
|
||||
patchedBinary,
|
||||
new PatchDiffOptions
|
||||
{
|
||||
UseSemanticAnalysis = true,
|
||||
SemanticThreshold = 0.7m,
|
||||
IncludeDecompilation = true,
|
||||
IncludeMlEmbedding = true
|
||||
},
|
||||
ct);
|
||||
```
|
||||
|
||||
### 9.3 DeltaSignature Enhancement
|
||||
|
||||
```csharp
|
||||
// Delta signatures now include semantic fingerprints
|
||||
var signature = await _deltaSignatureGenerator.GenerateSignaturesAsync(
|
||||
binaryStream,
|
||||
new DeltaSignatureRequest
|
||||
{
|
||||
Cve = "CVE-2024-1234",
|
||||
TargetSymbols = ["vulnerable_func"],
|
||||
IncludeSemanticFingerprint = true,
|
||||
IncludeDecompiledHash = true
|
||||
},
|
||||
ct);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Security Considerations
|
||||
|
||||
### 10.1 Sandbox Requirements
|
||||
|
||||
All binary analysis runs in sandboxed environments:
|
||||
- Seccomp profile restricting syscalls
|
||||
- Read-only root filesystem
|
||||
- No network access during analysis
|
||||
- Memory/CPU limits
|
||||
|
||||
### 10.2 Model Security
|
||||
|
||||
ML models are:
|
||||
- Signed with DSSE attestations
|
||||
- Verified before loading
|
||||
- Not user-uploadable (pre-trained only)
|
||||
|
||||
### 10.3 Corpus Integrity
|
||||
|
||||
Corpus data is:
|
||||
- Ingested from trusted sources only
|
||||
- Signed at snapshot level
|
||||
- Version-controlled with audit trail
|
||||
|
||||
---
|
||||
|
||||
## 11. Configuration
|
||||
|
||||
```yaml
|
||||
# binaryindex.yaml - Semantic diffing configuration
|
||||
binaryindex:
|
||||
semantic_diffing:
|
||||
enabled: true
|
||||
|
||||
# Analysis backends
|
||||
backends:
|
||||
b2r2:
|
||||
enabled: true
|
||||
ir_lifting: true
|
||||
semantic_graph: true
|
||||
ghidra:
|
||||
enabled: true
|
||||
fallback_only: true
|
||||
min_b2r2_confidence: 0.7
|
||||
headless_timeout_ms: 30000
|
||||
decompiler:
|
||||
enabled: true
|
||||
high_value_only: true # Only for CVE-affected functions
|
||||
ml:
|
||||
enabled: true
|
||||
model_path: /models/codebert_binary_v1.onnx
|
||||
embedding_dimension: 768
|
||||
|
||||
# Ensemble weights
|
||||
ensemble:
|
||||
instruction_weight: 0.15
|
||||
semantic_weight: 0.25
|
||||
decompiled_weight: 0.35
|
||||
ml_weight: 0.25
|
||||
min_confidence: 0.6
|
||||
|
||||
# Corpus
|
||||
corpus:
|
||||
auto_update: true
|
||||
update_interval_hours: 24
|
||||
libraries:
|
||||
- glibc
|
||||
- openssl
|
||||
- zlib
|
||||
- curl
|
||||
- sqlite
|
||||
|
||||
# Performance
|
||||
performance:
|
||||
max_parallel_analyses: 4
|
||||
cache_ttl_seconds: 3600
|
||||
max_function_size_bytes: 1048576 # 1MB
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 12. Metrics & Observability
|
||||
|
||||
### Metrics
|
||||
|
||||
| Metric | Type | Labels |
|
||||
|--------|------|--------|
|
||||
| `semantic_diffing_analysis_total` | Counter | backend, result |
|
||||
| `semantic_diffing_latency_ms` | Histogram | backend, tier |
|
||||
| `semantic_diffing_accuracy` | Gauge | comparison_type |
|
||||
| `corpus_functions_total` | Gauge | library |
|
||||
| `ml_inference_latency_ms` | Histogram | model |
|
||||
| `ensemble_signal_weight` | Gauge | signal_type |
|
||||
|
||||
### Traces
|
||||
|
||||
- `semantic_diffing.analyze` - Full analysis span
|
||||
- `semantic_diffing.b2r2.lift` - IR lifting
|
||||
- `semantic_diffing.ghidra.decompile` - Decompilation
|
||||
- `semantic_diffing.ml.inference` - ML embedding
|
||||
- `semantic_diffing.ensemble.decide` - Ensemble decision
|
||||
|
||||
---
|
||||
|
||||
## 13. Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
|
||||
| Test Suite | Coverage |
|
||||
|------------|----------|
|
||||
| `IrLiftingServiceTests` | IR lifting correctness |
|
||||
| `SemanticGraphExtractorTests` | Graph construction |
|
||||
| `WeisfeilerLehmanHasherTests` | Hash stability |
|
||||
| `AstComparisonEngineTests` | AST similarity |
|
||||
| `OnnxInferenceEngineTests` | ML inference |
|
||||
| `EnsembleDecisionEngineTests` | Weight combination |
|
||||
|
||||
### Integration Tests
|
||||
|
||||
| Test Suite | Coverage |
|
||||
|------------|----------|
|
||||
| `EndToEndSemanticDiffTests` | Full pipeline |
|
||||
| `OptimizationResilienceTests` | O0 vs O2 vs O3 |
|
||||
| `CompilerVariantTests` | GCC vs Clang |
|
||||
| `GhidraFallbackTests` | Fallback scenarios |
|
||||
|
||||
### Golden Corpus Tests
|
||||
|
||||
Pre-computed test cases with known results:
|
||||
- 100 CVE patch pairs (vulnerable -> fixed)
|
||||
- 50 optimization variant sets
|
||||
- 25 compiler variant sets
|
||||
- 25 obfuscation variant sets
|
||||
|
||||
---
|
||||
|
||||
## 14. Roadmap
|
||||
|
||||
| Phase | Status | ETA | Impact |
|
||||
|-------|--------|-----|--------|
|
||||
| Phase 1: IR Semantics | Planned | 2026-01-24 | +15% accuracy |
|
||||
| Phase 2: Corpus | Planned | 2026-02-15 | +10% coverage |
|
||||
| Phase 3: Ghidra | Planned | 2026-02-28 | +5% edge cases |
|
||||
| Phase 4: Decompiler/ML | Planned | 2026-03-31 | +10% obfuscation |
|
||||
| **Total** | | | **+35-40%** |
|
||||
|
||||
---
|
||||
|
||||
## 15. References
|
||||
|
||||
### Internal
|
||||
|
||||
- `docs/modules/binary-index/architecture.md`
|
||||
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.DeltaSig/`
|
||||
- `src/BinaryIndex/__Libraries/StellaOps.BinaryIndex.Fingerprints/`
|
||||
|
||||
### External
|
||||
|
||||
- [B2R2 Binary Analysis Framework](https://b2r2.org/)
|
||||
- [Ghidra Patch Diffing Guide](https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing)
|
||||
- [ghidriff Tool](https://github.com/clearbluejar/ghidriff)
|
||||
- [SemDiff Paper (arXiv)](https://arxiv.org/abs/2308.01463)
|
||||
- [SEI Semantic Equivalence Research](https://www.sei.cmu.edu/annual-reviews/2022-research-review/semantic-equivalence-checking-of-decompiled-binaries/)
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 1.0.0*
|
||||
*Last Updated: 2026-01-05*
|
||||
@@ -1,46 +0,0 @@
|
||||
# BinaryIndex
|
||||
|
||||
**Status:** Implemented
|
||||
**Source:** `src/BinaryIndex/`
|
||||
**Owner:** Scanner Guild + Concelier Guild
|
||||
|
||||
## Purpose
|
||||
|
||||
BinaryIndex provides vulnerable binary detection independent of package metadata. It addresses the gap where package version strings can lie (backports, custom builds, stripped metadata) through binary-first vulnerability identification using Build-IDs, hash catalogs, and function fingerprints.
|
||||
|
||||
## Components
|
||||
|
||||
**Libraries:**
|
||||
- `StellaOps.BinaryIndex.Core` - Core binary identity extraction and matching engine
|
||||
- `StellaOps.BinaryIndex.Corpus` - Binary-to-advisory mapping database
|
||||
- `StellaOps.BinaryIndex.Corpus.Debian` - Debian-specific corpus support
|
||||
- `StellaOps.BinaryIndex.Fingerprints` - Function fingerprint storage and matching (CFG/basic-block hashes)
|
||||
- `StellaOps.BinaryIndex.FixIndex` - Patch-aware backport handling
|
||||
- `StellaOps.BinaryIndex.Persistence` - Storage adapters for binary catalogs
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration is typically embedded in Scanner and Concelier module settings.
|
||||
|
||||
Key features:
|
||||
- Three-tier binary identification (package/version, Build-ID/hash, function fingerprints)
|
||||
- Binary identity extraction (Build-ID, PE CodeView GUID, Mach-O UUID)
|
||||
- Integration with Scanner.Worker for binary lookup
|
||||
- Offline-first design with deterministic outputs
|
||||
|
||||
## Dependencies
|
||||
|
||||
- PostgreSQL (integrated with Scanner/Concelier schemas)
|
||||
- Scanner.Analyzers.Native (for binary disassembly/analysis)
|
||||
- Concelier (for advisory-to-binary mapping)
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- Architecture: `./architecture.md`
|
||||
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
|
||||
- Scanner Architecture: `../scanner/architecture.md`
|
||||
- Concelier Architecture: `../concelier/architecture.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
Library implementation complete with support for ELF (Build-ID), PE (CodeView GUID), and Mach-O (UUID) binary formats. Integrated into Scanner's native binary analysis pipeline.
|
||||
@@ -418,7 +418,7 @@ Additional notes:
|
||||
|
||||
- [Aggregation-Only Contract reference](../../../aoc/aggregation-only-contract.md)
|
||||
- [Architecture overview](../../platform/architecture-overview.md)
|
||||
- [Console operator guide](../../../15_UI_GUIDE.md)
|
||||
- [Console operator guide](../../../UI_GUIDE.md)
|
||||
- [Authority scopes](../../authority/architecture.md)
|
||||
- [Task Pack CLI profiles](./packs-profiles.md)
|
||||
|
||||
|
||||
@@ -158,5 +158,5 @@ stella scan replay \
|
||||
## See Also
|
||||
|
||||
- [Deterministic Replay Specification](../../replay/DETERMINISTIC_REPLAY.md)
|
||||
- [Offline Kit Documentation](../../24_OFFLINE_KIT.md)
|
||||
- [Offline Kit Documentation](../../OFFLINE_KIT.md)
|
||||
- [Evidence Bundle Format](./evidence-bundle-format.md)
|
||||
|
||||
@@ -490,7 +490,7 @@ When operating in air-gapped environments:
|
||||
--expected-digest sha256:...
|
||||
```
|
||||
|
||||
For full offline kit support, see the [Offline Kit documentation](../../../24_OFFLINE_KIT.md).
|
||||
For full offline kit support, see the [Offline Kit documentation](../../../OFFLINE_KIT.md).
|
||||
|
||||
---
|
||||
|
||||
@@ -499,4 +499,4 @@ For full offline kit support, see the [Offline Kit documentation](../../../24_OF
|
||||
- [VEX Consensus CLI](./vex-cli.md) - VEX status management
|
||||
- [Policy Simulation](../../policy/guides/simulation.md) - Policy testing
|
||||
- [Authentication Guide](./auth-cli.md) - Token management
|
||||
- [API Reference](../../../09_API_CLI_REFERENCE.md) - Full API documentation
|
||||
- [API Reference](../../../API_CLI_REFERENCE.md) - Full API documentation
|
||||
|
||||
@@ -35,13 +35,13 @@ Concelier ingests signed advisories from **32 advisory connectors** and converts
|
||||
- Connector runbooks in ./operations/connectors/.
|
||||
- Mirror operations for Offline Kit parity.
|
||||
- Grafana dashboards for connector health.
|
||||
- **Authority toggle rollout (2025-10-22 update).** Follow the phased table and audit checklist in `../../10_CONCELIER_CLI_QUICKSTART.md` when enabling `authority.enabled`/`authority.allowAnonymousFallback`, and cross-check the refreshed `./operations/authority-audit-runbook.md` before enforcement.
|
||||
- **Authority toggle rollout (2025-10-22 update).** Follow the phased table and audit checklist in `../../CONCELIER_CLI_QUICKSTART.md` when enabling `authority.enabled`/`authority.allowAnonymousFallback`, and cross-check the refreshed `./operations/authority-audit-runbook.md` before enforcement.
|
||||
|
||||
## Related resources
|
||||
- ./operations/conflict-resolution.md
|
||||
- ./operations/mirror.md
|
||||
- ./operations/authority-audit-runbook.md
|
||||
- ../../10_CONCELIER_CLI_QUICKSTART.md (authority integration timeline & smoke tests)
|
||||
- ../../CONCELIER_CLI_QUICKSTART.md (authority integration timeline & smoke tests)
|
||||
|
||||
## Backlog references
|
||||
- DOCS-LNM-22-001, DOCS-LNM-22-007 in ../../TASKS.md.
|
||||
|
||||
@@ -132,7 +132,7 @@ operating offline.
|
||||
## 4. Locale & Translation Guidance
|
||||
|
||||
- Advisories remain in German (`language: "de"`). Preserve wording for provenance and legal accuracy.
|
||||
- UI localisation: enable the translation bundles documented in `docs/15_UI_GUIDE.md` if English UI copy is required. Operators can overlay machine or human translations, but the canonical database stores the source text.
|
||||
- UI localisation: enable the translation bundles documented in `docs/UI_GUIDE.md` if English UI copy is required. Operators can overlay machine or human translations, but the canonical database stores the source text.
|
||||
- Docs guild is compiling a CERT-Bund terminology glossary under `docs/locale/certbund-glossary.md` so downstream teams can reference consistent English equivalents without altering the stored advisories.
|
||||
|
||||
---
|
||||
|
||||
@@ -42,7 +42,7 @@ Key features:
|
||||
- Signer Module: `../signer/`
|
||||
- Attestor Module: `../attestor/`
|
||||
- Authority Module: `../authority/`
|
||||
- Air-Gap Operations: `../../24_OFFLINE_KIT.md`
|
||||
- Air-Gap Operations: `../../OFFLINE_KIT.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
|
||||
@@ -295,4 +295,4 @@ For air-gapped deployments:
|
||||
* Multi-profile signing: `./multi-profile-signing-specification.md`
|
||||
* Signer module: `../signer/architecture.md`
|
||||
* Attestor module: `../attestor/architecture.md`
|
||||
* Offline operations: `../../24_OFFLINE_KIT.md`
|
||||
* Offline operations: `../../OFFLINE_KIT.md`
|
||||
|
||||
@@ -41,7 +41,7 @@ Key settings:
|
||||
- Operations: `./operations/` (if exists)
|
||||
- ExportCenter: `../export-center/`
|
||||
- Attestor: `../attestor/`
|
||||
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
|
||||
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
|
||||
@@ -152,7 +152,7 @@ Downstream automation reads `manifest.json`/`bundle.json` directly, while `/exci
|
||||
* Track quota utilisation via HTTP 429 metrics (configure structured logging or OTEL counters when rate limiting triggers).
|
||||
* Mirror domains can be deployed per tenant (e.g., `tenant-a`, `tenant-b`) with different auth requirements.
|
||||
* Ensure the underlying artifact stores (`FileSystem`, `S3`, offline bundle) retain artefacts long enough for mirrors to sync.
|
||||
* For air-gapped mirrors, combine mirror endpoints with the Offline Kit (see `docs/24_OFFLINE_KIT.md`).
|
||||
* For air-gapped mirrors, combine mirror endpoints with the Offline Kit (see `docs/OFFLINE_KIT.md`).
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -409,7 +409,7 @@ gates:
|
||||
| POST | `/api/v1/authority/verdicts/{manifestId}/replay` | Verify replay |
|
||||
| GET | `/api/v1/authority/verdicts/{manifestId}/download` | Download signed manifest |
|
||||
|
||||
See `docs/09_API_CLI_REFERENCE.md` for complete API documentation.
|
||||
See `docs/API_CLI_REFERENCE.md` for complete API documentation.
|
||||
|
||||
---
|
||||
|
||||
@@ -506,7 +506,7 @@ Note: Conflict recorded in audit trail
|
||||
- [Excititor Architecture](./architecture.md)
|
||||
- [Verdict Manifest Specification](../authority/verdict-manifest.md)
|
||||
- [Policy Gates Configuration](../policy/architecture.md)
|
||||
- [API Reference](../../09_API_CLI_REFERENCE.md)
|
||||
- [API Reference](../../API_CLI_REFERENCE.md)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -200,6 +200,6 @@ If encryption enabled, decrypt using age or AES key before verification.
|
||||
- `docs/modules/export-center/mirror-bundles.md`
|
||||
- `ops/devops/TASKS.md` (`DEVOPS-EXPORT-36-001`, `DEVOPS-EXPORT-37-001`)
|
||||
- `docs/aoc/aggregation-only-contract.md`
|
||||
- `docs/24_OFFLINE_KIT.md`
|
||||
- `docs/OFFLINE_KIT.md`
|
||||
|
||||
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||||
|
||||
@@ -40,7 +40,7 @@ See `docs/security/policy-governance.md` and `docs/aoc/aggregation-only-contract
|
||||
- **Mirror bundles.** `mirror:full` packages raw evidence, normalized indexes, policy snapshots, and provenance in a portable filesystem layout suitable for disconnected environments. `mirror:delta` tracks changes relative to a prior export manifest.
|
||||
- **No unsanctioned egress.** The exporter respects the platform allowlist. External calls (e.g., OCI pushes) require explicit configuration and are disabled by default for offline installs.
|
||||
|
||||
Consult `docs/24_OFFLINE_KIT.md` for Offline Kit delivery and `docs/modules/concelier/operations/mirror.md` for mirror ingestion procedures.
|
||||
Consult `docs/OFFLINE_KIT.md` for Offline Kit delivery and `docs/modules/concelier/operations/mirror.md` for mirror ingestion procedures.
|
||||
|
||||
## Getting started
|
||||
1. **Choose a profile.** Map requirements to the profile table above. Policy-aware exports need a published policy snapshot.
|
||||
|
||||
700
docs/modules/facet/architecture.md
Normal file
700
docs/modules/facet/architecture.md
Normal file
@@ -0,0 +1,700 @@
|
||||
# Facet Sealing Architecture
|
||||
|
||||
> **Ownership:** Scanner Guild, Policy Guild
|
||||
> **Audience:** Service owners, platform engineers, security architects
|
||||
> **Related:** [Platform Architecture](../platform/architecture-overview.md), [Scanner Architecture](../scanner/architecture.md), [Replay Architecture](../replay/architecture.md), [Policy Engine](../policy/architecture.md)
|
||||
|
||||
This dossier describes the Facet Sealing subsystem, which provides cryptographically sealed manifests for logical slices of container images, enabling fine-grained drift detection, per-facet quota enforcement, and deterministic change tracking.
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
A **Facet** is a declared logical slice of a container image representing a cohesive set of files with shared characteristics:
|
||||
|
||||
| Facet Type | Description | Examples |
|
||||
|------------|-------------|----------|
|
||||
| `os` | Operating system packages | `/var/lib/dpkg/**`, `/var/lib/rpm/**` |
|
||||
| `lang/<ecosystem>` | Language-specific dependencies | `node_modules/**`, `site-packages/**`, `vendor/**` |
|
||||
| `binary` | Native binaries and shared libraries | `/usr/bin/*`, `/lib/**/*.so*` |
|
||||
| `config` | Configuration files | `/etc/**`, `*.conf`, `*.yaml` |
|
||||
| `custom` | User-defined patterns | Project-specific paths |
|
||||
|
||||
Each facet can be individually **sealed** (cryptographic snapshot) and monitored for **drift** (changes between seals).
|
||||
|
||||
---
|
||||
|
||||
## 2. System Landscape
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
subgraph Scanner["Scanner Services"]
|
||||
FE[FacetExtractor]
|
||||
FH[FacetHasher]
|
||||
MB[MerkleBuilder]
|
||||
end
|
||||
|
||||
subgraph Storage["Facet Storage"]
|
||||
FS[(PostgreSQL<br/>facet_seals)]
|
||||
FC[(CAS<br/>facet_manifests)]
|
||||
end
|
||||
|
||||
subgraph Policy["Policy & Enforcement"]
|
||||
DC[DriftCalculator]
|
||||
QE[QuotaEnforcer]
|
||||
AV[AdmissionValidator]
|
||||
end
|
||||
|
||||
subgraph Signing["Attestation"]
|
||||
DS[DSSE Signer]
|
||||
AT[Attestor]
|
||||
end
|
||||
|
||||
subgraph CLI["CLI & Integration"]
|
||||
SealCmd[stella seal]
|
||||
DriftCmd[stella drift]
|
||||
VexCmd[stella vex gen]
|
||||
Zastava[Zastava Webhook]
|
||||
end
|
||||
|
||||
FE --> FH
|
||||
FH --> MB
|
||||
MB --> DS
|
||||
DS --> FS
|
||||
DS --> FC
|
||||
FS --> DC
|
||||
DC --> QE
|
||||
QE --> AV
|
||||
AV --> Zastava
|
||||
SealCmd --> FE
|
||||
DriftCmd --> DC
|
||||
VexCmd --> DC
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Core Data Models
|
||||
|
||||
### 3.1 FacetDefinition
|
||||
|
||||
Declares a facet with its extraction patterns and quota constraints:
|
||||
|
||||
```csharp
|
||||
public sealed record FacetDefinition
|
||||
{
|
||||
public required string FacetId { get; init; } // e.g., "os", "lang/node", "binary"
|
||||
public required FacetType Type { get; init; } // OS, LangNode, LangPython, Binary, Config, Custom
|
||||
public required ImmutableArray<string> IncludeGlobs { get; init; }
|
||||
public ImmutableArray<string> ExcludeGlobs { get; init; } = [];
|
||||
public FacetQuota? Quota { get; init; }
|
||||
}
|
||||
|
||||
public enum FacetType
|
||||
{
|
||||
OS,
|
||||
LangNode,
|
||||
LangPython,
|
||||
LangGo,
|
||||
LangRust,
|
||||
LangJava,
|
||||
LangDotNet,
|
||||
Binary,
|
||||
Config,
|
||||
Custom
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 FacetManifest
|
||||
|
||||
Per-facet file manifest with Merkle root:
|
||||
|
||||
```csharp
|
||||
public sealed record FacetManifest
|
||||
{
|
||||
public required string FacetId { get; init; }
|
||||
public required FacetType Type { get; init; }
|
||||
public required ImmutableArray<FacetFileEntry> Files { get; init; }
|
||||
public required string MerkleRoot { get; init; } // SHA-256 hex
|
||||
public required int FileCount { get; init; }
|
||||
public required long TotalBytes { get; init; }
|
||||
public required DateTimeOffset ExtractedAt { get; init; }
|
||||
public required string ExtractorVersion { get; init; }
|
||||
}
|
||||
|
||||
public sealed record FacetFileEntry
|
||||
{
|
||||
public required string Path { get; init; } // Normalized POSIX path
|
||||
public required string ContentHash { get; init; } // SHA-256 hex
|
||||
public required long Size { get; init; }
|
||||
public required string Mode { get; init; } // POSIX mode string "0644"
|
||||
public required DateTimeOffset ModTime { get; init; } // Normalized to UTC
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 FacetSeal
|
||||
|
||||
DSSE-signed seal combining manifest with metadata:
|
||||
|
||||
```csharp
|
||||
public sealed record FacetSeal
|
||||
{
|
||||
public required Guid SealId { get; init; }
|
||||
public required string ImageRef { get; init; } // registry/repo:tag@sha256:...
|
||||
public required string ImageDigest { get; init; } // sha256:...
|
||||
public required FacetManifest Manifest { get; init; }
|
||||
public required DateTimeOffset SealedAt { get; init; }
|
||||
public required string SealedBy { get; init; } // Identity/service
|
||||
public required FacetQuota? AppliedQuota { get; init; }
|
||||
public required DsseEnvelope Envelope { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.4 FacetQuota
|
||||
|
||||
Per-facet change budget:
|
||||
|
||||
```csharp
|
||||
public sealed record FacetQuota
|
||||
{
|
||||
public required string FacetId { get; init; }
|
||||
public double MaxChurnPercent { get; init; } = 5.0; // 0-100
|
||||
public int MaxChangedFiles { get; init; } = 50;
|
||||
public int MaxAddedFiles { get; init; } = 25;
|
||||
public int MaxRemovedFiles { get; init; } = 10;
|
||||
public QuotaAction OnExceed { get; init; } = QuotaAction.Warn;
|
||||
}
|
||||
|
||||
public enum QuotaAction
|
||||
{
|
||||
Warn, // Log warning, allow admission
|
||||
Block, // Reject admission
|
||||
RequireVex // Require VEX justification before admission
|
||||
}
|
||||
```
|
||||
|
||||
### 3.5 FacetDrift
|
||||
|
||||
Drift calculation result between two seals:
|
||||
|
||||
```csharp
|
||||
public sealed record FacetDrift
|
||||
{
|
||||
public required string FacetId { get; init; }
|
||||
public required Guid BaselineSealId { get; init; }
|
||||
public required Guid CurrentSealId { get; init; }
|
||||
public required ImmutableArray<DriftEntry> Added { get; init; }
|
||||
public required ImmutableArray<DriftEntry> Removed { get; init; }
|
||||
public required ImmutableArray<DriftEntry> Modified { get; init; }
|
||||
public required DriftScore Score { get; init; }
|
||||
public required QuotaVerdict QuotaVerdict { get; init; }
|
||||
}
|
||||
|
||||
public sealed record DriftEntry
|
||||
{
|
||||
public required string Path { get; init; }
|
||||
public string? OldHash { get; init; }
|
||||
public string? NewHash { get; init; }
|
||||
public long? OldSize { get; init; }
|
||||
public long? NewSize { get; init; }
|
||||
public DriftCause Cause { get; init; } = DriftCause.Unknown;
|
||||
}
|
||||
|
||||
public enum DriftCause
|
||||
{
|
||||
Unknown,
|
||||
PackageUpdate,
|
||||
ConfigChange,
|
||||
BinaryRebuild,
|
||||
NewDependency,
|
||||
RemovedDependency,
|
||||
SecurityPatch
|
||||
}
|
||||
|
||||
public sealed record DriftScore
|
||||
{
|
||||
public required int TotalChanges { get; init; }
|
||||
public required double ChurnPercent { get; init; }
|
||||
public required int AddedCount { get; init; }
|
||||
public required int RemovedCount { get; init; }
|
||||
public required int ModifiedCount { get; init; }
|
||||
}
|
||||
|
||||
public sealed record QuotaVerdict
|
||||
{
|
||||
public required bool Passed { get; init; }
|
||||
public required ImmutableArray<QuotaViolation> Violations { get; init; }
|
||||
public required QuotaAction RecommendedAction { get; init; }
|
||||
}
|
||||
|
||||
public sealed record QuotaViolation
|
||||
{
|
||||
public required string QuotaField { get; init; } // e.g., "MaxChurnPercent"
|
||||
public required double Limit { get; init; }
|
||||
public required double Actual { get; init; }
|
||||
public required string Message { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Component Architecture
|
||||
|
||||
### 4.1 FacetExtractor
|
||||
|
||||
Extracts file entries from container images based on facet definitions:
|
||||
|
||||
```csharp
|
||||
public interface IFacetExtractor
|
||||
{
|
||||
Task<FacetManifest> ExtractAsync(
|
||||
string imageRef,
|
||||
FacetDefinition definition,
|
||||
CancellationToken ct = default);
|
||||
|
||||
Task<ImmutableArray<FacetManifest>> ExtractAllAsync(
|
||||
string imageRef,
|
||||
ImmutableArray<FacetDefinition> definitions,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
Implementation notes:
|
||||
- Uses existing `ISurfaceReader` for container layer traversal
|
||||
- Normalizes paths to POSIX format (forward slashes, no trailing slashes)
|
||||
- Computes SHA-256 content hashes for each file
|
||||
- Normalizes timestamps to UTC, mode to POSIX string
|
||||
- Sorts files lexicographically for deterministic ordering
|
||||
|
||||
### 4.2 FacetHasher
|
||||
|
||||
Computes Merkle tree for facet file entries:
|
||||
|
||||
```csharp
|
||||
public interface IFacetHasher
|
||||
{
|
||||
FacetMerkleResult ComputeMerkle(ImmutableArray<FacetFileEntry> files);
|
||||
}
|
||||
|
||||
public sealed record FacetMerkleResult
|
||||
{
|
||||
public required string Root { get; init; }
|
||||
public required ImmutableArray<string> LeafHashes { get; init; }
|
||||
public required ImmutableArray<MerkleProofNode> Proof { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
Implementation notes:
|
||||
- Leaf hash = SHA-256(path || contentHash || size || mode)
|
||||
- Binary Merkle tree with lexicographic leaf ordering
|
||||
- Empty facet produces well-known empty root hash
|
||||
- Proof enables verification of individual file membership
|
||||
|
||||
### 4.3 FacetSealStore
|
||||
|
||||
PostgreSQL storage for sealed facet manifests:
|
||||
|
||||
```sql
|
||||
-- Core seal storage
|
||||
CREATE TABLE facet_seals (
|
||||
seal_id UUID PRIMARY KEY,
|
||||
tenant TEXT NOT NULL,
|
||||
image_ref TEXT NOT NULL,
|
||||
image_digest TEXT NOT NULL,
|
||||
facet_id TEXT NOT NULL,
|
||||
facet_type TEXT NOT NULL,
|
||||
merkle_root TEXT NOT NULL,
|
||||
file_count INTEGER NOT NULL,
|
||||
total_bytes BIGINT NOT NULL,
|
||||
sealed_at TIMESTAMPTZ NOT NULL,
|
||||
sealed_by TEXT NOT NULL,
|
||||
quota_json JSONB,
|
||||
manifest_cas TEXT NOT NULL, -- CAS URI to full manifest
|
||||
dsse_envelope JSONB NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
|
||||
CONSTRAINT uq_facet_seal UNIQUE (tenant, image_digest, facet_id)
|
||||
);
|
||||
|
||||
CREATE INDEX ix_facet_seals_image ON facet_seals (tenant, image_digest);
|
||||
CREATE INDEX ix_facet_seals_merkle ON facet_seals (merkle_root);
|
||||
|
||||
-- Drift history
|
||||
CREATE TABLE facet_drift_history (
|
||||
drift_id UUID PRIMARY KEY,
|
||||
tenant TEXT NOT NULL,
|
||||
baseline_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
|
||||
current_seal_id UUID NOT NULL REFERENCES facet_seals(seal_id),
|
||||
facet_id TEXT NOT NULL,
|
||||
drift_score_json JSONB NOT NULL,
|
||||
quota_verdict_json JSONB NOT NULL,
|
||||
computed_at TIMESTAMPTZ NOT NULL,
|
||||
|
||||
CONSTRAINT uq_drift_pair UNIQUE (baseline_seal_id, current_seal_id)
|
||||
);
|
||||
```
|
||||
|
||||
### 4.4 DriftCalculator
|
||||
|
||||
Computes drift between baseline and current seals:
|
||||
|
||||
```csharp
|
||||
public interface IDriftCalculator
|
||||
{
|
||||
Task<FacetDrift> CalculateAsync(
|
||||
Guid baselineSealId,
|
||||
Guid currentSealId,
|
||||
CancellationToken ct = default);
|
||||
|
||||
Task<ImmutableArray<FacetDrift>> CalculateAllAsync(
|
||||
string imageDigestBaseline,
|
||||
string imageDigestCurrent,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
Implementation notes:
|
||||
- Retrieves manifests from CAS via seal metadata
|
||||
- Performs set difference operations on file paths
|
||||
- Detects modifications via content hash comparison
|
||||
- Attributes drift causes where determinable (e.g., package manager metadata)
|
||||
|
||||
### 4.5 QuotaEnforcer
|
||||
|
||||
Evaluates drift against quota constraints:
|
||||
|
||||
```csharp
|
||||
public interface IQuotaEnforcer
|
||||
{
|
||||
QuotaVerdict Evaluate(FacetDrift drift, FacetQuota quota);
|
||||
|
||||
Task<ImmutableArray<QuotaVerdict>> EvaluateAllAsync(
|
||||
ImmutableArray<FacetDrift> drifts,
|
||||
ImmutableDictionary<string, FacetQuota> quotas,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
### 4.6 AdmissionValidator
|
||||
|
||||
Zastava webhook integration for admission control:
|
||||
|
||||
```csharp
|
||||
public interface IFacetAdmissionValidator
|
||||
{
|
||||
Task<AdmissionResult> ValidateAsync(
|
||||
AdmissionRequest request,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
public sealed record AdmissionResult
|
||||
{
|
||||
public required bool Allowed { get; init; }
|
||||
public string? Message { get; init; }
|
||||
public ImmutableArray<QuotaViolation> Violations { get; init; } = [];
|
||||
public string? RequiredVexStatement { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. DSSE Envelope Structure
|
||||
|
||||
Facet seals use DSSE (Dead Simple Signing Envelope) for cryptographic binding:
|
||||
|
||||
```json
|
||||
{
|
||||
"payloadType": "application/vnd.stellaops.facet-seal.v1+json",
|
||||
"payload": "<base64url-encoded canonical JSON of FacetSeal>",
|
||||
"signatures": [
|
||||
{
|
||||
"keyid": "sha256:abc123...",
|
||||
"sig": "<base64url-encoded signature>"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Payload structure (canonical JSON, RFC 8785):
|
||||
```json
|
||||
{
|
||||
"_type": "https://stellaops.io/FacetSeal/v1",
|
||||
"facetId": "os",
|
||||
"facetType": "OS",
|
||||
"imageDigest": "sha256:abc123...",
|
||||
"imageRef": "registry.example.com/app:v1.2.3",
|
||||
"manifest": {
|
||||
"extractedAt": "2026-01-05T10:00:00.000Z",
|
||||
"extractorVersion": "1.0.0",
|
||||
"fileCount": 1234,
|
||||
"files": [
|
||||
{
|
||||
"contentHash": "sha256:...",
|
||||
"mode": "0644",
|
||||
"modTime": "2026-01-01T00:00:00.000Z",
|
||||
"path": "/etc/os-release",
|
||||
"size": 256
|
||||
}
|
||||
],
|
||||
"merkleRoot": "sha256:def456...",
|
||||
"totalBytes": 1048576
|
||||
},
|
||||
"quota": {
|
||||
"maxAddedFiles": 25,
|
||||
"maxChangedFiles": 50,
|
||||
"maxChurnPercent": 5.0,
|
||||
"maxRemovedFiles": 10,
|
||||
"onExceed": "Warn"
|
||||
},
|
||||
"sealId": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"sealedAt": "2026-01-05T10:05:00.000Z",
|
||||
"sealedBy": "scanner-worker-01"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Default Facet Definitions
|
||||
|
||||
Standard facet definitions applied when no custom configuration is provided:
|
||||
|
||||
```yaml
|
||||
# Default facet configuration
|
||||
facets:
|
||||
- facetId: os
|
||||
type: OS
|
||||
includeGlobs:
|
||||
- /var/lib/dpkg/**
|
||||
- /var/lib/rpm/**
|
||||
- /var/lib/pacman/**
|
||||
- /var/lib/apk/**
|
||||
- /var/cache/apt/**
|
||||
- /etc/apt/**
|
||||
- /etc/yum.repos.d/**
|
||||
excludeGlobs:
|
||||
- "**/*.log"
|
||||
quota:
|
||||
maxChurnPercent: 5.0
|
||||
maxChangedFiles: 100
|
||||
onExceed: Warn
|
||||
|
||||
- facetId: lang/node
|
||||
type: LangNode
|
||||
includeGlobs:
|
||||
- "**/node_modules/**"
|
||||
- "**/package.json"
|
||||
- "**/package-lock.json"
|
||||
- "**/yarn.lock"
|
||||
- "**/pnpm-lock.yaml"
|
||||
quota:
|
||||
maxChurnPercent: 10.0
|
||||
maxChangedFiles: 500
|
||||
onExceed: RequireVex
|
||||
|
||||
- facetId: lang/python
|
||||
type: LangPython
|
||||
includeGlobs:
|
||||
- "**/site-packages/**"
|
||||
- "**/dist-packages/**"
|
||||
- "**/requirements.txt"
|
||||
- "**/Pipfile.lock"
|
||||
- "**/poetry.lock"
|
||||
quota:
|
||||
maxChurnPercent: 10.0
|
||||
maxChangedFiles: 200
|
||||
onExceed: Warn
|
||||
|
||||
- facetId: lang/go
|
||||
type: LangGo
|
||||
includeGlobs:
|
||||
- "**/go.mod"
|
||||
- "**/go.sum"
|
||||
- "**/vendor/**"
|
||||
quota:
|
||||
maxChurnPercent: 15.0
|
||||
maxChangedFiles: 100
|
||||
onExceed: Warn
|
||||
|
||||
- facetId: binary
|
||||
type: Binary
|
||||
includeGlobs:
|
||||
- /usr/bin/*
|
||||
- /usr/sbin/*
|
||||
- /bin/*
|
||||
- /sbin/*
|
||||
- /usr/lib/**/*.so*
|
||||
- /lib/**/*.so*
|
||||
- /usr/local/bin/*
|
||||
excludeGlobs:
|
||||
- "**/*.py"
|
||||
- "**/*.sh"
|
||||
quota:
|
||||
maxChurnPercent: 2.0
|
||||
maxChangedFiles: 20
|
||||
onExceed: Block
|
||||
|
||||
- facetId: config
|
||||
type: Config
|
||||
includeGlobs:
|
||||
- /etc/**
|
||||
- "**/*.conf"
|
||||
- "**/*.cfg"
|
||||
- "**/*.ini"
|
||||
- "**/*.yaml"
|
||||
- "**/*.yml"
|
||||
- "**/*.json"
|
||||
excludeGlobs:
|
||||
- /etc/passwd
|
||||
- /etc/shadow
|
||||
- /etc/group
|
||||
- "**/*.log"
|
||||
quota:
|
||||
maxChurnPercent: 20.0
|
||||
maxChangedFiles: 50
|
||||
onExceed: Warn
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Integration Points
|
||||
|
||||
### 7.1 Scanner Integration
|
||||
|
||||
Scanner invokes facet extraction during scan:
|
||||
|
||||
```csharp
|
||||
// In ScanOrchestrator
|
||||
var facetDefs = await _facetConfigLoader.LoadAsync(scanRequest.FacetConfig, ct);
|
||||
var manifests = await _facetExtractor.ExtractAllAsync(imageRef, facetDefs, ct);
|
||||
|
||||
foreach (var manifest in manifests)
|
||||
{
|
||||
var seal = await _facetSealer.SealAsync(manifest, scanRequest, ct);
|
||||
await _facetSealStore.SaveAsync(seal, ct);
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 CLI Integration
|
||||
|
||||
```bash
|
||||
# Seal all facets for an image
|
||||
stella seal myregistry.io/app:v1.2.3 --output seals.json
|
||||
|
||||
# Seal specific facets
|
||||
stella seal myregistry.io/app:v1.2.3 --facet os --facet lang/node
|
||||
|
||||
# Check drift between two image versions
|
||||
stella drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4 --format json
|
||||
|
||||
# Generate VEX from drift
|
||||
stella vex gen --from-drift myregistry.io/app:v1.2.3 myregistry.io/app:v1.2.4
|
||||
```
|
||||
|
||||
### 7.3 Zastava Webhook Integration
|
||||
|
||||
```csharp
|
||||
// In FacetAdmissionValidator
|
||||
public async Task<AdmissionResult> ValidateAsync(AdmissionRequest request, CancellationToken ct)
|
||||
{
|
||||
// Find baseline seal (latest approved)
|
||||
var baseline = await _sealStore.GetLatestApprovedAsync(request.ImageRef, ct);
|
||||
if (baseline is null)
|
||||
return AdmissionResult.Allowed("No baseline seal found, skipping facet check");
|
||||
|
||||
// Extract current facets
|
||||
var currentManifests = await _extractor.ExtractAllAsync(request.ImageRef, _defaultFacets, ct);
|
||||
|
||||
// Calculate drift for each facet
|
||||
var drifts = new List<FacetDrift>();
|
||||
foreach (var manifest in currentManifests)
|
||||
{
|
||||
var baselineSeal = baseline.FirstOrDefault(s => s.FacetId == manifest.FacetId);
|
||||
if (baselineSeal is not null)
|
||||
{
|
||||
var drift = await _driftCalculator.CalculateAsync(baselineSeal, manifest, ct);
|
||||
drifts.Add(drift);
|
||||
}
|
||||
}
|
||||
|
||||
// Evaluate quotas
|
||||
var violations = new List<QuotaViolation>();
|
||||
QuotaAction maxAction = QuotaAction.Warn;
|
||||
|
||||
foreach (var drift in drifts)
|
||||
{
|
||||
var verdict = _quotaEnforcer.Evaluate(drift, drift.AppliedQuota);
|
||||
if (!verdict.Passed)
|
||||
{
|
||||
violations.AddRange(verdict.Violations);
|
||||
if (verdict.RecommendedAction > maxAction)
|
||||
maxAction = verdict.RecommendedAction;
|
||||
}
|
||||
}
|
||||
|
||||
return maxAction switch
|
||||
{
|
||||
QuotaAction.Block => AdmissionResult.Denied(violations),
|
||||
QuotaAction.RequireVex => AdmissionResult.RequiresVex(violations),
|
||||
_ => AdmissionResult.Allowed(violations)
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Observability
|
||||
|
||||
### 8.1 Metrics
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `facet_seal_total` | Counter | `tenant`, `facet_type`, `status` | Total seals created |
|
||||
| `facet_seal_duration_seconds` | Histogram | `facet_type` | Time to create seal |
|
||||
| `facet_drift_score` | Gauge | `tenant`, `facet_id`, `image` | Current drift score |
|
||||
| `facet_quota_violations_total` | Counter | `tenant`, `facet_id`, `quota_field` | Quota violations |
|
||||
| `facet_admission_decisions_total` | Counter | `tenant`, `decision`, `facet_id` | Admission decisions |
|
||||
|
||||
### 8.2 Traces
|
||||
|
||||
```
|
||||
facet.extract - Facet file extraction from image
|
||||
facet.hash - Merkle tree computation
|
||||
facet.seal - DSSE signing
|
||||
facet.drift.compute - Drift calculation
|
||||
facet.quota.evaluate - Quota enforcement
|
||||
facet.admission - Admission validation
|
||||
```
|
||||
|
||||
### 8.3 Logs
|
||||
|
||||
Structured log fields:
|
||||
- `facetId`: Facet identifier
|
||||
- `imageRef`: Container image reference
|
||||
- `imageDigest`: Image content digest
|
||||
- `merkleRoot`: Facet Merkle root
|
||||
- `driftScore`: Computed drift percentage
|
||||
- `quotaVerdict`: Pass/fail status
|
||||
|
||||
---
|
||||
|
||||
## 9. Security Considerations
|
||||
|
||||
1. **Signature Verification**: All seals must be DSSE-signed with keys managed by Authority service
|
||||
2. **Tenant Isolation**: Seals are scoped to tenants; cross-tenant access is prohibited
|
||||
3. **Immutability**: Once created, seals cannot be modified; only superseded by new seals
|
||||
4. **Audit Trail**: All seal operations are logged with correlation IDs
|
||||
5. **Key Rotation**: Signing keys support rotation; old signatures remain valid with archived keys
|
||||
|
||||
---
|
||||
|
||||
## 10. References
|
||||
|
||||
- [DSSE Specification](https://github.com/secure-systems-lab/dsse)
|
||||
- [RFC 8785 - JSON Canonicalization](https://tools.ietf.org/html/rfc8785)
|
||||
- [Scanner Architecture](../scanner/architecture.md)
|
||||
- [Attestor Architecture](../attestor/architecture.md)
|
||||
- [Policy Engine Architecture](../policy/architecture.md)
|
||||
- [Replay Architecture](../replay/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-01-05*
|
||||
@@ -165,7 +165,7 @@ jobs:
|
||||
|
||||
- [Tenant Isolation & Redaction](../tenant-isolation-redaction.md)
|
||||
- [Findings Ledger Deployment](../deployment.md)
|
||||
- [Offline Kit Operations](../../../24_OFFLINE_KIT.md)
|
||||
- [Offline Kit Operations](../../../OFFLINE_KIT.md)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -42,7 +42,7 @@ Key settings:
|
||||
- Architecture: `./architecture.md`
|
||||
- Router Module: `../router/`
|
||||
- Authority Module: `../authority/`
|
||||
- API Reference: `../../09_API_CLI_REFERENCE.md`
|
||||
- API Reference: `../../API_CLI_REFERENCE.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
|
||||
@@ -38,7 +38,7 @@ Key features:
|
||||
|
||||
- AirGap Module: `../airgap/`
|
||||
- ExportCenter: `../export-center/`
|
||||
- Offline Kit: `../../24_OFFLINE_KIT.md`
|
||||
- Offline Kit: `../../OFFLINE_KIT.md`
|
||||
- Operations: `./operations/` (if exists)
|
||||
|
||||
## Current Status
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# StellaOps Architecture Overview (Sprint 19)
|
||||
# StellaOps Architecture Overview
|
||||
|
||||
> **Ownership:** Architecture Guild • Docs Guild
|
||||
> **Audience:** Service owners, platform engineers, solution architects
|
||||
> **Ownership:** Architecture Guild • Docs Guild
|
||||
> **Audience:** Service owners, platform engineers, solution architects
|
||||
> **Related:** [High-Level Architecture](../../ARCHITECTURE_REFERENCE.md), [Concelier Architecture](../concelier/architecture.md), [Policy Engine Architecture](../policy/architecture.md), [Aggregation-Only Contract](../../aoc/aggregation-only-contract.md)
|
||||
|
||||
This dossier summarises the end-to-end runtime topology after the Aggregation-Only Contract (AOC) rollout. It highlights where raw facts live, how ingest services enforce guardrails, and how downstream components consume those facts to derive policy decisions and user-facing experiences.
|
||||
@@ -27,7 +27,7 @@ This dossier summarises the end-to-end runtime topology after the Aggregation-On
|
||||
|
||||
> Evaluate public scanner incidents? The [Ecosystem Test Cases](../product-advisories/30-Nov-2025 - Ecosystem Test Cases for StellaOps.md) document five hardened regressions (Grype credential leak, Trivy offline schema, SBOM parity, Grype instability) that you can turn into acceptance tests today.
|
||||
|
||||
## 1 · System landscape
|
||||
## 1 · System landscape
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
@@ -94,7 +94,7 @@ Key boundaries:
|
||||
|
||||
---
|
||||
|
||||
## 2 · Aggregation-Only Contract focus
|
||||
## 2 · Aggregation-Only Contract focus
|
||||
|
||||
### 2.1 Responsibilities at the boundary
|
||||
|
||||
@@ -146,7 +146,7 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
## 3 · Data & control flow highlights
|
||||
## 3 · Data & control flow highlights
|
||||
|
||||
1. **Ingestion:** Concelier / Excititor connectors fetch upstream documents, compute linksets, and hand payloads to `AOCWriteGuard`. Guards validate schema, provenance, forbidden fields, supersedes pointers, and append-only rules before writing to PostgreSQL.
|
||||
2. **Verification:** `stella aoc verify` (CLI/CI) and `/aoc/verify` endpoints replay guard checks against stored documents, mapping `ERR_AOC_00x` codes to exit codes for automation.
|
||||
@@ -156,7 +156,7 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
## 4 · Offline & disaster readiness
|
||||
## 4 · Offline & disaster readiness
|
||||
|
||||
- **Offline Kit:** Packages raw PostgreSQL snapshots (`advisory_raw`, `vex_raw`) plus guard configuration and CLI verifier binaries so air-gapped sites can re-run AOC checks before promotion.
|
||||
- **Recovery:** Supersedes chains allow rollback to prior revisions without mutating rows. Disaster exercises must rehearse restoring from snapshot, replaying logical replication into Policy Engine, and re-validating guard compliance.
|
||||
@@ -164,9 +164,9 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
## 5 · Replay CAS & deterministic bundles
|
||||
## 5 · Replay CAS & deterministic bundles
|
||||
|
||||
- **Replay CAS:** Content-addressed storage lives under `cas://replay/<sha256-prefix>/<digest>.tar.zst`. Writers must use [StellaOps.Replay.Core](../../src/__Libraries/StellaOps.Replay.Core/AGENTS.md) helpers to ensure lexicographic file ordering, POSIX mode normalisation (0644/0755), LF newlines, zstd level 19 compression, and shard-by-prefix CAS URIs (`BuildCasUri`). Bundle metadata (size, hash, created) feeds the platform-wide `replay_bundles` collection defined in `docs/db/replay-schema.md`.
|
||||
- **Replay CAS:** Content-addressed storage lives under `cas://replay/<sha256-prefix>/<digest>.tar.zst`. Writers must use [StellaOps.Replay.Core](../../src/__Libraries/StellaOps.Replay.Core/AGENTS.md) helpers to ensure lexicographic file ordering, POSIX mode normalisation (0644/0755), LF newlines, zstd level 19 compression, and shard-by-prefix CAS URIs (`BuildCasUri`). Bundle metadata (size, hash, created) feeds the platform-wide `replay_bundles` collection defined in `docs/db/replay-schema.md`.
|
||||
- **Artifacts:** Each recorded scan stores three bundles:
|
||||
1. `manifest.json` (canonical JSON, hashed and signed via DSSE).
|
||||
2. `inputbundle.tar.zst` (feeds, policies, tools, environment snapshot).
|
||||
@@ -175,11 +175,11 @@ sequenceDiagram
|
||||
- **Reachability subtree:** When reachability recording is enabled, Scanner uploads graphs & runtime traces under `cas://replay/<scan-id>/reachability/graphs/` and `cas://replay/<scan-id>/reachability/traces/`. Manifest references (StellaOps.Replay.Core) bind these URIs along with analyzer hashes so Replay + Signals can rehydrate explainability evidence deterministically.
|
||||
- **Storage tiers:** Primary storage is PostgreSQL (`replay_runs`, `replay_subjects`) plus the CAS bucket. Evidence Locker mirrors bundles for long-term retention and legal hold workflows (`docs/modules/evidence-locker/architecture.md`). Offline kits package bundles under `offline/replay/<scan-id>` with detached DSSE envelopes for air-gapped verification.
|
||||
- **APIs & ownership:** Scanner WebService produces the bundles via `record` mode, Scanner Worker emits Merkle metadata, Signer/Authority provide DSSE signatures, Attestor anchors manifests to Rekor, CLI/Evidence Locker handle retrieval, and Docs Guild maintains runbooks. Responsibilities are tracked in `docs/implplan/SPRINT_185_shared_replay_primitives.md` through `SPRINT_187_evidence_locker_cli_integration.md`.
|
||||
- **Operational policies:** Retention defaults to 180 days for hot CAS storage and 2 years for cold Evidence Locker copies. Rotation and pruning follow the checklist in `docs/runbooks/replay_ops.md`.
|
||||
- **Operational policies:** Retention defaults to 180 days for hot CAS storage and 2 years for cold Evidence Locker copies. Rotation and pruning follow the checklist in `docs/runbooks/replay_ops.md`.
|
||||
|
||||
---
|
||||
|
||||
## 6 · References
|
||||
## 6 · References
|
||||
|
||||
- [Aggregation-Only Contract reference](../../aoc/aggregation-only-contract.md)
|
||||
- [Concelier architecture](../concelier/architecture.md)
|
||||
@@ -194,7 +194,7 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
## 7 · Compliance checklist
|
||||
## 7 · Compliance checklist
|
||||
|
||||
- [ ] AOC guard enabled for all Concelier and Excititor write paths in production.
|
||||
- [ ] PostgreSQL schema constraints deployed for `advisory_raw` and `vex_raw`; logical replication scoped per tenant.
|
||||
@@ -208,4 +208,4 @@ sequenceDiagram
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2025-12-23 (Testing strategy links and catalog).*
|
||||
*Last updated: 2026-01-05 (Removed dated sprint reference).*
|
||||
|
||||
@@ -617,6 +617,6 @@ CREATE INDEX idx_revocations_time ON provcache.prov_revocations(revoked_at);
|
||||
- **[Provcache Architecture Guide](architecture.md)** - Detailed architecture, invalidation flows, and API reference
|
||||
- [Policy Engine Architecture](../policy/README.md)
|
||||
- [TrustLattice Engine](../policy/design/policy-deterministic-evaluator.md)
|
||||
- [Offline Kit Documentation](../../24_OFFLINE_KIT.md)
|
||||
- [Offline Kit Documentation](../../OFFLINE_KIT.md)
|
||||
- [Air-Gap Controller](../airgap/README.md)
|
||||
- [Authority Key Rotation](../authority/README.md)
|
||||
606
docs/modules/replay/replay-proof-schema.md
Normal file
606
docs/modules/replay/replay-proof-schema.md
Normal file
@@ -0,0 +1,606 @@
|
||||
# Replay Proof Schema
|
||||
|
||||
> **Ownership:** Replay Guild, Scanner Guild, Attestor Guild
|
||||
> **Audience:** Service owners, platform engineers, auditors, compliance teams
|
||||
> **Related:** [Platform Architecture](../platform/architecture-overview.md), [Replay Architecture](./architecture.md), [Facet Sealing](../facet/architecture.md), [DSSE Specification](https://github.com/secure-systems-lab/dsse)
|
||||
|
||||
This document defines the schema for Replay Proofs - compact, cryptographically verifiable artifacts that attest to deterministic policy evaluation outcomes.
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
A **Replay Proof** is a DSSE-signed artifact that proves a policy evaluation produced a specific verdict given a specific set of inputs. Replay proofs enable:
|
||||
|
||||
- **Audit trails**: Compact proof that a verdict was computed correctly
|
||||
- **Determinism verification**: Re-running with same inputs produces identical output
|
||||
- **Time-travel debugging**: Understand why a past decision was made
|
||||
- **Compliance evidence**: Cryptographic proof for regulatory requirements
|
||||
|
||||
---
|
||||
|
||||
## 2. Replay Bundle Structure
|
||||
|
||||
A complete replay bundle consists of three artifacts stored in CAS:
|
||||
|
||||
```
|
||||
cas://replay/<run-id>/
|
||||
manifest.json # DSSE-signed manifest (this document's focus)
|
||||
inputbundle.tar.zst # Compressed input artifacts
|
||||
outputbundle.tar.zst # Compressed output artifacts
|
||||
```
|
||||
|
||||
### 2.1 Directory Layout
|
||||
|
||||
```
|
||||
<run-id>/
|
||||
manifest.json
|
||||
inputbundle.tar.zst
|
||||
feeds/
|
||||
nvd/<date>.json
|
||||
osv/<date>.json
|
||||
ghsa/<date>.json
|
||||
policy/
|
||||
bundle.tar
|
||||
version.json
|
||||
sboms/
|
||||
<sbom-id>.spdx.json
|
||||
<sbom-id>.cdx.json
|
||||
vex/
|
||||
<vex-id>.openvex.json
|
||||
config/
|
||||
lattice.json
|
||||
feature-flags.json
|
||||
seeds/
|
||||
random-seeds.json
|
||||
clock-offsets.json
|
||||
outputbundle.tar.zst
|
||||
verdicts/
|
||||
<verdict-id>.json
|
||||
findings/
|
||||
<finding-id>.json
|
||||
merkle/
|
||||
verdict-tree.json
|
||||
finding-tree.json
|
||||
logs/
|
||||
replay.log
|
||||
trace.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Core Schema Definitions
|
||||
|
||||
### 3.1 ReplayProof
|
||||
|
||||
The primary proof artifact - a compact summary suitable for verification:
|
||||
|
||||
```csharp
|
||||
public sealed record ReplayProof
|
||||
{
|
||||
// Identity
|
||||
public required Guid ProofId { get; init; }
|
||||
public required Guid RunId { get; init; }
|
||||
public required string Subject { get; init; } // Image digest or SBOM ID
|
||||
|
||||
// Input digest
|
||||
public required KnowledgeSnapshotDigest InputDigest { get; init; }
|
||||
|
||||
// Output digest
|
||||
public required VerdictDigest OutputDigest { get; init; }
|
||||
|
||||
// Execution metadata
|
||||
public required ExecutionMetadata Execution { get; init; }
|
||||
|
||||
// CAS references
|
||||
public required BundleReferences Bundles { get; init; }
|
||||
|
||||
// Signature
|
||||
public required DateTimeOffset SignedAt { get; init; }
|
||||
public required string SignedBy { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 KnowledgeSnapshotDigest
|
||||
|
||||
Cryptographic digest of all inputs:
|
||||
|
||||
```csharp
|
||||
public sealed record KnowledgeSnapshotDigest
|
||||
{
|
||||
// Component digests
|
||||
public required string SbomsDigest { get; init; } // SHA-256 of sorted SBOM hashes
|
||||
public required string VexDigest { get; init; } // SHA-256 of sorted VEX hashes
|
||||
public required string FeedsDigest { get; init; } // SHA-256 of feed version manifest
|
||||
public required string PolicyDigest { get; init; } // SHA-256 of policy bundle
|
||||
public required string LatticeDigest { get; init; } // SHA-256 of lattice config
|
||||
public required string SeedsDigest { get; init; } // SHA-256 of random seeds
|
||||
|
||||
// Combined root
|
||||
public required string RootDigest { get; init; } // SHA-256 of all component digests
|
||||
|
||||
// Counts for quick comparison
|
||||
public required int SbomCount { get; init; }
|
||||
public required int VexCount { get; init; }
|
||||
public required int FeedCount { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 VerdictDigest
|
||||
|
||||
Cryptographic digest of all outputs:
|
||||
|
||||
```csharp
|
||||
public sealed record VerdictDigest
|
||||
{
|
||||
public required string VerdictMerkleRoot { get; init; } // Merkle root of verdicts
|
||||
public required string FindingMerkleRoot { get; init; } // Merkle root of findings
|
||||
public required int VerdictCount { get; init; }
|
||||
public required int FindingCount { get; init; }
|
||||
public required VerdictSummary Summary { get; init; }
|
||||
}
|
||||
|
||||
public sealed record VerdictSummary
|
||||
{
|
||||
public required int Critical { get; init; }
|
||||
public required int High { get; init; }
|
||||
public required int Medium { get; init; }
|
||||
public required int Low { get; init; }
|
||||
public required int Informational { get; init; }
|
||||
public required int Suppressed { get; init; }
|
||||
public required int Total { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
### 3.4 ExecutionMetadata
|
||||
|
||||
Execution environment and timing:
|
||||
|
||||
```csharp
|
||||
public sealed record ExecutionMetadata
|
||||
{
|
||||
// Timing
|
||||
public required DateTimeOffset StartedAt { get; init; }
|
||||
public required DateTimeOffset CompletedAt { get; init; }
|
||||
public required long DurationMs { get; init; }
|
||||
|
||||
// Engine version
|
||||
public required EngineVersion Engine { get; init; }
|
||||
|
||||
// Environment
|
||||
public required string HostId { get; init; }
|
||||
public required string RuntimeVersion { get; init; } // e.g., ".NET 10.0.0"
|
||||
public required string Platform { get; init; } // e.g., "linux-x64"
|
||||
|
||||
// Determinism markers
|
||||
public required bool DeterministicMode { get; init; }
|
||||
public required string ClockMode { get; init; } // "frozen", "simulated", "real"
|
||||
public required string RandomMode { get; init; } // "seeded", "recorded", "real"
|
||||
}
|
||||
|
||||
public sealed record EngineVersion
|
||||
{
|
||||
public required string Name { get; init; } // e.g., "PolicyEngine"
|
||||
public required string Version { get; init; } // e.g., "2.1.0"
|
||||
public required string SourceDigest { get; init; } // SHA-256 of engine source/binary
|
||||
}
|
||||
```
|
||||
|
||||
### 3.5 BundleReferences
|
||||
|
||||
CAS URIs to full bundles:
|
||||
|
||||
```csharp
|
||||
public sealed record BundleReferences
|
||||
{
|
||||
public required string ManifestUri { get; init; } // cas://replay/<run-id>/manifest.json
|
||||
public required string InputBundleUri { get; init; } // cas://replay/<run-id>/inputbundle.tar.zst
|
||||
public required string OutputBundleUri { get; init; } // cas://replay/<run-id>/outputbundle.tar.zst
|
||||
public required string ManifestDigest { get; init; } // SHA-256 of manifest.json
|
||||
public required string InputBundleDigest { get; init; } // SHA-256 of inputbundle.tar.zst
|
||||
public required string OutputBundleDigest { get; init; } // SHA-256 of outputbundle.tar.zst
|
||||
public required long InputBundleSize { get; init; }
|
||||
public required long OutputBundleSize { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. DSSE Envelope
|
||||
|
||||
Replay proofs are wrapped in DSSE envelopes for cryptographic binding:
|
||||
|
||||
```json
|
||||
{
|
||||
"payloadType": "application/vnd.stellaops.replay-proof.v1+json",
|
||||
"payload": "<base64url-encoded canonical JSON>",
|
||||
"signatures": [
|
||||
{
|
||||
"keyid": "sha256:abc123...",
|
||||
"sig": "<base64url-encoded signature>"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 4.1 Payload Type URI
|
||||
|
||||
- **v1**: `application/vnd.stellaops.replay-proof.v1+json`
|
||||
- **in-toto compatible**: `https://stellaops.io/ReplayProof/v1`
|
||||
|
||||
### 4.2 Canonical JSON Encoding
|
||||
|
||||
Payloads MUST be encoded using RFC 8785 canonical JSON:
|
||||
|
||||
1. Keys sorted lexicographically using Unicode code points
|
||||
2. No whitespace between structural characters
|
||||
3. No trailing commas
|
||||
4. Numbers without unnecessary decimal points or exponents
|
||||
5. Strings with minimal escaping (only required characters)
|
||||
|
||||
---
|
||||
|
||||
## 5. Full Manifest Schema
|
||||
|
||||
The `manifest.json` file contains the complete proof plus additional metadata:
|
||||
|
||||
```json
|
||||
{
|
||||
"_type": "https://stellaops.io/ReplayManifest/v1",
|
||||
"proofId": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"runId": "660e8400-e29b-41d4-a716-446655440001",
|
||||
"subject": "sha256:abc123def456...",
|
||||
"tenant": "acme-corp",
|
||||
"inputDigest": {
|
||||
"sbomsDigest": "sha256:111...",
|
||||
"vexDigest": "sha256:222...",
|
||||
"feedsDigest": "sha256:333...",
|
||||
"policyDigest": "sha256:444...",
|
||||
"latticeDigest": "sha256:555...",
|
||||
"seedsDigest": "sha256:666...",
|
||||
"rootDigest": "sha256:aaa...",
|
||||
"sbomCount": 1,
|
||||
"vexCount": 5,
|
||||
"feedCount": 3
|
||||
},
|
||||
"outputDigest": {
|
||||
"verdictMerkleRoot": "sha256:bbb...",
|
||||
"findingMerkleRoot": "sha256:ccc...",
|
||||
"verdictCount": 42,
|
||||
"findingCount": 156,
|
||||
"summary": {
|
||||
"critical": 2,
|
||||
"high": 8,
|
||||
"medium": 25,
|
||||
"low": 12,
|
||||
"informational": 3,
|
||||
"suppressed": 106,
|
||||
"total": 156
|
||||
}
|
||||
},
|
||||
"execution": {
|
||||
"startedAt": "2026-01-05T10:00:00.000Z",
|
||||
"completedAt": "2026-01-05T10:00:05.123Z",
|
||||
"durationMs": 5123,
|
||||
"engine": {
|
||||
"name": "PolicyEngine",
|
||||
"version": "2.1.0",
|
||||
"sourceDigest": "sha256:engine123..."
|
||||
},
|
||||
"hostId": "scanner-worker-01",
|
||||
"runtimeVersion": ".NET 10.0.0",
|
||||
"platform": "linux-x64",
|
||||
"deterministicMode": true,
|
||||
"clockMode": "frozen",
|
||||
"randomMode": "seeded"
|
||||
},
|
||||
"bundles": {
|
||||
"manifestUri": "cas://replay/660e8400.../manifest.json",
|
||||
"inputBundleUri": "cas://replay/660e8400.../inputbundle.tar.zst",
|
||||
"outputBundleUri": "cas://replay/660e8400.../outputbundle.tar.zst",
|
||||
"manifestDigest": "sha256:manifest...",
|
||||
"inputBundleDigest": "sha256:input...",
|
||||
"outputBundleDigest": "sha256:output...",
|
||||
"inputBundleSize": 10485760,
|
||||
"outputBundleSize": 2097152
|
||||
},
|
||||
"signedAt": "2026-01-05T10:00:06.000Z",
|
||||
"signedBy": "scanner-worker-01"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Verification Protocol
|
||||
|
||||
### 6.1 Quick Verification (Proof Only)
|
||||
|
||||
Verify the DSSE signature and check digest consistency:
|
||||
|
||||
```csharp
|
||||
public async Task<VerificationResult> VerifyProofAsync(
|
||||
ReplayProof proof,
|
||||
DsseEnvelope envelope,
|
||||
CancellationToken ct)
|
||||
{
|
||||
// 1. Verify DSSE signature
|
||||
var sigValid = await _dsseVerifier.VerifyAsync(envelope, ct);
|
||||
if (!sigValid)
|
||||
return VerificationResult.Failed("DSSE signature invalid");
|
||||
|
||||
// 2. Verify input digest consistency
|
||||
var inputRoot = ComputeInputRoot(
|
||||
proof.InputDigest.SbomsDigest,
|
||||
proof.InputDigest.VexDigest,
|
||||
proof.InputDigest.FeedsDigest,
|
||||
proof.InputDigest.PolicyDigest,
|
||||
proof.InputDigest.LatticeDigest,
|
||||
proof.InputDigest.SeedsDigest);
|
||||
|
||||
if (inputRoot != proof.InputDigest.RootDigest)
|
||||
return VerificationResult.Failed("Input root digest mismatch");
|
||||
|
||||
return VerificationResult.Passed();
|
||||
}
|
||||
```
|
||||
|
||||
### 6.2 Full Verification (With Replay)
|
||||
|
||||
Download bundles and re-execute to verify determinism:
|
||||
|
||||
```csharp
|
||||
public async Task<VerificationResult> VerifyWithReplayAsync(
|
||||
ReplayProof proof,
|
||||
CancellationToken ct)
|
||||
{
|
||||
// 1. Quick verification first
|
||||
var quickResult = await VerifyProofAsync(proof, envelope, ct);
|
||||
if (!quickResult.Passed)
|
||||
return quickResult;
|
||||
|
||||
// 2. Download bundles from CAS
|
||||
var inputBundle = await _cas.DownloadAsync(proof.Bundles.InputBundleUri, ct);
|
||||
var outputBundle = await _cas.DownloadAsync(proof.Bundles.OutputBundleUri, ct);
|
||||
|
||||
// 3. Verify bundle digests
|
||||
if (ComputeDigest(inputBundle) != proof.Bundles.InputBundleDigest)
|
||||
return VerificationResult.Failed("Input bundle digest mismatch");
|
||||
if (ComputeDigest(outputBundle) != proof.Bundles.OutputBundleDigest)
|
||||
return VerificationResult.Failed("Output bundle digest mismatch");
|
||||
|
||||
// 4. Extract and verify individual input digests
|
||||
var inputs = await ExtractInputsAsync(inputBundle, ct);
|
||||
var computedInputDigest = ComputeKnowledgeDigest(inputs);
|
||||
if (computedInputDigest.RootDigest != proof.InputDigest.RootDigest)
|
||||
return VerificationResult.Failed("Computed input digest mismatch");
|
||||
|
||||
// 5. Re-execute policy evaluation
|
||||
var replayResult = await _replayEngine.ExecuteAsync(inputs, ct);
|
||||
|
||||
// 6. Compare output digests
|
||||
var computedOutputDigest = ComputeVerdictDigest(replayResult);
|
||||
if (computedOutputDigest.VerdictMerkleRoot != proof.OutputDigest.VerdictMerkleRoot)
|
||||
return VerificationResult.Failed("Verdict Merkle root mismatch - non-deterministic!");
|
||||
|
||||
if (computedOutputDigest.FindingMerkleRoot != proof.OutputDigest.FindingMerkleRoot)
|
||||
return VerificationResult.Failed("Finding Merkle root mismatch - non-deterministic!");
|
||||
|
||||
return VerificationResult.Passed();
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Digest Computation
|
||||
|
||||
### 7.1 Input Root Digest
|
||||
|
||||
```csharp
|
||||
public string ComputeInputRoot(
|
||||
string sbomsDigest,
|
||||
string vexDigest,
|
||||
string feedsDigest,
|
||||
string policyDigest,
|
||||
string latticeDigest,
|
||||
string seedsDigest)
|
||||
{
|
||||
// Concatenate in fixed order with separators
|
||||
var combined = string.Join("|",
|
||||
sbomsDigest,
|
||||
vexDigest,
|
||||
feedsDigest,
|
||||
policyDigest,
|
||||
latticeDigest,
|
||||
seedsDigest);
|
||||
|
||||
return ComputeSha256(combined);
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 SBOM Collection Digest
|
||||
|
||||
```csharp
|
||||
public string ComputeSbomsDigest(IEnumerable<SbomRef> sboms)
|
||||
{
|
||||
// Sort by ID for determinism
|
||||
var sorted = sboms.OrderBy(s => s.SbomId, StringComparer.Ordinal);
|
||||
|
||||
// Concatenate hashes
|
||||
var combined = string.Join("|", sorted.Select(s => s.ContentHash));
|
||||
|
||||
return ComputeSha256(combined);
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 Verdict Merkle Root
|
||||
|
||||
```csharp
|
||||
public string ComputeVerdictMerkleRoot(IEnumerable<Verdict> verdicts)
|
||||
{
|
||||
// Sort by verdict ID for determinism
|
||||
var sorted = verdicts.OrderBy(v => v.VerdictId, StringComparer.Ordinal);
|
||||
|
||||
// Compute leaf hashes
|
||||
var leaves = sorted.Select(v => ComputeVerdictLeafHash(v)).ToArray();
|
||||
|
||||
// Build Merkle tree
|
||||
return MerkleTreeBuilder.ComputeRoot(leaves);
|
||||
}
|
||||
|
||||
private string ComputeVerdictLeafHash(Verdict verdict)
|
||||
{
|
||||
var canonical = CanonicalJsonSerializer.Serialize(verdict);
|
||||
return ComputeSha256(canonical);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Database Schema
|
||||
|
||||
```sql
|
||||
-- Replay proof storage
|
||||
CREATE TABLE replay_proofs (
|
||||
proof_id UUID PRIMARY KEY,
|
||||
run_id UUID NOT NULL,
|
||||
tenant TEXT NOT NULL,
|
||||
subject TEXT NOT NULL,
|
||||
input_root_digest TEXT NOT NULL,
|
||||
output_verdict_root TEXT NOT NULL,
|
||||
output_finding_root TEXT NOT NULL,
|
||||
execution_json JSONB NOT NULL,
|
||||
bundles_json JSONB NOT NULL,
|
||||
dsse_envelope JSONB NOT NULL,
|
||||
signed_at TIMESTAMPTZ NOT NULL,
|
||||
signed_by TEXT NOT NULL,
|
||||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||
|
||||
CONSTRAINT uq_replay_run UNIQUE (run_id)
|
||||
);
|
||||
|
||||
CREATE INDEX ix_replay_proofs_tenant ON replay_proofs (tenant, created_at DESC);
|
||||
CREATE INDEX ix_replay_proofs_subject ON replay_proofs (subject);
|
||||
CREATE INDEX ix_replay_proofs_input ON replay_proofs (input_root_digest);
|
||||
|
||||
-- Replay verification log
|
||||
CREATE TABLE replay_verifications (
|
||||
verification_id UUID PRIMARY KEY,
|
||||
proof_id UUID NOT NULL REFERENCES replay_proofs(proof_id),
|
||||
tenant TEXT NOT NULL,
|
||||
verification_type TEXT NOT NULL, -- 'quick', 'full'
|
||||
passed BOOLEAN NOT NULL,
|
||||
failure_reason TEXT,
|
||||
duration_ms BIGINT NOT NULL,
|
||||
verified_at TIMESTAMPTZ NOT NULL,
|
||||
verified_by TEXT NOT NULL,
|
||||
|
||||
CONSTRAINT fk_proof FOREIGN KEY (proof_id) REFERENCES replay_proofs(proof_id)
|
||||
);
|
||||
|
||||
CREATE INDEX ix_replay_verifications_proof ON replay_verifications (proof_id);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. CLI Integration
|
||||
|
||||
```bash
|
||||
# Verify a replay proof (quick - signature only)
|
||||
stella verify --proof proof.json
|
||||
|
||||
# Verify with full replay
|
||||
stella verify --proof proof.json --replay
|
||||
|
||||
# Verify from CAS URI
|
||||
stella verify --bundle cas://replay/660e8400.../manifest.json
|
||||
|
||||
# Export proof for audit
|
||||
stella replay export --run-id 660e8400-... --output proof.json
|
||||
|
||||
# List proofs for an image
|
||||
stella replay list --subject sha256:abc123...
|
||||
|
||||
# Diff two replay results
|
||||
stella replay diff --run-id-a 660e8400... --run-id-b 770e8400...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. API Endpoints
|
||||
|
||||
```http
|
||||
# Get proof by run ID
|
||||
GET /api/v1/replay/{runId}/proof
|
||||
Response: ReplayProof (JSON)
|
||||
|
||||
# Verify proof
|
||||
POST /api/v1/replay/{runId}/verify
|
||||
Request: { "type": "quick" | "full" }
|
||||
Response: VerificationResult
|
||||
|
||||
# List proofs for subject
|
||||
GET /api/v1/replay/proofs?subject={digest}&tenant={tenant}
|
||||
Response: ReplayProofSummary[]
|
||||
|
||||
# Download bundle
|
||||
GET /api/v1/replay/{runId}/bundles/{type}
|
||||
Response: Binary stream (tar.zst)
|
||||
|
||||
# Compare two runs
|
||||
GET /api/v1/replay/diff?runIdA={id}&runIdB={id}
|
||||
Response: ReplayDiffResult
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Error Codes
|
||||
|
||||
| Code | Description |
|
||||
|------|-------------|
|
||||
| `REPLAY_001` | Proof not found |
|
||||
| `REPLAY_002` | DSSE signature verification failed |
|
||||
| `REPLAY_003` | Input digest mismatch |
|
||||
| `REPLAY_004` | Output digest mismatch (non-deterministic) |
|
||||
| `REPLAY_005` | Bundle not found in CAS |
|
||||
| `REPLAY_006` | Bundle digest mismatch |
|
||||
| `REPLAY_007` | Engine version mismatch |
|
||||
| `REPLAY_008` | Replay execution failed |
|
||||
| `REPLAY_009` | Insufficient permissions |
|
||||
| `REPLAY_010` | Bundle format invalid |
|
||||
|
||||
---
|
||||
|
||||
## 12. Migration from v0
|
||||
|
||||
If upgrading from pre-v1 replay bundles:
|
||||
|
||||
1. **Schema migration**: Run `migrate-replay-schema.sql`
|
||||
2. **Re-sign existing proofs**: Use `stella replay migrate --sign` to add DSSE envelopes
|
||||
3. **Verify migration**: Run `stella replay verify --all` to check integrity
|
||||
4. **Update consumers**: Point to new `/api/v1/replay` endpoints
|
||||
|
||||
---
|
||||
|
||||
## 13. Security Considerations
|
||||
|
||||
1. **Key Management**: Signing keys managed by Authority service with rotation support
|
||||
2. **Tenant Isolation**: Proofs scoped to tenants; cross-tenant access prohibited
|
||||
3. **Integrity**: All digests use SHA-256; Merkle proofs enable partial verification
|
||||
4. **Immutability**: Proofs cannot be modified once signed
|
||||
5. **Audit**: All verification attempts logged with correlation IDs
|
||||
6. **Air-gap**: Proofs and bundles can be exported for offline verification
|
||||
|
||||
---
|
||||
|
||||
## 14. References
|
||||
|
||||
- [DSSE Specification](https://github.com/secure-systems-lab/dsse)
|
||||
- [RFC 8785 - JSON Canonicalization](https://tools.ietf.org/html/rfc8785)
|
||||
- [in-toto Attestation Framework](https://github.com/in-toto/attestation)
|
||||
- [SLSA Provenance](https://slsa.dev/provenance)
|
||||
- [Platform Architecture](../platform/architecture-overview.md)
|
||||
- [Facet Sealing Architecture](../facet/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-01-05*
|
||||
@@ -320,4 +320,4 @@ When schemas/adapters change:
|
||||
- Sprint: `docs/implplan/SPRINT_0186_0001_0001_record_deterministic_execution.md` (SC10)
|
||||
- Roadmap: `docs/modules/scanner/design/standards-convergence-roadmap.md` (SC1)
|
||||
- Governance: `docs/modules/scanner/design/schema-governance.md` (SC9)
|
||||
- Offline Operation: `docs/24_OFFLINE_KIT.md`
|
||||
- Offline Operation: `docs/OFFLINE_KIT.md`
|
||||
|
||||
@@ -277,4 +277,4 @@ Stripped binaries may lack Build-IDs. Options:
|
||||
- [BinaryIndex Architecture](../../binaryindex/architecture.md)
|
||||
- [Scanner Architecture](../architecture.md)
|
||||
- [Proof Chain Specification](../../attestor/proof-chain-specification.md)
|
||||
- [CLI Reference](../../../09_API_CLI_REFERENCE.md)
|
||||
- [CLI Reference](../../../API_CLI_REFERENCE.md)
|
||||
|
||||
@@ -411,4 +411,4 @@ var payload = await _payloadStore.GetAsync(artifact.Uri, ct);
|
||||
- [Surface.FS Design](../design/surface-fs.md)
|
||||
- [Surface.Env Design](../design/surface-env.md)
|
||||
- [Surface.Validation Guide](./surface-validation-extensibility.md)
|
||||
- [Offline Kit Documentation](../../../../24_OFFLINE_KIT.md)
|
||||
- [Offline Kit Documentation](../../../../OFFLINE_KIT.md)
|
||||
|
||||
@@ -23,7 +23,7 @@
|
||||
| Rekor v2 (managed or self-hosted) | Transparency log providing UUIDs + inclusion proofs. | `docs/ops/rekor/README.md` (if self-hosted) |
|
||||
| `StellaOps.Scanner` (WebService/Worker) | Requests attestations per scan, stores Rekor metadata next to SBOM artefacts. | `docs/modules/scanner/architecture.md` |
|
||||
| Export Center | Packages DSSE payloads + proofs into Offline Kit bundles and mirrors license notices. | `docs/modules/export-center/architecture.md` |
|
||||
| Policy Engine + CLI | Enforce “attested only” promotion, expose CLI verification verbs. | `docs/modules/policy/architecture.md`, `docs/09_API_CLI_REFERENCE.md` |
|
||||
| Policy Engine + CLI | Enforce “attested only” promotion, expose CLI verification verbs. | `docs/modules/policy/architecture.md`, `docs/API_CLI_REFERENCE.md` |
|
||||
|
||||
---
|
||||
|
||||
@@ -210,4 +210,4 @@ stellaops-cli attest verify --envelope artifacts/scan123/attest/sbom.dsse.json \
|
||||
- Scanner architecture (§Signer → Attestor → Rekor): `docs/modules/scanner/architecture.md`
|
||||
- Export Center profiles: `docs/modules/export-center/architecture.md`
|
||||
- Policy Engine predicates: `docs/modules/policy/architecture.md`
|
||||
- CLI reference: `docs/09_API_CLI_REFERENCE.md`
|
||||
- CLI reference: `docs/API_CLI_REFERENCE.md`
|
||||
|
||||
@@ -371,5 +371,5 @@ The bundle was created without the `--sign` flag. Either:
|
||||
- `docs/modules/policy/secret-leak-detection-readiness.md`
|
||||
- `docs/benchmarks/scanner/deep-dives/secrets.md`
|
||||
- `docs/modules/scanner/design/surface-secrets.md`
|
||||
- `docs/07_HIGH_LEVEL_ARCHITECTURE.md` - Runtime inventory (Scanner)
|
||||
- `docs/ARCHITECTURE_OVERVIEW.md` - Runtime inventory (Scanner)
|
||||
- [Secrets Bundle Rotation](./secrets-bundle-rotation.md)
|
||||
|
||||
@@ -39,7 +39,7 @@ Key features:
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- API Reference: `../../09_API_CLI_REFERENCE.md`
|
||||
- API Reference: `../../API_CLI_REFERENCE.md`
|
||||
- OpenAPI Specs: `../../api/` (if exists)
|
||||
- CLI: `../cli/`
|
||||
- Gateway: `../gateway/`
|
||||
|
||||
@@ -51,7 +51,7 @@ Key settings:
|
||||
- Architecture: `./architecture.md`
|
||||
- Policy Engine: `../policy/`
|
||||
- VexLens: `../vex-lens/`
|
||||
- High-Level Architecture: `../../07_HIGH_LEVEL_ARCHITECTURE.md`
|
||||
- High-Level Architecture: `../../ARCHITECTURE_OVERVIEW.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
|
||||
@@ -44,7 +44,7 @@ Snapshot functionality is implemented across multiple modules:
|
||||
- AirGap: `../airgap/`
|
||||
- ExportCenter: `../export-center/`
|
||||
- Replay: `../replay/` (if exists)
|
||||
- Offline Kit: `../../24_OFFLINE_KIT.md`
|
||||
- Offline Kit: `../../OFFLINE_KIT.md`
|
||||
|
||||
## Current Status
|
||||
|
||||
|
||||
@@ -32,7 +32,7 @@ The Console presents operator dashboards for scans, policies, VEX evidence, runt
|
||||
- Auth smoke tests in `operations/auth-smoke.md`.
|
||||
- Observability runbook + dashboard stub in `operations/observability.md` and `operations/dashboards/console-ui-observability.json` (offline import).
|
||||
- Console architecture doc for layout and SSE fan-out.
|
||||
- Operator guide: `../../15_UI_GUIDE.md`. Accessibility: `../../accessibility.md`. Security: `../../security/`.
|
||||
- Operator guide: `../../UI_GUIDE.md`. Accessibility: `../../accessibility.md`. Security: `../../security/`.
|
||||
|
||||
## Related resources
|
||||
- ./operations/auth-smoke.md
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
> **Ownership:** Console Guild • Docs Guild
|
||||
> **Delivery scope:** `StellaOps.Web` Angular workspace, Console Web Gateway routes (`/console/*`), Downloads manifest surfacing, SSE fan-out for Scheduler & telemetry.
|
||||
> **Related docs:** [Console operator guide](../../15_UI_GUIDE.md), [Admin workflows](../../console/admin-tenants.md), [Air-gap workflows](../../console/airgap.md), [Console security posture](../../security/console-security.md), [Console observability](../../console/observability.md), [UI telemetry](../../observability/ui-telemetry.md), [Deployment guide](../../deploy/console.md)
|
||||
> **Related docs:** [Console operator guide](../../UI_GUIDE.md), [Admin workflows](../../console/admin-tenants.md), [Air-gap workflows](../../console/airgap.md), [Console security posture](../../security/console-security.md), [Console observability](../../console/observability.md), [UI telemetry](../../observability/ui-telemetry.md), [Deployment guide](../../deploy/console.md)
|
||||
|
||||
This dossier describes the end-to-end architecture of the StellaOps Console as delivered in Sprint 23. It covers the Angular workspace layout, API/gateway integration points, live-update channels, performance budgets, offline workflows, and observability hooks needed to keep the console deterministic and air-gap friendly.
|
||||
|
||||
|
||||
@@ -414,6 +414,6 @@ Deep-dive into the cryptographic attestation chain, showing DSSE envelopes and R
|
||||
## References
|
||||
|
||||
- `docs/db/SPECIFICATION.md` Section 5.6-5.8 — Schema definitions
|
||||
- `docs/24_OFFLINE_KIT.md` Section 2.2 — Proof replay workflow
|
||||
- `docs/OFFLINE_KIT.md` Section 2.2 — Proof replay workflow
|
||||
- `SPRINT_3500_0001_0001_deeper_moat_master.md` — Feature requirements
|
||||
- `docs/modules/ui/architecture.md` — Console architecture
|
||||
|
||||
Reference in New Issue
Block a user