Files
git.stella-ops.org/docs/modules/attestor/reg-evidence-architecture.md
2025-12-24 14:19:46 +02:00

366 lines
17 KiB
Markdown

# Resolved Execution Graph (REG) Evidence Architecture
> **Status:** Proposed
> **Sprint Series:** 8100.0012.*
> **Last Updated:** 2025-12-24
## Overview
This document describes the architectural enhancements to StellaOps' evidence and attestation subsystems based on the **Merkle-Hash REG** (Resolved Execution Graph) pattern. The core insight: when every node in an execution graph is identified by a **Merkle hash of its normalized content**, evidence can be attached to *content itself* rather than brittle positional indices.
## Motivation
### Problem Statement
StellaOps currently has robust foundations for content-addressed identifiers, Merkle trees, and attestations. However, three gaps limit the system's verifiability:
1. **Canonicalizer versioning:** Hash collisions possible if canonicalization logic changes
2. **Fragmented evidence models:** Different modules use different evidence structures
3. **Implicit graph roots:** Merkle roots are computed but not independently attested
### Target Benefits
| Benefit | Description |
|---------|-------------|
| **Reproducible proofs** | Verifiers re-hash node content and check against stored hashes—no fragile indices |
| **Dedup & reuse** | Identical content across pipelines collapses to one ID; one piece of evidence justifies many occurrences |
| **Audits + time travel** | Snapshot the graph's Merkle root; any replay matching the root guarantees identical nodes & evidence |
| **Offline verification** | Attestations are self-contained; no network required to verify |
| **Exception stability** | Exceptions bind to content hashes, not "row 42"; stable across rebuilds |
## Architecture
### Component Diagram
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ StellaOps REG Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ StellaOps. │ │ StellaOps. │ │ StellaOps. │ │
│ │ Canonical.Json │────▶│ Evidence.Core │◀────│ Attestor. │ │
│ │ │ │ │ │ GraphRoot │ │
│ │ • CanonVersion │ │ • IEvidence │ │ │ │
│ │ • CanonJson │ │ • EvidenceRecord│ │ • IGraphRoot- │ │
│ │ • Versioned │ │ • IEvidenceStore│ │ Attestor │ │
│ │ Hashing │ │ • Adapters │ │ • Verification │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
│ │ │ │ │
│ └────────────────────────┼────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Content-Addressed Store │ │
│ │ │ │
│ │ Evidence by subject_node_id │ Graph roots by root_hash │ │
│ │ ──────────────────────────── │ ────────────────────────── │ │
│ │ sha256:abc... → [evidence] │ sha256:xyz... → attestation │ │
│ │ sha256:def... → [evidence] │ sha256:uvw... → attestation │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Scanner │ │ Policy │ │ Excititor │ │
│ │ ────────────── │ │ ────────────── │ │ ────────────── │ │
│ │ • EvidenceBundle│────▶│ • Exception │────▶│ • VexObservation│ │
│ │ • ProofSpine │ │ Application │ │ • VexLinkset │ │
│ │ • RichGraph │ │ • PolicyVerdict │ │ │ │
│ └──────────────────┘ └──────────────────┘ └──────────────────┘ │
│ │ │ │ │
│ └────────────────────────┼────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Unified IEvidence│ │
│ │ via Adapters │ │
│ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
### Data Flow
```
1. Content Creation
─────────────────
Component/Node → Canonicalize(content, version) → SHA-256 → node_id
2. Evidence Attachment
────────────────────
Analysis Result → IEvidence { subject_node_id, type, payload, provenance }
→ EvidenceId = hash(subject + type + payload + provenance)
→ Store(evidence)
3. Graph Construction
───────────────────
Nodes + Edges → Sort(node_ids) + Sort(edge_ids) → Merkle Tree → root_hash
4. Graph Attestation
──────────────────
GraphRootAttestationRequest → GraphRootAttestor
→ In-Toto Statement { subject: [root, artifact] }
→ DSSE Sign
→ Optional: Rekor Publish
5. Verification
─────────────
Download attestation → Verify signature
→ Fetch nodes/edges by ID
→ Recompute Merkle root
→ Compare with attested root
```
## Implementation Sprints
| Sprint | Title | Dependency | Key Deliverables |
|--------|-------|------------|------------------|
| 8100.0012.0001 | Canonicalizer Versioning | None | `CanonVersion`, `CanonicalizeVersioned()`, backward compatibility |
| 8100.0012.0002 | Unified Evidence Model | 0001 | `IEvidence`, `EvidenceRecord`, `IEvidenceStore`, adapters |
| 8100.0012.0003 | Graph Root Attestation | 0001, 0002 | `IGraphRootAttestor`, in-toto statements, Rekor integration |
### Sprint Sequence Diagram
```
Week 1-2 Week 3-4 Week 5-6 Week 7-8
──────── ──────── ──────── ────────
│ │ │ │
│ 0001: Canon │ │ │
│ Versioning │ │ │
│ ┌─────────┐ │ │ │
│ │Wave 0-1 │────┼─▶ 0002: Unified│ │
│ │Wave 2-3 │ │ Evidence │ │
│ │Wave 4 │ │ ┌─────────┐ │ │
│ └─────────┘ │ │Wave 0-1 │──┼─▶ 0003: Graph │
│ │ │Wave 2-3 │ │ Root Attest │
│ │ │Wave 4 │ │ ┌─────────┐ │
│ │ └─────────┘ │ │Wave 0-1 │ │
│ │ │ │Wave 2-4 │ │
│ │ │ │Wave 5 │ │
│ │ │ └─────────┘ │
▼ ▼ ▼ ▼
```
## Technical Specifications
### Canonicalization Version Marker
```json
{
"_canonVersion": "stella:canon:v1",
"evidenceSource": "stellaops/scanner/reachability",
"sbomEntryId": "sha256:abc123...:pkg:npm/lodash@4.17.21",
...
}
```
The `_canonVersion` field:
- Uses underscore prefix to sort first lexicographically
- Identifies the exact canonicalization algorithm
- Enables graceful version migration
- Is included in all content-addressed hash computations
### Unified Evidence Record
```typescript
interface IEvidence {
// Content-addressed binding
subjectNodeId: string; // "sha256:{hex}" - what this evidence is about
evidenceId: string; // "sha256:{hex}" - ID of this evidence record
// Type and payload
evidenceType: EvidenceType; // reachability, scan, policy, vex, etc.
payload: Uint8Array; // Canonical JSON bytes
payloadSchemaVersion: string;
// Attestation
signatures: EvidenceSignature[];
// Provenance
provenance: {
generatorId: string;
generatorVersion: string;
generatedAt: DateTimeOffset;
inputsDigest?: string;
correlationId?: string;
};
// External storage
externalPayloadCid?: string;
}
```
### Graph Root Attestation (In-Toto)
```json
{
"_type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "sha256:abc123...",
"digest": { "sha256": "abc123..." }
},
{
"name": "sha256:def456...",
"digest": { "sha256": "def456..." }
}
],
"predicateType": "https://stella-ops.org/attestation/graph-root/v1",
"predicate": {
"graphType": "ResolvedExecutionGraph",
"rootHash": "sha256:abc123...",
"nodeCount": 1247,
"edgeCount": 3891,
"nodeIds": ["sha256:...", ...],
"edgeIds": ["sha256:...", ...],
"inputs": {
"policyDigest": "sha256:...",
"feedsDigest": "sha256:...",
"toolchainDigest": "sha256:...",
"paramsDigest": "sha256:..."
},
"evidenceIds": ["sha256:...", ...],
"canonVersion": "stella:canon:v1",
"computedAt": "2025-12-24T10:30:00Z",
"computedBy": "stellaops/attestor/graph-root",
"computedByVersion": "1.0.0"
}
}
```
## Verification Guarantees
### What Can Be Verified
| Claim | Verification Method |
|-------|---------------------|
| "This graph contains exactly these nodes" | Recompute Merkle root from node IDs |
| "This evidence applies to node X" | Check evidence.subjectNodeId == node.id |
| "This attestation was signed by key K" | Verify DSSE envelope signature |
| "This graph was published to transparency log" | Check Rekor inclusion proof |
| "These inputs produced this graph" | Check inputs.* digests match expectations |
### What Cannot Be Verified (Without Additional Trust)
| Claim | Reason | Mitigation |
|-------|--------|------------|
| "The analysis was performed correctly" | Semantic, not structural | Trust the generator; audit toolchain |
| "No evidence was omitted" | Attester controls content | Require known evidence types; policy enforcement |
| "The inputs are authentic" | External dependency | Chain attestations; verify feed signatures |
## Migration Path
### Phase 1: Parallel Operation (Sprints 0001-0003)
- New versioned hashing alongside legacy
- New evidence model with adapters for existing types
- Graph root attestations optional
### Phase 2: Gradual Adoption (Post-Sprints)
- Emit migration warnings for legacy hashes
- Prefer IEvidence in new code
- Enable graph root attestations by default
### Phase 3: Deprecation (Future)
- Remove legacy hash acceptance
- Require IEvidence for all evidence storage
- Require graph root attestations for verification
## Compatibility Considerations
### Existing Attestations
Attestations generated before canonicalizer versioning remain valid:
- Verification detects legacy format (no `_canonVersion` field)
- Falls back to legacy canonicalization
- Logs warning for migration tracking
### Existing Evidence
Existing evidence types (`EvidenceBundle`, `EvidenceStatement`, etc.) continue working:
- Adapters convert to `IEvidence` on demand
- Original types remain in storage
- Gradual migration via write-through
### API Stability
Public APIs remain backward compatible:
- New methods added, not changed
- Optional parameters for new features
- Explicit opt-in for graph root attestations
## Performance Considerations
| Operation | Impact | Mitigation |
|-----------|--------|------------|
| Version field injection | ~100 bytes per hash | Negligible |
| Merkle tree computation | O(n log n) for n nodes | Existing algorithm; no change |
| Graph root attestation | +1 DSSE sign per graph | Batching; async |
| Evidence store queries | Index on subject_node_id | Composite index; pagination |
| Rekor publishing | Network latency | Optional; async; batching |
## Security Considerations
### Threat Model
| Threat | Mitigation |
|--------|------------|
| Hash collision attacks | SHA-256 with 256-bit security; version namespacing |
| Signature forgery | DSSE with ECDSA/EdDSA; key rotation |
| Evidence tampering | Content-addressed storage; Merkle verification |
| Replay attacks | Timestamp in provenance; Rekor log index |
| Canonicalization attacks | RFC 8785 compliance; explicit versioning |
### Key Management
Graph root attestations use the existing Signer module:
- Keys managed via Authority plugin
- Rotation policy applies
- Certificate chains optional for external verification
## References
- [RFC 8785 - JSON Canonicalization Scheme (JCS)](https://datatracker.ietf.org/doc/html/rfc8785)
- [in-toto Attestation Framework](https://github.com/in-toto/attestation)
- [DSSE - Dead Simple Signing Envelope](https://github.com/secure-systems-lab/dsse)
- [Sigstore Rekor](https://docs.sigstore.dev/rekor/overview/)
- [Merkle Tree - Wikipedia](https://en.wikipedia.org/wiki/Merkle_tree)
## Appendix A: File Locations
| Component | Path |
|-----------|------|
| CanonVersion | `src/__Libraries/StellaOps.Canonical.Json/CanonVersion.cs` |
| CanonJson (versioned) | `src/__Libraries/StellaOps.Canonical.Json/CanonJson.cs` |
| IEvidence | `src/__Libraries/StellaOps.Evidence.Core/IEvidence.cs` |
| EvidenceRecord | `src/__Libraries/StellaOps.Evidence.Core/EvidenceRecord.cs` |
| IEvidenceStore | `src/__Libraries/StellaOps.Evidence.Core/IEvidenceStore.cs` |
| Adapters | `src/__Libraries/StellaOps.Evidence.Core/Adapters/` |
| IGraphRootAttestor | `src/Attestor/__Libraries/StellaOps.Attestor.GraphRoot/IGraphRootAttestor.cs` |
| GraphRootAttestation | `src/Attestor/__Libraries/StellaOps.Attestor.GraphRoot/Models/` |
## Appendix B: Example Evidence Queries
```csharp
// Get all reachability evidence for a package
var evidence = await evidenceStore.GetBySubjectAsync(
subjectNodeId: "sha256:abc123...pkg:npm/lodash@4.17.21",
typeFilter: EvidenceType.Reachability);
// Verify a graph root attestation
var result = await graphRootAttestor.VerifyAsync(
envelope: downloadedEnvelope,
nodes: fetchedNodes,
edges: fetchedEdges);
if (!result.IsValid)
throw new VerificationException(result.FailureReason);
// Check if evidence exists before creating duplicate
if (!await evidenceStore.ExistsAsync(subjectNodeId, EvidenceType.Scan))
{
await evidenceStore.StoreAsync(newEvidence);
}
```