Files
git.stella-ops.org/docs/modules/provcache/README.md
2025-12-25 23:10:09 +02:00

623 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Provcache Module
> **Status: Implemented** — Core library shipped in Sprint 8200.0001.0001. API endpoints, caching, invalidation and write-behind queue are operational. Policy Engine integration pending architectural review.
> Provenance Cache — Maximizing Trust Evidence Density
## Overview
Provcache is a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte — enabling faster security decisions, offline replays, and smaller air-gap bundles.
### Key Benefits
- **Trust Latency**: Warm cache lookups return in single-digit milliseconds
- **Bandwidth Efficiency**: Avoid re-fetching bulky SBOMs/attestations
- **Offline Operation**: Decisions usable without full SBOM/VEX payloads
- **Audit Transparency**: Full evidence chain verifiable via Merkle proofs
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Policy Evaluator │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ VeriKey │───▶│ Provcache │───▶│ TrustLatticeEngine │ │
│ │ Builder │ │ Service │ │ (if cache miss) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Provcache Store │
│ ┌─────────────┐ ┌────────────────┐ │
│ │ Valkey │◀──▶│ Postgres │ │
│ │ (read-thru) │ │ (write-behind) │ │
│ └─────────────┘ └────────────────┘ │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Evidence Chunk Store │
│ ┌─────────────────────────────────────┐│
│ │ prov_evidence_chunks (Postgres) ││
│ │ - Chunked SBOM/VEX/CallGraph ││
│ │ - Merkle tree verification ││
│ └─────────────────────────────────────┘│
└─────────────────────────────────────────┘
```
## Core Concepts
### VeriKey (Provenance Identity Key)
A composite hash that uniquely identifies a provenance decision context:
```
VeriKey = SHA256(
"v1|" || // Version prefix for compatibility
source_hash || // Image/artifact digest
"|" ||
sbom_hash || // Canonical SBOM hash
"|" ||
vex_hash_set_hash || // Sorted VEX statement hashes
"|" ||
merge_policy_hash || // PolicyBundle hash
"|" ||
signer_set_hash || // Signer certificate hashes
"|" ||
time_window // Epoch bucket
)
```
**Why each component?**
| Component | Purpose |
|-----------|---------|
| `source_hash` | Different artifacts → different keys |
| `sbom_hash` | SBOM changes (new packages) → new key |
| `vex_hash_set` | VEX updates → new key |
| `policy_hash` | Policy changes → new key |
| `signer_set_hash` | Key rotation → new key (security) |
| `time_window` | Temporal bucketing → controlled expiry |
#### VeriKey Composition Rules
1. **Hash Normalization**: All input hashes are normalized to lowercase with `sha256:` prefix stripped if present
2. **Set Hash Computation**: For VEX statements and signer certificates:
- Individual hashes are sorted lexicographically (ordinal)
- Sorted hashes are concatenated with `|` delimiter
- Result is SHA256-hashed
- Empty sets use well-known sentinels (`"empty-vex-set"`, `"empty-signer-set"`)
3. **Time Window Computation**: `floor(timestamp.Ticks / bucket.Ticks) * bucket.Ticks` in UTC ISO-8601 format
4. **Output Format**: `sha256:<64-char-lowercase-hex>`
#### Code Example
```csharp
var veriKey = new VeriKeyBuilder(options)
.WithSourceHash("sha256:abc123...") // Image digest
.WithSbomHash("sha256:def456...") // SBOM digest
.WithVexStatementHashes(["sha256:v1", "sha256:v2"]) // Sorted automatically
.WithMergePolicyHash("sha256:policy...") // Policy bundle
.WithCertificateHashes(["sha256:cert1"]) // Signer certs
.WithTimeWindow(DateTimeOffset.UtcNow) // Auto-bucketed
.Build();
// Returns: "sha256:789abc..."
```
### DecisionDigest
Canonicalized representation of an evaluation result:
```json
{
"digestVersion": "v1",
"veriKey": "sha256:abc123...",
"verdictHash": "sha256:def456...",
"proofRoot": "sha256:789abc...",
"replaySeed": {
"feedIds": ["cve-2024", "ghsa-2024"],
"ruleIds": ["default-policy-v2"]
},
"trustScore": 85,
"createdAt": "2025-12-24T12:00:00Z",
"expiresAt": "2025-12-25T12:00:00Z"
}
```
### Trust Score
A composite score (0-100) indicating decision confidence:
| Component | Weight | Calculation |
|-----------|--------|-------------|
| Reachability | 25% | Call graph coverage, entry points analyzed |
| SBOM Completeness | 20% | Package count, license data presence |
| VEX Coverage | 20% | Vendor statements, justifications |
| Policy Freshness | 15% | Time since last policy update |
| Signer Trust | 20% | Key age, reputation, chain validity |
### Evidence Chunks
Large evidence (SBOM, VEX, call graphs) is stored in fixed-size chunks:
- **Default size**: 64 KB per chunk
- **Merkle verification**: Each chunk is a Merkle leaf
- **Lazy fetch**: Only fetch chunks needed for audit
- **LRU eviction**: Old chunks evicted under storage pressure
## API Reference
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| GET | `/v1/provcache/{veriKey}` | Lookup cached decision |
| POST | `/v1/provcache` | Store decision (idempotent) |
| POST | `/v1/provcache/invalidate` | Invalidate by pattern |
| GET | `/v1/proofs/{proofRoot}` | List evidence chunks |
| GET | `/v1/proofs/{proofRoot}/chunks/{index}` | Download chunk |
### Cache Lookup Flow
```mermaid
sequenceDiagram
participant Client
participant PolicyEngine
participant Provcache
participant Valkey
participant Postgres
participant TrustLattice
Client->>PolicyEngine: Evaluate(artifact)
PolicyEngine->>Provcache: Get(VeriKey)
Provcache->>Valkey: GET verikey
alt Cache Hit
Valkey-->>Provcache: DecisionDigest
Provcache-->>PolicyEngine: CacheResult(hit)
PolicyEngine-->>Client: Decision (cached)
else Cache Miss
Valkey-->>Provcache: null
Provcache->>Postgres: SELECT * FROM provcache_items
alt DB Hit
Postgres-->>Provcache: ProvcacheEntry
Provcache->>Valkey: SET (backfill)
Provcache-->>PolicyEngine: CacheResult(hit, source=postgres)
else DB Miss
Postgres-->>Provcache: null
Provcache-->>PolicyEngine: CacheResult(miss)
PolicyEngine->>TrustLattice: Evaluate
TrustLattice-->>PolicyEngine: EvaluationResult
PolicyEngine->>Provcache: Set(VeriKey, DecisionDigest)
Provcache->>Valkey: SET
Provcache->>Postgres: INSERT (async)
PolicyEngine-->>Client: Decision (computed)
end
end
```
## Invalidation
> **See also**: [architecture.md](architecture.md#invalidation-mechanisms) for detailed invalidation flow diagrams.
### Automatic Invalidation Triggers
| Trigger | Event | Scope | Implementation |
|---------|-------|-------|----------------|
| Signer Revocation | `SignerRevokedEvent` | All entries with matching `signer_set_hash` | `SignerSetInvalidator` |
| Feed Epoch Advance | `FeedEpochAdvancedEvent` | Entries with older `feed_epoch` | `FeedEpochInvalidator` |
| Policy Update | `PolicyUpdatedEvent` | Entries with matching `policy_hash` | `PolicyHashInvalidator` |
| TTL Expiry | Background job | Entries past `expires_at` | `TtlExpirationService` |
### Invalidation Interfaces
```csharp
// Main invalidator interface
public interface IProvcacheInvalidator
{
Task<int> InvalidateAsync(
InvalidationCriteria criteria,
string reason,
string? correlationId = null,
CancellationToken cancellationToken = default);
}
// Revocation ledger for audit trail
public interface IRevocationLedger
{
Task RecordAsync(RevocationEntry entry, CancellationToken ct = default);
Task<IReadOnlyList<RevocationEntry>> GetEntriesSinceAsync(long sinceSeqNo, int limit = 1000, CancellationToken ct = default);
Task<RevocationLedgerStats> GetStatsAsync(CancellationToken ct = default);
}
```
### Manual Invalidation
```bash
# Invalidate by signer
POST /v1/provcache/invalidate
{
"by": "signer_set_hash",
"value": "sha256:revoked-signer...",
"reason": "key-compromise"
}
# Invalidate by policy
POST /v1/provcache/invalidate
{
"by": "policy_hash",
"value": "sha256:old-policy...",
"reason": "policy-update"
}
```
### Revocation Replay
Nodes can replay missed revocation events after restart or network partition:
```csharp
var replayService = services.GetRequiredService<IRevocationReplayService>();
var checkpoint = await replayService.GetCheckpointAsync();
var result = await replayService.ReplayFromAsync(
sinceSeqNo: checkpoint,
new RevocationReplayOptions { BatchSize = 1000 });
// result.EntriesReplayed, result.TotalInvalidations
```
## Air-Gap Integration
> **See also**: [architecture.md](architecture.md#air-gap-exportimport) for bundle format specification and architecture diagrams.
### Export Workflow
```bash
# Export minimal proof (digest only)
stella prov export --verikey sha256:abc123 --density lite
# Export with evidence chunks
stella prov export --verikey sha256:abc123 --density standard
# Export full evidence
stella prov export --verikey sha256:abc123 --density strict --sign
```
### Import Workflow
```bash
# Import and verify Merkle root
stella prov import --input proof.bundle
# Import with lazy chunk fetch (connected mode)
stella prov import --input proof-lite.json --lazy-fetch --backend https://api.stellaops.com
# Import with lazy fetch from file directory (sneakernet mode)
stella prov import --input proof-lite.json --lazy-fetch --chunks-dir /mnt/usb/evidence
```
### Density Levels
| Level | Contents | Size | Use Case | Lazy Fetch Support |
|-------|----------|------|----------|--------------------|
| `lite` | DecisionDigest + ProofRoot + Manifest | ~2 KB | Quick verification | Required |
| `standard` | + First N chunks (~10%) | ~200 KB | Normal audit | Partial (remaining chunks) |
| `strict` | + All chunks | Variable | Full compliance | Not needed |
### Lazy Evidence Fetching
For `lite` and `standard` density exports, missing chunks can be fetched on-demand:
```csharp
// HTTP fetcher (connected mode)
var httpFetcher = new HttpChunkFetcher(
new Uri("https://api.stellaops.com"), logger);
// File fetcher (air-gapped/sneakernet mode)
var fileFetcher = new FileChunkFetcher(
basePath: "/mnt/usb/evidence", logger);
// Orchestrate fetch + verify + store
var orchestrator = new LazyFetchOrchestrator(repository, logger);
var result = await orchestrator.FetchAndStoreAsync(
proofRoot: "sha256:...",
fetcher,
new LazyFetchOptions
{
VerifyOnFetch = true,
BatchSize = 100,
MaxChunks = 1000
});
```
### Sneakernet Export for Chunked Evidence
```csharp
// Export evidence chunks to file system for transport
await fileFetcher.ExportEvidenceChunksToFilesAsync(
manifest,
chunks,
outputDirectory: "/mnt/usb/evidence");
```
## Configuration
### C# Configuration Class
The `ProvcacheOptions` class (section name: `"Provcache"`) exposes the following settings:
| Property | Type | Default | Validation | Description |
|----------|------|---------|------------|-------------|
| `DefaultTtl` | `TimeSpan` | 24h | 1min7d | Default time-to-live for cache entries |
| `MaxTtl` | `TimeSpan` | 7d | 1min30d | Maximum allowed TTL regardless of request |
| `TimeWindowBucket` | `TimeSpan` | 1h | 1min24h | Time window bucket for VeriKey computation |
| `ValkeyKeyPrefix` | `string` | `"stellaops:prov:"` | — | Key prefix for Valkey storage |
| `EnableWriteBehind` | `bool` | `true` | — | Enable async Postgres persistence |
| `WriteBehindFlushInterval` | `TimeSpan` | 5s | 1s5min | Interval for flushing write-behind queue |
| `WriteBehindMaxBatchSize` | `int` | 100 | 110000 | Maximum batch size per flush |
| `WriteBehindQueueCapacity` | `int` | 10000 | 1001M | Max queue capacity (blocks when full) |
| `WriteBehindMaxRetries` | `int` | 3 | 010 | Retry attempts for failed writes |
| `ChunkSize` | `int` | 65536 | 1KB1MB | Evidence chunk size in bytes |
| `MaxChunksPerEntry` | `int` | 1000 | 1100000 | Max chunks per cache entry |
| `AllowCacheBypass` | `bool` | `true` | — | Allow clients to force re-evaluation |
| `DigestVersion` | `string` | `"v1"` | — | Serialization version for digests |
| `HashAlgorithm` | `string` | `"SHA256"` | — | Hash algorithm for VeriKey/digest |
| `EnableValkeyCache` | `bool` | `true` | — | Enable Valkey layer (false = Postgres only) |
| `SlidingExpiration` | `bool` | `false` | — | Refresh TTL on cache hits |
### appsettings.json Example
```json
{
"Provcache": {
"DefaultTtl": "24:00:00",
"MaxTtl": "7.00:00:00",
"TimeWindowBucket": "01:00:00",
"ValkeyKeyPrefix": "stellaops:prov:",
"EnableWriteBehind": true,
"WriteBehindFlushInterval": "00:00:05",
"WriteBehindMaxBatchSize": 100,
"WriteBehindQueueCapacity": 10000,
"WriteBehindMaxRetries": 3,
"ChunkSize": 65536,
"MaxChunksPerEntry": 1000,
"AllowCacheBypass": true,
"DigestVersion": "v1",
"HashAlgorithm": "SHA256",
"EnableValkeyCache": true,
"SlidingExpiration": false
}
}
```
### YAML Example (Helm/Kubernetes)
```yaml
provcache:
# TTL configuration
defaultTtl: 24h
maxTtl: 168h # 7 days
timeWindowBucket: 1h
# Storage
valkeyKeyPrefix: "stellaops:prov:"
enableWriteBehind: true
writeBehindFlushInterval: 5s
writeBehindMaxBatchSize: 100
# Evidence chunking
chunkSize: 65536 # 64 KB
maxChunksPerEntry: 1000
# Behavior
allowCacheBypass: true
digestVersion: "v1"
```
### Dependency Injection Registration
```csharp
// In Program.cs or Startup.cs
services.AddProvcache(configuration);
// Or with explicit configuration
services.AddProvcache(options =>
{
options.DefaultTtl = TimeSpan.FromHours(12);
options.EnableWriteBehind = true;
options.WriteBehindMaxBatchSize = 200;
});
```
## Observability
### Metrics
| Metric | Type | Description |
|--------|------|-------------|
| `provcache_requests_total` | Counter | Total cache requests |
| `provcache_hits_total` | Counter | Cache hits |
| `provcache_misses_total` | Counter | Cache misses |
| `provcache_latency_seconds` | Histogram | Operation latency |
| `provcache_items_count` | Gauge | Current item count |
| `provcache_invalidations_total` | Counter | Invalidation count |
### Alerts
```yaml
# Low cache hit rate
- alert: ProvcacheLowHitRate
expr: rate(provcache_hits_total[5m]) / rate(provcache_requests_total[5m]) < 0.5
for: 10m
labels:
severity: warning
annotations:
summary: "Provcache hit rate below 50%"
# High invalidation rate
- alert: ProvcacheHighInvalidationRate
expr: rate(provcache_invalidations_total[5m]) > 100
for: 5m
labels:
severity: warning
annotations:
summary: "High cache invalidation rate"
```
## Security Considerations
### Signer-Aware Caching
The `signer_set_hash` is part of the VeriKey, ensuring:
- Key rotation → new cache entries
- Key revocation → immediate invalidation
- No stale decisions from compromised signers
### Merkle Verification
All evidence chunks are Merkle-verified:
- `ProofRoot` = Merkle root of all chunks
- Individual chunks verifiable without full tree
- Tamper detection on import
### Audit Trail
All invalidations are logged to `prov_revocations` table:
```sql
SELECT * FROM provcache.prov_revocations
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;
```
## Database Schema
### provcache_items
```sql
CREATE TABLE provcache.provcache_items (
verikey TEXT PRIMARY KEY,
digest_version TEXT NOT NULL,
verdict_hash TEXT NOT NULL,
proof_root TEXT NOT NULL,
replay_seed JSONB NOT NULL,
policy_hash TEXT NOT NULL,
signer_set_hash TEXT NOT NULL,
feed_epoch TEXT NOT NULL,
trust_score INTEGER NOT NULL,
hit_count BIGINT DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL,
expires_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
```
### prov_evidence_chunks
```sql
CREATE TABLE provcache.prov_evidence_chunks (
chunk_id UUID PRIMARY KEY,
proof_root TEXT NOT NULL REFERENCES provcache_items(proof_root),
chunk_index INTEGER NOT NULL,
chunk_hash TEXT NOT NULL,
blob BYTEA NOT NULL,
blob_size INTEGER NOT NULL,
content_type TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL
);
```
### prov_revocations
```sql
CREATE TABLE provcache.prov_revocations (
seq_no BIGSERIAL PRIMARY KEY,
revocation_id UUID NOT NULL UNIQUE,
revocation_type VARCHAR(32) NOT NULL, -- signer, feed_epoch, policy, explicit, expiration
revoked_key VARCHAR(512) NOT NULL,
reason VARCHAR(1024),
entries_invalidated INTEGER NOT NULL,
source VARCHAR(128) NOT NULL,
correlation_id VARCHAR(128),
revoked_at TIMESTAMPTZ NOT NULL,
metadata JSONB,
CONSTRAINT chk_revocation_type CHECK (
revocation_type IN ('signer', 'feed_epoch', 'policy', 'explicit', 'expiration')
)
);
CREATE INDEX idx_revocations_type ON provcache.prov_revocations(revocation_type);
CREATE INDEX idx_revocations_key ON provcache.prov_revocations(revoked_key);
CREATE INDEX idx_revocations_time ON provcache.prov_revocations(revoked_at);
```
## Implementation Status
### Completed (Sprint 8200.0001.0001 - Core Backend)
| Component | Path | Status |
|-----------|------|--------|
| Core Models | `src/__Libraries/StellaOps.Provcache/Models/` | ✅ Done |
| VeriKeyBuilder | `src/__Libraries/StellaOps.Provcache/VeriKeyBuilder.cs` | ✅ Done |
| DecisionDigest | `src/__Libraries/StellaOps.Provcache/DecisionDigest.cs` | ✅ Done |
| Caching Layer | `src/__Libraries/StellaOps.Provcache/Caching/` | ✅ Done |
| WriteBehindQueue | `src/__Libraries/StellaOps.Provcache/Persistence/` | ✅ Done |
| API Endpoints | `src/__Libraries/StellaOps.Provcache.Api/` | ✅ Done |
| Unit Tests (53) | `src/__Libraries/__Tests/StellaOps.Provcache.Tests/` | ✅ Done |
### Completed (Sprint 8200.0001.0002 - Invalidation & Air-Gap)
| Component | Path | Status |
|-----------|------|--------|
| Invalidation Interfaces | `src/__Libraries/StellaOps.Provcache/Invalidation/` | ✅ Done |
| Repository Invalidation Methods | `IEvidenceChunkRepository.Delete*Async()` | ✅ Done |
| Export Interfaces | `src/__Libraries/StellaOps.Provcache/Export/` | ✅ Done |
| IMinimalProofExporter | `Export/IMinimalProofExporter.cs` | ✅ Done |
| MinimalProofExporter | `Export/MinimalProofExporter.cs` | ✅ Done |
| Lazy Fetch - ILazyEvidenceFetcher | `LazyFetch/ILazyEvidenceFetcher.cs` | ✅ Done |
| Lazy Fetch - HttpChunkFetcher | `LazyFetch/HttpChunkFetcher.cs` | ✅ Done |
| Lazy Fetch - FileChunkFetcher | `LazyFetch/FileChunkFetcher.cs` | ✅ Done |
| Lazy Fetch - LazyFetchOrchestrator | `LazyFetch/LazyFetchOrchestrator.cs` | ✅ Done |
| Revocation - IRevocationLedger | `Revocation/IRevocationLedger.cs` | ✅ Done |
| Revocation - InMemoryRevocationLedger | `Revocation/InMemoryRevocationLedger.cs` | ✅ Done |
| Revocation - RevocationReplayService | `Revocation/RevocationReplayService.cs` | ✅ Done |
| ProvRevocationEntity | `Entities/ProvRevocationEntity.cs` | ✅ Done |
| Unit Tests (124 total) | `src/__Libraries/__Tests/StellaOps.Provcache.Tests/` | ✅ Done |
### Blocked
| Component | Reason |
|-----------|--------|
| Policy Engine Integration | `PolicyEvaluator` is `internal sealed`; requires architectural review to expose injection points for `IProvcacheService` |
| CLI e2e Tests | `AddSimRemoteCryptoProvider` method missing in CLI codebase |
### Pending
| Component | Sprint |
|-----------|--------|
| Authority Event Integration | 8200.0001.0002 (BLOCKED - Authority needs event publishing) |
| Concelier Event Integration | 8200.0001.0002 (BLOCKED - Concelier needs event publishing) |
| PostgresRevocationLedger | Future (requires EF Core integration) |
| UI Badges & Proof Tree | 8200.0001.0003 |
| Grafana Dashboards | 8200.0001.0003 |
## Implementation Sprints
| Sprint | Focus | Key Deliverables |
|--------|-------|------------------|
| [8200.0001.0001](../../implplan/SPRINT_8200_0001_0001_provcache_core_backend.md) | Core Backend | VeriKey, DecisionDigest, Valkey+Postgres, API |
| [8200.0001.0002](../../implplan/SPRINT_8200_0001_0002_provcache_invalidation_airgap.md) | Invalidation & Air-Gap | Signer revocation, feed epochs, CLI export/import |
| [8200.0001.0003](../../implplan/SPRINT_8200_0001_0003_provcache_ux_observability.md) | UX & Observability | UI badges, proof tree, Grafana, OCI attestation |
## Related Documentation
- **[Provcache Architecture Guide](architecture.md)** - Detailed architecture, invalidation flows, and API reference
- [Policy Engine Architecture](../policy/README.md)
- [TrustLattice Engine](../policy/design/policy-deterministic-evaluator.md)
- [Offline Kit Documentation](../../24_OFFLINE_KIT.md)
- [Air-Gap Controller](../airgap/README.md)
- [Authority Key Rotation](../authority/README.md)