Files
git.stella-ops.org/docs/modules/provcache/README.md
2025-12-25 12:16:13 +02:00

18 KiB
Raw Blame History

Provcache Module

Status: Implemented — Core library shipped in Sprint 8200.0001.0001. API endpoints, caching, invalidation and write-behind queue are operational. Policy Engine integration pending architectural review.

Provenance Cache — Maximizing Trust Evidence Density

Overview

Provcache is a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte — enabling faster security decisions, offline replays, and smaller air-gap bundles.

Key Benefits

  • Trust Latency: Warm cache lookups return in single-digit milliseconds
  • Bandwidth Efficiency: Avoid re-fetching bulky SBOMs/attestations
  • Offline Operation: Decisions usable without full SBOM/VEX payloads
  • Audit Transparency: Full evidence chain verifiable via Merkle proofs

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Policy Evaluator                         │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐ │
│  │ VeriKey     │───▶│ Provcache   │───▶│ TrustLatticeEngine │ │
│  │ Builder     │    │ Service     │    │ (if cache miss)     │ │
│  └─────────────┘    └─────────────┘    └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
        ┌─────────────────────────────────────────┐
        │            Provcache Store              │
        │  ┌─────────────┐    ┌────────────────┐ │
        │  │   Valkey    │◀──▶│   Postgres     │ │
        │  │ (read-thru) │    │ (write-behind) │ │
        │  └─────────────┘    └────────────────┘ │
        └─────────────────────────────────────────┘
                              │
                              ▼
        ┌─────────────────────────────────────────┐
        │         Evidence Chunk Store            │
        │  ┌─────────────────────────────────────┐│
        │  │  prov_evidence_chunks (Postgres)    ││
        │  │  - Chunked SBOM/VEX/CallGraph       ││
        │  │  - Merkle tree verification         ││
        │  └─────────────────────────────────────┘│
        └─────────────────────────────────────────┘

Core Concepts

VeriKey (Provenance Identity Key)

A composite hash that uniquely identifies a provenance decision context:

VeriKey = SHA256(
    "v1|"              ||  // Version prefix for compatibility
    source_hash        ||  // Image/artifact digest
    "|"                ||
    sbom_hash          ||  // Canonical SBOM hash
    "|"                ||
    vex_hash_set_hash  ||  // Sorted VEX statement hashes
    "|"                ||
    merge_policy_hash  ||  // PolicyBundle hash
    "|"                ||
    signer_set_hash    ||  // Signer certificate hashes
    "|"                ||
    time_window            // Epoch bucket
)

Why each component?

Component Purpose
source_hash Different artifacts → different keys
sbom_hash SBOM changes (new packages) → new key
vex_hash_set VEX updates → new key
policy_hash Policy changes → new key
signer_set_hash Key rotation → new key (security)
time_window Temporal bucketing → controlled expiry

VeriKey Composition Rules

  1. Hash Normalization: All input hashes are normalized to lowercase with sha256: prefix stripped if present
  2. Set Hash Computation: For VEX statements and signer certificates:
    • Individual hashes are sorted lexicographically (ordinal)
    • Sorted hashes are concatenated with | delimiter
    • Result is SHA256-hashed
    • Empty sets use well-known sentinels ("empty-vex-set", "empty-signer-set")
  3. Time Window Computation: floor(timestamp.Ticks / bucket.Ticks) * bucket.Ticks in UTC ISO-8601 format
  4. Output Format: sha256:<64-char-lowercase-hex>

Code Example

var veriKey = new VeriKeyBuilder(options)
    .WithSourceHash("sha256:abc123...")           // Image digest
    .WithSbomHash("sha256:def456...")             // SBOM digest
    .WithVexStatementHashes(["sha256:v1", "sha256:v2"])  // Sorted automatically
    .WithMergePolicyHash("sha256:policy...")      // Policy bundle
    .WithCertificateHashes(["sha256:cert1"])      // Signer certs
    .WithTimeWindow(DateTimeOffset.UtcNow)        // Auto-bucketed
    .Build();
// Returns: "sha256:789abc..."

DecisionDigest

Canonicalized representation of an evaluation result:

{
  "digestVersion": "v1",
  "veriKey": "sha256:abc123...",
  "verdictHash": "sha256:def456...",
  "proofRoot": "sha256:789abc...",
  "replaySeed": {
    "feedIds": ["cve-2024", "ghsa-2024"],
    "ruleIds": ["default-policy-v2"]
  },
  "trustScore": 85,
  "createdAt": "2025-12-24T12:00:00Z",
  "expiresAt": "2025-12-25T12:00:00Z"
}

Trust Score

A composite score (0-100) indicating decision confidence:

Component Weight Calculation
Reachability 25% Call graph coverage, entry points analyzed
SBOM Completeness 20% Package count, license data presence
VEX Coverage 20% Vendor statements, justifications
Policy Freshness 15% Time since last policy update
Signer Trust 20% Key age, reputation, chain validity

Evidence Chunks

Large evidence (SBOM, VEX, call graphs) is stored in fixed-size chunks:

  • Default size: 64 KB per chunk
  • Merkle verification: Each chunk is a Merkle leaf
  • Lazy fetch: Only fetch chunks needed for audit
  • LRU eviction: Old chunks evicted under storage pressure

API Reference

Endpoints

Method Path Description
GET /v1/provcache/{veriKey} Lookup cached decision
POST /v1/provcache Store decision (idempotent)
POST /v1/provcache/invalidate Invalidate by pattern
GET /v1/proofs/{proofRoot} List evidence chunks
GET /v1/proofs/{proofRoot}/chunks/{index} Download chunk

Cache Lookup Flow

sequenceDiagram
    participant Client
    participant PolicyEngine
    participant Provcache
    participant Valkey
    participant Postgres
    participant TrustLattice

    Client->>PolicyEngine: Evaluate(artifact)
    PolicyEngine->>Provcache: Get(VeriKey)
    Provcache->>Valkey: GET verikey
    alt Cache Hit
        Valkey-->>Provcache: DecisionDigest
        Provcache-->>PolicyEngine: CacheResult(hit)
        PolicyEngine-->>Client: Decision (cached)
    else Cache Miss
        Valkey-->>Provcache: null
        Provcache->>Postgres: SELECT * FROM provcache_items
        alt DB Hit
            Postgres-->>Provcache: ProvcacheEntry
            Provcache->>Valkey: SET (backfill)
            Provcache-->>PolicyEngine: CacheResult(hit, source=postgres)
        else DB Miss
            Postgres-->>Provcache: null
            Provcache-->>PolicyEngine: CacheResult(miss)
            PolicyEngine->>TrustLattice: Evaluate
            TrustLattice-->>PolicyEngine: EvaluationResult
            PolicyEngine->>Provcache: Set(VeriKey, DecisionDigest)
            Provcache->>Valkey: SET
            Provcache->>Postgres: INSERT (async)
            PolicyEngine-->>Client: Decision (computed)
        end
    end

Invalidation

Automatic Invalidation Triggers

Trigger Event Scope
Signer Revocation SignerRevokedEvent All entries with matching signer_set_hash
Feed Epoch Advance FeedEpochAdvancedEvent Entries with older feed_epoch
Policy Update PolicyUpdatedEvent Entries with matching policy_hash
TTL Expiry Background job Entries past expires_at

Manual Invalidation

# Invalidate by signer
POST /v1/provcache/invalidate
{
  "by": "signer_set_hash",
  "value": "sha256:revoked-signer...",
  "reason": "key-compromise"
}

# Invalidate by policy
POST /v1/provcache/invalidate
{
  "by": "policy_hash",
  "value": "sha256:old-policy...",
  "reason": "policy-update"
}

Air-Gap Integration

Export Workflow

# Export minimal proof (digest only)
stella prov export --verikey sha256:abc123 --density lite

# Export with evidence chunks
stella prov export --verikey sha256:abc123 --density standard

# Export full evidence
stella prov export --verikey sha256:abc123 --density strict --sign

Import Workflow

# Import and verify Merkle root
stella prov import --input proof.bundle

# Import with lazy chunk fetch
stella prov import --input proof-lite.json --lazy-fetch --backend https://api.stellaops.com

Density Levels

Level Contents Size Use Case
lite DecisionDigest + ProofRoot ~2 KB Quick verification
standard + First N chunks ~200 KB Normal audit
strict + All chunks Variable Full compliance

Configuration

C# Configuration Class

The ProvcacheOptions class (section name: "Provcache") exposes the following settings:

Property Type Default Validation Description
DefaultTtl TimeSpan 24h 1min7d Default time-to-live for cache entries
MaxTtl TimeSpan 7d 1min30d Maximum allowed TTL regardless of request
TimeWindowBucket TimeSpan 1h 1min24h Time window bucket for VeriKey computation
ValkeyKeyPrefix string "stellaops:prov:" Key prefix for Valkey storage
EnableWriteBehind bool true Enable async Postgres persistence
WriteBehindFlushInterval TimeSpan 5s 1s5min Interval for flushing write-behind queue
WriteBehindMaxBatchSize int 100 110000 Maximum batch size per flush
WriteBehindQueueCapacity int 10000 1001M Max queue capacity (blocks when full)
WriteBehindMaxRetries int 3 010 Retry attempts for failed writes
ChunkSize int 65536 1KB1MB Evidence chunk size in bytes
MaxChunksPerEntry int 1000 1100000 Max chunks per cache entry
AllowCacheBypass bool true Allow clients to force re-evaluation
DigestVersion string "v1" Serialization version for digests
HashAlgorithm string "SHA256" Hash algorithm for VeriKey/digest
EnableValkeyCache bool true Enable Valkey layer (false = Postgres only)
SlidingExpiration bool false Refresh TTL on cache hits

appsettings.json Example

{
  "Provcache": {
    "DefaultTtl": "24:00:00",
    "MaxTtl": "7.00:00:00",
    "TimeWindowBucket": "01:00:00",
    "ValkeyKeyPrefix": "stellaops:prov:",
    "EnableWriteBehind": true,
    "WriteBehindFlushInterval": "00:00:05",
    "WriteBehindMaxBatchSize": 100,
    "WriteBehindQueueCapacity": 10000,
    "WriteBehindMaxRetries": 3,
    "ChunkSize": 65536,
    "MaxChunksPerEntry": 1000,
    "AllowCacheBypass": true,
    "DigestVersion": "v1",
    "HashAlgorithm": "SHA256",
    "EnableValkeyCache": true,
    "SlidingExpiration": false
  }
}

YAML Example (Helm/Kubernetes)

provcache:
  # TTL configuration
  defaultTtl: 24h
  maxTtl: 168h  # 7 days
  timeWindowBucket: 1h

  # Storage
  valkeyKeyPrefix: "stellaops:prov:"
  enableWriteBehind: true
  writeBehindFlushInterval: 5s
  writeBehindMaxBatchSize: 100

  # Evidence chunking
  chunkSize: 65536  # 64 KB
  maxChunksPerEntry: 1000

  # Behavior
  allowCacheBypass: true
  digestVersion: "v1"

Dependency Injection Registration

// In Program.cs or Startup.cs
services.AddProvcache(configuration);

// Or with explicit configuration
services.AddProvcache(options =>
{
    options.DefaultTtl = TimeSpan.FromHours(12);
    options.EnableWriteBehind = true;
    options.WriteBehindMaxBatchSize = 200;
});

Observability

Metrics

Metric Type Description
provcache_requests_total Counter Total cache requests
provcache_hits_total Counter Cache hits
provcache_misses_total Counter Cache misses
provcache_latency_seconds Histogram Operation latency
provcache_items_count Gauge Current item count
provcache_invalidations_total Counter Invalidation count

Alerts

# Low cache hit rate
- alert: ProvcacheLowHitRate
  expr: rate(provcache_hits_total[5m]) / rate(provcache_requests_total[5m]) < 0.5
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "Provcache hit rate below 50%"

# High invalidation rate
- alert: ProvcacheHighInvalidationRate
  expr: rate(provcache_invalidations_total[5m]) > 100
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "High cache invalidation rate"

Security Considerations

Signer-Aware Caching

The signer_set_hash is part of the VeriKey, ensuring:

  • Key rotation → new cache entries
  • Key revocation → immediate invalidation
  • No stale decisions from compromised signers

Merkle Verification

All evidence chunks are Merkle-verified:

  • ProofRoot = Merkle root of all chunks
  • Individual chunks verifiable without full tree
  • Tamper detection on import

Audit Trail

All invalidations are logged to prov_revocations table:

SELECT * FROM provcache.prov_revocations
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;

Database Schema

provcache_items

CREATE TABLE provcache.provcache_items (
    verikey         TEXT PRIMARY KEY,
    digest_version  TEXT NOT NULL,
    verdict_hash    TEXT NOT NULL,
    proof_root      TEXT NOT NULL,
    replay_seed     JSONB NOT NULL,
    policy_hash     TEXT NOT NULL,
    signer_set_hash TEXT NOT NULL,
    feed_epoch      TEXT NOT NULL,
    trust_score     INTEGER NOT NULL,
    hit_count       BIGINT DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL,
    expires_at      TIMESTAMPTZ NOT NULL,
    updated_at      TIMESTAMPTZ NOT NULL
);

prov_evidence_chunks

CREATE TABLE provcache.prov_evidence_chunks (
    chunk_id     UUID PRIMARY KEY,
    proof_root   TEXT NOT NULL REFERENCES provcache_items(proof_root),
    chunk_index  INTEGER NOT NULL,
    chunk_hash   TEXT NOT NULL,
    blob         BYTEA NOT NULL,
    blob_size    INTEGER NOT NULL,
    content_type TEXT NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL
);

prov_revocations

CREATE TABLE provcache.prov_revocations (
    revocation_id    UUID PRIMARY KEY,
    revocation_type  TEXT NOT NULL,
    target_hash      TEXT NOT NULL,
    reason           TEXT,
    actor            TEXT,
    entries_affected BIGINT NOT NULL,
    created_at       TIMESTAMPTZ NOT NULL
);

Implementation Status

Completed (Sprint 8200.0001.0001)

Component Path Status
Core Models src/__Libraries/StellaOps.Provcache/Models/ Done
VeriKeyBuilder src/__Libraries/StellaOps.Provcache/VeriKeyBuilder.cs Done
DecisionDigest src/__Libraries/StellaOps.Provcache/DecisionDigest.cs Done
Caching Layer src/__Libraries/StellaOps.Provcache/Caching/ Done
WriteBehindQueue src/__Libraries/StellaOps.Provcache/Persistence/ Done
API Endpoints src/__Libraries/StellaOps.Provcache.Api/ Done
Unit Tests (53) src/__Libraries/__Tests/StellaOps.Provcache.Tests/ Done

Blocked

Component Reason
Policy Engine Integration PolicyEvaluator is internal sealed; requires architectural review to expose injection points for IProvcacheService

Pending

Component Sprint
Signer Revocation Events 8200.0001.0002
CLI Export/Import 8200.0001.0002
UI Badges & Proof Tree 8200.0001.0003
Grafana Dashboards 8200.0001.0003

Implementation Sprints

Sprint Focus Key Deliverables
8200.0001.0001 Core Backend VeriKey, DecisionDigest, Valkey+Postgres, API
8200.0001.0002 Invalidation & Air-Gap Signer revocation, feed epochs, CLI export/import
8200.0001.0003 UX & Observability UI badges, proof tree, Grafana, OCI attestation