Files
git.stella-ops.org/docs/modules/provcache/README.md
2025-12-24 14:19:46 +02:00

12 KiB

Provcache Module

Provenance Cache — Maximizing Trust Evidence Density

Overview

Provcache is a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte — enabling faster security decisions, offline replays, and smaller air-gap bundles.

Key Benefits

  • Trust Latency: Warm cache lookups return in single-digit milliseconds
  • Bandwidth Efficiency: Avoid re-fetching bulky SBOMs/attestations
  • Offline Operation: Decisions usable without full SBOM/VEX payloads
  • Audit Transparency: Full evidence chain verifiable via Merkle proofs

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Policy Evaluator                         │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────────┐ │
│  │ VeriKey     │───▶│ Provcache   │───▶│ TrustLatticeEngine │ │
│  │ Builder     │    │ Service     │    │ (if cache miss)     │ │
│  └─────────────┘    └─────────────┘    └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
        ┌─────────────────────────────────────────┐
        │            Provcache Store              │
        │  ┌─────────────┐    ┌────────────────┐ │
        │  │   Valkey    │◀──▶│   Postgres     │ │
        │  │ (read-thru) │    │ (write-behind) │ │
        │  └─────────────┘    └────────────────┘ │
        └─────────────────────────────────────────┘
                              │
                              ▼
        ┌─────────────────────────────────────────┐
        │         Evidence Chunk Store            │
        │  ┌─────────────────────────────────────┐│
        │  │  prov_evidence_chunks (Postgres)    ││
        │  │  - Chunked SBOM/VEX/CallGraph       ││
        │  │  - Merkle tree verification         ││
        │  └─────────────────────────────────────┘│
        └─────────────────────────────────────────┘

Core Concepts

VeriKey (Provenance Identity Key)

A composite hash that uniquely identifies a provenance decision context:

VeriKey = SHA256(
    source_hash        ||  // Image/artifact digest
    sbom_hash          ||  // Canonical SBOM hash
    vex_hash_set_hash  ||  // Sorted VEX statement hashes
    merge_policy_hash  ||  // PolicyBundle hash
    signer_set_hash    ||  // Signer certificate hashes
    time_window            // Epoch bucket
)

Why each component?

Component Purpose
source_hash Different artifacts → different keys
sbom_hash SBOM changes (new packages) → new key
vex_hash_set VEX updates → new key
policy_hash Policy changes → new key
signer_set_hash Key rotation → new key (security)
time_window Temporal bucketing → controlled expiry

DecisionDigest

Canonicalized representation of an evaluation result:

{
  "digestVersion": "v1",
  "veriKey": "sha256:abc123...",
  "verdictHash": "sha256:def456...",
  "proofRoot": "sha256:789abc...",
  "replaySeed": {
    "feedIds": ["cve-2024", "ghsa-2024"],
    "ruleIds": ["default-policy-v2"]
  },
  "trustScore": 85,
  "createdAt": "2025-12-24T12:00:00Z",
  "expiresAt": "2025-12-25T12:00:00Z"
}

Trust Score

A composite score (0-100) indicating decision confidence:

Component Weight Calculation
Reachability 25% Call graph coverage, entry points analyzed
SBOM Completeness 20% Package count, license data presence
VEX Coverage 20% Vendor statements, justifications
Policy Freshness 15% Time since last policy update
Signer Trust 20% Key age, reputation, chain validity

Evidence Chunks

Large evidence (SBOM, VEX, call graphs) is stored in fixed-size chunks:

  • Default size: 64 KB per chunk
  • Merkle verification: Each chunk is a Merkle leaf
  • Lazy fetch: Only fetch chunks needed for audit
  • LRU eviction: Old chunks evicted under storage pressure

API Reference

Endpoints

Method Path Description
GET /v1/provcache/{veriKey} Lookup cached decision
POST /v1/provcache Store decision (idempotent)
POST /v1/provcache/invalidate Invalidate by pattern
GET /v1/proofs/{proofRoot} List evidence chunks
GET /v1/proofs/{proofRoot}/chunks/{index} Download chunk

Cache Lookup Flow

sequenceDiagram
    participant Client
    participant PolicyEngine
    participant Provcache
    participant Valkey
    participant Postgres
    participant TrustLattice

    Client->>PolicyEngine: Evaluate(artifact)
    PolicyEngine->>Provcache: Get(VeriKey)
    Provcache->>Valkey: GET verikey
    alt Cache Hit
        Valkey-->>Provcache: DecisionDigest
        Provcache-->>PolicyEngine: CacheResult(hit)
        PolicyEngine-->>Client: Decision (cached)
    else Cache Miss
        Valkey-->>Provcache: null
        Provcache->>Postgres: SELECT * FROM provcache_items
        alt DB Hit
            Postgres-->>Provcache: ProvcacheEntry
            Provcache->>Valkey: SET (backfill)
            Provcache-->>PolicyEngine: CacheResult(hit, source=postgres)
        else DB Miss
            Postgres-->>Provcache: null
            Provcache-->>PolicyEngine: CacheResult(miss)
            PolicyEngine->>TrustLattice: Evaluate
            TrustLattice-->>PolicyEngine: EvaluationResult
            PolicyEngine->>Provcache: Set(VeriKey, DecisionDigest)
            Provcache->>Valkey: SET
            Provcache->>Postgres: INSERT (async)
            PolicyEngine-->>Client: Decision (computed)
        end
    end

Invalidation

Automatic Invalidation Triggers

Trigger Event Scope
Signer Revocation SignerRevokedEvent All entries with matching signer_set_hash
Feed Epoch Advance FeedEpochAdvancedEvent Entries with older feed_epoch
Policy Update PolicyUpdatedEvent Entries with matching policy_hash
TTL Expiry Background job Entries past expires_at

Manual Invalidation

# Invalidate by signer
POST /v1/provcache/invalidate
{
  "by": "signer_set_hash",
  "value": "sha256:revoked-signer...",
  "reason": "key-compromise"
}

# Invalidate by policy
POST /v1/provcache/invalidate
{
  "by": "policy_hash",
  "value": "sha256:old-policy...",
  "reason": "policy-update"
}

Air-Gap Integration

Export Workflow

# Export minimal proof (digest only)
stella prov export --verikey sha256:abc123 --density lite

# Export with evidence chunks
stella prov export --verikey sha256:abc123 --density standard

# Export full evidence
stella prov export --verikey sha256:abc123 --density strict --sign

Import Workflow

# Import and verify Merkle root
stella prov import --input proof.bundle

# Import with lazy chunk fetch
stella prov import --input proof-lite.json --lazy-fetch --backend https://api.stellaops.com

Density Levels

Level Contents Size Use Case
lite DecisionDigest + ProofRoot ~2 KB Quick verification
standard + First N chunks ~200 KB Normal audit
strict + All chunks Variable Full compliance

Configuration

provcache:
  # TTL configuration
  defaultTtl: 24h
  maxTtl: 168h  # 7 days
  timeWindowBucket: 1h

  # Storage
  valkeyKeyPrefix: "stellaops:prov:"
  enableWriteBehind: true
  writeBehindFlushInterval: 5s
  writeBehindMaxBatchSize: 100

  # Evidence chunking
  chunkSize: 65536  # 64 KB
  maxChunksPerEntry: 1000

  # Behavior
  allowCacheBypass: true
  digestVersion: "v1"

Observability

Metrics

Metric Type Description
provcache_requests_total Counter Total cache requests
provcache_hits_total Counter Cache hits
provcache_misses_total Counter Cache misses
provcache_latency_seconds Histogram Operation latency
provcache_items_count Gauge Current item count
provcache_invalidations_total Counter Invalidation count

Alerts

# Low cache hit rate
- alert: ProvcacheLowHitRate
  expr: rate(provcache_hits_total[5m]) / rate(provcache_requests_total[5m]) < 0.5
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: "Provcache hit rate below 50%"

# High invalidation rate
- alert: ProvcacheHighInvalidationRate
  expr: rate(provcache_invalidations_total[5m]) > 100
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "High cache invalidation rate"

Security Considerations

Signer-Aware Caching

The signer_set_hash is part of the VeriKey, ensuring:

  • Key rotation → new cache entries
  • Key revocation → immediate invalidation
  • No stale decisions from compromised signers

Merkle Verification

All evidence chunks are Merkle-verified:

  • ProofRoot = Merkle root of all chunks
  • Individual chunks verifiable without full tree
  • Tamper detection on import

Audit Trail

All invalidations are logged to prov_revocations table:

SELECT * FROM provcache.prov_revocations
WHERE created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;

Database Schema

provcache_items

CREATE TABLE provcache.provcache_items (
    verikey         TEXT PRIMARY KEY,
    digest_version  TEXT NOT NULL,
    verdict_hash    TEXT NOT NULL,
    proof_root      TEXT NOT NULL,
    replay_seed     JSONB NOT NULL,
    policy_hash     TEXT NOT NULL,
    signer_set_hash TEXT NOT NULL,
    feed_epoch      TEXT NOT NULL,
    trust_score     INTEGER NOT NULL,
    hit_count       BIGINT DEFAULT 0,
    created_at      TIMESTAMPTZ NOT NULL,
    expires_at      TIMESTAMPTZ NOT NULL,
    updated_at      TIMESTAMPTZ NOT NULL
);

prov_evidence_chunks

CREATE TABLE provcache.prov_evidence_chunks (
    chunk_id     UUID PRIMARY KEY,
    proof_root   TEXT NOT NULL REFERENCES provcache_items(proof_root),
    chunk_index  INTEGER NOT NULL,
    chunk_hash   TEXT NOT NULL,
    blob         BYTEA NOT NULL,
    blob_size    INTEGER NOT NULL,
    content_type TEXT NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL
);

prov_revocations

CREATE TABLE provcache.prov_revocations (
    revocation_id    UUID PRIMARY KEY,
    revocation_type  TEXT NOT NULL,
    target_hash      TEXT NOT NULL,
    reason           TEXT,
    actor            TEXT,
    entries_affected BIGINT NOT NULL,
    created_at       TIMESTAMPTZ NOT NULL
);

Implementation Sprints

Sprint Focus Key Deliverables
8200.0001.0001 Core Backend VeriKey, DecisionDigest, Valkey+Postgres, API
8200.0001.0002 Invalidation & Air-Gap Signer revocation, feed epochs, CLI export/import
8200.0001.0003 UX & Observability UI badges, proof tree, Grafana, OCI attestation