Files
git.stella-ops.org/docs/modules/provcache/architecture.md
2025-12-25 23:10:09 +02:00

16 KiB

Provcache Architecture Guide

Detailed architecture documentation for the Provenance Cache module

Overview

Provcache provides a caching layer that maximizes "provenance density" — the amount of trustworthy evidence retained per byte. This document covers the internal architecture, invalidation mechanisms, air-gap support, and replay capabilities.

Table of Contents

  1. Cache Architecture
  2. Invalidation Mechanisms
  3. Evidence Chunk Storage
  4. Air-Gap Export/Import
  5. Lazy Evidence Fetching
  6. Revocation Ledger
  7. API Reference

Cache Architecture

Storage Layers

┌───────────────────────────────────────────────────────────────┐
│                    Application Layer                          │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────────┐   │
│  │ VeriKey     │───▶│ Provcache   │───▶│ Policy Engine   │   │
│  │ Builder     │    │ Service     │    │ (cache miss)    │   │
│  └─────────────┘    └─────────────┘    └─────────────────┘   │
└───────────────────────────────────────────────────────────────┘
                             │
                             ▼
┌───────────────────────────────────────────────────────────────┐
│                     Caching Layer                             │
│  ┌─────────────────┐         ┌──────────────────────────┐    │
│  │     Valkey      │◀───────▶│      PostgreSQL          │    │
│  │  (read-through) │         │   (write-behind queue)   │    │
│  │                 │         │                          │    │
│  │  • Hot cache    │         │  • provcache_items       │    │
│  │  • Sub-ms reads │         │  • prov_evidence_chunks  │    │
│  │  • TTL-based    │         │  • prov_revocations      │    │
│  └─────────────────┘         └──────────────────────────┘    │
└───────────────────────────────────────────────────────────────┘

Key Components

Component Purpose
IProvcacheService Main service interface for cache operations
IProvcacheStore Storage abstraction (Valkey + Postgres)
WriteBehindQueue Async persistence to Postgres
IEvidenceChunker Splits large evidence into Merkle-verified chunks
IRevocationLedger Audit trail for all invalidation events

Invalidation Mechanisms

Provcache supports multiple invalidation triggers to ensure cache consistency when upstream data changes.

Automatic Invalidation

1. Signer Revocation

When a signing key is compromised or rotated:

┌─────────────┐     SignerRevokedEvent     ┌──────────────────┐
│  Authority  │ ──────────────────────────▶│ SignerSet        │
│  Module     │                            │ Invalidator      │
└─────────────┘                            └────────┬─────────┘
                                                    │
                                                    ▼
                                           DELETE FROM provcache_items
                                           WHERE signer_set_hash = ?

Implementation: SignerSetInvalidator subscribes to SignerRevokedEvent and invalidates all entries signed by the revoked key.

2. Feed Epoch Advancement

When vulnerability feeds are updated:

┌─────────────┐    FeedEpochAdvancedEvent   ┌──────────────────┐
│  Concelier  │ ───────────────────────────▶│ FeedEpoch        │
│  Module     │                             │ Invalidator      │
└─────────────┘                             └────────┬─────────┘
                                                     │
                                                     ▼
                                            DELETE FROM provcache_items
                                            WHERE feed_epoch < ?

Implementation: FeedEpochInvalidator compares epochs using semantic versioning or ISO timestamps.

3. Policy Updates

When policy bundles change:

┌─────────────┐      PolicyUpdatedEvent     ┌──────────────────┐
│   Policy    │ ───────────────────────────▶│ PolicyHash       │
│   Engine    │                             │ Invalidator      │
└─────────────┘                             └────────┬─────────┘
                                                     │
                                                     ▼
                                            DELETE FROM provcache_items
                                            WHERE policy_hash = ?

Invalidation Recording

All invalidation events are recorded in the revocation ledger for audit and replay:

public interface IProvcacheInvalidator
{
    Task<int> InvalidateAsync(
        InvalidationCriteria criteria,
        string reason,
        string? correlationId = null,
        CancellationToken cancellationToken = default);
}

The ledger entry includes:

  • Revocation type (signer, feed_epoch, policy, explicit)
  • The revoked key
  • Number of entries invalidated
  • Timestamp and correlation ID for tracing

Evidence Chunk Storage

Large evidence (SBOMs, VEX documents, call graphs) is stored in fixed-size chunks with Merkle tree verification.

Chunking Process

┌─────────────────────────────────────────────────────────────────┐
│                    Original Evidence                             │
│    [ 2.3 MB SPDX SBOM JSON ]                                    │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ IEvidenceChunker.ChunkAsync()
┌─────────────────────────────────────────────────────────────────┐
│  Chunk 0 (64KB)  │  Chunk 1 (64KB)  │ ... │  Chunk N (partial)  │
│  hash: abc123    │  hash: def456    │     │  hash: xyz789       │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ Merkle tree construction
┌─────────────────────────────────────────────────────────────────┐
│                        Proof Root                                │
│              sha256:merkle_root_of_all_chunks                    │
└─────────────────────────────────────────────────────────────────┘

Database Schema

CREATE TABLE provcache.prov_evidence_chunks (
    chunk_id        UUID PRIMARY KEY,
    proof_root      VARCHAR(128) NOT NULL,
    chunk_index     INTEGER NOT NULL,
    chunk_hash      VARCHAR(128) NOT NULL,
    blob            BYTEA NOT NULL,
    blob_size       INTEGER NOT NULL,
    content_type    VARCHAR(64) NOT NULL,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    
    CONSTRAINT uk_proof_chunk UNIQUE (proof_root, chunk_index)
);

CREATE INDEX idx_evidence_proof_root ON provcache.prov_evidence_chunks(proof_root);

Paging API

Evidence can be retrieved in pages to manage memory:

GET /api/v1/proofs/{proofRoot}?page=0&pageSize=10

Response includes chunk metadata without blob data, allowing clients to fetch specific chunks on demand.


Air-Gap Export/Import

Provcache supports air-gapped environments through minimal proof bundles.

Bundle Format (v1)

{
    "version": "v1",
    "exportedAt": "2025-01-15T10:30:00Z",
    "density": "standard",
    "digest": {
        "veriKey": "sha256:...",
        "verdictHash": "sha256:...",
        "proofRoot": "sha256:...",
        "trustScore": 85
    },
    "manifest": {
        "proofRoot": "sha256:...",
        "totalChunks": 42,
        "totalSize": 2752512,
        "chunks": [...]
    },
    "chunks": [
        {
            "index": 0,
            "data": "base64...",
            "hash": "sha256:..."
        }
    ],
    "signature": {
        "algorithm": "ECDSA-P256",
        "signature": "base64...",
        "signedAt": "2025-01-15T10:30:01Z"
    }
}

Density Levels

Level Contents Typical Size Use Case
Lite Digest + ProofRoot + Manifest ~2 KB Quick verification, requires lazy fetch for full evidence
Standard + First 10% of chunks ~200 KB Normal audits, balance of size vs completeness
Strict + All chunks Variable Full compliance, no network needed

Export Example

var exporter = serviceProvider.GetRequiredService<IMinimalProofExporter>();

// Lite export (manifest only)
var liteBundle = await exporter.ExportAsync(
    veriKey: "sha256:abc123",
    new MinimalProofExportOptions { Density = ProofDensity.Lite });

// Signed strict export
var strictBundle = await exporter.ExportAsync(
    veriKey: "sha256:abc123",
    new MinimalProofExportOptions 
    { 
        Density = ProofDensity.Strict,
        SignBundle = true,
        Signer = signerInstance
    });

Import and Verification

var result = await exporter.ImportAsync(bundle);

if (result.DigestVerified && result.ChunksVerified)
{
    // Bundle is authentic
    await provcache.UpsertAsync(result.Entry);
}

Lazy Evidence Fetching

For lite bundles, missing chunks can be fetched on-demand from connected or file sources.

Fetcher Architecture

┌────────────────────┐
│ ILazyEvidenceFetcher│
└─────────┬──────────┘
          │
    ┌─────┴─────┐
    │           │
    ▼           ▼
┌─────────┐  ┌──────────┐
│  HTTP   │  │   File   │
│ Fetcher │  │ Fetcher  │
└─────────┘  └──────────┘

HTTP Fetcher (Connected Mode)

var fetcher = new HttpChunkFetcher(
    new Uri("https://api.stellaops.com"),
    logger);

var orchestrator = new LazyFetchOrchestrator(repository, logger);

var result = await orchestrator.FetchAndStoreAsync(
    proofRoot: "sha256:...",
    fetcher,
    new LazyFetchOptions
    {
        VerifyOnFetch = true,
        BatchSize = 100
    });

File Fetcher (Sneakernet Mode)

For fully air-gapped environments:

  1. Export full evidence to USB drive
  2. Transport to isolated network
  3. Import using file fetcher
var fetcher = new FileChunkFetcher(
    basePath: "/mnt/usb/evidence",
    logger);

var result = await orchestrator.FetchAndStoreAsync(proofRoot, fetcher);

Revocation Ledger

The revocation ledger provides a complete audit trail of all invalidation events.

Schema

CREATE TABLE provcache.prov_revocations (
    seq_no              BIGSERIAL PRIMARY KEY,
    revocation_id       UUID NOT NULL,
    revocation_type     VARCHAR(32) NOT NULL,
    revoked_key         VARCHAR(512) NOT NULL,
    reason              VARCHAR(1024),
    entries_invalidated INTEGER NOT NULL,
    source              VARCHAR(128) NOT NULL,
    correlation_id      VARCHAR(128),
    revoked_at          TIMESTAMPTZ NOT NULL,
    metadata            JSONB
);

Replay for Catch-Up

After node restart or network partition, nodes can replay missed revocations:

var replayService = serviceProvider.GetRequiredService<IRevocationReplayService>();

// Get last checkpoint
var checkpoint = await replayService.GetCheckpointAsync();

// Replay from checkpoint
var result = await replayService.ReplayFromAsync(
    sinceSeqNo: checkpoint,
    new RevocationReplayOptions
    {
        BatchSize = 1000,
        SaveCheckpointPerBatch = true
    });

Console.WriteLine($"Replayed {result.EntriesReplayed} revocations, {result.TotalInvalidations} entries invalidated");

Statistics

var ledger = serviceProvider.GetRequiredService<IRevocationLedger>();
var stats = await ledger.GetStatsAsync();

// stats.TotalEntries - total revocation events
// stats.EntriesByType - breakdown by type (signer, feed_epoch, etc.)
// stats.TotalEntriesInvalidated - sum of all invalidated cache entries

API Reference

Evidence Endpoints

Endpoint Method Description
/api/v1/proofs/{proofRoot} GET Get paged evidence chunks
/api/v1/proofs/{proofRoot}/manifest GET Get chunk manifest
/api/v1/proofs/{proofRoot}/chunks/{index} GET Get specific chunk
/api/v1/proofs/{proofRoot}/verify POST Verify Merkle proof

Invalidation Endpoints

Endpoint Method Description
/api/v1/provcache/invalidate POST Manual invalidation
/api/v1/provcache/revocations GET List revocation history
/api/v1/provcache/stats GET Cache statistics

CLI Commands

# Export commands
stella prov export --verikey <key> --density <lite|standard|strict> [--output <file>] [--sign]

# Import commands
stella prov import <file> [--lazy-fetch] [--backend <url>] [--chunks-dir <path>]

# Verify commands
stella prov verify <file> [--signer-cert <cert>]

Configuration

Key settings in appsettings.json:

{
    "Provcache": {
        "ChunkSize": 65536,
        "MaxChunksPerEntry": 1000,
        "DefaultTtl": "24:00:00",
        "EnableWriteBehind": true,
        "WriteBehindFlushInterval": "00:00:05"
    }
}

See README.md for full configuration reference.