Files
git.stella-ops.org/docs-archived/product-advisories/18-Dec-2025/18-Dec-2025 - Designing a Layered EPSS v4 Database.md
2026-01-05 16:02:11 +02:00

8.0 KiB

ARCHIVED ADVISORY

Archived: 2025-12-18 Status: IMPLEMENTED Analysis: Plan file C:\Users\vlindos\.claude\plans\quizzical-hugging-hearth.md

Implementation Summary

This advisory was analyzed and merged into the existing EPSS implementation plan:

  • Master Plan: IMPL_3410_epss_v4_integration_master_plan.md updated with raw + signal layer schemas
  • Sprint: SPRINT_3413_0001_0001_epss_live_enrichment.md created with 30 tasks (original 14 + 16 from advisory)
  • Migrations Created:
    • 011_epss_raw_layer.sql - Full JSONB payload storage (~5GB/year)
    • 012_epss_signal_layer.sql - Tenant-scoped signals with dedupe_key and explain_hash

Gap Analysis Result

Advisory Proposal Decision Rationale
Raw feed layer (Layer 1) IMPLEMENTED Full JSONB storage for deterministic replay
Normalized layer (Layer 2) ALIGNED Already existed in IMPL_3410
Signal-ready layer (Layer 3) IMPLEMENTED Tenant-scoped signals, model change detection
Multi-model support DEFERRED No customer demand
Meta-predictor training SKIPPED Out of scope (ML complexity)
A/B testing SKIPPED Infrastructure overhead

Key Enhancements Implemented

  1. Raw Feed Layer (epss_raw table) - Stores full CSV payload as JSONB for replay
  2. Signal-Ready Layer (epss_signal table) - Tenant-scoped actionable events
  3. Model Version Change Detection - Suppresses noisy deltas on model updates
  4. Explain Hash - Deterministic SHA-256 for audit trail
  5. Risk Band Mapping - CRITICAL/HIGH/MEDIUM/LOW based on percentile

Original Advisory Content

Here's a compact, practical blueprint for bringing EPSS into your stack without chaos: a 3-layer ingestion model that keeps raw data, produces clean probabilities, and emits "signal-ready" events your risk engine can use immediately.


Why this matters (super short)

  • EPSS = predicted probability a vuln will be exploited soon.
  • Mixing "raw EPSS feed" directly into decisions makes audits, rollbacks, and model upgrades painful.
  • A layered model lets you version probability evolution, compare vendors, and train meta-predictors on deltas (how risk changes over time), not just on snapshots.

The three layers (and how they map to Stella Ops)

  1. Raw feed layer (immutable)
  • Goal: Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all).
  • Stella modules: Concelier (preserve-prune source) writes; Authority handles signatures/hashes.
  • Storage: postgres.epss_raw (partitioned by day); blob column for the untouched payload; SHA-256 of source file.
  • Why: Full provenance + deterministic replay.
  1. Normalized probabilistic layer
  • Goal: Clean, typed tables keyed by cve_id, with probability, percentile, model_version, asof_ts.
  • Stella modules: Excititor (transform); Policy Engine reads.
  • Storage: postgres.epss_prob with a surrogate key (cve_id, model_version, asof_ts) and computed delta fields vs previous asof_ts.
  • Extras: Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side-by-side.
  1. Signal-ready layer (risk engine contracts)
  • Goal: Pre-chewed "events" your Signals/Router can route instantly.
  • What's inside: Only the fields needed for gating and UI: cve_id, prob_now, prob_delta, percentile, risk_band, explain_hash.
  • Emit: first_signal, risk_increase, risk_decrease, quieted with idempotent event keys.
  • Stella modules: Signals publishes, Router fan-outs, Timeline records; Notify handles subscriptions.

Minimal Postgres schema (ready to paste)

-- 1) Raw (immutable)
create table epss_raw (
  id bigserial primary key,
  source_uri text not null,
  ingestion_ts timestamptz not null default now(),
  asof_date date not null,
  payload jsonb not null,
  payload_sha256 bytea not null
);
create index on epss_raw (asof_date);

-- 2) Normalized
create table epss_prob (
  id bigserial primary key,
  cve_id text not null,
  model_version text not null,
  asof_ts timestamptz not null,
  probability double precision not null,
  percentile double precision,
  features jsonb,
  unique (cve_id, model_version, asof_ts)
);

-- 3) Signal-ready
create table epss_signal (
  signal_id bigserial primary key,
  cve_id text not null,
  asof_ts timestamptz not null,
  probability double precision not null,
  prob_delta double precision,
  risk_band text not null,
  model_version text not null,
  explain_hash bytea not null,
  unique (cve_id, model_version, asof_ts)
);

C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style)

// 1) Fetch & store raw (Concelier)
public async Task IngestRawAsync(Uri src, DateOnly asOfDate) {
    var bytes = await http.GetByteArrayAsync(src);
    var sha = SHA256.HashData(bytes);
    await pg.ExecuteAsync(
        "insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)",
        new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha });
}

// 2) Normalize (Excititor)
public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) {
    var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate });
    foreach (var r in raws) {
        foreach (var row in ParseCsvOrJson(r.Payload)) {
            await pg.ExecuteAsync(
              @"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features)
                values (@cve,@mv,@ts,@prob,@pct,@feat)
                on conflict do nothing",
              new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features });
        }
    }
}

// 3) Emit signal-ready (Signals)
public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) {
    var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability,
        probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
      from epss_prob where model_version=@mv", new { mv = modelVersion });

    foreach (var r in rows) {
        var band = Band(r.probability);
        if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) {
            var explainHash = DeterministicExplainHash(r);
            await pg.ExecuteAsync(@"insert into epss_signal
                (cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash)
                values (@c,@t,@p,@d,@b,@mv,@h)
                on conflict do nothing",
                new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash });

            await bus.PublishAsync("risk.epss.delta", new {
                cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash)
            });
        }
    }
}

Versioning & experiments (the secret sauce)

  • Model namespace: EPSS-4.0-<regressor-name>-<date> so you can run multiple variants in parallel.
  • Delta-training: Train a small meta-predictor on delta-probability to forecast "risk jumps in next N days."
  • A/B in production: Route model_version=x to 50% of projects; compare MTTA to patch and false-alarm rate.

Policy & UI wiring (quick contracts)

Policy gates (OPA/Rego or internal rules):

  • Block if risk_band in {HIGH, CRITICAL} AND prob_delta >= 0.1 in last 72h.
  • Soften if asset not reachable or mitigated by VEX.

UI (Evidence pane):

  • Show sparkline of EPSS over time, highlight last delta.
  • "Why now?" button reveals explain_hash -> deterministic evidence payload.

Ops & reliability

  • Daily ingestion with idempotent runs (raw SHA guard).
  • Backfills: re-normalize from epss_raw for any new model without re-downloading.
  • Deterministic replay: export (raw, transform code hash, model_version) alongside results.