# ARCHIVED ADVISORY > **Archived**: 2025-12-18 > **Status**: IMPLEMENTED > **Analysis**: Plan file `C:\Users\vlindos\.claude\plans\quizzical-hugging-hearth.md` > > ## Implementation Summary > > This advisory was analyzed and merged into the existing EPSS implementation plan: > > - **Master Plan**: `IMPL_3410_epss_v4_integration_master_plan.md` updated with raw + signal layer schemas > - **Sprint**: `SPRINT_3413_0001_0001_epss_live_enrichment.md` created with 30 tasks (original 14 + 16 from advisory) > - **Migrations Created**: > - `011_epss_raw_layer.sql` - Full JSONB payload storage (~5GB/year) > - `012_epss_signal_layer.sql` - Tenant-scoped signals with dedupe_key and explain_hash > > ## Gap Analysis Result > > | Advisory Proposal | Decision | Rationale | > |-------------------|----------|-----------| > | Raw feed layer (Layer 1) | IMPLEMENTED | Full JSONB storage for deterministic replay | > | Normalized layer (Layer 2) | ALIGNED | Already existed in IMPL_3410 | > | Signal-ready layer (Layer 3) | IMPLEMENTED | Tenant-scoped signals, model change detection | > | Multi-model support | DEFERRED | No customer demand | > | Meta-predictor training | SKIPPED | Out of scope (ML complexity) | > | A/B testing | SKIPPED | Infrastructure overhead | > > ## Key Enhancements Implemented > > 1. **Raw Feed Layer** (`epss_raw` table) - Stores full CSV payload as JSONB for replay > 2. **Signal-Ready Layer** (`epss_signal` table) - Tenant-scoped actionable events > 3. **Model Version Change Detection** - Suppresses noisy deltas on model updates > 4. **Explain Hash** - Deterministic SHA-256 for audit trail > 5. **Risk Band Mapping** - CRITICAL/HIGH/MEDIUM/LOW based on percentile --- # Original Advisory Content Here's a compact, practical blueprint for bringing **EPSS** into your stack without chaos: a **3-layer ingestion model** that keeps raw data, produces clean probabilities, and emits "signal-ready" events your risk engine can use immediately. --- # Why this matters (super short) * **EPSS** = predicted probability a vuln will be exploited soon. * Mixing "raw EPSS feed" directly into decisions makes audits, rollbacks, and model upgrades painful. * A **layered model** lets you **version probability evolution**, compare vendors, and train **meta-predictors on deltas** (how risk changes over time), not just on snapshots. --- # The three layers (and how they map to Stella Ops) 1. **Raw feed layer (immutable)** * **Goal:** Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all). * **Stella modules:** `Concelier` (preserve-prune source) writes; `Authority` handles signatures/hashes. * **Storage:** `postgres.epss_raw` (partitioned by day); blob column for the untouched payload; SHA-256 of source file. * **Why:** Full provenance + deterministic replay. 2. **Normalized probabilistic layer** * **Goal:** Clean, typed tables keyed by `cve_id`, with **probability, percentile, model_version, asof_ts**. * **Stella modules:** `Excititor` (transform); `Policy Engine` reads. * **Storage:** `postgres.epss_prob` with a **surrogate key** `(cve_id, model_version, asof_ts)` and computed **delta fields** vs previous `asof_ts`. * **Extras:** Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side-by-side. 3. **Signal-ready layer (risk engine contracts)** * **Goal:** Pre-chewed "events" your **Signals/Router** can route instantly. * **What's inside:** Only the fields needed for gating and UI: `cve_id`, `prob_now`, `prob_delta`, `percentile`, `risk_band`, `explain_hash`. * **Emit:** `first_signal`, `risk_increase`, `risk_decrease`, `quieted` with **idempotent event keys**. * **Stella modules:** `Signals` publishes, `Router` fan-outs, `Timeline` records; `Notify` handles subscriptions. --- # Minimal Postgres schema (ready to paste) ```sql -- 1) Raw (immutable) create table epss_raw ( id bigserial primary key, source_uri text not null, ingestion_ts timestamptz not null default now(), asof_date date not null, payload jsonb not null, payload_sha256 bytea not null ); create index on epss_raw (asof_date); -- 2) Normalized create table epss_prob ( id bigserial primary key, cve_id text not null, model_version text not null, asof_ts timestamptz not null, probability double precision not null, percentile double precision, features jsonb, unique (cve_id, model_version, asof_ts) ); -- 3) Signal-ready create table epss_signal ( signal_id bigserial primary key, cve_id text not null, asof_ts timestamptz not null, probability double precision not null, prob_delta double precision, risk_band text not null, model_version text not null, explain_hash bytea not null, unique (cve_id, model_version, asof_ts) ); ``` --- # C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style) ```csharp // 1) Fetch & store raw (Concelier) public async Task IngestRawAsync(Uri src, DateOnly asOfDate) { var bytes = await http.GetByteArrayAsync(src); var sha = SHA256.HashData(bytes); await pg.ExecuteAsync( "insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)", new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha }); } // 2) Normalize (Excititor) public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) { var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate }); foreach (var r in raws) { foreach (var row in ParseCsvOrJson(r.Payload)) { await pg.ExecuteAsync( @"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features) values (@cve,@mv,@ts,@prob,@pct,@feat) on conflict do nothing", new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features }); } } } // 3) Emit signal-ready (Signals) public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) { var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability, probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta from epss_prob where model_version=@mv", new { mv = modelVersion }); foreach (var r in rows) { var band = Band(r.probability); if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) { var explainHash = DeterministicExplainHash(r); await pg.ExecuteAsync(@"insert into epss_signal (cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash) values (@c,@t,@p,@d,@b,@mv,@h) on conflict do nothing", new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash }); await bus.PublishAsync("risk.epss.delta", new { cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash) }); } } } ``` --- # Versioning & experiments (the secret sauce) * **Model namespace:** `EPSS-4.0--` so you can run multiple variants in parallel. * **Delta-training:** Train a small meta-predictor on **delta-probability** to forecast **"risk jumps in next N days."** * **A/B in production:** Route `model_version=x` to 50% of projects; compare **MTTA to patch** and **false-alarm rate**. --- # Policy & UI wiring (quick contracts) **Policy gates** (OPA/Rego or internal rules): * Block if `risk_band in {HIGH, CRITICAL}` **AND** `prob_delta >= 0.1` in last 72h. * Soften if asset not reachable or mitigated by VEX. **UI (Evidence pane):** * Show **sparkline of EPSS over time**, highlight last delta. * "Why now?" button reveals **explain_hash** -> deterministic evidence payload. --- # Ops & reliability * Daily ingestion with **idempotent** runs (raw SHA guard). * Backfills: re-normalize from `epss_raw` for any new model without re-downloading. * **Deterministic replay:** export `(raw, transform code hash, model_version)` alongside results.