8.0 KiB
8.0 KiB
ARCHIVED ADVISORY
Archived: 2025-12-18 Status: IMPLEMENTED Analysis: Plan file
C:\Users\vlindos\.claude\plans\quizzical-hugging-hearth.mdImplementation Summary
This advisory was analyzed and merged into the existing EPSS implementation plan:
- Master Plan:
IMPL_3410_epss_v4_integration_master_plan.mdupdated with raw + signal layer schemas- Sprint:
SPRINT_3413_0001_0001_epss_live_enrichment.mdcreated with 30 tasks (original 14 + 16 from advisory)- Migrations Created:
011_epss_raw_layer.sql- Full JSONB payload storage (~5GB/year)012_epss_signal_layer.sql- Tenant-scoped signals with dedupe_key and explain_hashGap Analysis Result
Advisory Proposal Decision Rationale Raw feed layer (Layer 1) IMPLEMENTED Full JSONB storage for deterministic replay Normalized layer (Layer 2) ALIGNED Already existed in IMPL_3410 Signal-ready layer (Layer 3) IMPLEMENTED Tenant-scoped signals, model change detection Multi-model support DEFERRED No customer demand Meta-predictor training SKIPPED Out of scope (ML complexity) A/B testing SKIPPED Infrastructure overhead Key Enhancements Implemented
- Raw Feed Layer (
epss_rawtable) - Stores full CSV payload as JSONB for replay- Signal-Ready Layer (
epss_signaltable) - Tenant-scoped actionable events- Model Version Change Detection - Suppresses noisy deltas on model updates
- Explain Hash - Deterministic SHA-256 for audit trail
- Risk Band Mapping - CRITICAL/HIGH/MEDIUM/LOW based on percentile
Original Advisory Content
Here's a compact, practical blueprint for bringing EPSS into your stack without chaos: a 3-layer ingestion model that keeps raw data, produces clean probabilities, and emits "signal-ready" events your risk engine can use immediately.
Why this matters (super short)
- EPSS = predicted probability a vuln will be exploited soon.
- Mixing "raw EPSS feed" directly into decisions makes audits, rollbacks, and model upgrades painful.
- A layered model lets you version probability evolution, compare vendors, and train meta-predictors on deltas (how risk changes over time), not just on snapshots.
The three layers (and how they map to Stella Ops)
- Raw feed layer (immutable)
- Goal: Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all).
- Stella modules:
Concelier(preserve-prune source) writes;Authorityhandles signatures/hashes. - Storage:
postgres.epss_raw(partitioned by day); blob column for the untouched payload; SHA-256 of source file. - Why: Full provenance + deterministic replay.
- Normalized probabilistic layer
- Goal: Clean, typed tables keyed by
cve_id, with probability, percentile, model_version, asof_ts. - Stella modules:
Excititor(transform);Policy Enginereads. - Storage:
postgres.epss_probwith a surrogate key(cve_id, model_version, asof_ts)and computed delta fields vs previousasof_ts. - Extras: Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side-by-side.
- Signal-ready layer (risk engine contracts)
- Goal: Pre-chewed "events" your Signals/Router can route instantly.
- What's inside: Only the fields needed for gating and UI:
cve_id,prob_now,prob_delta,percentile,risk_band,explain_hash. - Emit:
first_signal,risk_increase,risk_decrease,quietedwith idempotent event keys. - Stella modules:
Signalspublishes,Routerfan-outs,Timelinerecords;Notifyhandles subscriptions.
Minimal Postgres schema (ready to paste)
-- 1) Raw (immutable)
create table epss_raw (
id bigserial primary key,
source_uri text not null,
ingestion_ts timestamptz not null default now(),
asof_date date not null,
payload jsonb not null,
payload_sha256 bytea not null
);
create index on epss_raw (asof_date);
-- 2) Normalized
create table epss_prob (
id bigserial primary key,
cve_id text not null,
model_version text not null,
asof_ts timestamptz not null,
probability double precision not null,
percentile double precision,
features jsonb,
unique (cve_id, model_version, asof_ts)
);
-- 3) Signal-ready
create table epss_signal (
signal_id bigserial primary key,
cve_id text not null,
asof_ts timestamptz not null,
probability double precision not null,
prob_delta double precision,
risk_band text not null,
model_version text not null,
explain_hash bytea not null,
unique (cve_id, model_version, asof_ts)
);
C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style)
// 1) Fetch & store raw (Concelier)
public async Task IngestRawAsync(Uri src, DateOnly asOfDate) {
var bytes = await http.GetByteArrayAsync(src);
var sha = SHA256.HashData(bytes);
await pg.ExecuteAsync(
"insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)",
new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha });
}
// 2) Normalize (Excititor)
public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) {
var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate });
foreach (var r in raws) {
foreach (var row in ParseCsvOrJson(r.Payload)) {
await pg.ExecuteAsync(
@"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features)
values (@cve,@mv,@ts,@prob,@pct,@feat)
on conflict do nothing",
new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features });
}
}
}
// 3) Emit signal-ready (Signals)
public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) {
var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability,
probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
from epss_prob where model_version=@mv", new { mv = modelVersion });
foreach (var r in rows) {
var band = Band(r.probability);
if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) {
var explainHash = DeterministicExplainHash(r);
await pg.ExecuteAsync(@"insert into epss_signal
(cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash)
values (@c,@t,@p,@d,@b,@mv,@h)
on conflict do nothing",
new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash });
await bus.PublishAsync("risk.epss.delta", new {
cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash)
});
}
}
}
Versioning & experiments (the secret sauce)
- Model namespace:
EPSS-4.0-<regressor-name>-<date>so you can run multiple variants in parallel. - Delta-training: Train a small meta-predictor on delta-probability to forecast "risk jumps in next N days."
- A/B in production: Route
model_version=xto 50% of projects; compare MTTA to patch and false-alarm rate.
Policy & UI wiring (quick contracts)
Policy gates (OPA/Rego or internal rules):
- Block if
risk_band in {HIGH, CRITICAL}ANDprob_delta >= 0.1in last 72h. - Soften if asset not reachable or mitigated by VEX.
UI (Evidence pane):
- Show sparkline of EPSS over time, highlight last delta.
- "Why now?" button reveals explain_hash -> deterministic evidence payload.
Ops & reliability
- Daily ingestion with idempotent runs (raw SHA guard).
- Backfills: re-normalize from
epss_rawfor any new model without re-downloading. - Deterministic replay: export
(raw, transform code hash, model_version)alongside results.