save progress

This commit is contained in:
StellaOps Bot
2025-12-18 09:53:46 +02:00
parent 28823a8960
commit 7d5250238c
87 changed files with 9750 additions and 2026 deletions

View File

@@ -0,0 +1,197 @@
# ARCHIVED ADVISORY
> **Archived**: 2025-12-18
> **Status**: IMPLEMENTED
> **Analysis**: Plan file `C:\Users\vlindos\.claude\plans\quizzical-hugging-hearth.md`
>
> ## Implementation Summary
>
> This advisory was analyzed and merged into the existing EPSS implementation plan:
>
> - **Master Plan**: `IMPL_3410_epss_v4_integration_master_plan.md` updated with raw + signal layer schemas
> - **Sprint**: `SPRINT_3413_0001_0001_epss_live_enrichment.md` created with 30 tasks (original 14 + 16 from advisory)
> - **Migrations Created**:
> - `011_epss_raw_layer.sql` - Full JSONB payload storage (~5GB/year)
> - `012_epss_signal_layer.sql` - Tenant-scoped signals with dedupe_key and explain_hash
>
> ## Gap Analysis Result
>
> | Advisory Proposal | Decision | Rationale |
> |-------------------|----------|-----------|
> | Raw feed layer (Layer 1) | IMPLEMENTED | Full JSONB storage for deterministic replay |
> | Normalized layer (Layer 2) | ALIGNED | Already existed in IMPL_3410 |
> | Signal-ready layer (Layer 3) | IMPLEMENTED | Tenant-scoped signals, model change detection |
> | Multi-model support | DEFERRED | No customer demand |
> | Meta-predictor training | SKIPPED | Out of scope (ML complexity) |
> | A/B testing | SKIPPED | Infrastructure overhead |
>
> ## Key Enhancements Implemented
>
> 1. **Raw Feed Layer** (`epss_raw` table) - Stores full CSV payload as JSONB for replay
> 2. **Signal-Ready Layer** (`epss_signal` table) - Tenant-scoped actionable events
> 3. **Model Version Change Detection** - Suppresses noisy deltas on model updates
> 4. **Explain Hash** - Deterministic SHA-256 for audit trail
> 5. **Risk Band Mapping** - CRITICAL/HIGH/MEDIUM/LOW based on percentile
---
# Original Advisory Content
Here's a compact, practical blueprint for bringing **EPSS** into your stack without chaos: a **3-layer ingestion model** that keeps raw data, produces clean probabilities, and emits "signal-ready" events your risk engine can use immediately.
---
# Why this matters (super short)
* **EPSS** = predicted probability a vuln will be exploited soon.
* Mixing "raw EPSS feed" directly into decisions makes audits, rollbacks, and model upgrades painful.
* A **layered model** lets you **version probability evolution**, compare vendors, and train **meta-predictors on deltas** (how risk changes over time), not just on snapshots.
---
# The three layers (and how they map to Stella Ops)
1. **Raw feed layer (immutable)**
* **Goal:** Store exactly what the provider sent (EPSS v4 CSV/JSON, schema drift and all).
* **Stella modules:** `Concelier` (preserve-prune source) writes; `Authority` handles signatures/hashes.
* **Storage:** `postgres.epss_raw` (partitioned by day); blob column for the untouched payload; SHA-256 of source file.
* **Why:** Full provenance + deterministic replay.
2. **Normalized probabilistic layer**
* **Goal:** Clean, typed tables keyed by `cve_id`, with **probability, percentile, model_version, asof_ts**.
* **Stella modules:** `Excititor` (transform); `Policy Engine` reads.
* **Storage:** `postgres.epss_prob` with a **surrogate key** `(cve_id, model_version, asof_ts)` and computed **delta fields** vs previous `asof_ts`.
* **Extras:** Keep optional vendor columns (e.g., FIRST, custom regressors) to compare models side-by-side.
3. **Signal-ready layer (risk engine contracts)**
* **Goal:** Pre-chewed "events" your **Signals/Router** can route instantly.
* **What's inside:** Only the fields needed for gating and UI: `cve_id`, `prob_now`, `prob_delta`, `percentile`, `risk_band`, `explain_hash`.
* **Emit:** `first_signal`, `risk_increase`, `risk_decrease`, `quieted` with **idempotent event keys**.
* **Stella modules:** `Signals` publishes, `Router` fan-outs, `Timeline` records; `Notify` handles subscriptions.
---
# Minimal Postgres schema (ready to paste)
```sql
-- 1) Raw (immutable)
create table epss_raw (
id bigserial primary key,
source_uri text not null,
ingestion_ts timestamptz not null default now(),
asof_date date not null,
payload jsonb not null,
payload_sha256 bytea not null
);
create index on epss_raw (asof_date);
-- 2) Normalized
create table epss_prob (
id bigserial primary key,
cve_id text not null,
model_version text not null,
asof_ts timestamptz not null,
probability double precision not null,
percentile double precision,
features jsonb,
unique (cve_id, model_version, asof_ts)
);
-- 3) Signal-ready
create table epss_signal (
signal_id bigserial primary key,
cve_id text not null,
asof_ts timestamptz not null,
probability double precision not null,
prob_delta double precision,
risk_band text not null,
model_version text not null,
explain_hash bytea not null,
unique (cve_id, model_version, asof_ts)
);
```
---
# C# ingestion skeleton (StellaOps.Scanner.Worker.DotNet style)
```csharp
// 1) Fetch & store raw (Concelier)
public async Task IngestRawAsync(Uri src, DateOnly asOfDate) {
var bytes = await http.GetByteArrayAsync(src);
var sha = SHA256.HashData(bytes);
await pg.ExecuteAsync(
"insert into epss_raw(source_uri, asof_date, payload, payload_sha256) values (@u,@d,@p::jsonb,@s)",
new { u = src.ToString(), d = asOfDate, p = Encoding.UTF8.GetString(bytes), s = sha });
}
// 2) Normalize (Excititor)
public async Task NormalizeAsync(DateOnly asOfDate, string modelVersion) {
var raws = await pg.QueryAsync<(string Payload)>("select payload from epss_raw where asof_date=@d", new { d = asOfDate });
foreach (var r in raws) {
foreach (var row in ParseCsvOrJson(r.Payload)) {
await pg.ExecuteAsync(
@"insert into epss_prob(cve_id, model_version, asof_ts, probability, percentile, features)
values (@cve,@mv,@ts,@prob,@pct,@feat)
on conflict do nothing",
new { cve = row.Cve, mv = modelVersion, ts = row.AsOf, prob = row.Prob, pct = row.Pctl, feat = row.Features });
}
}
}
// 3) Emit signal-ready (Signals)
public async Task EmitSignalsAsync(string modelVersion, double deltaThreshold) {
var rows = await pg.QueryAsync(@"select cve_id, asof_ts, probability,
probability - lag(probability) over (partition by cve_id, model_version order by asof_ts) as prob_delta
from epss_prob where model_version=@mv", new { mv = modelVersion });
foreach (var r in rows) {
var band = Band(r.probability);
if (Math.Abs(r.prob_delta ?? 0) >= deltaThreshold) {
var explainHash = DeterministicExplainHash(r);
await pg.ExecuteAsync(@"insert into epss_signal
(cve_id, asof_ts, probability, prob_delta, risk_band, model_version, explain_hash)
values (@c,@t,@p,@d,@b,@mv,@h)
on conflict do nothing",
new { c = r.cve_id, t = r.asof_ts, p = r.probability, d = r.prob_delta, b = band, mv = modelVersion, h = explainHash });
await bus.PublishAsync("risk.epss.delta", new {
cve = r.cve_id, ts = r.asof_ts, prob = r.probability, delta = r.prob_delta, band, model = modelVersion, explain = Convert.ToHexString(explainHash)
});
}
}
}
```
---
# Versioning & experiments (the secret sauce)
* **Model namespace:** `EPSS-4.0-<regressor-name>-<date>` so you can run multiple variants in parallel.
* **Delta-training:** Train a small meta-predictor on **delta-probability** to forecast **"risk jumps in next N days."**
* **A/B in production:** Route `model_version=x` to 50% of projects; compare **MTTA to patch** and **false-alarm rate**.
---
# Policy & UI wiring (quick contracts)
**Policy gates** (OPA/Rego or internal rules):
* Block if `risk_band in {HIGH, CRITICAL}` **AND** `prob_delta >= 0.1` in last 72h.
* Soften if asset not reachable or mitigated by VEX.
**UI (Evidence pane):**
* Show **sparkline of EPSS over time**, highlight last delta.
* "Why now?" button reveals **explain_hash** -> deterministic evidence payload.
---
# Ops & reliability
* Daily ingestion with **idempotent** runs (raw SHA guard).
* Backfills: re-normalize from `epss_raw` for any new model without re-downloading.
* **Deterministic replay:** export `(raw, transform code hash, model_version)` alongside results.