Files
git.stella-ops.org/docs/modules/scanner/epss-integration.md
master 8bbfe4d2d2 feat(rate-limiting): Implement core rate limiting functionality with configuration, decision-making, metrics, middleware, and service registration
- Add RateLimitConfig for configuration management with YAML binding support.
- Introduce RateLimitDecision to encapsulate the result of rate limit checks.
- Implement RateLimitMetrics for OpenTelemetry metrics tracking.
- Create RateLimitMiddleware for enforcing rate limits on incoming requests.
- Develop RateLimitService to orchestrate instance and environment rate limit checks.
- Add RateLimitServiceCollectionExtensions for dependency injection registration.
2025-12-17 18:02:37 +02:00

358 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# EPSS Integration Architecture
> **Advisory Source**: `docs/product-advisories/16-Dec-2025 - Merging EPSS v4 with CVSS v4 Frameworks.md`
> **Last Updated**: 2025-12-17
> **Status**: Approved for Implementation
---
## Executive Summary
EPSS (Exploit Prediction Scoring System) is a **probabilistic model** that estimates the likelihood a given CVE will be exploited in the wild over the next ~30 days. This document defines how StellaOps integrates EPSS as a first-class risk signal.
**Key Distinction**:
- **CVSS v4**: Deterministic measurement of *severity* (0-10)
- **EPSS**: Dynamic, data-driven *probability of exploitation* (0-1)
EPSS does **not** replace CVSS or VEX—it provides complementary probabilistic threat intelligence.
---
## 1. Design Principles
### 1.1 EPSS as Probabilistic Signal
| Signal Type | Nature | Source |
|-------------|--------|--------|
| CVSS v4 | Deterministic impact | NVD, vendor |
| EPSS | Probabilistic threat | FIRST daily feeds |
| VEX | Vendor intent | Vendor statements |
| Runtime context | Actual exposure | StellaOps scanner |
**Rule**: EPSS *modulates confidence*, never asserts truth.
### 1.2 Architectural Constraints
1. **Append-only time-series**: Never overwrite historical EPSS data
2. **Deterministic replay**: Every scan stores the EPSS snapshot reference used
3. **Idempotent ingestion**: Safe to re-run for same date
4. **Postgres as source of truth**: Valkey is optional cache only
5. **Air-gap compatible**: Manual import via signed bundles
---
## 2. Data Model
### 2.1 Core Tables
#### Import Provenance
```sql
CREATE TABLE epss_import_runs (
import_run_id UUID PRIMARY KEY,
model_date DATE NOT NULL,
source_uri TEXT NOT NULL,
retrieved_at TIMESTAMPTZ NOT NULL,
file_sha256 TEXT NOT NULL,
decompressed_sha256 TEXT NULL,
row_count INT NOT NULL,
model_version_tag TEXT NULL,
published_date DATE NULL,
status TEXT NOT NULL, -- SUCCEEDED / FAILED
error TEXT NULL,
UNIQUE (model_date)
);
```
#### Time-Series Scores (Partitioned)
```sql
CREATE TABLE epss_scores (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
import_run_id UUID NOT NULL REFERENCES epss_import_runs(import_run_id),
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
#### Current Projection (Fast Lookup)
```sql
CREATE TABLE epss_current (
cve_id TEXT PRIMARY KEY,
epss_score DOUBLE PRECISION NOT NULL,
percentile DOUBLE PRECISION NOT NULL,
model_date DATE NOT NULL,
import_run_id UUID NOT NULL
);
CREATE INDEX idx_epss_current_score_desc ON epss_current (epss_score DESC);
CREATE INDEX idx_epss_current_percentile_desc ON epss_current (percentile DESC);
```
#### Change Detection
```sql
CREATE TABLE epss_changes (
model_date DATE NOT NULL,
cve_id TEXT NOT NULL,
old_score DOUBLE PRECISION NULL,
new_score DOUBLE PRECISION NOT NULL,
delta_score DOUBLE PRECISION NULL,
old_percentile DOUBLE PRECISION NULL,
new_percentile DOUBLE PRECISION NOT NULL,
flags INT NOT NULL, -- bitmask: NEW_SCORED, CROSSED_HIGH, BIG_JUMP
PRIMARY KEY (model_date, cve_id)
) PARTITION BY RANGE (model_date);
```
### 2.2 Flags Bitmask
| Flag | Value | Meaning |
|------|-------|---------|
| NEW_SCORED | 0x01 | CVE newly scored (not in previous day) |
| CROSSED_HIGH | 0x02 | Score crossed above high threshold |
| CROSSED_LOW | 0x04 | Score crossed below high threshold |
| BIG_JUMP_UP | 0x08 | Delta > 0.10 upward |
| BIG_JUMP_DOWN | 0x10 | Delta > 0.10 downward |
| TOP_PERCENTILE | 0x20 | Entered top 5% |
---
## 3. Service Architecture
### 3.1 Component Responsibilities
```
┌─────────────────────────────────────────────────────────────────┐
│ EPSS DATA FLOW │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Scheduler │────►│ Concelier │────►│ Scanner │ │
│ │ (triggers) │ │ (ingest) │ │ (evidence) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌──────────────┐ │ │
│ │ │ Postgres │◄───────────┘ │
│ │ │ (truth) │ │
│ │ └──────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Notify │◄────│ Excititor │ │
│ │ (alerts) │ │ (VEX tasks) │ │
│ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
| Component | Responsibility |
|-----------|----------------|
| **Scheduler** | Triggers daily EPSS import job |
| **Concelier** | Downloads/imports EPSS, stores facts, computes delta, emits events |
| **Scanner** | Attaches EPSS-at-scan as immutable evidence, uses for scoring |
| **Excititor** | Creates VEX tasks when EPSS is high and VEX missing |
| **Notify** | Sends alerts on priority changes |
### 3.2 Event Flow
```
Scheduler
→ epss.ingest(date)
→ Concelier (ingest)
→ epss.updated
→ Notify (optional daily summary)
→ Concelier (enrichment)
→ vuln.priority.changed
→ Notify (targeted alerts)
→ Excititor (VEX task creation)
```
---
## 4. Ingestion Pipeline
### 4.1 Data Source
FIRST publishes daily CSV snapshots at:
```
https://epss.empiricalsecurity.com/epss_scores-YYYY-MM-DD.csv.gz
```
Each file contains ~300k CVE records with:
- `cve` - CVE ID
- `epss` - Score (0.000001.00000)
- `percentile` - Rank vs all CVEs
### 4.2 Ingestion Steps
1. **Scheduler** triggers daily job for date D
2. **Download** `epss_scores-D.csv.gz`
3. **Decompress** stream
4. **Parse** header comment for model version/date
5. **Validate** scores in [0,1], monotonic percentile
6. **Bulk load** into TEMP staging table
7. **Transaction**:
- Insert `epss_import_runs`
- Insert into `epss_scores` partition
- Compute `epss_changes` by comparing staging vs `epss_current`
- Upsert `epss_current`
- Enqueue `epss.updated` event
8. **Commit**
### 4.3 Air-Gap Import
Accept local bundle containing:
- `epss_scores-YYYY-MM-DD.csv.gz`
- `manifest.json` with sha256, source attribution, DSSE signature
Same pipeline, with `source_uri = bundle://...`.
---
## 5. Enrichment Rules
### 5.1 New Scan Findings (Immutable)
Store EPSS "as-of" scan time:
```csharp
public record ScanEpssEvidence
{
public double EpssScoreAtScan { get; init; }
public double EpssPercentileAtScan { get; init; }
public DateOnly EpssModelDateAtScan { get; init; }
public Guid EpssImportRunIdAtScan { get; init; }
}
```
This supports deterministic replay even if EPSS changes later.
### 5.2 Existing Findings (Live Triage)
Maintain mutable "current EPSS" on vulnerability instances:
- **scan_finding_evidence**: Immutable EPSS-at-scan
- **vuln_instance_triage**: Current EPSS + band (for live triage)
### 5.3 Efficient Delta Targeting
On `epss.updated(D)`:
1. Read `epss_changes` where flags indicate material change
2. Find impacted vulnerability instances by CVE
3. Update only those instances
4. Emit `vuln.priority.changed` only if band crossed
---
## 6. Notification Policy
### 6.1 Default Thresholds
| Threshold | Default | Description |
|-----------|---------|-------------|
| HighPercentile | 0.95 | Top 5% of all CVEs |
| HighScore | 0.50 | 50% exploitation probability |
| BigJumpDelta | 0.10 | Meaningful daily change |
### 6.2 Trigger Conditions
1. **Newly scored** CVE in inventory AND `percentile >= HighPercentile`
2. Existing CVE **crosses above** HighPercentile or HighScore
3. Delta > BigJumpDelta AND CVE in runtime-exposed assets
All thresholds are org-configurable.
---
## 7. Trust Lattice Integration
### 7.1 Scoring Rule Example
```
IF cvss_base >= 8.0
AND epss_score >= 0.35
AND runtime_exposed = true
→ priority = IMMEDIATE_ATTENTION
```
### 7.2 Score Weights
| Factor | Default Weight | Range |
|--------|---------------|-------|
| CVSS | 0.25 | 0.0-1.0 |
| EPSS | 0.25 | 0.0-1.0 |
| Reachability | 0.25 | 0.0-1.0 |
| Freshness | 0.15 | 0.0-1.0 |
| Frequency | 0.10 | 0.0-1.0 |
---
## 8. API Surface
### 8.1 Internal API Endpoints
| Endpoint | Description |
|----------|-------------|
| `GET /epss/current?cve=...` | Bulk lookup current EPSS |
| `GET /epss/history?cve=...&days=180` | Historical time-series |
| `GET /epss/top?order=epss&limit=100` | Top CVEs by score |
| `GET /epss/changes?date=...` | Daily change report |
### 8.2 UI Requirements
For each vulnerability instance:
- EPSS score + percentile
- Model date
- Trend delta vs previous scan date
- Filter chips: "High EPSS", "Rising EPSS", "High CVSS + High EPSS"
- Evidence panel showing EPSS-at-scan vs current EPSS
---
## 9. Implementation Checklist
### Phase 1: Data Foundation
- [ ] DB migrations: tables + partitions + indexes
- [ ] Concelier ingestion job: online download + bundle import
### Phase 2: Integration
- [ ] epss_current + epss_changes projection
- [ ] Scanner.WebService: attach EPSS-at-scan evidence
- [ ] Bulk lookup API
### Phase 3: Enrichment
- [ ] Concelier enrichment job: update triage projections
- [ ] Notify subscription to vuln.priority.changed
### Phase 4: UI/UX
- [ ] EPSS fields in vulnerability detail
- [ ] Filters and sort by exploit likelihood
- [ ] Trend visualization
### Phase 5: Operations
- [ ] Backfill tool (last 180 days)
- [ ] Ops runbook: schedules, manual re-run, air-gap import
---
## 10. Anti-Patterns to Avoid
| Anti-Pattern | Why It's Wrong |
|--------------|----------------|
| Storing only latest EPSS | Breaks auditability and replay |
| Mixing EPSS into CVE table | EPSS is signal, not vulnerability data |
| Treating EPSS as severity | EPSS is probability, not impact |
| Alerting on every daily fluctuation | Creates alert fatigue |
| Recomputing EPSS internally | Use FIRST's authoritative data |
---
## Related Documents
- [Unknowns API Documentation](../api/unknowns-api.md)
- [Score Replay API](../api/score-replay-api.md)
- [Trust Lattice Architecture](../modules/scanner/architecture.md)