- Add RateLimitConfig for configuration management with YAML binding support. - Introduce RateLimitDecision to encapsulate the result of rate limit checks. - Implement RateLimitMetrics for OpenTelemetry metrics tracking. - Create RateLimitMiddleware for enforcing rate limits on incoming requests. - Develop RateLimitService to orchestrate instance and environment rate limit checks. - Add RateLimitServiceCollectionExtensions for dependency injection registration.
358 lines
12 KiB
Markdown
358 lines
12 KiB
Markdown
# EPSS Integration Architecture
|
||
|
||
> **Advisory Source**: `docs/product-advisories/16-Dec-2025 - Merging EPSS v4 with CVSS v4 Frameworks.md`
|
||
> **Last Updated**: 2025-12-17
|
||
> **Status**: Approved for Implementation
|
||
|
||
---
|
||
|
||
## Executive Summary
|
||
|
||
EPSS (Exploit Prediction Scoring System) is a **probabilistic model** that estimates the likelihood a given CVE will be exploited in the wild over the next ~30 days. This document defines how StellaOps integrates EPSS as a first-class risk signal.
|
||
|
||
**Key Distinction**:
|
||
- **CVSS v4**: Deterministic measurement of *severity* (0-10)
|
||
- **EPSS**: Dynamic, data-driven *probability of exploitation* (0-1)
|
||
|
||
EPSS does **not** replace CVSS or VEX—it provides complementary probabilistic threat intelligence.
|
||
|
||
---
|
||
|
||
## 1. Design Principles
|
||
|
||
### 1.1 EPSS as Probabilistic Signal
|
||
|
||
| Signal Type | Nature | Source |
|
||
|-------------|--------|--------|
|
||
| CVSS v4 | Deterministic impact | NVD, vendor |
|
||
| EPSS | Probabilistic threat | FIRST daily feeds |
|
||
| VEX | Vendor intent | Vendor statements |
|
||
| Runtime context | Actual exposure | StellaOps scanner |
|
||
|
||
**Rule**: EPSS *modulates confidence*, never asserts truth.
|
||
|
||
### 1.2 Architectural Constraints
|
||
|
||
1. **Append-only time-series**: Never overwrite historical EPSS data
|
||
2. **Deterministic replay**: Every scan stores the EPSS snapshot reference used
|
||
3. **Idempotent ingestion**: Safe to re-run for same date
|
||
4. **Postgres as source of truth**: Valkey is optional cache only
|
||
5. **Air-gap compatible**: Manual import via signed bundles
|
||
|
||
---
|
||
|
||
## 2. Data Model
|
||
|
||
### 2.1 Core Tables
|
||
|
||
#### Import Provenance
|
||
|
||
```sql
|
||
CREATE TABLE epss_import_runs (
|
||
import_run_id UUID PRIMARY KEY,
|
||
model_date DATE NOT NULL,
|
||
source_uri TEXT NOT NULL,
|
||
retrieved_at TIMESTAMPTZ NOT NULL,
|
||
file_sha256 TEXT NOT NULL,
|
||
decompressed_sha256 TEXT NULL,
|
||
row_count INT NOT NULL,
|
||
model_version_tag TEXT NULL,
|
||
published_date DATE NULL,
|
||
status TEXT NOT NULL, -- SUCCEEDED / FAILED
|
||
error TEXT NULL,
|
||
UNIQUE (model_date)
|
||
);
|
||
```
|
||
|
||
#### Time-Series Scores (Partitioned)
|
||
|
||
```sql
|
||
CREATE TABLE epss_scores (
|
||
model_date DATE NOT NULL,
|
||
cve_id TEXT NOT NULL,
|
||
epss_score DOUBLE PRECISION NOT NULL,
|
||
percentile DOUBLE PRECISION NOT NULL,
|
||
import_run_id UUID NOT NULL REFERENCES epss_import_runs(import_run_id),
|
||
PRIMARY KEY (model_date, cve_id)
|
||
) PARTITION BY RANGE (model_date);
|
||
```
|
||
|
||
#### Current Projection (Fast Lookup)
|
||
|
||
```sql
|
||
CREATE TABLE epss_current (
|
||
cve_id TEXT PRIMARY KEY,
|
||
epss_score DOUBLE PRECISION NOT NULL,
|
||
percentile DOUBLE PRECISION NOT NULL,
|
||
model_date DATE NOT NULL,
|
||
import_run_id UUID NOT NULL
|
||
);
|
||
|
||
CREATE INDEX idx_epss_current_score_desc ON epss_current (epss_score DESC);
|
||
CREATE INDEX idx_epss_current_percentile_desc ON epss_current (percentile DESC);
|
||
```
|
||
|
||
#### Change Detection
|
||
|
||
```sql
|
||
CREATE TABLE epss_changes (
|
||
model_date DATE NOT NULL,
|
||
cve_id TEXT NOT NULL,
|
||
old_score DOUBLE PRECISION NULL,
|
||
new_score DOUBLE PRECISION NOT NULL,
|
||
delta_score DOUBLE PRECISION NULL,
|
||
old_percentile DOUBLE PRECISION NULL,
|
||
new_percentile DOUBLE PRECISION NOT NULL,
|
||
flags INT NOT NULL, -- bitmask: NEW_SCORED, CROSSED_HIGH, BIG_JUMP
|
||
PRIMARY KEY (model_date, cve_id)
|
||
) PARTITION BY RANGE (model_date);
|
||
```
|
||
|
||
### 2.2 Flags Bitmask
|
||
|
||
| Flag | Value | Meaning |
|
||
|------|-------|---------|
|
||
| NEW_SCORED | 0x01 | CVE newly scored (not in previous day) |
|
||
| CROSSED_HIGH | 0x02 | Score crossed above high threshold |
|
||
| CROSSED_LOW | 0x04 | Score crossed below high threshold |
|
||
| BIG_JUMP_UP | 0x08 | Delta > 0.10 upward |
|
||
| BIG_JUMP_DOWN | 0x10 | Delta > 0.10 downward |
|
||
| TOP_PERCENTILE | 0x20 | Entered top 5% |
|
||
|
||
---
|
||
|
||
## 3. Service Architecture
|
||
|
||
### 3.1 Component Responsibilities
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ EPSS DATA FLOW │
|
||
├─────────────────────────────────────────────────────────────────┤
|
||
│ │
|
||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||
│ │ Scheduler │────►│ Concelier │────►│ Scanner │ │
|
||
│ │ (triggers) │ │ (ingest) │ │ (evidence) │ │
|
||
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
||
│ │ │ │ │
|
||
│ │ ▼ │ │
|
||
│ │ ┌──────────────┐ │ │
|
||
│ │ │ Postgres │◄───────────┘ │
|
||
│ │ │ (truth) │ │
|
||
│ │ └──────────────┘ │
|
||
│ │ │ │
|
||
│ ▼ ▼ │
|
||
│ ┌──────────────┐ ┌──────────────┐ │
|
||
│ │ Notify │◄────│ Excititor │ │
|
||
│ │ (alerts) │ │ (VEX tasks) │ │
|
||
│ └──────────────┘ └──────────────┘ │
|
||
│ │
|
||
└─────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
| Component | Responsibility |
|
||
|-----------|----------------|
|
||
| **Scheduler** | Triggers daily EPSS import job |
|
||
| **Concelier** | Downloads/imports EPSS, stores facts, computes delta, emits events |
|
||
| **Scanner** | Attaches EPSS-at-scan as immutable evidence, uses for scoring |
|
||
| **Excititor** | Creates VEX tasks when EPSS is high and VEX missing |
|
||
| **Notify** | Sends alerts on priority changes |
|
||
|
||
### 3.2 Event Flow
|
||
|
||
```
|
||
Scheduler
|
||
→ epss.ingest(date)
|
||
→ Concelier (ingest)
|
||
→ epss.updated
|
||
→ Notify (optional daily summary)
|
||
→ Concelier (enrichment)
|
||
→ vuln.priority.changed
|
||
→ Notify (targeted alerts)
|
||
→ Excititor (VEX task creation)
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Ingestion Pipeline
|
||
|
||
### 4.1 Data Source
|
||
|
||
FIRST publishes daily CSV snapshots at:
|
||
```
|
||
https://epss.empiricalsecurity.com/epss_scores-YYYY-MM-DD.csv.gz
|
||
```
|
||
|
||
Each file contains ~300k CVE records with:
|
||
- `cve` - CVE ID
|
||
- `epss` - Score (0.00000–1.00000)
|
||
- `percentile` - Rank vs all CVEs
|
||
|
||
### 4.2 Ingestion Steps
|
||
|
||
1. **Scheduler** triggers daily job for date D
|
||
2. **Download** `epss_scores-D.csv.gz`
|
||
3. **Decompress** stream
|
||
4. **Parse** header comment for model version/date
|
||
5. **Validate** scores in [0,1], monotonic percentile
|
||
6. **Bulk load** into TEMP staging table
|
||
7. **Transaction**:
|
||
- Insert `epss_import_runs`
|
||
- Insert into `epss_scores` partition
|
||
- Compute `epss_changes` by comparing staging vs `epss_current`
|
||
- Upsert `epss_current`
|
||
- Enqueue `epss.updated` event
|
||
8. **Commit**
|
||
|
||
### 4.3 Air-Gap Import
|
||
|
||
Accept local bundle containing:
|
||
- `epss_scores-YYYY-MM-DD.csv.gz`
|
||
- `manifest.json` with sha256, source attribution, DSSE signature
|
||
|
||
Same pipeline, with `source_uri = bundle://...`.
|
||
|
||
---
|
||
|
||
## 5. Enrichment Rules
|
||
|
||
### 5.1 New Scan Findings (Immutable)
|
||
|
||
Store EPSS "as-of" scan time:
|
||
```csharp
|
||
public record ScanEpssEvidence
|
||
{
|
||
public double EpssScoreAtScan { get; init; }
|
||
public double EpssPercentileAtScan { get; init; }
|
||
public DateOnly EpssModelDateAtScan { get; init; }
|
||
public Guid EpssImportRunIdAtScan { get; init; }
|
||
}
|
||
```
|
||
|
||
This supports deterministic replay even if EPSS changes later.
|
||
|
||
### 5.2 Existing Findings (Live Triage)
|
||
|
||
Maintain mutable "current EPSS" on vulnerability instances:
|
||
- **scan_finding_evidence**: Immutable EPSS-at-scan
|
||
- **vuln_instance_triage**: Current EPSS + band (for live triage)
|
||
|
||
### 5.3 Efficient Delta Targeting
|
||
|
||
On `epss.updated(D)`:
|
||
1. Read `epss_changes` where flags indicate material change
|
||
2. Find impacted vulnerability instances by CVE
|
||
3. Update only those instances
|
||
4. Emit `vuln.priority.changed` only if band crossed
|
||
|
||
---
|
||
|
||
## 6. Notification Policy
|
||
|
||
### 6.1 Default Thresholds
|
||
|
||
| Threshold | Default | Description |
|
||
|-----------|---------|-------------|
|
||
| HighPercentile | 0.95 | Top 5% of all CVEs |
|
||
| HighScore | 0.50 | 50% exploitation probability |
|
||
| BigJumpDelta | 0.10 | Meaningful daily change |
|
||
|
||
### 6.2 Trigger Conditions
|
||
|
||
1. **Newly scored** CVE in inventory AND `percentile >= HighPercentile`
|
||
2. Existing CVE **crosses above** HighPercentile or HighScore
|
||
3. Delta > BigJumpDelta AND CVE in runtime-exposed assets
|
||
|
||
All thresholds are org-configurable.
|
||
|
||
---
|
||
|
||
## 7. Trust Lattice Integration
|
||
|
||
### 7.1 Scoring Rule Example
|
||
|
||
```
|
||
IF cvss_base >= 8.0
|
||
AND epss_score >= 0.35
|
||
AND runtime_exposed = true
|
||
→ priority = IMMEDIATE_ATTENTION
|
||
```
|
||
|
||
### 7.2 Score Weights
|
||
|
||
| Factor | Default Weight | Range |
|
||
|--------|---------------|-------|
|
||
| CVSS | 0.25 | 0.0-1.0 |
|
||
| EPSS | 0.25 | 0.0-1.0 |
|
||
| Reachability | 0.25 | 0.0-1.0 |
|
||
| Freshness | 0.15 | 0.0-1.0 |
|
||
| Frequency | 0.10 | 0.0-1.0 |
|
||
|
||
---
|
||
|
||
## 8. API Surface
|
||
|
||
### 8.1 Internal API Endpoints
|
||
|
||
| Endpoint | Description |
|
||
|----------|-------------|
|
||
| `GET /epss/current?cve=...` | Bulk lookup current EPSS |
|
||
| `GET /epss/history?cve=...&days=180` | Historical time-series |
|
||
| `GET /epss/top?order=epss&limit=100` | Top CVEs by score |
|
||
| `GET /epss/changes?date=...` | Daily change report |
|
||
|
||
### 8.2 UI Requirements
|
||
|
||
For each vulnerability instance:
|
||
- EPSS score + percentile
|
||
- Model date
|
||
- Trend delta vs previous scan date
|
||
- Filter chips: "High EPSS", "Rising EPSS", "High CVSS + High EPSS"
|
||
- Evidence panel showing EPSS-at-scan vs current EPSS
|
||
|
||
---
|
||
|
||
## 9. Implementation Checklist
|
||
|
||
### Phase 1: Data Foundation
|
||
- [ ] DB migrations: tables + partitions + indexes
|
||
- [ ] Concelier ingestion job: online download + bundle import
|
||
|
||
### Phase 2: Integration
|
||
- [ ] epss_current + epss_changes projection
|
||
- [ ] Scanner.WebService: attach EPSS-at-scan evidence
|
||
- [ ] Bulk lookup API
|
||
|
||
### Phase 3: Enrichment
|
||
- [ ] Concelier enrichment job: update triage projections
|
||
- [ ] Notify subscription to vuln.priority.changed
|
||
|
||
### Phase 4: UI/UX
|
||
- [ ] EPSS fields in vulnerability detail
|
||
- [ ] Filters and sort by exploit likelihood
|
||
- [ ] Trend visualization
|
||
|
||
### Phase 5: Operations
|
||
- [ ] Backfill tool (last 180 days)
|
||
- [ ] Ops runbook: schedules, manual re-run, air-gap import
|
||
|
||
---
|
||
|
||
## 10. Anti-Patterns to Avoid
|
||
|
||
| Anti-Pattern | Why It's Wrong |
|
||
|--------------|----------------|
|
||
| Storing only latest EPSS | Breaks auditability and replay |
|
||
| Mixing EPSS into CVE table | EPSS is signal, not vulnerability data |
|
||
| Treating EPSS as severity | EPSS is probability, not impact |
|
||
| Alerting on every daily fluctuation | Creates alert fatigue |
|
||
| Recomputing EPSS internally | Use FIRST's authoritative data |
|
||
|
||
---
|
||
|
||
## Related Documents
|
||
|
||
- [Unknowns API Documentation](../api/unknowns-api.md)
|
||
- [Score Replay API](../api/score-replay-api.md)
|
||
- [Trust Lattice Architecture](../modules/scanner/architecture.md)
|