464 lines
14 KiB
Markdown
464 lines
14 KiB
Markdown
# Unknowns Ranking Algorithm Reference
|
||
|
||
This document describes the multi-factor scoring algorithm used to rank and triage unknowns in the StellaOps Signals module.
|
||
|
||
## Purpose
|
||
|
||
When reachability analysis encounters unresolved symbols, edges, or package identities, these are recorded as **unknowns**. The ranking algorithm prioritizes unknowns by computing a composite score from five factors, then assigns each to a triage band (HOT/WARM/COLD) that determines rescan scheduling and escalation policies.
|
||
|
||
## Scoring Formula
|
||
|
||
The composite score is computed as:
|
||
|
||
```
|
||
Score = wP × P + wE × E + wU × U + wC × C + wS × S
|
||
```
|
||
|
||
Where:
|
||
- **P** = Popularity (deployment impact)
|
||
- **E** = Exploit potential (CVE severity)
|
||
- **U** = Uncertainty density (flag accumulation)
|
||
- **C** = Centrality (graph position importance)
|
||
- **S** = Staleness (evidence age)
|
||
|
||
All factors are normalized to [0.0, 1.0] before weighting. The final score is clamped to [0.0, 1.0].
|
||
|
||
### Default Weights
|
||
|
||
| Factor | Weight | Description |
|
||
|--------|--------|-------------|
|
||
| wP | 0.25 | Popularity weight |
|
||
| wE | 0.25 | Exploit potential weight |
|
||
| wU | 0.25 | Uncertainty density weight |
|
||
| wC | 0.15 | Centrality weight |
|
||
| wS | 0.10 | Staleness weight |
|
||
|
||
Weights must sum to 1.0 and are configurable via `Signals:UnknownsScoring` settings.
|
||
|
||
## Factor Details
|
||
|
||
### Factor P: Popularity (Deployment Impact)
|
||
|
||
Measures how widely the unknown's package is deployed across monitored environments.
|
||
|
||
**Formula:**
|
||
```
|
||
P = min(1, log10(1 + deploymentCount) / log10(1 + maxDeployments))
|
||
```
|
||
|
||
**Parameters:**
|
||
- `deploymentCount`: Number of deployments referencing the package (from `deploy_refs` table)
|
||
- `maxDeployments`: Normalization ceiling (default: 100)
|
||
|
||
**Rationale:** Logarithmic scaling prevents a single highly-deployed package from dominating scores while still prioritizing widely-used dependencies.
|
||
|
||
### Factor E: Exploit Potential (CVE Severity)
|
||
|
||
Estimates the consequence severity if the unknown resolves to a vulnerable component.
|
||
|
||
**Current Implementation:**
|
||
- Returns 0.5 (medium potential) when no CVE association exists
|
||
- Future: Integrate KEV lookup, EPSS scores, and exploit database references
|
||
|
||
**Planned Enhancements:**
|
||
- CVE severity mapping (Critical=1.0, High=0.8, Medium=0.5, Low=0.2)
|
||
- KEV (Known Exploited Vulnerabilities) flag boost
|
||
- EPSS (Exploit Prediction Scoring System) integration
|
||
|
||
### Factor U: Uncertainty Density (Flag Accumulation)
|
||
|
||
Aggregates uncertainty signals from multiple sources. Each flag contributes a weighted penalty.
|
||
|
||
**Flag Weights:**
|
||
|
||
| Flag | Weight | Description |
|
||
|------|--------|-------------|
|
||
| `NoProvenanceAnchor` | 0.30 | Cannot verify package source |
|
||
| `VersionRange` | 0.25 | Version specified as range, not exact |
|
||
| `DynamicCallTarget` | 0.25 | Reflection, eval, or dynamic dispatch |
|
||
| `ConflictingFeeds` | 0.20 | Contradictory info from different feeds |
|
||
| `ExternalAssembly` | 0.20 | Assembly outside analysis scope |
|
||
| `MissingVector` | 0.15 | No CVSS vector for severity assessment |
|
||
| `UnreachableSourceAdvisory` | 0.10 | Source advisory URL unreachable |
|
||
|
||
**Formula:**
|
||
```
|
||
U = min(1.0, sum(activeFlags × flagWeight))
|
||
```
|
||
|
||
**Example:**
|
||
- NoProvenanceAnchor (0.30) + VersionRange (0.25) + MissingVector (0.15) = 0.70
|
||
|
||
### Factor C: Centrality (Graph Position Importance)
|
||
|
||
Measures the unknown's position importance in the call graph using betweenness centrality.
|
||
|
||
**Formula:**
|
||
```
|
||
C = min(1.0, betweenness / maxBetweenness)
|
||
```
|
||
|
||
**Parameters:**
|
||
- `betweenness`: Raw betweenness centrality from graph analysis
|
||
- `maxBetweenness`: Normalization ceiling (default: 1000)
|
||
|
||
**Rationale:** High-betweenness nodes appear on many shortest paths, meaning they're likely to be reached regardless of entry point.
|
||
|
||
**Related Metrics:**
|
||
- `DegreeCentrality`: Number of incoming + outgoing edges (stored but not used in score)
|
||
- `BetweennessCentrality`: Raw betweenness value (stored for debugging)
|
||
|
||
### Factor S: Staleness (Evidence Age)
|
||
|
||
Measures how old the evidence is since the last successful analysis attempt.
|
||
|
||
**Formula:**
|
||
```
|
||
S = min(1.0, daysSinceLastAnalysis / maxDays)
|
||
```
|
||
|
||
With exponential decay enhancement (optional):
|
||
```
|
||
S = 1 - exp(-daysSinceLastAnalysis / tau)
|
||
```
|
||
|
||
**Parameters:**
|
||
- `daysSinceLastAnalysis`: Days since `LastAnalyzedAt` timestamp
|
||
- `maxDays`: Staleness ceiling (default: 14 days)
|
||
- `tau`: Decay constant for exponential model (default: 14)
|
||
|
||
**Special Cases:**
|
||
- Never analyzed (`LastAnalyzedAt` is null): S = 1.0 (maximum staleness)
|
||
|
||
## Band Assignment
|
||
|
||
Based on the composite score, unknowns are assigned to triage bands:
|
||
|
||
| Band | Threshold | Rescan Policy | Description |
|
||
|------|-----------|---------------|-------------|
|
||
| **HOT** | Score >= 0.70 | 15 minutes | Immediate rescan + VEX escalation |
|
||
| **WARM** | 0.40 <= Score < 0.70 | 24 hours | Scheduled rescan within 12-72h |
|
||
| **COLD** | Score < 0.40 | 7 days | Weekly batch processing |
|
||
|
||
Thresholds are configurable:
|
||
```yaml
|
||
Signals:
|
||
UnknownsScoring:
|
||
HotThreshold: 0.70
|
||
WarmThreshold: 0.40
|
||
```
|
||
|
||
## Scheduler Integration
|
||
|
||
The `UnknownsRescanWorker` processes unknowns based on their band:
|
||
|
||
### HOT Band Processing
|
||
- Poll interval: 1 minute
|
||
- Batch size: 10 items
|
||
- Action: Trigger immediate rescan via `IRescanOrchestrator`
|
||
- On failure: Exponential backoff, max 3 retries before demotion to WARM
|
||
|
||
### WARM Band Processing
|
||
- Poll interval: 5 minutes
|
||
- Batch size: 50 items
|
||
- Scheduled window: 12-72 hours based on score within band
|
||
- On failure: Increment `RescanAttempts`, re-queue with delay
|
||
|
||
### COLD Band Processing
|
||
- Schedule: Weekly on configurable day (default: Sunday)
|
||
- Batch size: 500 items
|
||
- Action: Batch rescan job submission
|
||
- On failure: Log and retry next week
|
||
|
||
## Normalization Trace
|
||
|
||
Each scored unknown includes a `NormalizationTrace` for debugging and replay:
|
||
|
||
```json
|
||
{
|
||
"rawPopularity": 42,
|
||
"normalizedPopularity": 0.65,
|
||
"popularityFormula": "min(1, log10(1 + 42) / log10(1 + 100))",
|
||
|
||
"rawExploitPotential": 0.5,
|
||
"normalizedExploitPotential": 0.5,
|
||
|
||
"rawUncertainty": 0.55,
|
||
"normalizedUncertainty": 0.55,
|
||
"activeFlags": ["NoProvenanceAnchor", "VersionRange"],
|
||
|
||
"rawCentrality": 250.0,
|
||
"normalizedCentrality": 0.25,
|
||
|
||
"rawStaleness": 7,
|
||
"normalizedStaleness": 0.5,
|
||
|
||
"weights": {
|
||
"wP": 0.25,
|
||
"wE": 0.25,
|
||
"wU": 0.25,
|
||
"wC": 0.15,
|
||
"wS": 0.10
|
||
},
|
||
"finalScore": 0.52,
|
||
"assignedBand": "Warm",
|
||
"computedAt": "2025-12-15T10:00:00Z"
|
||
}
|
||
```
|
||
|
||
**Replay Capability:** Given the trace, the exact score can be recomputed:
|
||
```
|
||
Score = 0.25×0.65 + 0.25×0.5 + 0.25×0.55 + 0.15×0.25 + 0.10×0.5
|
||
= 0.1625 + 0.125 + 0.1375 + 0.0375 + 0.05
|
||
= 0.5125 ≈ 0.52
|
||
```
|
||
|
||
## API Endpoints
|
||
|
||
### Query Unknowns by Band
|
||
|
||
```
|
||
GET /api/signals/unknowns?band=hot&limit=50&offset=0
|
||
```
|
||
|
||
Response:
|
||
```json
|
||
{
|
||
"items": [
|
||
{
|
||
"id": "unk-123",
|
||
"subjectKey": "myapp|1.0.0",
|
||
"purl": "pkg:npm/lodash@4.17.21",
|
||
"score": 0.82,
|
||
"band": "Hot",
|
||
"flags": { "noProvenanceAnchor": true, "versionRange": true },
|
||
"nextScheduledRescan": "2025-12-15T10:15:00Z"
|
||
}
|
||
],
|
||
"total": 15,
|
||
"hasMore": false
|
||
}
|
||
```
|
||
|
||
### Get Score Explanation
|
||
|
||
```
|
||
GET /api/signals/unknowns/{id}/explain
|
||
```
|
||
|
||
Response:
|
||
```json
|
||
{
|
||
"unknown": { /* full UnknownSymbolDocument */ },
|
||
"normalizationTrace": { /* trace object */ },
|
||
"factorBreakdown": {
|
||
"popularity": { "raw": 42, "normalized": 0.65, "weighted": 0.1625 },
|
||
"exploitPotential": { "raw": 0.5, "normalized": 0.5, "weighted": 0.125 },
|
||
"uncertainty": { "raw": 0.55, "normalized": 0.55, "weighted": 0.1375 },
|
||
"centrality": { "raw": 250, "normalized": 0.25, "weighted": 0.0375 },
|
||
"staleness": { "raw": 7, "normalized": 0.5, "weighted": 0.05 }
|
||
},
|
||
"bandThresholds": { "hot": 0.70, "warm": 0.40 }
|
||
}
|
||
```
|
||
|
||
## Configuration Reference
|
||
|
||
```yaml
|
||
Signals:
|
||
UnknownsScoring:
|
||
# Factor weights (must sum to 1.0)
|
||
WeightPopularity: 0.25
|
||
WeightExploitPotential: 0.25
|
||
WeightUncertainty: 0.25
|
||
WeightCentrality: 0.15
|
||
WeightStaleness: 0.10
|
||
|
||
# Popularity normalization
|
||
PopularityMaxDeployments: 100
|
||
|
||
# Uncertainty flag weights
|
||
FlagWeightNoProvenance: 0.30
|
||
FlagWeightVersionRange: 0.25
|
||
FlagWeightConflictingFeeds: 0.20
|
||
FlagWeightMissingVector: 0.15
|
||
FlagWeightUnreachableSource: 0.10
|
||
FlagWeightDynamicTarget: 0.25
|
||
FlagWeightExternalAssembly: 0.20
|
||
|
||
# Centrality normalization
|
||
CentralityMaxBetweenness: 1000.0
|
||
|
||
# Staleness normalization
|
||
StalenessMaxDays: 14
|
||
StalenessTau: 14 # For exponential decay
|
||
|
||
# Band thresholds
|
||
HotThreshold: 0.70
|
||
WarmThreshold: 0.40
|
||
|
||
# Rescan scheduling
|
||
HotRescanMinutes: 15
|
||
WarmRescanHours: 24
|
||
ColdRescanDays: 7
|
||
|
||
UnknownsDecay:
|
||
# Nightly batch decay
|
||
BatchEnabled: true
|
||
MaxSubjectsPerBatch: 1000
|
||
ColdBatchDay: Sunday
|
||
```
|
||
|
||
## Determinism Requirements
|
||
|
||
The scoring algorithm is fully deterministic:
|
||
|
||
1. **Same inputs produce identical scores** - Given identical `UnknownSymbolDocument`, deployment counts, and graph metrics, the score will always be the same
|
||
2. **Normalization trace enables replay** - The trace contains all raw values and weights needed to reproduce the score
|
||
3. **Timestamps use UTC ISO 8601** - All `ComputedAt`, `LastAnalyzedAt`, and `NextScheduledRescan` timestamps are UTC
|
||
4. **Weights logged per computation** - The trace includes the exact weights used, allowing audit of configuration changes
|
||
|
||
## Database Schema
|
||
|
||
```sql
|
||
-- Unknowns table (enhanced)
|
||
CREATE TABLE signals.unknowns (
|
||
id UUID PRIMARY KEY,
|
||
subject_key TEXT NOT NULL,
|
||
purl TEXT,
|
||
symbol_id TEXT,
|
||
callgraph_id TEXT,
|
||
|
||
-- Scoring factors
|
||
popularity_score FLOAT DEFAULT 0,
|
||
deployment_count INT DEFAULT 0,
|
||
exploit_potential_score FLOAT DEFAULT 0,
|
||
uncertainty_score FLOAT DEFAULT 0,
|
||
centrality_score FLOAT DEFAULT 0,
|
||
degree_centrality INT DEFAULT 0,
|
||
betweenness_centrality FLOAT DEFAULT 0,
|
||
staleness_score FLOAT DEFAULT 0,
|
||
days_since_last_analysis INT DEFAULT 0,
|
||
|
||
-- Composite score and band
|
||
score FLOAT DEFAULT 0,
|
||
band TEXT DEFAULT 'cold' CHECK (band IN ('hot', 'warm', 'cold')),
|
||
|
||
-- Metadata
|
||
flags JSONB DEFAULT '{}',
|
||
normalization_trace JSONB,
|
||
rescan_attempts INT DEFAULT 0,
|
||
last_rescan_result TEXT,
|
||
|
||
-- Timestamps
|
||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
last_analyzed_at TIMESTAMPTZ,
|
||
next_scheduled_rescan TIMESTAMPTZ
|
||
);
|
||
|
||
-- Indexes for band-based queries
|
||
CREATE INDEX idx_unknowns_band ON signals.unknowns(band);
|
||
CREATE INDEX idx_unknowns_score ON signals.unknowns(score DESC);
|
||
CREATE INDEX idx_unknowns_next_rescan ON signals.unknowns(next_scheduled_rescan)
|
||
WHERE next_scheduled_rescan IS NOT NULL;
|
||
CREATE INDEX idx_unknowns_subject ON signals.unknowns(subject_key);
|
||
```
|
||
|
||
## Metrics and Observability
|
||
|
||
The following metrics are exposed for monitoring:
|
||
|
||
| Metric | Type | Description |
|
||
|--------|------|-------------|
|
||
| `signals_unknowns_total` | Gauge | Total unknowns by band |
|
||
| `signals_unknowns_rescans_total` | Counter | Rescans triggered by band |
|
||
| `signals_unknowns_scoring_duration_seconds` | Histogram | Scoring computation time |
|
||
| `signals_unknowns_band_transitions_total` | Counter | Band changes (e.g., WARM->HOT) |
|
||
|
||
---
|
||
|
||
## Runtime Updated Events
|
||
|
||
> Sprint: SPRINT_20260112_008_SIGNALS_runtime_telemetry_events
|
||
|
||
When runtime observations change for a CVE and product pair, the Signals module emits `runtime.updated` events to drive policy reanalysis of unknowns.
|
||
|
||
### Event Types
|
||
|
||
| Event Type | Constant | Description |
|
||
|------------|----------|-------------|
|
||
| `runtime.updated` | `RuntimeEventTypes.Updated` | Runtime observations changed for a subject |
|
||
| `runtime.ingested` | `RuntimeEventTypes.Ingested` | New runtime observation batch ingested |
|
||
| `runtime.confirmed` | `RuntimeEventTypes.Confirmed` | Runtime fact confirmed by additional evidence |
|
||
| `runtime.exploit_detected` | `RuntimeEventTypes.ExploitDetected` | Exploit behavior detected at runtime |
|
||
|
||
### Update Types
|
||
|
||
| Type | Description |
|
||
|------|-------------|
|
||
| `NewObservation` | First runtime observation for a subject |
|
||
| `StateChange` | Reachability state changed from previous observation |
|
||
| `ConfidenceIncrease` | Additional hits increased confidence score |
|
||
| `NewCallPath` | Previously unseen call path observed |
|
||
| `ExploitTelemetry` | Exploit behavior detected (always triggers reanalysis) |
|
||
|
||
### Event Schema
|
||
|
||
```jsonc
|
||
{
|
||
"eventId": "sha256:abc123...", // Deterministic based on content
|
||
"eventType": "runtime.updated",
|
||
"version": "1.0.0",
|
||
"tenant": "default",
|
||
"cveId": "CVE-2026-1234", // Optional
|
||
"purl": "pkg:npm/lodash@4.17.21", // Optional
|
||
"subjectKey": "cve:CVE-2026-1234|purl:pkg:npm/lodash@4.17.21",
|
||
"callgraphId": "cg-scan-001",
|
||
"evidenceDigest": "sha256:def456...", // Digest of runtime evidence
|
||
"updateType": "NewCallPath",
|
||
"previousState": "observed", // Null for new observations
|
||
"newState": "observed",
|
||
"confidence": 0.85, // 0.0-1.0
|
||
"fromRuntime": true,
|
||
"runtimeMethod": "ebpf", // "ebpf", "agent", "probe"
|
||
"observedNodeHashes": ["sha256:...", "sha256:..."],
|
||
"pathHash": "sha256:...", // Optional
|
||
"triggerReanalysis": true,
|
||
"reanalysisReason": "New call path observed at runtime",
|
||
"occurredAtUtc": "2026-01-15T10:30:00Z",
|
||
"traceId": "abc123" // Optional correlation ID
|
||
}
|
||
```
|
||
|
||
### Reanalysis Triggers
|
||
|
||
The `triggerReanalysis` flag is set to `true` when:
|
||
|
||
1. **Exploit telemetry detected** (always triggers)
|
||
2. **State change** from previous observation
|
||
3. **High-confidence runtime observation** (confidence >= 0.8 and fromRuntime=true)
|
||
4. **New observation** (no previous runtime data)
|
||
|
||
### Event Emission Points
|
||
|
||
Runtime updated events are emitted from:
|
||
|
||
1. `RuntimeFactsIngestionService.IngestAsync` - After runtime facts are persisted
|
||
2. `ReachabilityScoringService` - When scores are recomputed with new runtime data
|
||
|
||
### Deterministic Event IDs
|
||
|
||
Event IDs are computed deterministically using SHA-256 of:
|
||
- `subjectKey`
|
||
- `evidenceDigest`
|
||
- `occurredAtUtc` (ISO 8601 format)
|
||
|
||
This ensures idempotent event handling and deduplication.
|
||
|
||
## Related Documentation
|
||
|
||
- [Unknowns Registry](./unknowns-registry.md) - Data model and API for unknowns
|
||
- [Reachability Analysis](./reachability.md) - Reachability scoring integration
|
||
- [Callgraph Schema](./callgraph-formats.md) - Graph structure for centrality computation
|