Files
git.stella-ops.org/docs/metrics/fn-drift.md
master 2170a58734
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
Add comprehensive security tests for OWASP A02, A05, A07, and A08 categories
- Implemented tests for Cryptographic Failures (A02) to ensure proper handling of sensitive data, secure algorithms, and key management.
- Added tests for Security Misconfiguration (A05) to validate production configurations, security headers, CORS settings, and feature management.
- Developed tests for Authentication Failures (A07) to enforce strong password policies, rate limiting, session management, and MFA support.
- Created tests for Software and Data Integrity Failures (A08) to verify artifact signatures, SBOM integrity, attestation chains, and feed updates.
2025-12-16 16:40:44 +02:00

178 lines
5.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# FN-Drift Metrics Reference
> **Sprint:** SPRINT_3404_0001_0001
> **Module:** Scanner Storage / Telemetry
## Overview
False-Negative Drift (FN-Drift) measures how often vulnerability classifications change from "not affected" or "unknown" to "affected" during rescans. This metric is critical for:
- **Accuracy Assessment**: Tracking scanner reliability over time
- **SLO Compliance**: Meeting false-negative rate targets
- **Root Cause Analysis**: Stratified analysis by drift cause
- **Feed Quality**: Identifying problematic vulnerability feeds
## Metrics
### Gauges (30-day rolling window)
| Metric | Type | Description |
|--------|------|-------------|
| `scanner.fn_drift.percent` | Gauge | 30-day rolling FN-Drift percentage |
| `scanner.fn_drift.transitions_30d` | Gauge | Total FN transitions in last 30 days |
| `scanner.fn_drift.evaluated_30d` | Gauge | Total findings evaluated in last 30 days |
| `scanner.fn_drift.cause.feed_delta` | Gauge | FN transitions caused by feed updates |
| `scanner.fn_drift.cause.rule_delta` | Gauge | FN transitions caused by rule changes |
| `scanner.fn_drift.cause.lattice_delta` | Gauge | FN transitions caused by VEX lattice changes |
| `scanner.fn_drift.cause.reachability_delta` | Gauge | FN transitions caused by reachability changes |
| `scanner.fn_drift.cause.engine` | Gauge | FN transitions caused by engine changes (should be ~0) |
### Counters (all-time)
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `scanner.classification_changes_total` | Counter | `cause` | Total classification status changes |
| `scanner.fn_transitions_total` | Counter | `cause` | Total false-negative transitions |
## Classification Statuses
| Status | Description |
|--------|-------------|
| `new` | First scan, no previous status |
| `unaffected` | Confirmed not affected |
| `unknown` | Status unknown/uncertain |
| `affected` | Confirmed affected |
| `fixed` | Previously affected, now fixed |
## Drift Causes
| Cause | Description | Expected Impact |
|-------|-------------|-----------------|
| `feed_delta` | Vulnerability feed updated (NVD, GHSA, OVAL) | High - most common cause |
| `rule_delta` | Policy rules changed | Medium - controlled by policy team |
| `lattice_delta` | VEX lattice state changed | Medium - VEX updates |
| `reachability_delta` | Reachability analysis changed | Low - improved analysis |
| `engine` | Scanner engine change | ~0 - determinism violation if >0 |
| `other` | Unknown/unclassified cause | Low - investigate if high |
## FN-Drift Definition
A **False-Negative Transition** occurs when:
- Previous status was `unaffected` or `unknown`
- New status is `affected`
This indicates the scanner previously classified a finding as "not vulnerable" but now classifies it as "vulnerable" - a false negative in the earlier scan.
### FN-Drift Rate Calculation
```
FN-Drift % = (FN Transitions / Total Reclassified) × 100
```
Where:
- **FN Transitions**: Count of `(unaffected|unknown) → affected` changes
- **Total Reclassified**: Count of all status changes (excluding `new`)
## SLO Thresholds
| SLO Level | FN-Drift Threshold | Alert Severity |
|-----------|-------------------|----------------|
| Target | < 1.0% | None |
| Warning | 1.0% - 2.5% | Warning |
| Critical | > 2.5% | Critical |
| Engine Drift | > 0% | Page |
### Alerting Rules
```yaml
# Example Prometheus alerting rules
groups:
- name: fn-drift
rules:
- alert: FnDriftWarning
expr: scanner_fn_drift_percent > 1.0
for: 5m
labels:
severity: warning
annotations:
summary: "FN-Drift rate above warning threshold"
- alert: FnDriftCritical
expr: scanner_fn_drift_percent > 2.5
for: 5m
labels:
severity: critical
annotations:
summary: "FN-Drift rate above critical threshold"
- alert: EngineDriftDetected
expr: scanner_fn_drift_cause_engine > 0
for: 1m
labels:
severity: page
annotations:
summary: "Engine-caused FN drift detected - determinism violation"
```
## Dashboard Queries
### FN-Drift Trend (Grafana)
```promql
# 30-day rolling FN-Drift percentage
scanner_fn_drift_percent
# FN transitions by cause
sum by (cause) (rate(scanner_fn_transitions_total[1h]))
# Classification changes rate
sum by (cause) (rate(scanner_classification_changes_total[1h]))
```
### Drift Cause Breakdown
```promql
# Pie chart of drift causes
topk(5,
sum by (cause) (
increase(scanner_fn_transitions_total[24h])
)
)
```
## Database Schema
### classification_history Table
```sql
CREATE TABLE scanner.classification_history (
id BIGSERIAL PRIMARY KEY,
artifact_digest TEXT NOT NULL,
vuln_id TEXT NOT NULL,
package_purl TEXT NOT NULL,
tenant_id UUID NOT NULL,
manifest_id UUID NOT NULL,
execution_id UUID NOT NULL,
previous_status TEXT NOT NULL,
new_status TEXT NOT NULL,
is_fn_transition BOOLEAN GENERATED ALWAYS AS (...) STORED,
cause TEXT NOT NULL,
cause_detail JSONB,
changed_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```
### fn_drift_stats Materialized View
Aggregated daily statistics for efficient dashboard queries:
- Day bucket
- Tenant ID
- Cause breakdown
- FN count and percentage
## Related Documentation
- [Determinism Technical Reference](../product-advisories/14-Dec-2025%20-%20Determinism%20and%20Reproducibility%20Technical%20Reference.md) - Section 13.2
- [Scanner Architecture](../modules/scanner/architecture.md)
- [Telemetry Stack](../modules/telemetry/architecture.md)