- Add RateLimitConfig for configuration management with YAML binding support. - Introduce RateLimitDecision to encapsulate the result of rate limit checks. - Implement RateLimitMetrics for OpenTelemetry metrics tracking. - Create RateLimitMiddleware for enforcing rate limits on incoming requests. - Develop RateLimitService to orchestrate instance and environment rate limit checks. - Add RateLimitServiceCollectionExtensions for dependency injection registration.
128 lines
4.0 KiB
Markdown
128 lines
4.0 KiB
Markdown
# Tiered Precision Curves for Scanner Accuracy
|
|
|
|
**Advisory:** 16-Dec-2025 - Measuring Progress with Tiered Precision Curves
|
|
**Status:** Processing
|
|
**Related Sprints:** SPRINT_3500_0003_0001 (Ground-Truth Corpus)
|
|
|
|
## Executive Summary
|
|
|
|
This advisory introduces a tiered approach to measuring scanner accuracy that prevents metric gaming. By tracking precision/recall separately for three evidence tiers (Imported, Executed, Tainted→Sink), we ensure improvements in one tier don't hide regressions in another.
|
|
|
|
## Key Concepts
|
|
|
|
### Evidence Tiers
|
|
|
|
| Tier | Description | Risk Level | Typical Volume |
|
|
|------|-------------|------------|----------------|
|
|
| **Imported** | Vuln exists in dependency | Lowest | High |
|
|
| **Executed** | Code/deps actually run | Medium | Medium |
|
|
| **Tainted→Sink** | User data reaches sink | Highest | Low |
|
|
|
|
### Tier Precedence
|
|
|
|
Highest tier wins when a finding has multiple evidence types:
|
|
1. `tainted_sink` (highest)
|
|
2. `executed`
|
|
3. `imported`
|
|
|
|
## Implementation Components
|
|
|
|
### 1. Evidence Schema (`eval` schema)
|
|
|
|
```sql
|
|
-- Ground truth samples
|
|
eval.sample(sample_id, name, repo_path, commit_sha, language, scenario, entrypoints)
|
|
|
|
-- Expected findings
|
|
eval.expected_finding(expected_id, sample_id, vuln_key, tier, rule_key, sink_class)
|
|
|
|
-- Evaluation runs
|
|
eval.run(eval_run_id, scanner_version, rules_hash, concelier_snapshot_hash)
|
|
|
|
-- Observed results
|
|
eval.observed_finding(observed_id, eval_run_id, sample_id, vuln_key, tier, score, rule_key, evidence)
|
|
|
|
-- Computed metrics
|
|
eval.metrics(eval_run_id, tier, op_point, precision, recall, f1, pr_auc, latency_p50_ms)
|
|
```
|
|
|
|
### 2. Scanner Worker Changes
|
|
|
|
Workers emit evidence primitives:
|
|
- `DependencyEvidence { purl, version, lockfile_path }`
|
|
- `ReachabilityEvidence { entrypoint, call_path[], confidence }`
|
|
- `TaintEvidence { source, sink, sanitizers[], dataflow_path[], confidence }`
|
|
|
|
### 3. Scanner WebService Changes
|
|
|
|
WebService performs tiering:
|
|
- Merge evidence for same `vuln_key`
|
|
- Run reachability/taint algorithms
|
|
- Assign `evidence_tier` deterministically
|
|
- Persist normalized findings
|
|
|
|
### 4. Evaluator CLI
|
|
|
|
New tool `StellaOps.Scanner.Evaluation.Cli`:
|
|
- `import-corpus` - Load samples and expected findings
|
|
- `run` - Trigger scans using replay manifest
|
|
- `compute` - Calculate per-tier PR curves
|
|
- `report` - Generate markdown artifacts
|
|
|
|
### 5. CI Gates
|
|
|
|
Fail builds when:
|
|
- PR-AUC(imported) drops > 2%
|
|
- PR-AUC(executed/tainted_sink) drops > 1%
|
|
- FP rate in `tainted_sink` > 5% at Recall ≥ 0.7
|
|
|
|
## Operating Points
|
|
|
|
| Tier | Target Recall | Purpose |
|
|
|------|--------------|---------|
|
|
| `imported` | ≥ 0.60 | Broad coverage |
|
|
| `executed` | ≥ 0.70 | Material risk |
|
|
| `tainted_sink` | ≥ 0.80 | Actionable findings |
|
|
|
|
## Integration with Existing Systems
|
|
|
|
### Concelier
|
|
- Stores advisory data, does not tier
|
|
- Tag advisories with sink classes when available
|
|
|
|
### Excititor (VEX)
|
|
- Include `tier` in VEX statements
|
|
- Allow policy per-tier thresholds
|
|
- Preserve pruning provenance
|
|
|
|
### Notify
|
|
- Gate alerts on tiered thresholds
|
|
- Page only on `tainted_sink` at operating point
|
|
|
|
### UI
|
|
- Show tier badge on findings
|
|
- Default sort: tainted_sink > executed > imported
|
|
- Display evidence summary (entrypoint, path length, sink class)
|
|
|
|
## Success Criteria
|
|
|
|
1. Can demonstrate release where overall precision stayed flat but tainted→sink PR-AUC improved
|
|
2. On-call noise reduced via tier-gated paging
|
|
3. TTFS p95 for tainted→sink within budget
|
|
|
|
## Related Documentation
|
|
|
|
- [Ground-Truth Corpus Sprint](../implplan/SPRINT_3500_0003_0001_ground_truth_corpus_ci_gates.md)
|
|
- [Scanner Architecture](../modules/scanner/architecture.md)
|
|
- [Reachability Analysis](./14-Dec-2025%20-%20Reachability%20Analysis%20Technical%20Reference.md)
|
|
|
|
## Overlap Analysis
|
|
|
|
This advisory **extends** the ground-truth corpus work (SPRINT_3500_0003_0001) with:
|
|
- Tiered precision tracking (new)
|
|
- Per-tier operating points (new)
|
|
- CI gates based on tier-specific AUC (enhancement)
|
|
- Integration with Notify for tier-gated alerts (new)
|
|
|
|
No contradictions with existing implementations found.
|