Files
git.stella-ops.org/docs/modules/signals/decay/2025-12-01-confidence-decay.md
StellaOps Bot 44171930ff
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
feat: Add UI benchmark driver and scenarios for graph interactions
- Introduced `ui_bench_driver.mjs` to read scenarios and fixture manifest, generating a deterministic run plan.
- Created `ui_bench_plan.md` outlining the purpose, scope, and next steps for the benchmark.
- Added `ui_bench_scenarios.json` containing various scenarios for graph UI interactions.
- Implemented tests for CLI commands, ensuring bundle verification and telemetry defaults.
- Developed schemas for orchestrator components, including replay manifests and event envelopes.
- Added mock API for risk management, including listing and statistics functionalities.
- Implemented models for risk profiles and query options to support the new API.
2025-12-02 01:28:17 +02:00

4.6 KiB
Raw Blame History

Confidence Decay Controls · Signals Runtime

Compiled: 2025-12-01 (UTC) Scope: Close U1U10 gaps from docs/product-advisories/31-Nov-2025 FINDINGS.md for confidence decay of unknowns/signals. Status: Draft for review on 2025-12-03; to be signed (DSSE) after sign-off.

Decisions (U1U10)

  • τ governance (U1): All τ values live in confidence_decay_config.yaml, change-controlled via DSSE-signed PRs; allowable τ range 190 days. Changes require dual approval (Signals + Policy), recorded in history.
  • Floor / freeze (U2): confidence_floor per severity; is_confidence_frozen=true when SLA-bound or manual pin. Floors: Critical 0.60, High 0.45, Medium 0.30, Low 0.20. Freeze auto-expires at freeze_until.
  • Weighted signals (U3): Signal taxonomy with weights: exploit=1.0, customer_incident=0.9, threat_intel=0.7, code_change=0.4, artifact_refresh=0.3, metadata_touch=0.1. last_signal_weighted_at uses max(weighted timestamp).
  • Time source / drift (U4): All timestamps in UTC; decay uses monotonic clock fallback; reject events >5 minutes in the future or >30 days backdated, log corrections.
  • Deterministic recompute (U5): Nightly job at 03:00 UTC recomputes decay for all items; emits decay_snapshot_YYYY-MM-DD.ndjson with SHA256 and checksum record. On-read recompute only if snapshot is older than 24h.
  • SLA coupling (U6): Items with active SLA clamp to sla_floor (0.60 Critical, 0.50 High) until SLA met. SLA flag and floor are emitted in API.
  • Uncertainty linkage (U7): Confidence is capped by (1 - uncertainty_score); if uncertainty_score ≥0.4, band forced to "under_review" and alerts fire.
  • Backfill & migration (U8): Initial migration seeds last_signal_at from latest activity; default τ from entity profile; dry-run impact report required; backfill script outputs before/after bands.
  • API/UX surfacing (U9): New fields: confidence, confidence_band (critical/high/medium/low/under_review), tau_days, is_frozen, confidence_floor, uncertainty_score, last_signal_weighted_at. Sort default: priority * confidence.
  • Observability & alerts (U10): Counters/gauges: confidence_recalc_latency, items_below_floor, signals_weighted_by_type{type}, decay_snapshots_age_hours, uncertainty_forced_under_review. Alerts on missing nightly snapshot, decay drift >1 band, or SLA items below floor.

Reference Config (draft)

version: 1
updated_at: 2025-12-01T00:00:00Z
entities:
  vulnerability:
    tau_days: 21
    tau_min: 7
    tau_max: 90
    confidence_floor: {critical: 0.60, high: 0.45, medium: 0.30, low: 0.20}
    sla_floor: {critical: 0.60, high: 0.50}
    freeze_default_days: 30
  incident:
    tau_days: 14
    tau_min: 3
    tau_max: 60
signals_taxonomy:
  exploit: 1.0
  customer_incident: 0.9
  threat_intel: 0.7
  code_change: 0.4
  artifact_refresh: 0.3
  metadata_touch: 0.1
time:
  reject_future_minutes: 5
  reject_backdated_days: 30
recompute:
  schedule_utc: "03:00"
  snapshot_retention_days: 30
observability:
  alerts:
    missing_snapshot_hours: 26
    sla_floor_breach: true
    uncertainty_band_force: 0.4
signing:
  predicate: stella.ops/confidenceDecayConfig@v1
  dsse_required: true

Operational Rules

  • Config changes must produce a new DSSE envelope and update the checksum in the nightly snapshot header.
  • Nightly job writes decay_snapshot_<date>.ndjson (sorted by item_id) plus SHA256SUMS; both stored in Evidence Locker.
  • Any on-read recompute must emit an audit log with reasons (stale snapshot or forced recalculation).

Migration Playbook

  1. Run dry-run backfill: compute bands with proposed config; write decay_backfill_diff.ndjson (before/after bands, delta) and checksum.
  2. Get dual approval; sign confidence_decay_config.yaml with DSSE predicate above.
  3. Apply config, execute full recompute, publish snapshot + checksums, update observability dashboard baselines.

API Notes

  • Add fields to Signals API and CLI responses; ensure canonical serialization (sorted keys, UTC timestamps, fixed decimals 3dp) to avoid hash drift.
  • Bands map: >=0.75 critical, >=0.55 high, >=0.35 medium, >=0.20 low, else under_review.

Evidence & Storage

  • Store config DSSE, snapshots, and backfill reports in Evidence Locker with retention class signals-decay-config.
  • For offline kits, include latest config DSSE + last 3 snapshots and checksums.

Open Items for Review (12-03)

  • Confirm weights for threat_intel vs exploit; adjust if customer data suggests different ordering.
  • Confirm under_review threshold (currently uncertainty ≥0.4).
  • Align with Policy on SLA floors for High severity (0.50 proposed).