Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
603 lines
16 KiB
Plaintext
603 lines
16 KiB
Plaintext
Here’s a simple, low‑friction way to keep priorities fresh without constant manual grooming: **let confidence decay over time**.
|
||
|
||
%20=%20e^{-t/τ})
|
||
|
||
# Exponential confidence decay (what & why)
|
||
|
||
* **Idea:** Every item (task, lead, bug, doc, hypothesis) has a confidence score that **automatically shrinks with time** if you don’t touch it.
|
||
* **Formula:** `confidence(t) = e^(−t/τ)` where `t` is days since last signal (edit, comment, commit, new data), and **τ (“tau”)** is the decay constant.
|
||
* **Rule of thumb:** With **τ = 30 days**, at **t = 30** the confidence is **e^(−1) ≈ 0.37**—about a **63% drop**. This surfaces long‑ignored items *gradually*, not with harsh “stale/expired” flips.
|
||
|
||
# How to use it in practice
|
||
|
||
* **Signals that reset t → 0:** comment on the ticket, new benchmark, fresh log sample, doc update, CI run, new market news.
|
||
* **Sort queues by:** `priority × confidence(t)` (or severity × confidence). Quiet items drift down; truly active ones stay up.
|
||
* **Escalation bands:**
|
||
|
||
* `>0.6` = green (recently touched)
|
||
* `0.3–0.6` = amber (review soon)
|
||
* `<0.3` = red (poke or close)
|
||
|
||
# Quick presets
|
||
|
||
* **Fast‑moving queues (incidents, hot leads):** τ = **7–14** days
|
||
* **Engineering tasks / product docs:** τ = **30** days
|
||
* **Research bets / roadmaps:** τ = **60–90** days
|
||
|
||
# For your world (Stella Ops + ops/dev work)
|
||
|
||
* **Vuln tickets:** `risk_score = CVSS × reachability × e^(−t/30)`
|
||
* **Roadmap epics:** `value_score = impact × e^(−t/60)` to re‑rank quarterly.
|
||
* **Docs:** show a badge “freshness: 42%” derived from last edit age to nudge updates.
|
||
|
||
# Minimal implementation sketch
|
||
|
||
* Store per‑item: `last_signal_at`, `base_priority`.
|
||
* Compute on read:
|
||
|
||
```
|
||
days = (now - last_signal_at).days
|
||
conf = exp(-days / tau)
|
||
score = base_priority * conf
|
||
```
|
||
* Recompute in your API layer or materialize nightly; no cron spam needed.
|
||
|
||
If you want, I can draft a tiny C# helper (and SQL snippet) you can drop into your issue service to add `confidence(t)` and color bands to your lists.
|
||
Perfect, let’s turn the idea into something your devs can actually build.
|
||
|
||
Below is an implementation plan you can drop into a ticketing/PRD — with clear phases, data model changes, APIs, and some sample code (C# + SQL). I’ll also sprinkle in Stella Ops–specific notes.
|
||
|
||
---
|
||
|
||
## 0. Scope & Objectives
|
||
|
||
**Goal:** Introduce `confidence(t)` as an automatic freshness factor that decays with time and is used to rank and highlight work.
|
||
|
||
We’ll apply it to:
|
||
|
||
* Vulnerabilities (Stella Ops)
|
||
* General issues / tasks / epics
|
||
* (Optional) Docs, leads, hypotheses later
|
||
|
||
**Core behavior:**
|
||
|
||
* Each item has:
|
||
|
||
* A base priority / risk (from severity, business impact, etc.)
|
||
* A timestamp of last signal (meaningful activity)
|
||
* A decay rate τ (tau) in days
|
||
* Effective priority = `base_priority × confidence(t)`
|
||
* `confidence(t) = exp(− t / τ)` where `t` = days since last_signal
|
||
|
||
---
|
||
|
||
## 1. Data Model Changes
|
||
|
||
### 1.1. Add fields to core “work item” tables
|
||
|
||
For each relevant table (`Issues`, `Vulnerabilities`, `Epics`, …):
|
||
|
||
**New columns:**
|
||
|
||
* `base_priority` (FLOAT or INT)
|
||
|
||
* Example: 1–100, or derived from severity.
|
||
* `last_signal_at` (DATETIME, NOT NULL, default = `created_at`)
|
||
* `tau_days` (FLOAT, nullable, falls back to type default)
|
||
* (Optional) `confidence_score_cached` (FLOAT, for materialized score)
|
||
* (Optional) `is_confidence_frozen` (BOOL, default FALSE)
|
||
For pinned items that should not decay.
|
||
|
||
**Example Postgres migration (Issues):**
|
||
|
||
```sql
|
||
ALTER TABLE issues
|
||
ADD COLUMN base_priority DOUBLE PRECISION,
|
||
ADD COLUMN last_signal_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
ADD COLUMN tau_days DOUBLE PRECISION,
|
||
ADD COLUMN confidence_cached DOUBLE PRECISION,
|
||
ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;
|
||
```
|
||
|
||
For Stella Ops:
|
||
|
||
```sql
|
||
ALTER TABLE vulnerabilities
|
||
ADD COLUMN base_risk DOUBLE PRECISION,
|
||
ADD COLUMN last_signal_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
ADD COLUMN tau_days DOUBLE PRECISION,
|
||
ADD COLUMN confidence_cached DOUBLE PRECISION,
|
||
ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;
|
||
```
|
||
|
||
### 1.2. Add a config table for τ per entity type
|
||
|
||
```sql
|
||
CREATE TABLE confidence_decay_config (
|
||
id SERIAL PRIMARY KEY,
|
||
entity_type TEXT NOT NULL, -- 'issue', 'vulnerability', 'epic', 'doc'
|
||
tau_days_default DOUBLE PRECISION NOT NULL,
|
||
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||
);
|
||
|
||
INSERT INTO confidence_decay_config (entity_type, tau_days_default) VALUES
|
||
('incident', 7),
|
||
('vulnerability', 30),
|
||
('issue', 30),
|
||
('epic', 60),
|
||
('doc', 90);
|
||
```
|
||
|
||
---
|
||
|
||
## 2. Define “signal” events & instrumentation
|
||
|
||
We need a standardized way to say: “this item got activity → reset last_signal_at”.
|
||
|
||
### 2.1. Signals that should reset `last_signal_at`
|
||
|
||
For **issues / epics:**
|
||
|
||
* New comment
|
||
* Status change (e.g., Open → In Progress)
|
||
* Field change that matters (severity, owner, milestone)
|
||
* Attachment added
|
||
* Link to PR added or updated
|
||
* New CI failure linked
|
||
|
||
For **vulnerabilities (Stella Ops):**
|
||
|
||
* New scanner result attached or status updated (e.g., “Verified”, “False Positive”)
|
||
* New evidence (PoC, exploit notes)
|
||
* SLA override change
|
||
* Assignment / ownership change
|
||
* Integration events (e.g., PR merge that references the vuln)
|
||
|
||
For **docs (if you do it):**
|
||
|
||
* Any edit
|
||
* Comment/annotation
|
||
|
||
### 2.2. Implement a shared helper to record a signal
|
||
|
||
**Service-level helper (pseudocode / C#-ish):**
|
||
|
||
```csharp
|
||
public interface IConfidenceSignalService
|
||
{
|
||
Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null);
|
||
}
|
||
|
||
public class ConfidenceSignalService : IConfidenceSignalService
|
||
{
|
||
private readonly IWorkItemRepository _repo;
|
||
private readonly IConfidenceConfigService _config;
|
||
|
||
public async Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null)
|
||
{
|
||
var now = signalTimeUtc ?? DateTime.UtcNow;
|
||
var item = await _repo.GetByIdAsync(type, itemId);
|
||
if (item == null) return;
|
||
|
||
item.LastSignalAt = now;
|
||
|
||
if (item.TauDays == null)
|
||
{
|
||
item.TauDays = await _config.GetDefaultTauAsync(type);
|
||
}
|
||
|
||
await _repo.UpdateAsync(item);
|
||
}
|
||
}
|
||
```
|
||
|
||
### 2.3. Wire signals into existing flows
|
||
|
||
Create small tasks for devs like:
|
||
|
||
* **ISS-01:** Call `RecordSignalAsync` on:
|
||
|
||
* New issue comment handler
|
||
* Issue status update handler
|
||
* Issue field update handler (severity/priority/owner)
|
||
* **VULN-01:** Call `RecordSignalAsync` when:
|
||
|
||
* New scanner result ingested for a vuln
|
||
* Vulnerability status, SLA, or owner changes
|
||
* New exploit evidence is attached
|
||
|
||
---
|
||
|
||
## 3. Confidence & scoring calculation
|
||
|
||
### 3.1. Shared confidence function
|
||
|
||
Definition:
|
||
|
||
```csharp
|
||
public static class ConfidenceMath
|
||
{
|
||
// t = days since last signal
|
||
public static double ConfidenceScore(DateTime lastSignalAtUtc, double tauDays, DateTime? nowUtc = null)
|
||
{
|
||
var now = nowUtc ?? DateTime.UtcNow;
|
||
var tDays = (now - lastSignalAtUtc).TotalDays;
|
||
|
||
if (tDays <= 0) return 1.0;
|
||
if (tauDays <= 0) return 1.0; // guard / fallback
|
||
|
||
var score = Math.Exp(-tDays / tauDays);
|
||
|
||
// Optional: never drop below a tiny floor, so items never "disappear"
|
||
const double floor = 0.01;
|
||
return Math.Max(score, floor);
|
||
}
|
||
}
|
||
```
|
||
|
||
### 3.2. Effective priority formulas
|
||
|
||
**Generic issues / tasks:**
|
||
|
||
```csharp
|
||
double effectiveScore = issue.BasePriority * ConfidenceMath.ConfidenceScore(issue.LastSignalAt, issue.TauDays ?? defaultTau);
|
||
```
|
||
|
||
**Vulnerabilities (Stella Ops):**
|
||
|
||
Let’s define:
|
||
|
||
* `severity_weight`: map CVSS or severity string to numeric (e.g. Critical=100, High=80, Medium=50, Low=20).
|
||
* `reachability`: 0–1 (e.g. from your reachability analysis).
|
||
* `exploitability`: 0–1 (optional, based on known exploits).
|
||
* `confidence`: as above.
|
||
|
||
```csharp
|
||
double baseRisk = severityWeight * reachability * exploitability; // or simpler: severityWeight * reachability
|
||
double conf = ConfidenceMath.ConfidenceScore(vuln.LastSignalAt, vuln.TauDays ?? defaultTau);
|
||
double effectiveRisk = baseRisk * conf;
|
||
```
|
||
|
||
Store `baseRisk` → `vulnerabilities.base_risk`, and compute `effectiveRisk` on the fly or via job.
|
||
|
||
### 3.3. SQL implementation (optional for server-side sorting)
|
||
|
||
**Postgres example:**
|
||
|
||
```sql
|
||
-- t_days = age in days
|
||
-- tau = tau_days
|
||
-- score = exp(-t_days / tau)
|
||
|
||
SELECT
|
||
i.*,
|
||
i.base_priority *
|
||
GREATEST(
|
||
EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
|
||
0.01
|
||
) AS effective_priority
|
||
FROM issues i
|
||
ORDER BY effective_priority DESC;
|
||
```
|
||
|
||
You can wrap that in a view:
|
||
|
||
```sql
|
||
CREATE VIEW issues_with_confidence AS
|
||
SELECT
|
||
i.*,
|
||
GREATEST(
|
||
EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
|
||
0.01
|
||
) AS confidence,
|
||
i.base_priority *
|
||
GREATEST(
|
||
EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
|
||
0.01
|
||
) AS effective_priority
|
||
FROM issues i;
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Caching & performance
|
||
|
||
You have two options:
|
||
|
||
### 4.1. Compute on read (simplest to start)
|
||
|
||
* Use the helper function in your service layer or a DB view.
|
||
* Pros:
|
||
|
||
* No jobs, always fresh.
|
||
* Cons:
|
||
|
||
* Slight CPU cost on heavy lists.
|
||
|
||
**Plan:** Start with this. If you see perf issues, move to 4.2.
|
||
|
||
### 4.2. Periodic materialization job (optional later)
|
||
|
||
Add a scheduled job (e.g. hourly) that:
|
||
|
||
1. Selects all active items.
|
||
2. Computes `confidence_score` and `effective_priority`.
|
||
3. Writes to `confidence_cached` and `effective_priority_cached` (if you add such a column).
|
||
|
||
Service then sorts by cached values.
|
||
|
||
---
|
||
|
||
## 5. Backfill & migration
|
||
|
||
### 5.1. Initial backfill script
|
||
|
||
For existing records:
|
||
|
||
* If `last_signal_at` is NULL → set to `created_at`.
|
||
* Derive `base_priority` / `base_risk` from existing severity fields.
|
||
* Set `tau_days` from config.
|
||
|
||
**Example:**
|
||
|
||
```sql
|
||
UPDATE issues
|
||
SET last_signal_at = created_at
|
||
WHERE last_signal_at IS NULL;
|
||
|
||
UPDATE issues
|
||
SET base_priority = CASE severity
|
||
WHEN 'critical' THEN 100
|
||
WHEN 'high' THEN 80
|
||
WHEN 'medium' THEN 50
|
||
WHEN 'low' THEN 20
|
||
ELSE 10
|
||
END
|
||
WHERE base_priority IS NULL;
|
||
|
||
UPDATE issues i
|
||
SET tau_days = c.tau_days_default
|
||
FROM confidence_decay_config c
|
||
WHERE c.entity_type = 'issue'
|
||
AND i.tau_days IS NULL;
|
||
```
|
||
|
||
Do similarly for `vulnerabilities` using severity / CVSS.
|
||
|
||
### 5.2. Sanity checks
|
||
|
||
Add a small script/test to verify:
|
||
|
||
* Newly created items → `confidence ≈ 1.0`.
|
||
* 30-day-old items with τ=30 → `confidence ≈ 0.37`.
|
||
* Ordering changes when you edit/comment on items.
|
||
|
||
---
|
||
|
||
## 6. API & Query Layer
|
||
|
||
### 6.1. New sorting options
|
||
|
||
Update list APIs:
|
||
|
||
* Accept parameter: `sort=effective_priority` or `sort=confidence`.
|
||
* Default sort for some views:
|
||
|
||
* Vulnerabilities backlog: `sort=effective_risk` (risk × confidence).
|
||
* Issues backlog: `sort=effective_priority`.
|
||
|
||
**Example REST API contract:**
|
||
|
||
`GET /api/issues?sort=effective_priority&state=open`
|
||
|
||
**Response fields (additions):**
|
||
|
||
```json
|
||
{
|
||
"id": "ISS-123",
|
||
"title": "Fix login bug",
|
||
"base_priority": 80,
|
||
"last_signal_at": "2025-11-01T10:00:00Z",
|
||
"tau_days": 30,
|
||
"confidence": 0.63,
|
||
"effective_priority": 50.4,
|
||
"confidence_band": "amber"
|
||
}
|
||
```
|
||
|
||
### 6.2. Confidence banding (for UI)
|
||
|
||
Define bands server-side (easy to change):
|
||
|
||
* Green: `confidence >= 0.6`
|
||
* Amber: `0.3 ≤ confidence < 0.6`
|
||
* Red: `confidence < 0.3`
|
||
|
||
You can compute on server:
|
||
|
||
```csharp
|
||
string ConfidenceBand(double confidence) =>
|
||
confidence >= 0.6 ? "green"
|
||
: confidence >= 0.3 ? "amber"
|
||
: "red";
|
||
```
|
||
|
||
---
|
||
|
||
## 7. UI / UX changes
|
||
|
||
### 7.1. List views (issues / vulns / epics)
|
||
|
||
For each item row:
|
||
|
||
* Show a small freshness pill:
|
||
|
||
* Text: `Active`, `Review soon`, `Stale`
|
||
* Derived from confidence band.
|
||
* Tooltip:
|
||
|
||
* “Confidence 78%. Last activity 3 days ago. τ = 30 days.”
|
||
|
||
* Sort default: by `effective_priority` / `effective_risk`.
|
||
|
||
* Filters:
|
||
|
||
* `Freshness: [All | Active | Review soon | Stale]`
|
||
* Optionally: “Show stale only” toggle.
|
||
|
||
**Example labels:**
|
||
|
||
* Green: “Active (confidence 82%)”
|
||
* Amber: “Review soon (confidence 45%)”
|
||
* Red: “Stale (confidence 18%)”
|
||
|
||
### 7.2. Detail views
|
||
|
||
On an issue / vuln page:
|
||
|
||
* Add a “Confidence” section:
|
||
|
||
* “Confidence: **52%**”
|
||
* “Last signal: **12 days ago**”
|
||
* “Decay τ: **30 days**”
|
||
* “Effective priority: **Base 80 × 0.52 = 42**”
|
||
|
||
* (Optional) small mini-chart (text-only or simple bar) showing approximate decay, but not necessary for first iteration.
|
||
|
||
### 7.3. Admin / settings UI
|
||
|
||
Add an internal settings page:
|
||
|
||
* Table of entity types with editable τ:
|
||
|
||
| Entity type | τ (days) | Notes |
|
||
| ------------- | -------- | ---------------------------- |
|
||
| Incident | 7 | Fast-moving |
|
||
| Vulnerability | 30 | Standard risk review cadence |
|
||
| Issue | 30 | Sprint-level decay |
|
||
| Epic | 60 | Quarterly |
|
||
| Doc | 90 | Slow decay |
|
||
|
||
* Optionally: toggle to pin item (`is_confidence_frozen`) from UI.
|
||
|
||
---
|
||
|
||
## 8. Stella Ops–specific behavior
|
||
|
||
For vulnerabilities:
|
||
|
||
### 8.1. Base risk calculation
|
||
|
||
Ingested fields you likely already have:
|
||
|
||
* `cvss_score` or `severity`
|
||
* `reachable` (true/false or numeric)
|
||
* (Optional) `exploit_available` (bool) or exploitability score
|
||
* `asset_criticality` (1–5)
|
||
|
||
Define `base_risk` as:
|
||
|
||
```text
|
||
severity_weight = f(cvss_score or severity)
|
||
reachability = reachable ? 1.0 : 0.5 -- example
|
||
exploitability = exploit_available ? 1.0 : 0.7
|
||
asset_factor = 0.5 + 0.1 * asset_criticality -- 1 → 1.0, 5 → 1.5
|
||
|
||
base_risk = severity_weight * reachability * exploitability * asset_factor
|
||
```
|
||
|
||
Store `base_risk` on vuln row.
|
||
|
||
Then:
|
||
|
||
```text
|
||
effective_risk = base_risk * confidence(t)
|
||
```
|
||
|
||
Use `effective_risk` for backlog ordering and SLAs dashboards.
|
||
|
||
### 8.2. Signals for vulns
|
||
|
||
Make sure these all call `RecordSignalAsync(Vulnerability, vulnId)`:
|
||
|
||
* New scan result for same vuln (re-detected).
|
||
* Change status to “In Progress”, “Ready for Deploy”, “Verified Fixed”, etc.
|
||
* Assigning an owner.
|
||
* Attaching PoC / exploit details.
|
||
|
||
### 8.3. Vuln UI copy ideas
|
||
|
||
* Pill text:
|
||
|
||
* “Risk: 850 (confidence 68%)”
|
||
* “Last analyst activity 11 days ago”
|
||
|
||
* In backlog view: show **Effective Risk** as main sort, with a smaller subtext “Base 1200 × Confidence 71%”.
|
||
|
||
---
|
||
|
||
## 9. Rollout plan
|
||
|
||
### Phase 1 – Infrastructure (backend-only)
|
||
|
||
* [ ] DB migrations & config table
|
||
* [ ] Implement `ConfidenceMath` and helper functions
|
||
* [ ] Implement `IConfidenceSignalService`
|
||
* [ ] Wire signals into key flows (comments, state changes, scanner ingestion)
|
||
* [ ] Add `confidence` and `effective_priority/risk` to API responses
|
||
* [ ] Backfill script + dry run in staging
|
||
|
||
### Phase 2 – Internal UI & feature flag
|
||
|
||
* [ ] Add optional sorting by effective score to internal/staff views
|
||
* [ ] Add confidence pill (hidden behind feature flag `confidence_decay_v1`)
|
||
* [ ] Dogfood internally:
|
||
|
||
* Do items bubble up/down as expected?
|
||
* Are any items “disappearing” because decay is too aggressive?
|
||
|
||
### Phase 3 – Parameter tuning
|
||
|
||
* [ ] Adjust τ per type based on feedback:
|
||
|
||
* If things decay too fast → increase τ
|
||
* If queues rarely change → decrease τ
|
||
* [ ] Decide on confidence floor (0.01? 0.05?) so nothing goes to literal 0.
|
||
|
||
### Phase 4 – General release
|
||
|
||
* [ ] Make effective score the default sort for key views:
|
||
|
||
* Vulnerabilities backlog
|
||
* Issues backlog
|
||
* [ ] Document behavior for users (help center / inline tooltip)
|
||
* [ ] Add admin UI to tweak τ per entity type.
|
||
|
||
---
|
||
|
||
## 10. Edge cases & safeguards
|
||
|
||
* **New items**
|
||
|
||
* `last_signal_at = created_at`, confidence = 1.0.
|
||
* **Pinned items**
|
||
|
||
* If `is_confidence_frozen = true` → treat confidence as 1.0.
|
||
* **Items without τ**
|
||
|
||
* Always fallback to entity type default.
|
||
* **Timezones**
|
||
|
||
* Always store & compute in UTC.
|
||
* **Very old items**
|
||
|
||
* Floor the confidence so they’re still visible when explicitly searched.
|
||
|
||
---
|
||
|
||
If you want, I can turn this into:
|
||
|
||
* A short **technical design doc** (with sections: Problem, Proposal, Alternatives, Rollout).
|
||
* Or a **set of Jira tickets** grouped by backend / frontend / infra that your team can pick up directly.
|