stella-ops.org/git.stella-ops.org

Fork 0

Files

master b3656e5cb7

Docs CI / lint-and-preview (push) Has been cancelled

Details

update advisories

2025-11-29 01:32:00 +02:00

16 KiB

Raw Blame History

Here’s a simple, low‑friction way to keep priorities fresh without constant manual grooming: let confidence decay over time.

Exponential confidence decay (what & why)

Idea: Every item (task, lead, bug, doc, hypothesis) has a confidence score that automatically shrinks with time if you don’t touch it.
Formula: confidence(t) = e^(−t/τ) where t is days since last signal (edit, comment, commit, new data), and τ (“tau”) is the decay constant.
Rule of thumb: With τ = 30 days, at t = 30 the confidence is e^(−1) ≈ 0.37—about a 63% drop. This surfaces long‑ignored items gradually, not with harsh “stale/expired” flips.

How to use it in practice

Signals that reset t → 0: comment on the ticket, new benchmark, fresh log sample, doc update, CI run, new market news.
Sort queues by: priority × confidence(t) (or severity × confidence). Quiet items drift down; truly active ones stay up.
Escalation bands:
- >0.6 = green (recently touched)
- 0.3–0.6 = amber (review soon)
- <0.3 = red (poke or close)

Quick presets

Fast‑moving queues (incidents, hot leads): τ = 7–14 days
Engineering tasks / product docs: τ = 30 days
Research bets / roadmaps: τ = 60–90 days

For your world (Stella Ops + ops/dev work)

Vuln tickets: risk_score = CVSS × reachability × e^(−t/30)
Roadmap epics: value_score = impact × e^(−t/60) to re‑rank quarterly.
Docs: show a badge “freshness: 42%” derived from last edit age to nudge updates.

Minimal implementation sketch

Store per‑item: last_signal_at, base_priority.

Compute on read:

days = (now - last_signal_at).days
conf  = exp(-days / tau)
score = base_priority * conf

Recompute in your API layer or materialize nightly; no cron spam needed.

If you want, I can draft a tiny C# helper (and SQL snippet) you can drop into your issue service to add confidence(t) and color bands to your lists. Perfect, let’s turn the idea into something your devs can actually build.

Below is an implementation plan you can drop into a ticketing/PRD — with clear phases, data model changes, APIs, and some sample code (C# + SQL). I’ll also sprinkle in Stella Ops–specific notes.

0. Scope & Objectives

Goal: Introduce confidence(t) as an automatic freshness factor that decays with time and is used to rank and highlight work.

We’ll apply it to:

Vulnerabilities (Stella Ops)
General issues / tasks / epics
(Optional) Docs, leads, hypotheses later

Core behavior:

Each item has:
- A base priority / risk (from severity, business impact, etc.)
- A timestamp of last signal (meaningful activity)
- A decay rate τ (tau) in days
Effective priority = base_priority × confidence(t)
confidence(t) = exp(− t / τ) where t = days since last_signal

1. Data Model Changes

1.1. Add fields to core “work item” tables

For each relevant table (Issues, Vulnerabilities, Epics, …):

New columns:

base_priority (FLOAT or INT)
- Example: 1–100, or derived from severity.
last_signal_at (DATETIME, NOT NULL, default = created_at)
tau_days (FLOAT, nullable, falls back to type default)
(Optional) confidence_score_cached (FLOAT, for materialized score)
(Optional) is_confidence_frozen (BOOL, default FALSE) For pinned items that should not decay.

Example Postgres migration (Issues):

ALTER TABLE issues
  ADD COLUMN base_priority       DOUBLE PRECISION,
  ADD COLUMN last_signal_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  ADD COLUMN tau_days            DOUBLE PRECISION,
  ADD COLUMN confidence_cached   DOUBLE PRECISION,
  ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;

For Stella Ops:

ALTER TABLE vulnerabilities
  ADD COLUMN base_risk           DOUBLE PRECISION,
  ADD COLUMN last_signal_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  ADD COLUMN tau_days            DOUBLE PRECISION,
  ADD COLUMN confidence_cached   DOUBLE PRECISION,
  ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;

1.2. Add a config table for τ per entity type

CREATE TABLE confidence_decay_config (
  id                SERIAL PRIMARY KEY,
  entity_type       TEXT NOT NULL, -- 'issue', 'vulnerability', 'epic', 'doc'
  tau_days_default  DOUBLE PRECISION NOT NULL,
  created_at        TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at        TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

INSERT INTO confidence_decay_config (entity_type, tau_days_default) VALUES
('incident', 7),
('vulnerability', 30),
('issue', 30),
('epic', 60),
('doc', 90);

2. Define “signal” events & instrumentation

We need a standardized way to say: “this item got activity → reset last_signal_at”.

2.1. Signals that should reset `last_signal_at`

For issues / epics:

New comment
Status change (e.g., Open → In Progress)
Field change that matters (severity, owner, milestone)
Attachment added
Link to PR added or updated
New CI failure linked

For vulnerabilities (Stella Ops):

New scanner result attached or status updated (e.g., “Verified”, “False Positive”)
New evidence (PoC, exploit notes)
SLA override change
Assignment / ownership change
Integration events (e.g., PR merge that references the vuln)

For docs (if you do it):

Any edit
Comment/annotation

2.2. Implement a shared helper to record a signal

Service-level helper (pseudocode / C#-ish):

public interface IConfidenceSignalService
{
    Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null);
}

public class ConfidenceSignalService : IConfidenceSignalService
{
    private readonly IWorkItemRepository _repo;
    private readonly IConfidenceConfigService _config;

    public async Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null)
    {
        var now = signalTimeUtc ?? DateTime.UtcNow;
        var item = await _repo.GetByIdAsync(type, itemId);
        if (item == null) return;

        item.LastSignalAt = now;

        if (item.TauDays == null)
        {
            item.TauDays = await _config.GetDefaultTauAsync(type);
        }

        await _repo.UpdateAsync(item);
    }
}

2.3. Wire signals into existing flows

Create small tasks for devs like:

ISS-01: Call RecordSignalAsync on:
- New issue comment handler
- Issue status update handler
- Issue field update handler (severity/priority/owner)
VULN-01: Call RecordSignalAsync when:
- New scanner result ingested for a vuln
- Vulnerability status, SLA, or owner changes
- New exploit evidence is attached

3. Confidence & scoring calculation

3.1. Shared confidence function

Definition:

public static class ConfidenceMath
{
    // t = days since last signal
    public static double ConfidenceScore(DateTime lastSignalAtUtc, double tauDays, DateTime? nowUtc = null)
    {
        var now = nowUtc ?? DateTime.UtcNow;
        var tDays = (now - lastSignalAtUtc).TotalDays;

        if (tDays <= 0) return 1.0;
        if (tauDays <= 0) return 1.0; // guard / fallback

        var score = Math.Exp(-tDays / tauDays);

        // Optional: never drop below a tiny floor, so items never "disappear"
        const double floor = 0.01;
        return Math.Max(score, floor);
    }
}

3.2. Effective priority formulas

Generic issues / tasks:

double effectiveScore = issue.BasePriority * ConfidenceMath.ConfidenceScore(issue.LastSignalAt, issue.TauDays ?? defaultTau);

Vulnerabilities (Stella Ops):

Let’s define:

severity_weight: map CVSS or severity string to numeric (e.g. Critical=100, High=80, Medium=50, Low=20).
reachability: 0–1 (e.g. from your reachability analysis).
exploitability: 0–1 (optional, based on known exploits).
confidence: as above.

double baseRisk = severityWeight * reachability * exploitability; // or simpler: severityWeight * reachability
double conf = ConfidenceMath.ConfidenceScore(vuln.LastSignalAt, vuln.TauDays ?? defaultTau);
double effectiveRisk = baseRisk * conf;

Store baseRisk → vulnerabilities.base_risk, and compute effectiveRisk on the fly or via job.

3.3. SQL implementation (optional for server-side sorting)

Postgres example:

-- t_days = age in days
-- tau = tau_days
-- score = exp(-t_days / tau)

SELECT
  i.*,
  i.base_priority * 
    GREATEST(
      EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
      0.01
    ) AS effective_priority
FROM issues i
ORDER BY effective_priority DESC;

You can wrap that in a view:

CREATE VIEW issues_with_confidence AS
SELECT
  i.*,
  GREATEST(
    EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
    0.01
  ) AS confidence,
  i.base_priority *
    GREATEST(
      EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
      0.01
    ) AS effective_priority
FROM issues i;

4. Caching & performance

You have two options:

4.1. Compute on read (simplest to start)

Use the helper function in your service layer or a DB view.
Pros:
- No jobs, always fresh.
Cons:
- Slight CPU cost on heavy lists.

Plan: Start with this. If you see perf issues, move to 4.2.

4.2. Periodic materialization job (optional later)

Add a scheduled job (e.g. hourly) that:

Selects all active items.
Computes confidence_score and effective_priority.
Writes to confidence_cached and effective_priority_cached (if you add such a column).

Service then sorts by cached values.

5. Backfill & migration

5.1. Initial backfill script

For existing records:

If last_signal_at is NULL → set to created_at.
Derive base_priority / base_risk from existing severity fields.
Set tau_days from config.

Example:

UPDATE issues
SET last_signal_at = created_at
WHERE last_signal_at IS NULL;

UPDATE issues
SET base_priority = CASE severity
  WHEN 'critical' THEN 100
  WHEN 'high'     THEN 80
  WHEN 'medium'   THEN 50
  WHEN 'low'      THEN 20
  ELSE 10
END
WHERE base_priority IS NULL;

UPDATE issues i
SET tau_days = c.tau_days_default
FROM confidence_decay_config c
WHERE c.entity_type = 'issue'
  AND i.tau_days IS NULL;

Do similarly for vulnerabilities using severity / CVSS.

5.2. Sanity checks

Add a small script/test to verify:

Newly created items → confidence ≈ 1.0.
30-day-old items with τ=30 → confidence ≈ 0.37.
Ordering changes when you edit/comment on items.

6. API & Query Layer

6.1. New sorting options

Update list APIs:

Accept parameter: sort=effective_priority or sort=confidence.
Default sort for some views:
- Vulnerabilities backlog: sort=effective_risk (risk × confidence).
- Issues backlog: sort=effective_priority.

Example REST API contract:

GET /api/issues?sort=effective_priority&state=open

Response fields (additions):

{
  "id": "ISS-123",
  "title": "Fix login bug",
  "base_priority": 80,
  "last_signal_at": "2025-11-01T10:00:00Z",
  "tau_days": 30,
  "confidence": 0.63,
  "effective_priority": 50.4,
  "confidence_band": "amber"
}

6.2. Confidence banding (for UI)

Define bands server-side (easy to change):

Green: confidence >= 0.6
Amber: 0.3 ≤ confidence < 0.6
Red: confidence < 0.3

You can compute on server:

string ConfidenceBand(double confidence) =>
    confidence >= 0.6 ? "green"
  : confidence >= 0.3 ? "amber"
  : "red";

7. UI / UX changes

7.1. List views (issues / vulns / epics)

For each item row:

Show a small freshness pill:
- Text: Active, Review soon, Stale
- Derived from confidence band.
- Tooltip:
  - “Confidence 78%. Last activity 3 days ago. τ = 30 days.”
Sort default: by effective_priority / effective_risk.
Filters:
- Freshness: [All | Active | Review soon | Stale]
- Optionally: “Show stale only” toggle.

Example labels:

Green: “Active (confidence 82%)”
Amber: “Review soon (confidence 45%)”
Red: “Stale (confidence 18%)”

7.2. Detail views

On an issue / vuln page:

Add a “Confidence” section:
- “Confidence: 52%”
- “Last signal: 12 days ago”
- “Decay τ: 30 days”
- “Effective priority: Base 80 × 0.52 = 42”
(Optional) small mini-chart (text-only or simple bar) showing approximate decay, but not necessary for first iteration.

7.3. Admin / settings UI

Add an internal settings page:

Table of entity types with editable τ:

Entity type τ (days) Notes

Incident 7 Fast-moving

Vulnerability 30 Standard risk review cadence

Issue 30 Sprint-level decay

Epic 60 Quarterly

Doc 90 Slow decay
Optionally: toggle to pin item (is_confidence_frozen) from UI.

Entity type	τ (days)	Notes
Incident	7	Fast-moving
Vulnerability	30	Standard risk review cadence
Issue	30	Sprint-level decay
Epic	60	Quarterly
Doc	90	Slow decay

8. Stella Ops–specific behavior

For vulnerabilities:

8.1. Base risk calculation

Ingested fields you likely already have:

cvss_score or severity
reachable (true/false or numeric)
(Optional) exploit_available (bool) or exploitability score
asset_criticality (1–5)

Define base_risk as:

severity_weight = f(cvss_score or severity)
reachability    = reachable ? 1.0 : 0.5   -- example
exploitability  = exploit_available ? 1.0 : 0.7
asset_factor    = 0.5 + 0.1 * asset_criticality  -- 1 → 1.0, 5 → 1.5

base_risk = severity_weight * reachability * exploitability * asset_factor

Store base_risk on vuln row.

Then:

effective_risk = base_risk * confidence(t)

Use effective_risk for backlog ordering and SLAs dashboards.

8.2. Signals for vulns

Make sure these all call RecordSignalAsync(Vulnerability, vulnId):

New scan result for same vuln (re-detected).
Change status to “In Progress”, “Ready for Deploy”, “Verified Fixed”, etc.
Assigning an owner.
Attaching PoC / exploit details.

8.3. Vuln UI copy ideas

Pill text:
- “Risk: 850 (confidence 68%)”
- “Last analyst activity 11 days ago”
In backlog view: show Effective Risk as main sort, with a smaller subtext “Base 1200 × Confidence 71%”.

9. Rollout plan

Phase 1 – Infrastructure (backend-only)

DB migrations & config table
Implement ConfidenceMath and helper functions
Implement IConfidenceSignalService
Wire signals into key flows (comments, state changes, scanner ingestion)
Add confidence and effective_priority/risk to API responses
Backfill script + dry run in staging

Phase 2 – Internal UI & feature flag

Add optional sorting by effective score to internal/staff views
Add confidence pill (hidden behind feature flag confidence_decay_v1)
Dogfood internally:
- Do items bubble up/down as expected?
- Are any items “disappearing” because decay is too aggressive?

Phase 3 – Parameter tuning

Adjust τ per type based on feedback:
- If things decay too fast → increase τ
- If queues rarely change → decrease τ
Decide on confidence floor (0.01? 0.05?) so nothing goes to literal 0.

Phase 4 – General release

Make effective score the default sort for key views:
- Vulnerabilities backlog
- Issues backlog
Document behavior for users (help center / inline tooltip)
Add admin UI to tweak τ per entity type.

10. Edge cases & safeguards

New items
- last_signal_at = created_at, confidence = 1.0.
Pinned items
- If is_confidence_frozen = true → treat confidence as 1.0.
Items without τ
- Always fallback to entity type default.
Timezones
- Always store & compute in UTC.
Very old items
- Floor the confidence so they’re still visible when explicitly searched.

If you want, I can turn this into:

A short technical design doc (with sections: Problem, Proposal, Alternatives, Rollout).
Or a set of Jira tickets grouped by backend / frontend / infra that your team can pick up directly.

16 KiB Raw Blame History Unescape Escape