Files
git.stella-ops.org/docs/product-advisories/25-Nov-2025 - Half‑Life Confidence Decay for Unknowns.md
master b3656e5cb7
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
update advisories
2025-11-29 01:32:00 +02:00

16 KiB
Raw Blame History

Heres a simple, lowfriction way to keep priorities fresh without constant manual grooming: let confidence decay over time.

A small curve sloping down over time, illustrating exponential decay

Exponential confidence decay (what & why)

  • Idea: Every item (task, lead, bug, doc, hypothesis) has a confidence score that automatically shrinks with time if you dont touch it.
  • Formula: confidence(t) = e^(t/τ) where t is days since last signal (edit, comment, commit, new data), and τ (“tau”) is the decay constant.
  • Rule of thumb: With τ = 30 days, at t = 30 the confidence is e^(1) ≈ 0.37—about a 63% drop. This surfaces longignored items gradually, not with harsh “stale/expired” flips.

How to use it in practice

  • Signals that reset t → 0: comment on the ticket, new benchmark, fresh log sample, doc update, CI run, new market news.

  • Sort queues by: priority × confidence(t) (or severity × confidence). Quiet items drift down; truly active ones stay up.

  • Escalation bands:

    • >0.6 = green (recently touched)
    • 0.30.6 = amber (review soon)
    • <0.3 = red (poke or close)

Quick presets

  • Fastmoving queues (incidents, hot leads): τ = 714 days
  • Engineering tasks / product docs: τ = 30 days
  • Research bets / roadmaps: τ = 6090 days

For your world (StellaOps + ops/dev work)

  • Vuln tickets: risk_score = CVSS × reachability × e^(t/30)
  • Roadmap epics: value_score = impact × e^(t/60) to rerank quarterly.
  • Docs: show a badge “freshness: 42%” derived from last edit age to nudge updates.

Minimal implementation sketch

  • Store peritem: last_signal_at, base_priority.

  • Compute on read:

    days = (now - last_signal_at).days
    conf  = exp(-days / tau)
    score = base_priority * conf
    
  • Recompute in your API layer or materialize nightly; no cron spam needed.

If you want, I can draft a tiny C# helper (and SQL snippet) you can drop into your issue service to add confidence(t) and color bands to your lists. Perfect, lets turn the idea into something your devs can actually build.

Below is an implementation plan you can drop into a ticketing/PRD — with clear phases, data model changes, APIs, and some sample code (C# + SQL). Ill also sprinkle in StellaOpsspecific notes.


0. Scope & Objectives

Goal: Introduce confidence(t) as an automatic freshness factor that decays with time and is used to rank and highlight work.

Well apply it to:

  • Vulnerabilities (StellaOps)
  • General issues / tasks / epics
  • (Optional) Docs, leads, hypotheses later

Core behavior:

  • Each item has:

    • A base priority / risk (from severity, business impact, etc.)
    • A timestamp of last signal (meaningful activity)
    • A decay rate τ (tau) in days
  • Effective priority = base_priority × confidence(t)

  • confidence(t) = exp( t / τ) where t = days since last_signal


1. Data Model Changes

1.1. Add fields to core “work item” tables

For each relevant table (Issues, Vulnerabilities, Epics, …):

New columns:

  • base_priority (FLOAT or INT)

    • Example: 1100, or derived from severity.
  • last_signal_at (DATETIME, NOT NULL, default = created_at)

  • tau_days (FLOAT, nullable, falls back to type default)

  • (Optional) confidence_score_cached (FLOAT, for materialized score)

  • (Optional) is_confidence_frozen (BOOL, default FALSE) For pinned items that should not decay.

Example Postgres migration (Issues):

ALTER TABLE issues
  ADD COLUMN base_priority       DOUBLE PRECISION,
  ADD COLUMN last_signal_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  ADD COLUMN tau_days            DOUBLE PRECISION,
  ADD COLUMN confidence_cached   DOUBLE PRECISION,
  ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;

For StellaOps:

ALTER TABLE vulnerabilities
  ADD COLUMN base_risk           DOUBLE PRECISION,
  ADD COLUMN last_signal_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  ADD COLUMN tau_days            DOUBLE PRECISION,
  ADD COLUMN confidence_cached   DOUBLE PRECISION,
  ADD COLUMN is_confidence_frozen BOOLEAN NOT NULL DEFAULT FALSE;

1.2. Add a config table for τ per entity type

CREATE TABLE confidence_decay_config (
  id                SERIAL PRIMARY KEY,
  entity_type       TEXT NOT NULL, -- 'issue', 'vulnerability', 'epic', 'doc'
  tau_days_default  DOUBLE PRECISION NOT NULL,
  created_at        TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at        TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

INSERT INTO confidence_decay_config (entity_type, tau_days_default) VALUES
('incident', 7),
('vulnerability', 30),
('issue', 30),
('epic', 60),
('doc', 90);

2. Define “signal” events & instrumentation

We need a standardized way to say: “this item got activity → reset last_signal_at”.

2.1. Signals that should reset last_signal_at

For issues / epics:

  • New comment
  • Status change (e.g., Open → In Progress)
  • Field change that matters (severity, owner, milestone)
  • Attachment added
  • Link to PR added or updated
  • New CI failure linked

For vulnerabilities (StellaOps):

  • New scanner result attached or status updated (e.g., “Verified”, “False Positive”)
  • New evidence (PoC, exploit notes)
  • SLA override change
  • Assignment / ownership change
  • Integration events (e.g., PR merge that references the vuln)

For docs (if you do it):

  • Any edit
  • Comment/annotation

2.2. Implement a shared helper to record a signal

Service-level helper (pseudocode / C#-ish):

public interface IConfidenceSignalService
{
    Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null);
}

public class ConfidenceSignalService : IConfidenceSignalService
{
    private readonly IWorkItemRepository _repo;
    private readonly IConfidenceConfigService _config;

    public async Task RecordSignalAsync(WorkItemType type, Guid itemId, DateTime? signalTimeUtc = null)
    {
        var now = signalTimeUtc ?? DateTime.UtcNow;
        var item = await _repo.GetByIdAsync(type, itemId);
        if (item == null) return;

        item.LastSignalAt = now;

        if (item.TauDays == null)
        {
            item.TauDays = await _config.GetDefaultTauAsync(type);
        }

        await _repo.UpdateAsync(item);
    }
}

2.3. Wire signals into existing flows

Create small tasks for devs like:

  • ISS-01: Call RecordSignalAsync on:

    • New issue comment handler
    • Issue status update handler
    • Issue field update handler (severity/priority/owner)
  • VULN-01: Call RecordSignalAsync when:

    • New scanner result ingested for a vuln
    • Vulnerability status, SLA, or owner changes
    • New exploit evidence is attached

3. Confidence & scoring calculation

3.1. Shared confidence function

Definition:

public static class ConfidenceMath
{
    // t = days since last signal
    public static double ConfidenceScore(DateTime lastSignalAtUtc, double tauDays, DateTime? nowUtc = null)
    {
        var now = nowUtc ?? DateTime.UtcNow;
        var tDays = (now - lastSignalAtUtc).TotalDays;

        if (tDays <= 0) return 1.0;
        if (tauDays <= 0) return 1.0; // guard / fallback

        var score = Math.Exp(-tDays / tauDays);

        // Optional: never drop below a tiny floor, so items never "disappear"
        const double floor = 0.01;
        return Math.Max(score, floor);
    }
}

3.2. Effective priority formulas

Generic issues / tasks:

double effectiveScore = issue.BasePriority * ConfidenceMath.ConfidenceScore(issue.LastSignalAt, issue.TauDays ?? defaultTau);

Vulnerabilities (StellaOps):

Lets define:

  • severity_weight: map CVSS or severity string to numeric (e.g. Critical=100, High=80, Medium=50, Low=20).
  • reachability: 01 (e.g. from your reachability analysis).
  • exploitability: 01 (optional, based on known exploits).
  • confidence: as above.
double baseRisk = severityWeight * reachability * exploitability; // or simpler: severityWeight * reachability
double conf = ConfidenceMath.ConfidenceScore(vuln.LastSignalAt, vuln.TauDays ?? defaultTau);
double effectiveRisk = baseRisk * conf;

Store baseRiskvulnerabilities.base_risk, and compute effectiveRisk on the fly or via job.

3.3. SQL implementation (optional for server-side sorting)

Postgres example:

-- t_days = age in days
-- tau = tau_days
-- score = exp(-t_days / tau)

SELECT
  i.*,
  i.base_priority * 
    GREATEST(
      EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
      0.01
    ) AS effective_priority
FROM issues i
ORDER BY effective_priority DESC;

You can wrap that in a view:

CREATE VIEW issues_with_confidence AS
SELECT
  i.*,
  GREATEST(
    EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
    0.01
  ) AS confidence,
  i.base_priority *
    GREATEST(
      EXP(- EXTRACT(EPOCH FROM (NOW() - i.last_signal_at)) / (86400 * COALESCE(i.tau_days, 30))),
      0.01
    ) AS effective_priority
FROM issues i;

4. Caching & performance

You have two options:

4.1. Compute on read (simplest to start)

  • Use the helper function in your service layer or a DB view.

  • Pros:

    • No jobs, always fresh.
  • Cons:

    • Slight CPU cost on heavy lists.

Plan: Start with this. If you see perf issues, move to 4.2.

4.2. Periodic materialization job (optional later)

Add a scheduled job (e.g. hourly) that:

  1. Selects all active items.
  2. Computes confidence_score and effective_priority.
  3. Writes to confidence_cached and effective_priority_cached (if you add such a column).

Service then sorts by cached values.


5. Backfill & migration

5.1. Initial backfill script

For existing records:

  • If last_signal_at is NULL → set to created_at.
  • Derive base_priority / base_risk from existing severity fields.
  • Set tau_days from config.

Example:

UPDATE issues
SET last_signal_at = created_at
WHERE last_signal_at IS NULL;

UPDATE issues
SET base_priority = CASE severity
  WHEN 'critical' THEN 100
  WHEN 'high'     THEN 80
  WHEN 'medium'   THEN 50
  WHEN 'low'      THEN 20
  ELSE 10
END
WHERE base_priority IS NULL;

UPDATE issues i
SET tau_days = c.tau_days_default
FROM confidence_decay_config c
WHERE c.entity_type = 'issue'
  AND i.tau_days IS NULL;

Do similarly for vulnerabilities using severity / CVSS.

5.2. Sanity checks

Add a small script/test to verify:

  • Newly created items → confidence ≈ 1.0.
  • 30-day-old items with τ=30 → confidence ≈ 0.37.
  • Ordering changes when you edit/comment on items.

6. API & Query Layer

6.1. New sorting options

Update list APIs:

  • Accept parameter: sort=effective_priority or sort=confidence.

  • Default sort for some views:

    • Vulnerabilities backlog: sort=effective_risk (risk × confidence).
    • Issues backlog: sort=effective_priority.

Example REST API contract:

GET /api/issues?sort=effective_priority&state=open

Response fields (additions):

{
  "id": "ISS-123",
  "title": "Fix login bug",
  "base_priority": 80,
  "last_signal_at": "2025-11-01T10:00:00Z",
  "tau_days": 30,
  "confidence": 0.63,
  "effective_priority": 50.4,
  "confidence_band": "amber"
}

6.2. Confidence banding (for UI)

Define bands server-side (easy to change):

  • Green: confidence >= 0.6
  • Amber: 0.3 ≤ confidence < 0.6
  • Red: confidence < 0.3

You can compute on server:

string ConfidenceBand(double confidence) =>
    confidence >= 0.6 ? "green"
  : confidence >= 0.3 ? "amber"
  : "red";

7. UI / UX changes

7.1. List views (issues / vulns / epics)

For each item row:

  • Show a small freshness pill:

    • Text: Active, Review soon, Stale

    • Derived from confidence band.

    • Tooltip:

      • “Confidence 78%. Last activity 3 days ago. τ = 30 days.”
  • Sort default: by effective_priority / effective_risk.

  • Filters:

    • Freshness: [All | Active | Review soon | Stale]
    • Optionally: “Show stale only” toggle.

Example labels:

  • Green: “Active (confidence 82%)”
  • Amber: “Review soon (confidence 45%)”
  • Red: “Stale (confidence 18%)”

7.2. Detail views

On an issue / vuln page:

  • Add a “Confidence” section:

    • “Confidence: 52%
    • “Last signal: 12 days ago
    • “Decay τ: 30 days
    • “Effective priority: Base 80 × 0.52 = 42
  • (Optional) small mini-chart (text-only or simple bar) showing approximate decay, but not necessary for first iteration.

7.3. Admin / settings UI

Add an internal settings page:

  • Table of entity types with editable τ:

    Entity type τ (days) Notes
    Incident 7 Fast-moving
    Vulnerability 30 Standard risk review cadence
    Issue 30 Sprint-level decay
    Epic 60 Quarterly
    Doc 90 Slow decay
  • Optionally: toggle to pin item (is_confidence_frozen) from UI.


8. StellaOpsspecific behavior

For vulnerabilities:

8.1. Base risk calculation

Ingested fields you likely already have:

  • cvss_score or severity
  • reachable (true/false or numeric)
  • (Optional) exploit_available (bool) or exploitability score
  • asset_criticality (15)

Define base_risk as:

severity_weight = f(cvss_score or severity)
reachability    = reachable ? 1.0 : 0.5   -- example
exploitability  = exploit_available ? 1.0 : 0.7
asset_factor    = 0.5 + 0.1 * asset_criticality  -- 1 → 1.0, 5 → 1.5

base_risk = severity_weight * reachability * exploitability * asset_factor

Store base_risk on vuln row.

Then:

effective_risk = base_risk * confidence(t)

Use effective_risk for backlog ordering and SLAs dashboards.

8.2. Signals for vulns

Make sure these all call RecordSignalAsync(Vulnerability, vulnId):

  • New scan result for same vuln (re-detected).
  • Change status to “In Progress”, “Ready for Deploy”, “Verified Fixed”, etc.
  • Assigning an owner.
  • Attaching PoC / exploit details.

8.3. Vuln UI copy ideas

  • Pill text:

    • “Risk: 850 (confidence 68%)”
    • “Last analyst activity 11 days ago”
  • In backlog view: show Effective Risk as main sort, with a smaller subtext “Base 1200 × Confidence 71%”.


9. Rollout plan

Phase 1 Infrastructure (backend-only)

  • DB migrations & config table
  • Implement ConfidenceMath and helper functions
  • Implement IConfidenceSignalService
  • Wire signals into key flows (comments, state changes, scanner ingestion)
  • Add confidence and effective_priority/risk to API responses
  • Backfill script + dry run in staging

Phase 2 Internal UI & feature flag

  • Add optional sorting by effective score to internal/staff views

  • Add confidence pill (hidden behind feature flag confidence_decay_v1)

  • Dogfood internally:

    • Do items bubble up/down as expected?
    • Are any items “disappearing” because decay is too aggressive?

Phase 3 Parameter tuning

  • Adjust τ per type based on feedback:

    • If things decay too fast → increase τ
    • If queues rarely change → decrease τ
  • Decide on confidence floor (0.01? 0.05?) so nothing goes to literal 0.

Phase 4 General release

  • Make effective score the default sort for key views:

    • Vulnerabilities backlog
    • Issues backlog
  • Document behavior for users (help center / inline tooltip)

  • Add admin UI to tweak τ per entity type.


10. Edge cases & safeguards

  • New items

    • last_signal_at = created_at, confidence = 1.0.
  • Pinned items

    • If is_confidence_frozen = true → treat confidence as 1.0.
  • Items without τ

    • Always fallback to entity type default.
  • Timezones

    • Always store & compute in UTC.
  • Very old items

    • Floor the confidence so theyre still visible when explicitly searched.

If you want, I can turn this into:

  • A short technical design doc (with sections: Problem, Proposal, Alternatives, Rollout).
  • Or a set of Jira tickets grouped by backend / frontend / infra that your team can pick up directly.