Files
git.stella-ops.org/docs/uncertainty/README.md
StellaOps Bot 999e26a48e up
2025-12-13 02:22:15 +02:00

6.7 KiB

Uncertainty States & Entropy Scoring

Status: Implemented v0 for reachability facts (Signals). Owners: Signals Guild · Policy Guild · UI Guild.

StellaOps treats missing data and untrusted evidence as first-class uncertainty states, not silent false negatives. Signals persists uncertainty state entries alongside reachability facts and derives a deterministic riskScore that increases when entropy is high.


1. Core states (extensible)

Code Name Meaning
U1 MissingSymbolResolution Unresolved symbols/edges prevent a complete reachability proof.
U2 MissingPurl Package identity/version is ambiguous (lockfile absent, heuristics only).
U3 UntrustedAdvisory Advisory source lacks provenance/corroboration.
U4 Unknown No analyzers have processed this subject; baseline uncertainty.

Each state records:

  • entropy (0..1)
  • evidence[] list pointing to analyzers/heuristics/sources
  • optional timestamp (UTC)

1.1 Uncertainty Tiers (v1 — Sprint 0401)

Uncertainty states are grouped into tiers that determine policy thresholds and UI treatment.

Tier Definitions

Tier Entropy Range States Risk Modifier Policy Implication
T1 (High) 0.7 - 1.0 U1 (high), U4 +50% Block "not_affected", require human review
T2 (Medium) 0.4 - 0.69 U1 (medium), U2 +25% Warn on "not_affected", flag for review
T3 (Low) 0.1 - 0.39 U2 (low), U3 +10% Allow "not_affected" with advisory note
T4 (Negligible) 0.0 - 0.09 U3 (low) +0% Normal processing, no special handling

Tier Assignment Rules

  1. U1 (MissingSymbolResolution):

    • entropy >= 0.7 → T1 (>30% unknowns in callgraph)
    • entropy >= 0.4 → T2 (15-30% unknowns)
    • entropy < 0.4 → T3 (<15% unknowns)
  2. U2 (MissingPurl):

    • entropy >= 0.5 → T2 (>50% packages unresolved)
    • entropy < 0.5 → T3 (<50% packages unresolved)
  3. U3 (UntrustedAdvisory):

    • entropy >= 0.6 → T3 (no corroboration)
    • entropy < 0.6 → T4 (partial corroboration)
  4. U4 (Unknown):

    • Always T1 (no analysis performed = maximum uncertainty)

Aggregate Tier Calculation

When multiple uncertainty states exist, the aggregate tier is the maximum (most severe):

aggregateTier = max(tier(state) for state in uncertainty.states)

2. JSON shape

{
  "uncertainty": {
    "states": [
      {
        "code": "U1",
        "name": "MissingSymbolResolution",
        "entropy": 0.72,
        "timestamp": "2025-11-12T14:12:00Z",
        "evidence": [
          {
            "type": "UnknownsRegistry",
            "sourceId": "signals.unknowns",
            "detail": "unknownsCount=12;unknownsPressure=0.375"
          }
        ]
      }
    ]
  }
}

3. Risk score math (Signals)

Signals computes a riskScore deterministically during reachability recompute:

meanEntropy  = avg(uncertainty.states[].entropy)              // 0 when no states
entropyBoost = clamp(meanEntropy * k, 0 .. boostCeiling)
riskScore    = clamp(baseScore * (1 + entropyBoost), 0 .. 1)

Where:

  • baseScore is the average of per-target reachability state scores (before unknowns penalty).
  • k defaults to 0.5 (SignalsOptions:Scoring:UncertaintyEntropyMultiplier).
  • boostCeiling defaults to 0.5 (SignalsOptions:Scoring:UncertaintyBoostCeiling).

4. Policy guidance (high level)

Uncertainty should bias decisions away from "not affected" when evidence is missing:

  • High entropy (U1 with high entropy) should lead to under investigation and drive remediation (upload symbols, run probes, close unknowns).
  • Low entropy should allow normal confidence-based gates.

See docs/reachability/lattice.md for the current reachability score model and docs/api/signals/reachability-contract.md for the Signals contract.


5. Tier-Based Risk Score (v1 — Sprint 0401)

Risk Score Formula

Building on §3, the v1 risk score incorporates tier-based modifiers:

tierModifier = {
  T1: 0.50,
  T2: 0.25,
  T3: 0.10,
  T4: 0.00
}[aggregateTier]

riskScore = clamp(baseScore * (1 + tierModifier + entropyBoost), 0 .. 1)

Where:

  • baseScore is the average of per-target reachability state scores
  • tierModifier is the tier-based risk increase
  • entropyBoost is the existing entropy-based boost (§3)

Example Calculation

Given:
  - baseScore = 0.4 (moderate reachability)
  - uncertainty.states = [
      {code: "U1", entropy: 0.72},  // T1 tier
      {code: "U3", entropy: 0.45}   // T3 tier
    ]
  - aggregateTier = T1 (max of T1, T3)
  - tierModifier = 0.50

  meanEntropy = (0.72 + 0.45) / 2 = 0.585
  entropyBoost = clamp(0.585 * 0.5, 0 .. 0.5) = 0.2925

  riskScore = clamp(0.4 * (1 + 0.50 + 0.2925), 0 .. 1)
            = clamp(0.4 * 1.7925, 0 .. 1)
            = clamp(0.717, 0 .. 1)
            = 0.717

Tier Thresholds for Policy Gates

Tier riskScore Range VEX "not_affected" VEX "affected" Auto-triage
T1 >= 0.6 blocked ⚠️ review under_investigation
T2 0.4 - 0.59 ⚠️ warning allowed Manual review
T3 0.2 - 0.39 with note allowed Normal
T4 < 0.2 allowed allowed Normal

6. JSON Schema (v1)

Extended schema with tier information:

{
  "uncertainty": {
    "states": [
      {
        "code": "U1",
        "name": "MissingSymbolResolution",
        "entropy": 0.72,
        "tier": "T1",
        "timestamp": "2025-12-13T10:00:00Z",
        "evidence": [
          {
            "type": "UnknownsRegistry",
            "sourceId": "signals.unknowns",
            "detail": "unknownsCount=45;totalSymbols=125;unknownsPressure=0.36"
          }
        ]
      },
      {
        "code": "U4",
        "name": "Unknown",
        "entropy": 1.0,
        "tier": "T1",
        "timestamp": "2025-12-13T10:00:00Z",
        "evidence": [
          {
            "type": "NoAnalysis",
            "sourceId": "signals.bootstrap",
            "detail": "subject not yet analyzed"
          }
        ]
      }
    ],
    "aggregateTier": "T1",
    "riskScore": 0.717,
    "computedAt": "2025-12-13T10:00:00Z"
  }
}

7. Implementation Pointers

  • Tier calculation: UncertaintyTierCalculator in src/Signals/StellaOps.Signals/Services/
  • Risk score math: ReachabilityScoringService.ComputeRiskScore() (extend existing)
  • Policy integration: docs/reachability/policy-gate.md for gate rules
  • Lattice integration: docs/reachability/lattice.md §9 for v1 lattice states