up
This commit is contained in:
@@ -1,28 +1,73 @@
|
||||
# Uncertainty States & Entropy Scoring
|
||||
|
||||
> **Status:** Draft – aligns with the November 2025 advisory on explicit uncertainty tracking.
|
||||
> **Owners:** Signals Guild · Concelier Guild · UI Guild.
|
||||
> **Status:** Implemented v0 for reachability facts (Signals).
|
||||
> **Owners:** Signals Guild · Policy Guild · UI Guild.
|
||||
|
||||
Stella Ops treats missing data and untrusted evidence as **first-class uncertainty states**, not silent false negatives. Each finding stores a list of `UncertaintyState` entries plus supporting evidence; the risk scorer uses their entropy to adjust final risk. Policy and UI surfaces reveal uncertainty to operators rather than hiding it.
|
||||
StellaOps treats missing data and untrusted evidence as **first-class uncertainty states**, not silent false negatives. Signals persists uncertainty state entries alongside reachability facts and derives a deterministic `riskScore` that increases when entropy is high.
|
||||
|
||||
---
|
||||
|
||||
## 1. Core states (extensible)
|
||||
|
||||
| Code | Name | Meaning |
|
||||
|------|------------------------|---------------------------------------------------------------------------|
|
||||
| `U1` | MissingSymbolResolution| Vulnerability → function mapping unresolved (no PDB/IL map, missing dSYMs). |
|
||||
| `U2` | MissingPurl | Package identity/version ambiguous (lockfile absent, heuristics only). |
|
||||
| `U3` | UntrustedAdvisory | Advisory source lacks DSSE/Sigstore provenance or corroboration. |
|
||||
| `U4+`| (future) | e.g. partial SBOM coverage, missing container layers, unresolved transitives. |
|
||||
| Code | Name | Meaning |
|
||||
|------|------|---------|
|
||||
| `U1` | `MissingSymbolResolution` | Unresolved symbols/edges prevent a complete reachability proof. |
|
||||
| `U2` | `MissingPurl` | Package identity/version is ambiguous (lockfile absent, heuristics only). |
|
||||
| `U3` | `UntrustedAdvisory` | Advisory source lacks provenance/corroboration. |
|
||||
| `U4` | `Unknown` | No analyzers have processed this subject; baseline uncertainty. |
|
||||
|
||||
Each state records `entropy` (0–1) and an evidence list pointing to analyzers, heuristics, or advisory sources that asserted the uncertainty.
|
||||
Each state records:
|
||||
|
||||
- `entropy` (0..1)
|
||||
- `evidence[]` list pointing to analyzers/heuristics/sources
|
||||
- optional `timestamp` (UTC)
|
||||
|
||||
---
|
||||
|
||||
## 2. Schema
|
||||
## 1.1 Uncertainty Tiers (v1 — Sprint 0401)
|
||||
|
||||
```jsonc
|
||||
Uncertainty states are grouped into **tiers** that determine policy thresholds and UI treatment.
|
||||
|
||||
### Tier Definitions
|
||||
|
||||
| Tier | Entropy Range | States | Risk Modifier | Policy Implication |
|
||||
|------|---------------|--------|---------------|-------------------|
|
||||
| **T1 (High)** | `0.7 - 1.0` | `U1` (high), `U4` | `+50%` | Block "not_affected", require human review |
|
||||
| **T2 (Medium)** | `0.4 - 0.69` | `U1` (medium), `U2` | `+25%` | Warn on "not_affected", flag for review |
|
||||
| **T3 (Low)** | `0.1 - 0.39` | `U2` (low), `U3` | `+10%` | Allow "not_affected" with advisory note |
|
||||
| **T4 (Negligible)** | `0.0 - 0.09` | `U3` (low) | `+0%` | Normal processing, no special handling |
|
||||
|
||||
### Tier Assignment Rules
|
||||
|
||||
1. **U1 (MissingSymbolResolution):**
|
||||
- `entropy >= 0.7` → T1 (>30% unknowns in callgraph)
|
||||
- `entropy >= 0.4` → T2 (15-30% unknowns)
|
||||
- `entropy < 0.4` → T3 (<15% unknowns)
|
||||
|
||||
2. **U2 (MissingPurl):**
|
||||
- `entropy >= 0.5` → T2 (>50% packages unresolved)
|
||||
- `entropy < 0.5` → T3 (<50% packages unresolved)
|
||||
|
||||
3. **U3 (UntrustedAdvisory):**
|
||||
- `entropy >= 0.6` → T3 (no corroboration)
|
||||
- `entropy < 0.6` → T4 (partial corroboration)
|
||||
|
||||
4. **U4 (Unknown):**
|
||||
- Always T1 (no analysis performed = maximum uncertainty)
|
||||
|
||||
### Aggregate Tier Calculation
|
||||
|
||||
When multiple uncertainty states exist, the aggregate tier is the **maximum** (most severe):
|
||||
|
||||
```
|
||||
aggregateTier = max(tier(state) for state in uncertainty.states)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. JSON shape
|
||||
|
||||
```json
|
||||
{
|
||||
"uncertainty": {
|
||||
"states": [
|
||||
@@ -30,24 +75,12 @@ Each state records `entropy` (0–1) and an evidence list pointing to analyzers,
|
||||
"code": "U1",
|
||||
"name": "MissingSymbolResolution",
|
||||
"entropy": 0.72,
|
||||
"timestamp": "2025-11-12T14:12:00Z",
|
||||
"evidence": [
|
||||
{
|
||||
"type": "AnalyzerProbe",
|
||||
"sourceId": "dotnet.symbolizer",
|
||||
"detail": "No PDB/IL map for Foo.Bar::DoWork"
|
||||
}
|
||||
],
|
||||
"timestamp": "2025-11-12T14:12:00Z"
|
||||
},
|
||||
{
|
||||
"code": "U2",
|
||||
"name": "MissingPurl",
|
||||
"entropy": 0.55,
|
||||
"evidence": [
|
||||
{
|
||||
"type": "PackageHeuristic",
|
||||
"sourceId": "jar.manifest",
|
||||
"detail": "Guessed groupId=com.example, version ~= 1.9.x"
|
||||
"type": "UnknownsRegistry",
|
||||
"sourceId": "signals.unknowns",
|
||||
"detail": "unknownsCount=12;unknownsPressure=0.375"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -56,98 +89,140 @@ Each state records `entropy` (0–1) and an evidence list pointing to analyzers,
|
||||
}
|
||||
```
|
||||
|
||||
### C# models
|
||||
---
|
||||
|
||||
```csharp
|
||||
public sealed record UncertaintyEvidence(string Type, string SourceId, string Detail);
|
||||
## 3. Risk score math (Signals)
|
||||
|
||||
public sealed record UncertaintyState(
|
||||
string Code,
|
||||
string Name,
|
||||
double Entropy,
|
||||
IReadOnlyList<UncertaintyEvidence> Evidence);
|
||||
Signals computes a `riskScore` deterministically during reachability recompute:
|
||||
|
||||
```
|
||||
meanEntropy = avg(uncertainty.states[].entropy) // 0 when no states
|
||||
entropyBoost = clamp(meanEntropy * k, 0 .. boostCeiling)
|
||||
riskScore = clamp(baseScore * (1 + entropyBoost), 0 .. 1)
|
||||
```
|
||||
|
||||
Store them alongside `FindingDocument` in Signals and expose via APIs/CLI/GraphQL so downstream services can display them or enforce policies.
|
||||
Where:
|
||||
|
||||
- `baseScore` is the average of per-target reachability state scores (before unknowns penalty).
|
||||
- `k` defaults to `0.5` (`SignalsOptions:Scoring:UncertaintyEntropyMultiplier`).
|
||||
- `boostCeiling` defaults to `0.5` (`SignalsOptions:Scoring:UncertaintyBoostCeiling`).
|
||||
|
||||
---
|
||||
|
||||
## 3. Risk score math
|
||||
## 4. Policy guidance (high level)
|
||||
|
||||
```
|
||||
riskScore = baseScore
|
||||
× reachabilityFactor (0..1)
|
||||
× trustFactor (0..1)
|
||||
× (1 + entropyBoost)
|
||||
Uncertainty should bias decisions away from "not affected" when evidence is missing:
|
||||
|
||||
entropyBoost = clamp(avg(uncertainty[i].entropy) × k, 0 .. 0.5)
|
||||
```
|
||||
- High entropy (`U1` with high `entropy`) should lead to **under investigation** and drive remediation (upload symbols, run probes, close unknowns).
|
||||
- Low entropy should allow normal confidence-based gates.
|
||||
|
||||
* `k` defaults to `0.5`. With mean entropy = 0.8, boost = 0.4 → risk increases 40% to highlight unknowns.
|
||||
* If no uncertainty states exist, entropy boost = 0 and the previous scoring remains.
|
||||
|
||||
Persist both `uncertainty.states` and `riskScore` so policies, dashboards, and APIs stay deterministic.
|
||||
See `docs/reachability/lattice.md` for the current reachability score model and `docs/api/signals/reachability-contract.md` for the Signals contract.
|
||||
|
||||
---
|
||||
|
||||
## 4. Policy + actions
|
||||
## 5. Tier-Based Risk Score (v1 — Sprint 0401)
|
||||
|
||||
Use uncertainty in Concelier/Excitors policies:
|
||||
### Risk Score Formula
|
||||
|
||||
* **Block release** if critical CVE has `U1` with entropy ≥ 0.70 until symbols or runtime probes are provided.
|
||||
* **Warn** when only `U3` exists – allow deployment but require corroboration (OSV/GHSA, CSAF).
|
||||
* **Auto-create tasks** for `U2` to fix SBOM/purl data quality.
|
||||
Building on §3, the v1 risk score incorporates tier-based modifiers:
|
||||
|
||||
Recommended policy predicates:
|
||||
```
|
||||
tierModifier = {
|
||||
T1: 0.50,
|
||||
T2: 0.25,
|
||||
T3: 0.10,
|
||||
T4: 0.00
|
||||
}[aggregateTier]
|
||||
|
||||
```yaml
|
||||
when:
|
||||
all:
|
||||
- uncertaintyCodesAny: ["U1"]
|
||||
- maxEntropyGte: 0.7
|
||||
riskScore = clamp(baseScore * (1 + tierModifier + entropyBoost), 0 .. 1)
|
||||
```
|
||||
|
||||
Excitors can suggest remediation actions (upload PDBs, add lockfiles, fetch signed CSAF) based on state codes.
|
||||
Where:
|
||||
- `baseScore` is the average of per-target reachability state scores
|
||||
- `tierModifier` is the tier-based risk increase
|
||||
- `entropyBoost` is the existing entropy-based boost (§3)
|
||||
|
||||
### Example Calculation
|
||||
|
||||
```
|
||||
Given:
|
||||
- baseScore = 0.4 (moderate reachability)
|
||||
- uncertainty.states = [
|
||||
{code: "U1", entropy: 0.72}, // T1 tier
|
||||
{code: "U3", entropy: 0.45} // T3 tier
|
||||
]
|
||||
- aggregateTier = T1 (max of T1, T3)
|
||||
- tierModifier = 0.50
|
||||
|
||||
meanEntropy = (0.72 + 0.45) / 2 = 0.585
|
||||
entropyBoost = clamp(0.585 * 0.5, 0 .. 0.5) = 0.2925
|
||||
|
||||
riskScore = clamp(0.4 * (1 + 0.50 + 0.2925), 0 .. 1)
|
||||
= clamp(0.4 * 1.7925, 0 .. 1)
|
||||
= clamp(0.717, 0 .. 1)
|
||||
= 0.717
|
||||
```
|
||||
|
||||
### Tier Thresholds for Policy Gates
|
||||
|
||||
| Tier | `riskScore` Range | VEX "not_affected" | VEX "affected" | Auto-triage |
|
||||
|------|-------------------|-------------------|----------------|-------------|
|
||||
| T1 | `>= 0.6` | ❌ blocked | ⚠️ review | → `under_investigation` |
|
||||
| T2 | `0.4 - 0.59` | ⚠️ warning | ✅ allowed | Manual review |
|
||||
| T3 | `0.2 - 0.39` | ✅ with note | ✅ allowed | Normal |
|
||||
| T4 | `< 0.2` | ✅ allowed | ✅ allowed | Normal |
|
||||
|
||||
---
|
||||
|
||||
## 5. UI guidelines
|
||||
## 6. JSON Schema (v1)
|
||||
|
||||
* Display chips `U1`, `U2`, … on each finding. Tooltip: entropy level + evidence bullets (“AnalyzerProbe/dotnet.symbolizer: …”).
|
||||
* Provide “How to reduce entropy” hints: symbol uploads, EventPipe probes, purl overrides, advisory verification.
|
||||
* Show entropy in filters (e.g., “entropy ≥ 0.5”) so teams can prioritise closing uncertainty gaps.
|
||||
|
||||
See `components/UncertaintyChipStack` (planned) for a reference implementation.
|
||||
|
||||
---
|
||||
|
||||
## 6. Event sourcing / audit
|
||||
|
||||
Emit `FindingUncertaintyUpdated` events whenever the set changes:
|
||||
Extended schema with tier information:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "FindingUncertaintyUpdated",
|
||||
"findingId": "finding:service:prod:CVE-2023-12345",
|
||||
"updatedAt": "2025-11-12T14:21:33Z",
|
||||
"uncertainty": [ ...states... ]
|
||||
"uncertainty": {
|
||||
"states": [
|
||||
{
|
||||
"code": "U1",
|
||||
"name": "MissingSymbolResolution",
|
||||
"entropy": 0.72,
|
||||
"tier": "T1",
|
||||
"timestamp": "2025-12-13T10:00:00Z",
|
||||
"evidence": [
|
||||
{
|
||||
"type": "UnknownsRegistry",
|
||||
"sourceId": "signals.unknowns",
|
||||
"detail": "unknownsCount=45;totalSymbols=125;unknownsPressure=0.36"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"code": "U4",
|
||||
"name": "Unknown",
|
||||
"entropy": 1.0,
|
||||
"tier": "T1",
|
||||
"timestamp": "2025-12-13T10:00:00Z",
|
||||
"evidence": [
|
||||
{
|
||||
"type": "NoAnalysis",
|
||||
"sourceId": "signals.bootstrap",
|
||||
"detail": "subject not yet analyzed"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"aggregateTier": "T1",
|
||||
"riskScore": 0.717,
|
||||
"computedAt": "2025-12-13T10:00:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Projections recompute `riskScore` deterministically, and the event log provides an audit trail showing when/why entropy changed.
|
||||
|
||||
---
|
||||
|
||||
## 7. Action hints (per state)
|
||||
## 7. Implementation Pointers
|
||||
|
||||
| Code | Suggested remediation |
|
||||
|------|-----------------------|
|
||||
| `U1` | Upload PDBs/dSYM files, enable symbolizer connectors, attach runtime probes (EventPipe/JFR). |
|
||||
| `U2` | Provide package overrides, ingest lockfiles, fix SBOM generator metadata. |
|
||||
| `U3` | Obtain signed CSAF/OSV evidence, verify via Excitors connectors, or mark trust overrides in policy. |
|
||||
|
||||
### 8. Unknowns registry tie-in
|
||||
|
||||
Unresolved identities and missing edges should be recorded as Unknowns (see `docs/signals/unknowns-registry.md`). Signals scoring may add an `unknowns_pressure` term when density of unresolved items is high near entrypoints; Policy and UI should surface these records so operators can close the gaps rather than hiding the uncertainty.
|
||||
|
||||
Keep this file updated as new states (U4+) or tooling hooks land. Link additional guides (symbol upload, purl overrides) once available.
|
||||
- **Tier calculation:** `UncertaintyTierCalculator` in `src/Signals/StellaOps.Signals/Services/`
|
||||
- **Risk score math:** `ReachabilityScoringService.ComputeRiskScore()` (extend existing)
|
||||
- **Policy integration:** `docs/reachability/policy-gate.md` for gate rules
|
||||
- **Lattice integration:** `docs/reachability/lattice.md` §9 for v1 lattice states
|
||||
|
||||
Reference in New Issue
Block a user