Here’s a simple, *actionable* way to keep “unknowns” from piling up in Stella Ops: rank them by **how risky they might be** and **how widely they could spread**—then let Scheduler auto‑recheck or escalate based on that score. --- # Unknowns Triage: a lightweight, high‑leverage scheme **Goal:** decide which “Unknown” findings (no proof yet; inconclusive reachability; unparsed advisory; mismatched version; missing evidence) to re‑scan first or route into VEX escalation—without waiting for perfect certainty. ## 1) Define the score Score each Unknown `U` with a weighted sum (normalize each input to 0–1): * **Component popularity (P):** how many distinct workloads/images depend on this package (direct + transitive). *Proxy:* in‑degree or deployment count across environments. * **CVSS uncertainty (C):** how fuzzy the risk is (e.g., missing vector, version ranges like `<=`, vendor ambiguity). *Proxy:* 1 − certainty; higher = less certain, more dangerous to ignore. * **Graph centrality (G):** how “hub‑like” the component is in your dependency graph. *Proxy:* normalized betweenness/degree centrality in your SBOM DAG. **TriageScore(U) = wP·P + wC·C + wG·G**, with default weights: `wP=0.4, wC=0.35, wG=0.25`. **Thresholds (tuneable):** * `≥ 0.70` → **Hot**: immediate rescan + VEX escalation job * `0.40–0.69` → **Warm**: schedule rescan within 24–48h * `< 0.40` → **Cold**: batch into weekly sweep ## 2) Minimal schema (Postgres or Mongo) to support it * `unknowns(id, pkg_id, version, source, first_seen, last_seen, certainty, evidence_hash, status)` * `deploy_refs(pkg_id, image_id, env, first_seen, last_seen)` → compute **popularity P** * `graph_metrics(pkg_id, degree_c, betweenness_c, last_calc_at)` → compute **centrality G** * `advisory_gaps(pkg_id, missing_fields[], has_range_version, vendor_mismatch)` → compute **uncertainty C** > Store `triage_score`, `triage_band` on write so Scheduler can act without recomputing everything. ## 3) Fast heuristics to fill inputs * **P (popularity):** `P = min(1, log10(1 + deployments)/log10(1 + 100))` * **C (uncertainty):** start at 0; +0.3 if version range, +0.2 if vendor mismatch, +0.2 if missing CVSS vector, +0.2 if evidence stale (>7d), cap at 1.0 * **G (centrality):** precompute on SBOM DAG nightly; normalize to [0,1] ## 4) Scheduler rules (UnknownsRegistry → jobs) * On `unknowns.upsert`: * compute (P,C,G) → `triage_score` * if **Hot** → enqueue: * **Deterministic rescan** (fresh feeds + strict lattice) * **VEX escalation** (Excititor) with context pack (SBOM slice, provenance, last evidence) * if **Warm** → enqueue rescan with jitter (spread load) * if **Cold** → tag for weekly batch * Backoff: if the same Unknown stays **Hot** after N attempts, widen evidence (alternate feeds, secondary matcher, vendor OVAL, NVD mirror) and alert. ## 5) Operator‑visible UX (DevOps‑friendly) * Unknowns list: columns = pkg@ver, deployments, centrality, uncertainty flags, last evidence age, **score badge** (Hot/Warm/Cold), **Next action** chip. * Side panel: show *why* the score is high (P/C/G sub‑scores) + scheduled jobs and last outcomes. * Bulk actions: “Recompute scores”, “Force VEX escalation”, “De‑dupe aliases”. ## 6) Guardrails to keep it deterministic * Record the **inputs + weights + feed hashes** in the scan manifest (your “replay” object). * Any change to weights or heuristics → new policy version in the manifest; old runs remain replayable. ## 7) Reference snippets **SQL (Postgres) — compute and persist score:** ```sql update unknowns u set triage_score = least(1, 0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g), triage_band = case when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.70 then 'HOT' when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.40 then 'WARM' else 'COLD' end, last_scored_at = now() where u.status = 'OPEN'; ``` **C# (Common) — score helper:** ```csharp public static (double score, string band) Score(double p, double c, double g, double wP=0.4, double wC=0.35, double wG=0.25) { var s = Math.Min(1.0, wP*p + wC*c + wG*g); var band = s >= 0.70 ? "HOT" : s >= 0.40 ? "WARM" : "COLD"; return (s, band); } ``` ## 8) Where this plugs into Stella Ops * **Scanner.WebService**: writes Unknowns with raw flags (range‑version, vector missing, vendor mismatch). * **UnknownsRegistry**: computes P/C/G, persists triage fields, emits `Unknown.Triaged`. * **Scheduler**: listens → enqueues **Rescan** / **VEX Escalation** with jitter/backoff. * **Excititor (VEX)**: builds vendor‑merge proof or raises “Unresolvable” with rationale. * **Authority**: records policy version + weights in replay manifest. --- If you want, I can drop in a ready‑to‑use `UnknownsRegistry` table DDL + EF Core 9 model and a tiny Scheduler job that implements these thresholds. Below is a complete, production-grade **developer guideline for Ranking Unknowns in Reachability Graphs** inside **Stella Ops**. It fits the existing architectural rules (scanner = origin of truth, Concelier/Vexer = prune-preservers, Authority = replay manifest owner, Scheduler = executor). These guidelines give: 1. Definitions 2. Ranking dimensions 3. Deterministic scoring formula 4. Evidence capture 5. Scheduler policies 6. UX and API rules 7. Testing rules and golden fixtures --- # Stella Ops Developer Guidelines # Ranking Unknowns in Reachability Graphs ## 0. Purpose An **Unknown** is any vulnerability-like record where **reachability**, **affectability**, or **evidence linkage** cannot yet be proved true or false. We rank Unknowns to: 1. Prioritize rescans 2. Trigger VEX escalation 3. Guide operators in constrained time windows 4. Maintain deterministic behaviour under replay manifests 5. Avoid non-deterministic or “probabilistic” security decisions Unknown ranking **never declares security state**. It determines **the order of proof acquisition**. --- # 1. Formal Definition of “Unknown” A record is classified as **Unknown** if one or more of the following is true: 1. **Dependency Reachability Unproven** * Graph traversal exists but is not validated by call-graph/rule-graph evidence. * Downstream node is reachable but no execution path has sufficient evidence. 2. **Version Semantics Uncertain** * Advisory reports `<=`, `<`, `>=`, version ranges, or ambiguous pseudo-versions. * Normalized version mapping disagrees between data sources. 3. **Component Provenance Uncertain** * Package cannot be deterministically linked to its SBOM node (name-alias confusion, epoch mismatch, distro backport case). 4. **Missing/Contradictory Evidence** * Feeds disagree; Vendor VEX differs from NVD; OSS index has missing CVSS vector; environment evidence incomplete. 5. **Weak Graph Anchoring** * Node exists but cannot be anchored to a layer digest or artifact hash (common in scratch/base images and badly packaged libs). Unknowns **must be stored with explicit flags**—not as a collapsed bucket. --- # 2. Dimensions for Ranking Unknowns Each Unknown is ranked along **five deterministic axes**: ### 2.1 Popularity Impact (P) How broadly the component is used across workloads. Evidence sources: * SBOM deployment graph * Workload registry * Layer-to-package index Compute: `P = normalized log(deployment_count)`. ### 2.2 Exploit Consequence Potential (E) Not risk. Consequence if the Unknown turns out to be an actual vulnerability. Compute from: * Maximum CVSS across feeds * CWE category weight * Vendor “criticality marker” if present * If CVSS missing → use CWE fallback → mark uncertainty penalty. ### 2.3 Uncertainty Density (U) How much is missing or contradictory. Flags (examples): * version_range → +0.25 * missing_vector → +0.15 * conflicting_feeds → +0.20 * no provenance anchor → +0.30 * unreachable source advisory → +0.10 U ∈ [0, 1]. ### 2.4 Graph Centrality (C) Is this component a structural hub? Use: * In-degree * Out-degree * Betweenness centrality Normalize per artifact type. ### 2.5 Evidence Staleness (S) Age of last successful evidence pull. Decay function: `S = min(1, age_days / 14)`. --- # 3. Deterministic Ranking Score All Unknowns get a reproducible score under replay manifest: ``` Score = clamp01( wP·P + wE·E + wU·U + wC·C + wS·S ) ``` Default recommended weights: ``` wP = 0.25 (deployment impact) wE = 0.25 (potential consequence) wU = 0.25 (uncertainty density) wC = 0.15 (graph centrality) wS = 0.10 (evidence staleness) ``` The manifest must record: * weights * transform functions * normalization rules * feed hashes * evidence hashes Thus the ranking is replayable bit-for-bit. --- # 4. Ranking Bands After computing Score: * **Hot (Score ≥ 0.70)** Immediate rescan, VEX escalation, widen evidence sources. * **Warm (0.40 ≤ Score < 0.70)** Scheduled rescan, no escalation yet. * **Cold (Score < 0.40)** Batch weekly; suppressed from UI noise except on request. Band assignment must be stored explicitly. --- # 5. Evidence Capture Requirements Every Unknown must persist: 1. **UnknownFlags[]** – all uncertainty flags 2. **GraphSliceHash** – deterministic hash of dependents/ancestors 3. **EvidenceSetHash** – hashes of advisories, vendor VEXes, feed extracts 4. **NormalizationTrace** – version normalization decision path 5. **CallGraphAttemptHash** – even if incomplete 6. **PackageMatchTrace** – exact match reasoning (name, epoch, distro backport heuristics) This allows Inspector/Authority to replay everything and prevents “ghost Unknowns” caused by environment drift. --- # 6. Scheduler Policies ### 6.1 On Unknown Created Scheduler receives event: `Unknown.Created`. Decision matrix: | Condition | Action | | --------------- | ------------------------------------- | | Score ≥ 0.70 | Immediate Rescan + VEX Escalation job | | Score 0.40–0.69 | Queue rescan within 12–72h (jitter) | | Score < 0.40 | Add to weekly batch | ### 6.2 On Unknown Unchanged after N rescans If N = 3 consecutive runs with same UnknownFlags: * Force alternate feeds (mirror, vendor direct) * Run VEX excitor with full provenance pack * If still unresolved → emit `Unknown.Unresolvable` event (not an error; a state) ### 6.3 Failure Recovery If fetch/feed errors → Unknown transitions to `Unknown.EvidenceFailed`. This must raise S (staleness) on next compute. --- # 7. Scanner Implementation Guidelines (.NET 10) ### 7.1 Ranking Computation Location Ranking is computed inside **scanner.webservice** immediately after Unknown classification. Concelier/Vexer must **not** touch ranking logic. ### 7.2 Graph Metrics Service Maintain a cached daily calculation of centrality metrics to prevent per-scan recomputation cost explosion. ### 7.3 Compute Path ``` 1. Build evidence set 2. Classify UnknownFlags 3. Compute P, E, U, C, S 4. Compute Score 5. Assign Band 6. Persist UnknownRecord 7. Emit Unknown.Triaged event ``` ### 7.4 Storage Schema (Postgres) Fields required: ``` unknown_id PK pkg_id pkg_version digest_anchor unknown_flags jsonb popularity_p float potential_e float uncertainty_u float centrality_c float staleness_s float score float band enum graph_slice_hash bytea evidence_set_hash bytea normalization_trace jsonb callgraph_attempt_hash bytea created_at, updated_at ``` --- # 8. API and UX Guidelines ### 8.1 Operator UI For every Unknown: * Score badge (Hot/Warm/Cold) * Sub-component contributions (P/E/U/C/S) * Flags list * Evidence age * Scheduled next action * History graph of score evolution ### 8.2 Filters Operators may filter by: * High P (impactful components) * High U (ambiguous advisories) * High S (stale data) * High C (graph hubs) ### 8.3 Reasoning Transparency UI must show *exactly why* the ranking is high. No hidden heuristics. --- # 9. Unit Testing & Golden Fixtures ### 9.1 Golden Unknown Cases Provide frozen fixtures for: * Version range ambiguity * Mismatched epoch/backport * Missing vector * Conflicting severity between vendor/NVD * Unanchored filesystem library Each fixture stores expected: * Flags * P/E/U/C/S * Score * Band ### 9.2 Replay Manifest Tests Given a manifest containing: * feed hashes * rules version * normalization logic * lattice rules (for overall system) Ensure ranking recomputes identically. --- # 10. Developer Checklist (must be followed) 1. Did I persist all traces needed for deterministic replay? 2. Does ranking depend only on manifest-declared parameters (not environment)? 3. Are all uncertainty factors explicit flags, never inferred fuzzily? 4. Is the scoring reproducible under identical inputs? 5. Is Scheduler decision table deterministic and exhaustively tested? 6. Does API expose full reasoning without hiding rules? --- If you want, I can now produce: 1. **A full Postgres DDL** for Unknowns. 2. **A .NET 10 service class** for ranking calculation. 3. **A golden test suite** with 20 fixtures. 4. **UI wireframe** for Unknown triage screen. Which one should I generate?