stella-ops.org/git.stella-ops.org

Fork 0

Files

master a3c7fe5e88

Docs CI / lint-and-preview (push) Has been cancelled

Details

devportal-offline / build-offline (push) Has been cancelled

Details

Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled

Details

add advisories

2025-12-09 18:45:57 +02:00

13 KiB

Raw Blame History

Here’s a simple, actionable way to keep “unknowns” from piling up in Stella Ops: rank them by how risky they might be and how widely they could spread—then let Scheduler auto‑recheck or escalate based on that score.

Unknowns Triage: a lightweight, high‑leverage scheme

Goal: decide which “Unknown” findings (no proof yet; inconclusive reachability; unparsed advisory; mismatched version; missing evidence) to re‑scan first or route into VEX escalation—without waiting for perfect certainty.

1) Define the score

Score each Unknown U with a weighted sum (normalize each input to 0–1):

Component popularity (P): how many distinct workloads/images depend on this package (direct + transitive). Proxy: in‑degree or deployment count across environments.
CVSS uncertainty (C): how fuzzy the risk is (e.g., missing vector, version ranges like <=, vendor ambiguity). Proxy: 1 − certainty; higher = less certain, more dangerous to ignore.
Graph centrality (G): how “hub‑like” the component is in your dependency graph. Proxy: normalized betweenness/degree centrality in your SBOM DAG.

TriageScore(U) = wP·P + wC·C + wG·G, with default weights: wP=0.4, wC=0.35, wG=0.25.

Thresholds (tuneable):

≥ 0.70 → Hot: immediate rescan + VEX escalation job
0.40–0.69 → Warm: schedule rescan within 24–48h
< 0.40 → Cold: batch into weekly sweep

2) Minimal schema (Postgres or Mongo) to support it

unknowns(id, pkg_id, version, source, first_seen, last_seen, certainty, evidence_hash, status)
deploy_refs(pkg_id, image_id, env, first_seen, last_seen) → compute popularity P
graph_metrics(pkg_id, degree_c, betweenness_c, last_calc_at) → compute centrality G
advisory_gaps(pkg_id, missing_fields[], has_range_version, vendor_mismatch) → compute uncertainty C

Store triage_score, triage_band on write so Scheduler can act without recomputing everything.

3) Fast heuristics to fill inputs

P (popularity): P = min(1, log10(1 + deployments)/log10(1 + 100))
C (uncertainty): start at 0; +0.3 if version range, +0.2 if vendor mismatch, +0.2 if missing CVSS vector, +0.2 if evidence stale (>7d), cap at 1.0
G (centrality): precompute on SBOM DAG nightly; normalize to [0,1]

4) Scheduler rules (UnknownsRegistry → jobs)

On unknowns.upsert:
- compute (P,C,G) → triage_score
- if Hot → enqueue:
  - Deterministic rescan (fresh feeds + strict lattice)
  - VEX escalation (Excititor) with context pack (SBOM slice, provenance, last evidence)
- if Warm → enqueue rescan with jitter (spread load)
- if Cold → tag for weekly batch
Backoff: if the same Unknown stays Hot after N attempts, widen evidence (alternate feeds, secondary matcher, vendor OVAL, NVD mirror) and alert.

5) Operator‑visible UX (DevOps‑friendly)

Unknowns list: columns = pkg@ver, deployments, centrality, uncertainty flags, last evidence age, score badge (Hot/Warm/Cold), Next action chip.
Side panel: show why the score is high (P/C/G sub‑scores) + scheduled jobs and last outcomes.
Bulk actions: “Recompute scores”, “Force VEX escalation”, “De‑dupe aliases”.

6) Guardrails to keep it deterministic

Record the inputs + weights + feed hashes in the scan manifest (your “replay” object).
Any change to weights or heuristics → new policy version in the manifest; old runs remain replayable.

7) Reference snippets

SQL (Postgres) — compute and persist score:

update unknowns u
set triage_score = least(1, 0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g),
    triage_band  = case
        when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.70 then 'HOT'
        when (0.4*u.popularity_p + 0.35*u.cvss_uncertainty_c + 0.25*u.graph_centrality_g) >= 0.40 then 'WARM'
        else 'COLD'
    end,
    last_scored_at = now()
where u.status = 'OPEN';

C# (Common) — score helper:

public static (double score, string band) Score(double p, double c, double g,
    double wP=0.4, double wC=0.35, double wG=0.25)
{
    var s = Math.Min(1.0, wP*p + wC*c + wG*g);
    var band = s >= 0.70 ? "HOT" : s >= 0.40 ? "WARM" : "COLD";
    return (s, band);
}

8) Where this plugs into Stella Ops

Scanner.WebService: writes Unknowns with raw flags (range‑version, vector missing, vendor mismatch).
UnknownsRegistry: computes P/C/G, persists triage fields, emits Unknown.Triaged.
Scheduler: listens → enqueues Rescan / VEX Escalation with jitter/backoff.
Excititor (VEX): builds vendor‑merge proof or raises “Unresolvable” with rationale.
Authority: records policy version + weights in replay manifest.

If you want, I can drop in a ready‑to‑use UnknownsRegistry table DDL + EF Core 9 model and a tiny Scheduler job that implements these thresholds. Below is a complete, production-grade developer guideline for Ranking Unknowns in Reachability Graphs inside Stella Ops. It fits the existing architectural rules (scanner = origin of truth, Concelier/Vexer = prune-preservers, Authority = replay manifest owner, Scheduler = executor).

These guidelines give:

Definitions
Ranking dimensions
Deterministic scoring formula
Evidence capture
Scheduler policies
UX and API rules
Testing rules and golden fixtures

Stella Ops Developer Guidelines

Ranking Unknowns in Reachability Graphs

0. Purpose

An Unknown is any vulnerability-like record where reachability, affectability, or evidence linkage cannot yet be proved true or false. We rank Unknowns to:

Prioritize rescans
Trigger VEX escalation
Guide operators in constrained time windows
Maintain deterministic behaviour under replay manifests
Avoid non-deterministic or “probabilistic” security decisions

Unknown ranking never declares security state. It determines the order of proof acquisition.

1. Formal Definition of “Unknown”

A record is classified as Unknown if one or more of the following is true:

Dependency Reachability Unproven
- Graph traversal exists but is not validated by call-graph/rule-graph evidence.
- Downstream node is reachable but no execution path has sufficient evidence.
Version Semantics Uncertain
- Advisory reports <=, <, >=, version ranges, or ambiguous pseudo-versions.
- Normalized version mapping disagrees between data sources.
Component Provenance Uncertain
- Package cannot be deterministically linked to its SBOM node (name-alias confusion, epoch mismatch, distro backport case).
Missing/Contradictory Evidence
- Feeds disagree; Vendor VEX differs from NVD; OSS index has missing CVSS vector; environment evidence incomplete.
Weak Graph Anchoring
- Node exists but cannot be anchored to a layer digest or artifact hash (common in scratch/base images and badly packaged libs).

Unknowns must be stored with explicit flags—not as a collapsed bucket.

2. Dimensions for Ranking Unknowns

Each Unknown is ranked along five deterministic axes:

2.1 Popularity Impact (P)

How broadly the component is used across workloads.

Evidence sources:

SBOM deployment graph
Workload registry
Layer-to-package index

Compute: P = normalized log(deployment_count).

2.2 Exploit Consequence Potential (E)

Not risk. Consequence if the Unknown turns out to be an actual vulnerability.

Compute from:

Maximum CVSS across feeds
CWE category weight
Vendor “criticality marker” if present
If CVSS missing → use CWE fallback → mark uncertainty penalty.

2.3 Uncertainty Density (U)

How much is missing or contradictory.

Flags (examples):

version_range → +0.25
missing_vector → +0.15
conflicting_feeds → +0.20
no provenance anchor → +0.30
unreachable source advisory → +0.10

U ∈ [0, 1].

2.4 Graph Centrality (C)

Is this component a structural hub?

Use:

In-degree
Out-degree
Betweenness centrality

Normalize per artifact type.

2.5 Evidence Staleness (S)

Age of last successful evidence pull.

Decay function: S = min(1, age_days / 14).

3. Deterministic Ranking Score

All Unknowns get a reproducible score under replay manifest:

Score = clamp01(
    wP·P  +
    wE·E  +
    wU·U  +
    wC·C  +
    wS·S
)

Default recommended weights:

wP = 0.25   (deployment impact)
wE = 0.25   (potential consequence)
wU = 0.25   (uncertainty density)
wC = 0.15   (graph centrality)
wS = 0.10   (evidence staleness)

The manifest must record:

weights
transform functions
normalization rules
feed hashes
evidence hashes

Thus the ranking is replayable bit-for-bit.

4. Ranking Bands

After computing Score:

Hot (Score ≥ 0.70) Immediate rescan, VEX escalation, widen evidence sources.
Warm (0.40 ≤ Score < 0.70) Scheduled rescan, no escalation yet.
Cold (Score < 0.40) Batch weekly; suppressed from UI noise except on request.

Band assignment must be stored explicitly.

5. Evidence Capture Requirements

Every Unknown must persist:

UnknownFlags[] – all uncertainty flags
GraphSliceHash – deterministic hash of dependents/ancestors
EvidenceSetHash – hashes of advisories, vendor VEXes, feed extracts
NormalizationTrace – version normalization decision path
CallGraphAttemptHash – even if incomplete
PackageMatchTrace – exact match reasoning (name, epoch, distro backport heuristics)

This allows Inspector/Authority to replay everything and prevents “ghost Unknowns” caused by environment drift.

6. Scheduler Policies

6.1 On Unknown Created

Scheduler receives event: Unknown.Created.

Decision matrix:

Condition	Action
Score ≥ 0.70	Immediate Rescan + VEX Escalation job
Score 0.40–0.69	Queue rescan within 12–72h (jitter)
Score < 0.40	Add to weekly batch

6.2 On Unknown Unchanged after N rescans

If N = 3 consecutive runs with same UnknownFlags:

Force alternate feeds (mirror, vendor direct)
Run VEX excitor with full provenance pack
If still unresolved → emit Unknown.Unresolvable event (not an error; a state)

6.3 Failure Recovery

If fetch/feed errors → Unknown transitions to Unknown.EvidenceFailed. This must raise S (staleness) on next compute.

7. Scanner Implementation Guidelines (.NET 10)

7.1 Ranking Computation Location

Ranking is computed inside scanner.webservice immediately after Unknown classification. Concelier/Vexer must not touch ranking logic.

7.2 Graph Metrics Service

Maintain a cached daily calculation of centrality metrics to prevent per-scan recomputation cost explosion.

7.3 Compute Path

1. Build evidence set
2. Classify UnknownFlags
3. Compute P, E, U, C, S
4. Compute Score
5. Assign Band
6. Persist UnknownRecord
7. Emit Unknown.Triaged event

7.4 Storage Schema (Postgres)

Fields required:

unknown_id PK
pkg_id
pkg_version
digest_anchor
unknown_flags jsonb
popularity_p float
potential_e float
uncertainty_u float
centrality_c float
staleness_s float
score float
band enum
graph_slice_hash bytea
evidence_set_hash bytea
normalization_trace jsonb
callgraph_attempt_hash bytea
created_at, updated_at

8. API and UX Guidelines

8.1 Operator UI

For every Unknown:

Score badge (Hot/Warm/Cold)
Sub-component contributions (P/E/U/C/S)
Flags list
Evidence age
Scheduled next action
History graph of score evolution

8.2 Filters

Operators may filter by:

High P (impactful components)
High U (ambiguous advisories)
High S (stale data)
High C (graph hubs)

8.3 Reasoning Transparency

UI must show exactly why the ranking is high. No hidden heuristics.

9. Unit Testing & Golden Fixtures

9.1 Golden Unknown Cases

Provide frozen fixtures for:

Version range ambiguity
Mismatched epoch/backport
Missing vector
Conflicting severity between vendor/NVD
Unanchored filesystem library

Each fixture stores expected:

Flags
P/E/U/C/S
Score
Band

9.2 Replay Manifest Tests

Given a manifest containing:

feed hashes
rules version
normalization logic
lattice rules (for overall system)

Ensure ranking recomputes identically.

10. Developer Checklist (must be followed)

Did I persist all traces needed for deterministic replay?
Does ranking depend only on manifest-declared parameters (not environment)?
Are all uncertainty factors explicit flags, never inferred fuzzily?
Is the scoring reproducible under identical inputs?
Is Scheduler decision table deterministic and exhaustively tested?
Does API expose full reasoning without hiding rules?

If you want, I can now produce:

A full Postgres DDL for Unknowns.
A .NET 10 service class for ranking calculation.
A golden test suite with 20 fixtures.
UI wireframe for Unknown triage screen.

Which one should I generate?

13 KiB Raw Blame History Unescape Escape

Unknowns Triage: a lightweight, high‑leverage scheme

1) Define the score

2) Minimal schema (Postgres or Mongo) to support it

3) Fast heuristics to fill inputs

4) Scheduler rules (UnknownsRegistry → jobs)

5) Operator‑visible UX (DevOps‑friendly)

6) Guardrails to keep it deterministic

7) Reference snippets

8) Where this plugs into Stella Ops

Stella Ops Developer Guidelines

Ranking Unknowns in Reachability Graphs

0. Purpose

1. Formal Definition of “Unknown”

2. Dimensions for Ranking Unknowns

2.1 Popularity Impact (P)

2.2 Exploit Consequence Potential (E)

2.3 Uncertainty Density (U)

2.4 Graph Centrality (C)

2.5 Evidence Staleness (S)

3. Deterministic Ranking Score

4. Ranking Bands

5. Evidence Capture Requirements

6. Scheduler Policies

6.1 On Unknown Created

6.2 On Unknown Unchanged after N rescans

6.3 Failure Recovery

7. Scanner Implementation Guidelines (.NET 10)

7.1 Ranking Computation Location

7.2 Graph Metrics Service

7.3 Compute Path

7.4 Storage Schema (Postgres)

8. API and UX Guidelines

8.1 Operator UI

8.2 Filters

8.3 Reasoning Transparency

9. Unit Testing & Golden Fixtures

9.1 Golden Unknown Cases

9.2 Replay Manifest Tests

10. Developer Checklist (must be followed)

13 KiB

Raw Blame History

8) Where this plugs into Stella Ops