Files
git.stella-ops.org/docs/product-advisories/14-Dec-2025 - Triage and Unknowns Technical Reference.md
2025-12-14 19:58:38 +02:00

9.6 KiB
Raw Blame History

Triage and Unknowns Technical Reference

Source Advisories:

  • 30-Nov-2025 - Unknowns Decay & Triage Heuristics
  • 14-Dec-2025 - Dissect triage and evidence workflows
  • 04-Dec-2025 - Ranking Unknowns in Reachability Graphs

Last Updated: 2025-12-14


1. EVIDENCE-FIRST PRINCIPLES

  1. Evidence before detail: Opening alert shows best available evidence immediately
  2. Fast first signal: UI renders credible "first signal" quickly
  3. Determinism reduces hesitation: Sorting, graphs, diffs stable across refreshes
  4. Offline by design: If evidence exists locally, render without network
  5. Audit-ready by default: Every decision reproducible, attributable, exportable

2. MINIMAL EVIDENCE BUNDLE (PER FINDING)

  1. Reachability proof: Function-level path or package-level import chain
  2. Call-stack snippet: 510 frames around sink/source with file:line anchors
  3. Provenance: Attestation/DSSE + build ancestry (image → layer → artifact → commit)
  4. VEX/CSAF status: affected/not-affected/under-investigation + reason
  5. Diff: SBOM or VEX delta since last scan (smart-diff)

3. KPIS

3.1 TTFS (Time-to-First-Signal)

Definition: p50/p95 from alert creation to first rendered evidence

Target: p95 < 1.5s (with 100ms RTT, 1% loss)

3.2 Clicks-to-Closure

Definition: Median interactions per decision type

Target: median < 6 clicks

3.3 Evidence Completeness Score

Definition: 04 (reachability, call-stack, provenance, VEX present)

Target: ≥90% of decisions include all evidence + reason + replay token

3.4 Offline Friendliness

Definition: % evidence resolvable with no network

Target: ≥95% with local bundle

3.5 Audit Log Completeness

Requirement: Every decision has evidence hash set, actor, policy context, replay token

4. KEYBOARD SHORTCUTS

  • J: Jump to first incomplete evidence pane
  • Y: Copy DSSE (attestation block or Rekor entry ref)
  • R: Toggle reachability view (path list ↔ compact graph ↔ textual proof)
  • /: Search within graph (node/func/package)
  • S: Deterministic sort (reachability→severity→age→component)
  • A, N, U: Quick VEX set (Affected / Not-affected / Under-investigation)
  • ?: Keyboard help overlay

5. UX FLOW

5.1 Alert Row

  • TTFS timer
  • Reachability badge
  • Decision state
  • Diff-dot

5.2 Open Alert → Evidence Tab (Not Details)

Top strip: 3 proof pills (Reachability ✓ / Call-stack ✓ / Provenance ✓)

Click to expand inline

5.3 Decision Drawer (Pinned Right)

  • VEX/CSAF radio (A/N/U)
  • Reason presets → "Record decision"
  • Audit-ready summary (hashes, timestamps, policy)

5.4 Diff Tab

SBOM/VEX delta, grouped by "meaningful risk shift"

5.5 Activity Tab

Immutable audit log; export as signed bundle

6. GRAPH PERFORMANCE (LARGE CALL GRAPHS)

6.1 Minimal-Latency Snapshots

  • Pre-render static PNG/SVG thumbnails server-side

6.2 Progressive Neighborhood Expansion

  • Load 1-hop first, expand on demand
  • First TTFS < 500ms

6.3 Stable Node Ordering

  • Deterministic layout with consistent anchors

6.4 Chunked Graph Edges

  • Capped fan-out
  • Collapse identical library paths into reachability macro-edge

Targets:

  • Preview < 300ms
  • Interactive hydration < 2.0s for large graphs

7. OFFLINE DESIGN

7.1 Local Evidence Cache

Store (SBOM slices, path proofs, DSSE attestations, compiled call-stacks) in signed bundle beside SARIF/VEX

7.2 Deferred Enrichment

Mark fields needing internet; queue background "enricher" when network returns

7.3 Predictable Fallbacks

Show embedded DSSE + "verification pending" if provenance server missing

8. AUDIT & REPLAY

8.1 Deterministic Replay Token

replay_token = hash(feed_manifests + rules + lattice_policy + inputs)

8.2 One-Click "Reproduce"

CLI snippet pinned to exact versions and policies

8.3 Evidence Hash-Set

Content-address each proof artifact; audit entry stores hashes + signer

9. TELEMETRY IMPLEMENTATION

ttfs.start  (alert creation)
ttfs.signal  (first evidence card paint)
close.clicks  (decision recorded)

Log evidence bitset (reach, stack, prov, vex) at decision time

10. API REQUIREMENTS

10.1 Endpoints

GET /alerts?filters… → list view
GET /alerts/{id}/evidence → evidence payload (reachability, call stack, provenance, hashes)
POST /alerts/{id}/decisions → record decision event (append-only)
GET /alerts/{id}/audit → audit timeline
GET /alerts/{id}/diff?baseline=… → SBOM/VEX diff
GET /bundles/{id}, POST /bundles/verify → offline bundle download/verify

10.2 Evidence Payload Schema

{
  "alert_id": "a123",
  "reachability": { "status": "available|loading|unavailable|error", "hash": "sha256:…", "proof": {...} },
  "callstack": { "status": "...", "hash": "...", "frames": [...] },
  "provenance": { "status": "...", "hash": "...", "dsse": {...} },
  "vex": { "status": "...", "current": {...}, "history": [...] },
  "hashes": ["sha256:…", ...]
}

Guidelines:

  • Deterministic ordering for arrays and nodes
  • Explicit status per evidence section
  • Include hash per artifact

11. DECISION EVENT SCHEMA

Store per decision:

  • alert_id, artifact_id (image digest/commit hash)
  • actor_id, timestamp
  • decision_status (Affected/Not affected/Under investigation)
  • reason_code (preset) + reason_text
  • evidence_hashes[] (content-addressed)
  • policy_context (ruleset version, policy id)
  • replay_token (hash of inputs)

12. OFFLINE BUNDLE FORMAT

Single file (.stella.bundle.tgz) containing:

  • Alert metadata snapshot
  • Evidence artifacts (reachability proofs, call stacks, provenance attestations)
  • SBOM slice(s) for diffs
  • VEX decision history
  • Manifest with content hashes
  • Must be signed and verifiable

13. PERFORMANCE BUDGETS

  • TTFS: <200ms skeleton, <500ms first evidence pill, <1.5s p95 full evidence
  • Graph: Preview <300ms, interactive <2.0s
  • Interaction response: ≤100ms
  • Animation frame budget: 16ms avg / 50ms p95
  • Keyboard coverage: ≥90% of triage actions
  • Offline replay: 100% of decisions re-render from bundle

14. ERROR HANDLING

Never show empty states without explanation. Distinguish:

  • "not computed yet"
  • "not possible due to missing inputs"
  • "blocked by permissions"
  • "offline—enrichment pending"
  • "verification failed"

15. RBAC

Gate:

  • Viewing provenance attestations
  • Recording decisions
  • Exporting audit bundles

All decision events immutable; corrections are new events (append-only)

16. UNKNOWNS DECAY & TRIAGE HEURISTICS

16.1 Problem

Stale "unknown" findings create noise; need deterministic decay and triage rules

16.2 Requirements

  • Confidence decay card
  • Triage queue UI
  • Export artifacts for planning

16.3 Determinism

  • Decay windows and thresholds must be deterministic
  • Exports reproducible without live dependencies

16.4 Decay Logic

Decay Windows: Define time-based decay windows

Thresholds: Set confidence thresholds for promotion/demotion

UI/Export Snapshot Expectations: Deterministic decay logic description

17. UNKNOWNS RANKING ALGORITHM

17.1 Score Formula

Score = clamp01(
    wP·P  +  # Popularity impact
    wE·E  +  # Exploit consequence potential
    wU·U  +  # Uncertainty density
    wC·C  +  # Graph centrality
    wS·S     # Evidence staleness
)

17.2 Default Weights

wP = 0.25  (deployment impact)
wE = 0.25  (potential consequence)
wU = 0.25  (uncertainty density)
wC = 0.15  (graph centrality)
wS = 0.10  (evidence staleness)

17.3 Heuristics

P = min(1, log10(1 + deployments)/log10(1 + 100))
U = sum of flags, capped at 1.0:
    +0.30 if no provenance anchor
    +0.25 if version_range
    +0.20 if conflicting_feeds
    +0.15 if missing_vector
    +0.10 if unreachable source advisory
S = min(1, age_days / 14)

17.4 Band Assignment

Score ≥ 0.70 → HOT (immediate rescan + VEX escalation)
0.40 ≤ Score < 0.70 → WARM (scheduled rescan 12-72h)
Score < 0.40 → COLD (weekly batch)

18. UNKNOWNS DATABASE SCHEMA

CREATE TABLE unknowns (
    unknown_id uuid PRIMARY KEY,
    pkg_id text,
    pkg_version text,
    digest_anchor bytea,
    unknown_flags jsonb,
    popularity_p float,
    potential_e float,
    uncertainty_u float,
    centrality_c float,
    staleness_s float,
    score float,
    band text CHECK(band IN ('HOT','WARM','COLD')),
    graph_slice_hash bytea,
    evidence_set_hash bytea,
    normalization_trace jsonb,
    callgraph_attempt_hash bytea,
    created_at timestamptz,
    updated_at timestamptz
);

CREATE TABLE deploy_refs (
    pkg_id text,
    image_id text,
    env text,
    first_seen timestamptz,
    last_seen timestamptz
);

CREATE TABLE graph_metrics (
    pkg_id text PRIMARY KEY,
    degree_c float,
    betweenness_c float,
    last_calc_at timestamptz
);

19. TRIAGE QUEUE UI

19.1 Queue Views

  • HOT: Red badge, sort by score desc
  • WARM: Yellow badge, sort by score desc
  • COLD: Gray badge, sort by age asc

19.2 Bulk Actions

  • Mark as reviewed
  • Escalate to HOT
  • Suppress (with reason)
  • Export selected

19.3 Filters

  • By band
  • By package
  • By environment
  • By date range

20. DECISION WORKFLOW CHECKLIST

For any triage decision:

  • Evidence reviewed (reachability, call-stack, provenance, VEX)
  • Decision status selected (A/N/U)
  • Reason provided (preset or custom)
  • Replay token generated
  • Evidence hashes captured
  • Audit event recorded
  • Decision immutable (append-only)

Document Version: 1.0 Target Platform: .NET 10, PostgreSQL ≥16, Angular v17