Files
git.stella-ops.org/docs/signals/unknowns-registry.md
master 5a480a3c2a
Some checks failed
Reachability Corpus Validation / validate-corpus (push) Waiting to run
Reachability Corpus Validation / validate-ground-truths (push) Waiting to run
Reachability Corpus Validation / determinism-check (push) Blocked by required conditions
Scanner Analyzers / Discover Analyzers (push) Waiting to run
Scanner Analyzers / Build Analyzers (push) Blocked by required conditions
Scanner Analyzers / Test Language Analyzers (push) Blocked by required conditions
Scanner Analyzers / Validate Test Fixtures (push) Waiting to run
Scanner Analyzers / Verify Deterministic Output (push) Blocked by required conditions
Signals CI & Image / signals-ci (push) Waiting to run
Signals Reachability Scoring & Events / reachability-smoke (push) Waiting to run
Signals Reachability Scoring & Events / sign-and-upload (push) Blocked by required conditions
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
Findings Ledger CI / build-test (push) Has been cancelled
Findings Ledger CI / migration-validation (push) Has been cancelled
Findings Ledger CI / generate-manifest (push) Has been cancelled
Lighthouse CI / Lighthouse Audit (push) Has been cancelled
Lighthouse CI / Axe Accessibility Audit (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Add call graph fixtures for various languages and scenarios
- Introduced `all-edge-reasons.json` to test edge resolution reasons in .NET.
- Added `all-visibility-levels.json` to validate method visibility levels in .NET.
- Created `dotnet-aspnetcore-minimal.json` for a minimal ASP.NET Core application.
- Included `go-gin-api.json` for a Go Gin API application structure.
- Added `java-spring-boot.json` for the Spring PetClinic application in Java.
- Introduced `legacy-no-schema.json` for legacy application structure without schema.
- Created `node-express-api.json` for an Express.js API application structure.
2025-12-16 10:44:24 +02:00

4.2 KiB

Unknowns Registry (Signals) — November 2026

This document defines the Unknowns Registry that turns unresolved identities or edges into first-class signals. It replaces the temporary notes from late 2026 advisories.

1. Purpose

When scanners or runtime probes cannot decisively map artifacts, symbols, or package identities, the gap is recorded as an Unknown instead of being dropped. Policy and scoring can then incorporate “unknowns pressure” to avoid silent false negatives.

2. Data model (v0)

{
  "unknown_id": "unk:sha256:<type+scope+evidence>",
  "observed_at": "2025-11-20T00:00:00Z",
  "provenance": { "source": "Scanner|Signals|SbomService|Vexer", "host": "runner-42", "scan_id": "scan:..." },
  "scope": { "artifact": { "type": "oci.image", "ref": "registry/app@sha256:..." }, "subpath": "/app/bin/libssl.so.3", "phase": "scan|runtime|build" },
  "unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable",
  "evidence": { "raw": "dynsym missing for libssl.so.3", "signals": ["sym:memcpy", "import:SSL_free"] },
  "transitive": { "depth": 1, "parents": ["pkg:deb/openssl@3.0.2"], "children": [] },
  "confidence": { "p": 0.42, "method": "rule" },
  "exposure_hints": { "surface": ["startup"], "runtime_hits": 0 },
  "status": "open|triaged|suppressed|resolved",
  "labels": ["reachability:possible", "sbom:incomplete"]
}

3. API (idempotent)

  • POST /unknowns/ingest — upsert by unknown_id; repeat payloads are no-ops.
  • GET /unknowns?artifact=...&status=open — list unknowns for a target.
  • POST /unknowns/{id}/triage — update status/labels, attach rationale.
  • GET /unknowns/metrics — density by artifact / unknown_type / depth.

All endpoints are additive; no hard deletes. Payloads must include tenant bindings and CAS URIs when evidence is stored externally.

4. Producers

  • Scanner: unresolved symbol → package mapping (stripped binaries), missing build-id, ambiguous purl; log with unknown_type=identity_gap or missing_edge.
  • Signals: runtime hits that cannot map to a graph node or purl; unresolved call edges.
  • SbomService: conflicting versions for same path; hash mismatch between SBOM and observed file.
  • Vexer/Policy: advisory without trustable provenance (unknown_type=policy_undecidable).

5. Consumers & scoring

  • Signals scoring adds unknowns_pressure = f(density(depth<=1), runtime_shadow, policy_undecidable) and feeds it into reachability/risk scores.
  • Policy can block not_affected claims when unknowns_pressure exceeds thresholds.
  • UI/CLI show unknown chips with reason and depth; operators can triage or suppress.

5.1 Multi-Factor Ranking

Unknowns are ranked using a 5-factor scoring algorithm that computes a composite score from:

  • Popularity (P) - Deployment impact based on usage count
  • Exploit Potential (E) - CVE severity if known
  • Uncertainty (U) - Accumulated flag weights
  • Centrality (C) - Graph position importance (betweenness)
  • Staleness (S) - Evidence age since last analysis

Based on the composite score, unknowns are assigned to triage bands:

  • HOT (score >= 0.70): Immediate rescan, 15-minute scheduling
  • WARM (0.40 <= score < 0.70): Scheduled rescan within 12-72h
  • COLD (score < 0.40): Weekly batch processing

See Unknowns Ranking Algorithm for the complete formula reference.

6. Storage & CAS

  • Primary store: append-only KV/graph in Mongo (collections unknowns, unknown_metrics).
  • Evidence blobs: CAS under cas://unknowns/{sha256} for large payloads (runtime traces, partial SBOMs).
  • Include analyzer fingerprint + schema version in each record for replay.

7. Integration checkpoints

  • Add writer hooks in Scanner/Signals once richgraph-v1 and runtime ingestion surface unmapped items.
  • Extend reachability lattice docs to note unknowns_pressure input.
  • Add Grafana panel for unknown density per artifact/namespace.

8. Acceptance criteria

  • APIs deployed with idempotent behavior and tenant guards.
  • At least two producer paths writing Unknowns (Scanner unresolved symbol; Signals runtime shadow).
  • Metrics endpoint shows density and trend; UI/CLI expose triage status.