Files
git.stella-ops.org/docs/signals/unknowns-registry.md
master d519782a8f
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
prep docs and service updates
2025-11-21 06:56:36 +00:00

3.4 KiB

Unknowns Registry (Signals) — November 2026

This document defines the Unknowns Registry that turns unresolved identities or edges into first-class signals. It replaces the temporary notes from late 2026 advisories.

1. Purpose

When scanners or runtime probes cannot decisively map artifacts, symbols, or package identities, the gap is recorded as an Unknown instead of being dropped. Policy and scoring can then incorporate “unknowns pressure” to avoid silent false negatives.

2. Data model (v0)

{
  "unknown_id": "unk:sha256:<type+scope+evidence>",
  "observed_at": "2025-11-20T00:00:00Z",
  "provenance": { "source": "Scanner|Signals|SbomService|Vexer", "host": "runner-42", "scan_id": "scan:..." },
  "scope": { "artifact": { "type": "oci.image", "ref": "registry/app@sha256:..." }, "subpath": "/app/bin/libssl.so.3", "phase": "scan|runtime|build" },
  "unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable",
  "evidence": { "raw": "dynsym missing for libssl.so.3", "signals": ["sym:memcpy", "import:SSL_free"] },
  "transitive": { "depth": 1, "parents": ["pkg:deb/openssl@3.0.2"], "children": [] },
  "confidence": { "p": 0.42, "method": "rule" },
  "exposure_hints": { "surface": ["startup"], "runtime_hits": 0 },
  "status": "open|triaged|suppressed|resolved",
  "labels": ["reachability:possible", "sbom:incomplete"]
}

3. API (idempotent)

  • POST /unknowns/ingest — upsert by unknown_id; repeat payloads are no-ops.
  • GET /unknowns?artifact=...&status=open — list unknowns for a target.
  • POST /unknowns/{id}/triage — update status/labels, attach rationale.
  • GET /unknowns/metrics — density by artifact / unknown_type / depth.

All endpoints are additive; no hard deletes. Payloads must include tenant bindings and CAS URIs when evidence is stored externally.

4. Producers

  • Scanner: unresolved symbol → package mapping (stripped binaries), missing build-id, ambiguous purl; log with unknown_type=identity_gap or missing_edge.
  • Signals: runtime hits that cannot map to a graph node or purl; unresolved call edges.
  • SbomService: conflicting versions for same path; hash mismatch between SBOM and observed file.
  • Vexer/Policy: advisory without trustable provenance (unknown_type=policy_undecidable).

5. Consumers & scoring

  • Signals scoring adds unknowns_pressure = f(density(depth<=1), runtime_shadow, policy_undecidable) and feeds it into reachability/risk scores.
  • Policy can block not_affected claims when unknowns_pressure exceeds thresholds.
  • UI/CLI show unknown chips with reason and depth; operators can triage or suppress.

6. Storage & CAS

  • Primary store: append-only KV/graph in Mongo (collections unknowns, unknown_metrics).
  • Evidence blobs: CAS under cas://unknowns/{sha256} for large payloads (runtime traces, partial SBOMs).
  • Include analyzer fingerprint + schema version in each record for replay.

7. Integration checkpoints

  • Add writer hooks in Scanner/Signals once richgraph-v1 and runtime ingestion surface unmapped items.
  • Extend reachability lattice docs to note unknowns_pressure input.
  • Add Grafana panel for unknown density per artifact/namespace.

8. Acceptance criteria

  • APIs deployed with idempotent behavior and tenant guards.
  • At least two producer paths writing Unknowns (Scanner unresolved symbol; Signals runtime shadow).
  • Metrics endpoint shows density and trend; UI/CLI expose triage status.