# Unknowns Registry (Signals) — November 2026 This document defines the Unknowns Registry that turns unresolved identities or edges into first-class signals. It replaces the temporary notes from late 2026 advisories. ## 1. Purpose When scanners or runtime probes cannot decisively map artifacts, symbols, or package identities, the gap is recorded as an **Unknown** instead of being dropped. Policy and scoring can then incorporate “unknowns pressure” to avoid silent false negatives. ## 2. Data model (v0) ```json { "unknown_id": "unk:sha256:", "observed_at": "2025-11-20T00:00:00Z", "provenance": { "source": "Scanner|Signals|SbomService|Vexer", "host": "runner-42", "scan_id": "scan:..." }, "scope": { "artifact": { "type": "oci.image", "ref": "registry/app@sha256:..." }, "subpath": "/app/bin/libssl.so.3", "phase": "scan|runtime|build" }, "unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable", "evidence": { "raw": "dynsym missing for libssl.so.3", "signals": ["sym:memcpy", "import:SSL_free"] }, "transitive": { "depth": 1, "parents": ["pkg:deb/openssl@3.0.2"], "children": [] }, "confidence": { "p": 0.42, "method": "rule" }, "exposure_hints": { "surface": ["startup"], "runtime_hits": 0 }, "status": "open|triaged|suppressed|resolved", "labels": ["reachability:possible", "sbom:incomplete"] } ``` ## 3. API (idempotent) - `POST /unknowns/ingest` — upsert by `unknown_id`; repeat payloads are no-ops. - `GET /unknowns?artifact=...&status=open` — list unknowns for a target. - `POST /unknowns/{id}/triage` — update `status`/`labels`, attach rationale. - `GET /unknowns/metrics` — density by artifact / unknown_type / depth. All endpoints are additive; no hard deletes. Payloads must include tenant bindings and CAS URIs when evidence is stored externally. ## 4. Producers - **Scanner**: unresolved symbol → package mapping (stripped binaries), missing build-id, ambiguous purl; log with `unknown_type=identity_gap` or `missing_edge`. - **Signals**: runtime hits that cannot map to a graph node or purl; unresolved call edges. - **SbomService**: conflicting versions for same path; hash mismatch between SBOM and observed file. - **Vexer/Policy**: advisory without trustable provenance (`unknown_type=policy_undecidable`). ## 5. Consumers & scoring - Signals scoring adds `unknowns_pressure = f(density(depth<=1), runtime_shadow, policy_undecidable)` and feeds it into reachability/risk scores. - Policy can block `not_affected` claims when `unknowns_pressure` exceeds thresholds. - UI/CLI show unknown chips with reason and depth; operators can triage or suppress. ## 6. Storage & CAS - Primary store: append-only KV/graph in Mongo (collections `unknowns`, `unknown_metrics`). - Evidence blobs: CAS under `cas://unknowns/{sha256}` for large payloads (runtime traces, partial SBOMs). - Include analyzer fingerprint + schema version in each record for replay. ## 7. Integration checkpoints - Add writer hooks in Scanner/Signals once `richgraph-v1` and runtime ingestion surface unmapped items. - Extend reachability lattice docs to note `unknowns_pressure` input. - Add Grafana panel for unknown density per artifact/namespace. ## 8. Acceptance criteria - APIs deployed with idempotent behavior and tenant guards. - At least two producer paths writing Unknowns (Scanner unresolved symbol; Signals runtime shadow). - Metrics endpoint shows density and trend; UI/CLI expose triage status.