66 lines
3.4 KiB
Markdown
66 lines
3.4 KiB
Markdown
# Unknowns Registry (Signals) — November 2026
|
|
|
|
This document defines the Unknowns Registry that turns unresolved identities or edges into first-class signals. It replaces the temporary notes from late 2026 advisories.
|
|
|
|
## 1. Purpose
|
|
|
|
When scanners or runtime probes cannot decisively map artifacts, symbols, or package identities, the gap is recorded as an **Unknown** instead of being dropped. Policy and scoring can then incorporate “unknowns pressure” to avoid silent false negatives.
|
|
|
|
## 2. Data model (v0)
|
|
|
|
```json
|
|
{
|
|
"unknown_id": "unk:sha256:<type+scope+evidence>",
|
|
"observed_at": "2025-11-20T00:00:00Z",
|
|
"provenance": { "source": "Scanner|Signals|SbomService|Vexer", "host": "runner-42", "scan_id": "scan:..." },
|
|
"scope": { "artifact": { "type": "oci.image", "ref": "registry/app@sha256:..." }, "subpath": "/app/bin/libssl.so.3", "phase": "scan|runtime|build" },
|
|
"unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable",
|
|
"evidence": { "raw": "dynsym missing for libssl.so.3", "signals": ["sym:memcpy", "import:SSL_free"] },
|
|
"transitive": { "depth": 1, "parents": ["pkg:deb/openssl@3.0.2"], "children": [] },
|
|
"confidence": { "p": 0.42, "method": "rule" },
|
|
"exposure_hints": { "surface": ["startup"], "runtime_hits": 0 },
|
|
"status": "open|triaged|suppressed|resolved",
|
|
"labels": ["reachability:possible", "sbom:incomplete"]
|
|
}
|
|
```
|
|
|
|
## 3. API (idempotent)
|
|
|
|
- `POST /unknowns/ingest` — upsert by `unknown_id`; repeat payloads are no-ops.
|
|
- `GET /unknowns?artifact=...&status=open` — list unknowns for a target.
|
|
- `POST /unknowns/{id}/triage` — update `status`/`labels`, attach rationale.
|
|
- `GET /unknowns/metrics` — density by artifact / unknown_type / depth.
|
|
|
|
All endpoints are additive; no hard deletes. Payloads must include tenant bindings and CAS URIs when evidence is stored externally.
|
|
|
|
## 4. Producers
|
|
|
|
- **Scanner**: unresolved symbol → package mapping (stripped binaries), missing build-id, ambiguous purl; log with `unknown_type=identity_gap` or `missing_edge`.
|
|
- **Signals**: runtime hits that cannot map to a graph node or purl; unresolved call edges.
|
|
- **SbomService**: conflicting versions for same path; hash mismatch between SBOM and observed file.
|
|
- **Vexer/Policy**: advisory without trustable provenance (`unknown_type=policy_undecidable`).
|
|
|
|
## 5. Consumers & scoring
|
|
|
|
- Signals scoring adds `unknowns_pressure = f(density(depth<=1), runtime_shadow, policy_undecidable)` and feeds it into reachability/risk scores.
|
|
- Policy can block `not_affected` claims when `unknowns_pressure` exceeds thresholds.
|
|
- UI/CLI show unknown chips with reason and depth; operators can triage or suppress.
|
|
|
|
## 6. Storage & CAS
|
|
|
|
- Primary store: append-only KV/graph in Mongo (collections `unknowns`, `unknown_metrics`).
|
|
- Evidence blobs: CAS under `cas://unknowns/{sha256}` for large payloads (runtime traces, partial SBOMs).
|
|
- Include analyzer fingerprint + schema version in each record for replay.
|
|
|
|
## 7. Integration checkpoints
|
|
|
|
- Add writer hooks in Scanner/Signals once `richgraph-v1` and runtime ingestion surface unmapped items.
|
|
- Extend reachability lattice docs to note `unknowns_pressure` input.
|
|
- Add Grafana panel for unknown density per artifact/namespace.
|
|
|
|
## 8. Acceptance criteria
|
|
|
|
- APIs deployed with idempotent behavior and tenant guards.
|
|
- At least two producer paths writing Unknowns (Scanner unresolved symbol; Signals runtime shadow).
|
|
- Metrics endpoint shows density and trend; UI/CLI expose triage status.
|