feat: Add new provenance and crypto registry documentation
- Introduced attestation inventory and subject-rekor mapping files for tracking Docker packages. - Added a comprehensive crypto registry decision document outlining defaults and required follow-ups. - Created an offline feeds manifest for bundling air-gap resources. - Implemented a script to generate and update binary manifests for curated binaries. - Added a verification script to ensure binary artefacts are located in approved directories. - Defined new schemas for AdvisoryEvidenceBundle, OrchestratorEnvelope, ScannerReportReadyPayload, and ScannerScanCompletedPayload. - Established project files for StellaOps.Orchestrator.Schemas and StellaOps.PolicyAuthoritySignals.Contracts. - Updated vendor manifest to track pinned binaries for integrity.
This commit is contained in:
719
docs/product-advisories/18-Nov-2026 - 1 copy 5.md
Normal file
719
docs/product-advisories/18-Nov-2026 - 1 copy 5.md
Normal file
@@ -0,0 +1,719 @@
|
||||
|
||||
Here’s a crisp idea you can drop straight into Stella Ops: treat “unknowns” as first‑class data, not noise.
|
||||
|
||||
---
|
||||
|
||||
# Unknowns Registry — turning uncertainty into signals
|
||||
|
||||
**Why:** Scanners and VEX feeds miss things (ambiguous package IDs, unverifiable hashes, orphaned layers, missing SBOM edges, runtime-only artifacts). Today these get logged and forgotten. If we **structure** them, downstream agents can reason about risk and shrink blast radius proactively.
|
||||
|
||||
**What it is:** A small service + schema that records every uncertainty with enough context for later inference.
|
||||
|
||||
## Core model (v0)
|
||||
|
||||
```json
|
||||
{
|
||||
"unknown_id": "unk:sha256:…",
|
||||
"observed_at": "2025-11-18T12:00:00Z",
|
||||
"provenance": {
|
||||
"source": "Scanner.Analyzer.DotNet|Sbomer|Signals|Vexer",
|
||||
"host": "runner-42",
|
||||
"scan_id": "scan:…"
|
||||
},
|
||||
"scope": {
|
||||
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
|
||||
"subpath": "/app/bin/Contoso.dll",
|
||||
"phase": "build|scan|runtime"
|
||||
},
|
||||
"unknown_type": "identity_gap|version_conflict|hash_mismatch|missing_edge|runtime_shadow|policy_undecidable",
|
||||
"evidence": {
|
||||
"raw": "nuget id 'Serilog' but assembly name 'Serilog.Core'",
|
||||
"signals": ["sym:Serilog.Core.Logger", "procopen:/app/agent"]
|
||||
},
|
||||
"transitive": {
|
||||
"depth": 2,
|
||||
"parents": ["pkg:nuget/Serilog@?"],
|
||||
"children": []
|
||||
},
|
||||
"confidence": { "p": 0.42, "method": "bayes-merge|rule" },
|
||||
"exposure_hints": {
|
||||
"surface": ["logging pipeline", "startup path"],
|
||||
"runtime_hits": 3
|
||||
},
|
||||
"status": "open|triaged|suppressed|resolved",
|
||||
"labels": ["reachability:possible", "sbom:incomplete"]
|
||||
}
|
||||
```
|
||||
|
||||
## Categorize by three axes
|
||||
|
||||
* **Provenance** (where it came from): Scanner vs Sbomer vs Vexer vs Signals.
|
||||
* **Scope** (what it touches): image/layer/file/symbol/runtime‑proc/policy.
|
||||
* **Transitive depth** (how far from an entry point): 0 = direct, 1..N via deps.
|
||||
|
||||
## How agents use it
|
||||
|
||||
* **Cartographer**: includes unknown edges in the graph with special weight; lets Policy/Lattice down‑rank vulnerable nodes near high‑impact unknowns.
|
||||
* **Remedy Assistant (Zastava)**: proposes micro‑probes (“add EventPipe/JFR tap for X symbol”) or build‑time assertions (“pin Serilog>=3.1, regenerate SBOM”).
|
||||
* **Scheduler**: prioritizes scans where unknown density × asset criticality is highest.
|
||||
|
||||
## Minimal API (idempotent, additive)
|
||||
|
||||
* `POST /unknowns/ingest` — upsert by `unknown_id` (hash of type+scope+evidence).
|
||||
* `GET /unknowns?artifact=…&status=open` — list for a target.
|
||||
* `POST /unknowns/:id/triage` — set status/labels, attach rationale.
|
||||
* `GET /metrics` — density by artifact/namespace/unknown_type.
|
||||
|
||||
*All additive; no versioning required. Repeat calls with the same payload are no‑ops.*
|
||||
|
||||
## Scoring hook (into your lattice)
|
||||
|
||||
* Add a **“Unknowns Pressure”** term:
|
||||
`risk = base ⊕ (α * density_depth≤1) ⊕ (β * runtime_shadow) ⊕ (γ * policy_undecidable)`
|
||||
* Gate “green” only if `density_depth≤1 == 0` **or** compensating controls active.
|
||||
|
||||
## Storage & plumbing
|
||||
|
||||
* **Store:** append‑only KV (Badger/Rocks) + Graph overlay (SQLite/Neo4j—your call).
|
||||
* **Emit:** DSSE‑signed “Unknowns Attestation” per scan for replayable audits.
|
||||
* **UI:** heatmap per artifact (unknowns by type × depth), drill‑down to evidence.
|
||||
|
||||
## First 2‑day slice
|
||||
|
||||
1. Define `unknown_type` enum + hashable `unknown_id`.
|
||||
2. Wire Scanner/Sbomer/Vexer to emit unknowns (start with: identity_gap, missing_edge).
|
||||
3. Persist + expose `/metrics` (density, by depth and type).
|
||||
4. In Policy Studio, add the Unknowns Pressure term with default α/β/γ.
|
||||
|
||||
If you want, I’ll draft the exact protobuf/JSON schema and drop a .NET 10 record types + EF model, plus a tiny CLI to query and a Grafana panel JSON.
|
||||
I will treat “it” as the whole vision behind **Pushing Binary Reachability Toward True Determinism** inside Stella Ops: function-/symbol-level reachability for binaries and higher-level languages, wired into Scanner, Cartographer, Signals, and VEX.
|
||||
|
||||
Below is an implementation-oriented architecture plan you can hand directly to agents.
|
||||
|
||||
---
|
||||
|
||||
## 1. Scope, goals, and non-negotiable invariants
|
||||
|
||||
### 1.1. Scope
|
||||
|
||||
Deliver a deterministic reachability pipeline for containers that:
|
||||
|
||||
1. Builds **call graphs** and **symbol usage maps** for:
|
||||
|
||||
* Native binaries (ELF, PE, Mach-O) — primary for this branch.
|
||||
* Scripted/VM languages later: JS, Python, PHP (as part of the same architecture).
|
||||
2. Maps symbols and functions to:
|
||||
|
||||
* Packages (purls).
|
||||
* Vulnerabilities (CVE → symbol/function list via Concelier/VEX data).
|
||||
3. Computes **deterministic reachability states** for each `(vulnerability, artifact)` pair.
|
||||
4. Emits:
|
||||
|
||||
* Machine-readable JSON (with `purl`s).
|
||||
* Graph overlays for Cartographer.
|
||||
* Inputs for the lattice/trust engine and VEXer/Excitor.
|
||||
|
||||
### 1.2. Invariants
|
||||
|
||||
* **Deterministic replay**: Given the same:
|
||||
|
||||
* Image digest(s),
|
||||
* Analyzer versions,
|
||||
* Config + policy,
|
||||
* Runtime trace inputs (if any),
|
||||
the same reachability outputs must be produced, bit-for-bit.
|
||||
* **Idempotent, additive APIs**:
|
||||
|
||||
* No versioning of endpoints, only additive/optional fields.
|
||||
* Same request = same response, no side effects besides storing/caching.
|
||||
* **Lattice logic runs in `Scanner.WebService`**:
|
||||
|
||||
* All “reachable/unreachable/unknown” and confidence merging lives in Scanner, not Concelier/Excitors.
|
||||
* **Preserve prune source**:
|
||||
|
||||
* Concelier and Excitors preserve provenance and do not “massage” reachability; they only consume it.
|
||||
* **Offline, air-gap friendly**:
|
||||
|
||||
* No mandatory external calls; dependency on local analyzers and local advisory/VEX cache.
|
||||
|
||||
---
|
||||
|
||||
## 2. High-level pipeline
|
||||
|
||||
From container image to reachability output:
|
||||
|
||||
1. **Image enumeration**
|
||||
`Scanner.WebService` receives an image ref or tarball and spawns an analysis run.
|
||||
2. **Binary discovery & classification**
|
||||
Binary analyzers detect ELF/PE/Mach-O + main interpreters (python, node, php) and scripts.
|
||||
3. **Symbolization & call graph building**
|
||||
|
||||
* For each binary/module, we produce:
|
||||
|
||||
* Symbol table (exported + imported).
|
||||
* Call graph edges (function-level where possible).
|
||||
* For dynamic languages, we later plug in appropriate analyzers.
|
||||
4. **Symbol→package mapping**
|
||||
|
||||
* Match symbols to packages and `purl`s using:
|
||||
|
||||
* Known vendor symbol maps (from Concelier / Feedser).
|
||||
* Heuristics, path patterns, build IDs.
|
||||
5. **Vulnerability→symbol mapping**
|
||||
|
||||
* From Concelier/VEX/CSAF: map each CVE to the set of symbols/functions it affects.
|
||||
6. **Reachability solving**
|
||||
|
||||
* For each `(CVE, artifact)`:
|
||||
|
||||
* Determine presence and reachability of affected symbols from known entrypoints.
|
||||
* Merge static call graph and runtime signals (if available) via deterministic lattice.
|
||||
7. **Output & storage**
|
||||
|
||||
* Reachability JSON with purls and confidence.
|
||||
* Graph overlay into Cartographer.
|
||||
* Signals/events for downstream scoring.
|
||||
* DSSE-signed reachability attestation for replay/audit.
|
||||
|
||||
---
|
||||
|
||||
## 3. Component architecture
|
||||
|
||||
### 3.1. New and extended services
|
||||
|
||||
1. **`StellaOps.Scanner.WebService` (extended)**
|
||||
|
||||
* Orchestration of reachability analyses.
|
||||
* Lattice/merging engine.
|
||||
* Idempotent reachability APIs.
|
||||
|
||||
2. **`StellaOps.Scanner.Analyzers.Binary.*` (new)**
|
||||
|
||||
* `…Binary.Discovery`: file type detection, ELF/PE/Mach-O parsing.
|
||||
* `…Binary.Symbolizer`: resolves symbols, imports/exports, relocations.
|
||||
* `…Binary.CallGraph.Native`: builds call graphs where possible (via disassembly/CFG).
|
||||
* `…Binary.CallGraph.DynamicStubs`: heuristics for indirect calls, PLT/GOT, vtables.
|
||||
|
||||
3. **`StellaOps.Scanner.Analyzers.Script.*` (future extension)**
|
||||
|
||||
* `…Lang.JavaScript.CallGraph`
|
||||
* `…Lang.Python.CallGraph`
|
||||
* `…Lang.Php.CallGraph`
|
||||
* These emit the same generic call-graph IR.
|
||||
|
||||
4. **`StellaOps.Reachability.Engine` (within Scanner.WebService)**
|
||||
|
||||
* Normalizes all call graphs into a common IR.
|
||||
* Merges static and dynamic evidence.
|
||||
* Computes reachability states and scores.
|
||||
|
||||
5. **`StellaOps.Cartographer.ReachabilityOverlay` (new overlay module)**
|
||||
|
||||
* Stores per-artifact call graphs and reachability tags.
|
||||
* Provides graph queries for UI and policy tools.
|
||||
|
||||
6. **`StellaOps.Signals` (extended)**
|
||||
|
||||
* Ingests runtime call traces (e.g., from EventPipe/JFR/ebpf in other branches).
|
||||
* Feeds function-hit events into the Reachability Engine.
|
||||
|
||||
7. **Unknowns Registry integration (optional but recommended)**
|
||||
|
||||
* Stores unresolved symbol/package mappings and incomplete edges as `unknowns`.
|
||||
* Used to adjust risk scores (“Unknowns Pressure”) when binary analysis is incomplete.
|
||||
|
||||
---
|
||||
|
||||
## 4. Detailed design by layer
|
||||
|
||||
### 4.1. Static analysis layer (binaries)
|
||||
|
||||
#### 4.1.1. Binary discovery
|
||||
|
||||
Module: `StellaOps.Scanner.Analyzers.Binary.Discovery`
|
||||
|
||||
* Inputs:
|
||||
|
||||
* Per-image file list (from existing Scanner).
|
||||
* Byte slices of candidate binaries.
|
||||
* Logic:
|
||||
|
||||
* Detect ELF/PE/Mach-O via magic bytes, not extensions.
|
||||
* Classify as:
|
||||
|
||||
* Main executable
|
||||
* Shared library
|
||||
* Plugin/module
|
||||
* Output:
|
||||
|
||||
* `binary_manifest.json` per image:
|
||||
|
||||
```json
|
||||
{
|
||||
"image_ref": "registry/app@sha256:…",
|
||||
"binaries": [
|
||||
{
|
||||
"id": "bin:elf:/usr/local/bin/app",
|
||||
"path": "/usr/local/bin/app",
|
||||
"format": "elf",
|
||||
"arch": "x86_64",
|
||||
"role": "executable"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### 4.1.2. Symbolization
|
||||
|
||||
Module: `StellaOps.Scanner.Analyzers.Binary.Symbolizer`
|
||||
|
||||
* Uses:
|
||||
|
||||
* ELF/PE/Mach-O parsers (internal or third-party), no external calls.
|
||||
* Output per binary:
|
||||
|
||||
```json
|
||||
{
|
||||
"binary_id": "bin:elf:/usr/local/bin/app",
|
||||
"build_id": "buildid:abcd…",
|
||||
"exports": ["pkg1::ClassA::method1", "..."],
|
||||
"imports": ["openssl::EVP_EncryptInit_ex", "..."],
|
||||
"sections": { "text": { "va": "0x...", "size": 12345 } }
|
||||
}
|
||||
```
|
||||
* Writes unresolved symbol sets to Unknowns Registry when:
|
||||
|
||||
* Imports cannot be tied to known packages or symbols.
|
||||
|
||||
#### 4.1.3. Call graph construction
|
||||
|
||||
Module: `StellaOps.Scanner.Analyzers.Binary.CallGraph.Native`
|
||||
|
||||
* Core tasks:
|
||||
|
||||
* Build control-flow graphs (CFG) for each function via:
|
||||
|
||||
* Disassembly.
|
||||
* Basic block detection.
|
||||
* Identify direct calls (`call func`) and indirect calls (function pointers, vtables).
|
||||
* IR model:
|
||||
|
||||
```json
|
||||
{
|
||||
"binary_id": "bin:elf:/usr/local/bin/app",
|
||||
"functions": [
|
||||
{ "fid": "func:app::main", "va": "0x401000", "size": 128 },
|
||||
{ "fid": "func:libssl::EVP_EncryptInit_ex", "external": true }
|
||||
],
|
||||
"edges": [
|
||||
{ "caller": "func:app::main", "callee": "func:app::init_config", "type": "direct" },
|
||||
{ "caller": "func:app::main", "callee": "func:libssl::EVP_EncryptInit_ex", "type": "import" }
|
||||
]
|
||||
}
|
||||
```
|
||||
* Edge confidence:
|
||||
|
||||
* `type: direct|import|indirect|heuristic`
|
||||
* Used later by the lattice.
|
||||
|
||||
#### 4.1.4. Entry point inference
|
||||
|
||||
* Sources:
|
||||
|
||||
* ELF `PT_INTERP`, PE `AddressOfEntryPoint`.
|
||||
* Application-level hints (known frameworks, service main methods).
|
||||
* Container metadata (CMD, ENTRYPOINT).
|
||||
* Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"binary_id": "bin:elf:/usr/local/bin/app",
|
||||
"entrypoints": ["func:app::main"]
|
||||
}
|
||||
```
|
||||
|
||||
> Note: For JS/Python/PHP, equivalent analyzers will later define module entrypoints (`index.js`, `wsgi_app`, `public/index.php`).
|
||||
|
||||
---
|
||||
|
||||
### 4.2. Symbol-to-package and CVE-to-symbol mapping
|
||||
|
||||
#### 4.2.1. Symbol→package mapping
|
||||
|
||||
Module: `StellaOps.Reachability.Mapping.SymbolToPurl`
|
||||
|
||||
* Inputs:
|
||||
|
||||
* Binary symbolization outputs.
|
||||
* Local mapping DB in Concelier (vendor symbol maps, debug info, name patterns).
|
||||
* File path + container context (`/usr/lib/...`, `/site-packages/...`).
|
||||
* Output:
|
||||
|
||||
```json
|
||||
{
|
||||
"symbol": "libssl::EVP_EncryptInit_ex",
|
||||
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
|
||||
"confidence": 0.93,
|
||||
"method": "vendor_map+path_heuristic"
|
||||
}
|
||||
```
|
||||
* Unresolved / ambiguous symbols:
|
||||
|
||||
* Stored as `unknowns` of type `identity_gap`.
|
||||
|
||||
#### 4.2.2. CVE→symbol mapping
|
||||
|
||||
Responsibility: Concelier + its advisory ingestion.
|
||||
|
||||
* For each vulnerability:
|
||||
|
||||
```json
|
||||
{
|
||||
"cve_id": "CVE-2025-12345",
|
||||
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
|
||||
"affected_symbols": [
|
||||
"libssl::EVP_EncryptInit_ex",
|
||||
"libssl::EVP_EncryptUpdate"
|
||||
],
|
||||
"source": "vendor_vex",
|
||||
"confidence": 1.0
|
||||
}
|
||||
```
|
||||
* Reachability Engine consumes this mapping read-only.
|
||||
|
||||
---
|
||||
|
||||
### 4.3. Reachability Engine
|
||||
|
||||
Module: `StellaOps.Reachability.Engine` (in Scanner.WebService)
|
||||
|
||||
#### 4.3.1. Core data model
|
||||
|
||||
Per `(artifact, cve, purl)`:
|
||||
|
||||
```json
|
||||
{
|
||||
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
|
||||
"cve_id": "CVE-2025-12345",
|
||||
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
|
||||
"symbols": [
|
||||
{
|
||||
"symbol": "libssl::EVP_EncryptInit_ex",
|
||||
"static_presence": "present|absent|unknown",
|
||||
"static_reachability": "reachable|unreachable|unknown",
|
||||
"runtime_hits": 3,
|
||||
"runtime_reachability": "observed|not_observed|unknown"
|
||||
}
|
||||
],
|
||||
"reachability_state": "confirmed_reachable|statically_reachable|present_not_reachable|not_present|unknown",
|
||||
"confidence": {
|
||||
"p": 0.87,
|
||||
"evidence": ["static_callgraph", "runtime_trace", "symbol_map"],
|
||||
"unknowns_pressure": 0.12
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 4.3.2. Lattice / state machine
|
||||
|
||||
Define a deterministic lattice over states:
|
||||
|
||||
* `NOT_PRESENT`
|
||||
* `PRESENT_NOT_REACHABLE`
|
||||
* `STATICALLY_REACHABLE`
|
||||
* `RUNTIME_OBSERVED`
|
||||
|
||||
And “unknown” flags overlayed when evidence is missing.
|
||||
|
||||
Merging rules (simplified):
|
||||
|
||||
* If `NOT_PRESENT` and no conflicting evidence → `NOT_PRESENT`.
|
||||
* If at least one affected symbol is on a static path from any entrypoint → `STATICALLY_REACHABLE`.
|
||||
* If symbol observed at runtime → `RUNTIME_OBSERVED` (top state).
|
||||
* If symbol present in binary but not on any static path → `PRESENT_NOT_REACHABLE`, unless unknown edges exist near it (then downgrade with lower confidence).
|
||||
* Unknowns Registry entries near affected symbols increase `unknowns_pressure` and may push from `NOT_PRESENT` to `UNKNOWN`.
|
||||
|
||||
Implementation: pure functional merge functions inside Scanner.WebService:
|
||||
|
||||
```csharp
|
||||
ReachabilityState Merge(ReachabilityState a, ReachabilityState b);
|
||||
ReachabilityState FromEvidence(StaticEvidence s, RuntimeEvidence r, UnknownsPressure u);
|
||||
```
|
||||
|
||||
#### 4.3.3. Deterministic inputs
|
||||
|
||||
To guarantee replay:
|
||||
|
||||
* Build **Reachability Plan Manifest** per run:
|
||||
|
||||
```json
|
||||
{
|
||||
"plan_id": "reach:sha256:…",
|
||||
"scanner_version": "1.4.0",
|
||||
"analyzers": {
|
||||
"binary_discovery": "1.0.0",
|
||||
"binary_symbolizer": "1.1.0",
|
||||
"binary_callgraph": "1.2.0"
|
||||
},
|
||||
"inputs": {
|
||||
"image_digest": "sha256:…",
|
||||
"runtime_trace_files": ["signals:run:2025-11-18T12:00:00Z"],
|
||||
"config": {
|
||||
"assume_indirect_calls": "conservative",
|
||||
"max_call_depth": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
* DSSE-sign the plan + result.
|
||||
|
||||
---
|
||||
|
||||
### 4.4. Storage and graph overlay
|
||||
|
||||
#### 4.4.1. Reachability store
|
||||
|
||||
Backend: re-use existing Scanner/Cartographer storage stack (e.g., Postgres or SQLite + blob store).
|
||||
|
||||
Tables/collections:
|
||||
|
||||
* `reachability_runs`
|
||||
|
||||
* `plan_id`, `image_ref`, `created_at`, `scanner_version`.
|
||||
|
||||
* `reachability_results`
|
||||
|
||||
* `plan_id`, `cve_id`, `purl`, `state`, `confidence_p`, `unknowns_pressure`, `payload_json`.
|
||||
|
||||
* Indexes on `(image_ref, cve_id)`, `(image_ref, purl)`.
|
||||
|
||||
#### 4.4.2. Cartographer overlay
|
||||
|
||||
Edges:
|
||||
|
||||
* `IMAGE` → `BINARY` → `FUNCTION` → `PACKAGE` → `CVE`
|
||||
* Extra property on `IMAGE -[AFFECTED_BY]-> CVE`:
|
||||
|
||||
* `reachability_state`
|
||||
* `reachability_plan_id`
|
||||
|
||||
Enables queries:
|
||||
|
||||
* “Show me all CVEs with `STATICALLY_REACHABLE` in this namespace.”
|
||||
* “Show me binaries with high density of reachable crypto CVEs.”
|
||||
|
||||
---
|
||||
|
||||
### 4.5. APIs (idempotent, additive)
|
||||
|
||||
#### 4.5.1. Trigger reachability
|
||||
|
||||
`POST /reachability/runs`
|
||||
|
||||
Request:
|
||||
|
||||
```json
|
||||
{
|
||||
"artifact": { "type": "oci.image", "ref": "registry/app@sha256:…" },
|
||||
"config": {
|
||||
"include_languages": ["binary"],
|
||||
"max_call_depth": 10,
|
||||
"assume_indirect_calls": "conservative"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Response:
|
||||
|
||||
```json
|
||||
{ "plan_id": "reach:sha256:…" }
|
||||
```
|
||||
|
||||
* Idempotent key: `(image_ref, config_hash)`. Subsequent calls return same `plan_id`.
|
||||
|
||||
#### 4.5.2. Fetch results
|
||||
|
||||
`GET /reachability/runs/:plan_id`
|
||||
|
||||
```json
|
||||
{
|
||||
"plan": { /* reachability plan manifest */ },
|
||||
"results": [
|
||||
{
|
||||
"cve_id": "CVE-2025-12345",
|
||||
"purl": "pkg:apk/alpine/openssl@3.1.5-r2",
|
||||
"reachability_state": "static_reachable",
|
||||
"confidence": { "p": 0.84, "unknowns_pressure": 0.1 }
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
#### 4.5.3. Per-CVE view for VEXer/Excitor
|
||||
|
||||
`GET /reachability/by-cve?artifact=…&cve_id=…`
|
||||
|
||||
* Returns filtered result for downstream VEX creation.
|
||||
|
||||
All APIs are **read-only** except for the side effect of storing/caching runs.
|
||||
|
||||
---
|
||||
|
||||
## 5. Interaction with other Stella Ops modules
|
||||
|
||||
### 5.1. Concelier
|
||||
|
||||
* Provides:
|
||||
|
||||
* CVE→purl→symbol mapping.
|
||||
* Vendor VEX statements indicating affected functions.
|
||||
* Consumes:
|
||||
|
||||
* Nothing from reachability directly; Scanner/WebService passes reachability summary to VEXer/Excitor which merges with vendor statements.
|
||||
|
||||
### 5.2. VEXer / Excitor
|
||||
|
||||
* Input:
|
||||
|
||||
* For each `(artifact, cve)`:
|
||||
|
||||
* Reachability state.
|
||||
* Confidence.
|
||||
* Logic:
|
||||
|
||||
* Translate states to VEX statements:
|
||||
|
||||
* `NOT_PRESENT` → `not_affected`
|
||||
* `PRESENT_NOT_REACHABLE` → `not_affected` (with justification “code not reachable according to analysis”)
|
||||
* `STATICALLY_REACHABLE` → `affected`
|
||||
* `RUNTIME_OBSERVED` → `affected` (higher severity)
|
||||
* Attach determinism proof:
|
||||
|
||||
* Plan ID + DSSE of reachability run.
|
||||
|
||||
### 5.3. Signals
|
||||
|
||||
* Provides:
|
||||
|
||||
* Function hit events: `(binary_id, function_id, timestamp)` aggregated per image.
|
||||
* Reachability Engine:
|
||||
|
||||
* Marks `runtime_hits` and state `RUNTIME_OBSERVED` for symbols with hits.
|
||||
* Unknowns:
|
||||
|
||||
* If runtime sees hits in functions with no static edges to entrypoints (or unmapped symbols), these produce Unknowns and increase `unknowns_pressure`.
|
||||
|
||||
### 5.4. Unknowns Registry
|
||||
|
||||
* From reachability pipeline, create Unknowns when:
|
||||
|
||||
* Symbol→package mapping is ambiguous.
|
||||
* CVE→symbol mapping exists, but symbol cannot be found in binaries.
|
||||
* Call graph has indirect calls that cannot be resolved.
|
||||
* The “Unknowns Pressure” term is fed into:
|
||||
|
||||
* Reachability confidence.
|
||||
* Global risk scoring (Trust Algebra Studio).
|
||||
|
||||
---
|
||||
|
||||
## 6. Implementation phases and engineering plan
|
||||
|
||||
### Phase 0 – Scaffolding & manifests (1 sprint)
|
||||
|
||||
* Create:
|
||||
|
||||
* `StellaOps.Reachability.Engine` skeleton.
|
||||
* Reachability Plan Manifest schema.
|
||||
* Reachability Run + Result persistence.
|
||||
* Add `/reachability/runs` and `/reachability/runs/:plan_id` endpoints, returning mock data.
|
||||
* Wire DSSE attestation generation for reachability results (even if payload is empty).
|
||||
|
||||
### Phase 1 – Binary discovery + symbolization (1–2 sprints)
|
||||
|
||||
* Implement `Binary.Discovery` and `Binary.Symbolizer`.
|
||||
* Feed symbol tables into Reachability Engine as “presence-only evidence”:
|
||||
|
||||
* States: `NOT_PRESENT` vs `PRESENT_NOT_REACHABLE` vs `UNKNOWN`.
|
||||
* Integrate with Concelier’s CVE→purl mapping (no symbol-level yet):
|
||||
|
||||
* For CVEs affecting a package present in the image, mark as `PRESENT_NOT_REACHABLE`.
|
||||
* Emit Unknowns for unresolved binary roles and ambiguous package mapping.
|
||||
|
||||
Deliverable: package-level reachability with deterministic manifests.
|
||||
|
||||
### Phase 2 – Binary call graphs & entrypoints (2–3 sprints)
|
||||
|
||||
* Implement `Binary.CallGraph.Native`:
|
||||
|
||||
* CFG + direct call edges.
|
||||
* Implement entrypoint inference from binary + container ENTRYPOINT/CMD.
|
||||
* Add static reachability algorithm:
|
||||
|
||||
* DFS/BFS from entrypoints through call graph.
|
||||
* Mark affected symbols as reachable if found on paths.
|
||||
* Extend Concelier to ingest symbol-aware vulnerability metadata (for pilots; can be partial).
|
||||
|
||||
Deliverable: function-level static reachability for native binaries where symbol maps exist.
|
||||
|
||||
### Phase 3 – Runtime integration (2 sprints, may be in parallel workstream)
|
||||
|
||||
* Integrate Signals runtime evidence:
|
||||
|
||||
* Define schema for function hit events.
|
||||
* Add ingestion path into Reachability Engine.
|
||||
* Update lattice:
|
||||
|
||||
* Promote symbols to `RUNTIME_OBSERVED` when hits exist.
|
||||
* Extend DSSE attestation to reference runtime evidence URIs (hashes of trace inputs).
|
||||
|
||||
Deliverable: static + runtime-confirmed reachability.
|
||||
|
||||
### Phase 4 – Unknowns & pressure (1 sprint)
|
||||
|
||||
* Wire Unknowns Registry:
|
||||
|
||||
* Emit unknowns from Symbolizer and CallGraph (identity gaps, missing edges).
|
||||
* Compute `unknowns_pressure` per `(artifact, cve)` as density of unknowns near affected symbols.
|
||||
* Adjust confidence calculation in Reachability Engine.
|
||||
* Expose unknowns metrics in API and Cartographer.
|
||||
|
||||
Deliverable: explicit modelling of uncertainty, feeding into trust/lattice.
|
||||
|
||||
### Phase 5 – Language extensions (JS/Python/PHP) (ongoing)
|
||||
|
||||
* Implement per-language call-graph analyzers creating the same IR as binary.
|
||||
* Extend symbol→purl mapping for these ecosystems (npm, PyPI, Packagist).
|
||||
* Update reachability solver to include multi-language edges (e.g., Python calling into native modules).
|
||||
|
||||
---
|
||||
|
||||
## 7. Minimal contracts for agents
|
||||
|
||||
To hand off to agents, you can codify:
|
||||
|
||||
1. **IR schemas**
|
||||
|
||||
* Call graph IR.
|
||||
* Reachability Result JSON.
|
||||
* Reachability Plan Manifest.
|
||||
2. **API contracts**
|
||||
|
||||
* `POST /reachability/runs`
|
||||
* `GET /reachability/runs/:plan_id`
|
||||
* `GET /reachability/by-cve`
|
||||
3. **Module boundaries**
|
||||
|
||||
* `Scanner.Analyzers.Binary.*` produce IR only; NO network calls.
|
||||
* `Reachability.Engine` is the only place where lattice logic lives.
|
||||
* `Concelier` is read-only for reachability; no custom logic there.
|
||||
4. **Determinism practices**
|
||||
|
||||
* All algorithmic randomness is banned; where unavoidable, seed with values derived from plan_id.
|
||||
* All external inputs must be listed in the Plan Manifest.
|
||||
|
||||
If you like, next step I can draft:
|
||||
|
||||
* Concrete C# record types for the IRs.
|
||||
* A small pseudo-code implementation of the lattice functions and static reachability DFS.
|
||||
* A proposed directory layout under `src/StellaOps.Scanner` and `src/StellaOps.Cartographer`.
|
||||
Reference in New Issue
Block a user