- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution. - Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done. - Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
432 lines
20 KiB
Markdown
432 lines
20 KiB
Markdown
Here’s Epic 5 in the same paste‑into‑repo, implementation‑ready format as the prior epics. It’s exhaustive, formal, and designed to slot into AOC, Policy Engine, Conseiller/Excitator, and the Console.
|
||
|
||
---
|
||
|
||
# Epic 5: SBOM Graph Explorer
|
||
|
||
> Short name: **Graph Explorer**
|
||
> Services touched: **SBOM Service**, **Graph Indexer** (new), **Graph API** (new), **Policy Engine**, **Conseiller (Feedser)**, **Excitator (Vexer)**, **Web API Gateway**, **Authority** (authN/Z), **Workers/Scheduler**, **Telemetry**
|
||
> Surfaces: **Console (Web UI)** graph module, **CLI**, **Exports**
|
||
> Deliverables: Interactive graph UI with semantic zoom, saved queries, policy/VEX/advisory overlays, diff views, impact analysis, exports
|
||
|
||
---
|
||
|
||
## 1) What it is
|
||
|
||
**SBOM Graph Explorer** is the interactive, tenant‑scoped view of all supply‑chain relationships the platform knows about, rendered as a navigable graph. It connects:
|
||
|
||
* **Artifacts** (applications, images, libs), **Packages/Versions**, **Files/Paths**, **Licenses**, **Advisories** (from Conseiller), **VEX statements** (from Excitator), **Provenance** (builds, sources), and **Policies** (overlays of determinations)
|
||
* **Edges** like `depends_on`, `contains`, `built_from`, `declared_in`, `affected_by`, `vex_exempts`, `governs_with`
|
||
* **Time/version** dimension: multiple SBOM snapshots with diffs
|
||
|
||
It’s built for investigation and review: find where a vulnerable package enters; see which apps are impacted; understand why a finding exists; simulate a policy version and see the delta. The explorer observes **AOC enforcement**: it never mutates facts; it aggregates and visualizes them. Only the Policy Engine may classify, and classification is displayed as overlays.
|
||
|
||
---
|
||
|
||
## 2) Why
|
||
|
||
* SBOMs are graphs. Tables flatten what matters and hide transitive risk.
|
||
* Engineers, security, and auditors need impact answers quickly: “What pulls in `log4j:2.17` and where is it at runtime?”
|
||
* Policy/VEX/advisory interactions are nuanced. A visual overlay makes precedence and outcomes obvious.
|
||
* Review is collaborative; you need saved queries, deep links, exports, and consistent evidence.
|
||
|
||
---
|
||
|
||
## 3) How it should work (maximum detail)
|
||
|
||
### 3.1 Domain model
|
||
|
||
**Nodes** (typed, versioned, tenant‑scoped):
|
||
|
||
* `Artifact`: application, service, container image, library, module
|
||
* `Package`: name + ecosystem (purl), `PackageVersion` with resolved version
|
||
* `File`: path within artifact or image layer
|
||
* `License`: SPDX id
|
||
* `Advisory`: normalized advisory id (GHSA, CVE, vendor), source = Conseiller
|
||
* `VEX`: statement with product context, status, justification, source = Excitator
|
||
* `SBOM`: ingestion unit; includes metadata (tool, sha, build info)
|
||
* `PolicyDetermination`: materialized view of Policy Engine results (read‑only overlay)
|
||
* `Build`: provenance, commit, workflow run
|
||
* `Source`: repo, tag, commit
|
||
|
||
**Edges** (directed):
|
||
|
||
* `declared_in` (PackageVersion → SBOM)
|
||
* `contains` (Artifact → PackageVersion | File)
|
||
* `depends_on` (PackageVersion → PackageVersion) with scope attr (prod|dev|test|optional)
|
||
* `built_from` (Artifact → Build), `provenance_of` (Build → Source)
|
||
* `affected_by` (PackageVersion → Advisory) with range semantics
|
||
* `vex_exempts` (Advisory ↔ VEX) scoped by product/component
|
||
* `licensed_under` (Artifact|PackageVersion → License)
|
||
* `governs_with` (Artifact|PackageVersion → PolicyDetermination)
|
||
* `derived_from` (SBOM → SBOM) for superseding snapshots
|
||
|
||
**Identity & versioning**
|
||
|
||
* Every node has a stable key: `{tenant}:{type}:{natural_id}` (e.g., purl for packages, digest for images).
|
||
* SBOM snapshots are immutable; edges carry `valid_from`/`valid_to` for time travel and diffing.
|
||
|
||
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||
|
||
### 3.2 User capabilities (end‑to‑end)
|
||
|
||
* **Search & Navigate**: global search (purls, CVEs, repos, licenses), keyboard nav, breadcrumbs, semantic zoom.
|
||
* **Lenses**: toggle views (Security, License, Provenance, Runtime vs Dev, Policy effect).
|
||
* **Overlays**:
|
||
|
||
* **Advisory overlay**: show affected nodes/edges with source, severity, ranges.
|
||
* **VEX overlay**: show suppressions/justifications; collapse exempted paths.
|
||
* **Policy overlay**: choose a policy version; nodes/edges reflect determinations (severity, status) with explain sampling.
|
||
* **Impact analysis**: pick a vulnerable node; highlight upstream/downstream dependents, scope filters, shortest/all paths with constraints.
|
||
* **Diff view**: compare SBOM A vs B; show added/removed nodes/edges, changed versions, changed determinations.
|
||
* **Saved queries**: visual builder + JSON query; shareable permalinks scoped by tenant and environment.
|
||
* **Exports**: GraphML, CSV edge list, NDJSON of findings, PNG/SVG snapshot.
|
||
* **Evidence details**: side panel with raw facts, advisory links, VEX statements, policy explain trace, provenance.
|
||
* **Accessibility**: tab‑navigable, high‑contrast, screen‑reader labels for nodes and sidebars.
|
||
|
||
### 3.3 Query model
|
||
|
||
* **Visual builder** for common queries:
|
||
|
||
* “Show all paths from Artifact X to Advisory Y up to depth 6.”
|
||
* “All runtime dependencies with license = GPL‑3.0.”
|
||
* “All artifacts affected by GHSA‑… with no applicable VEX.”
|
||
* “Which SBOMs introduced/removed `openssl` between build 120 and 130?”
|
||
* **JSON query** (internal, POST body) with:
|
||
|
||
* `start`: list of node selectors (type + id or attributes)
|
||
* `expand`: edge types and depth, direction, scope filters
|
||
* `where`: predicates on node/edge attributes
|
||
* `overlay`: policy version id, advisory sources, VEX filters
|
||
* `limit`: nodes, edges, timebox, cost budget
|
||
* **Cost control**: server estimates cost, denies or pages results; UI streams partial graph tiles.
|
||
|
||
### 3.4 UI architecture (Console)
|
||
|
||
* **Canvas**: WebGL renderer with level‑of‑detail, edge bundling, and label culling; deterministic layout when possible (seeded).
|
||
* **Semantic zoom**:
|
||
|
||
* Far: clusters by artifact/repo/ecosystem, color by lens
|
||
* Mid: package groups, advisory badges, license swatches
|
||
* Near: concrete versions, direct edges, inline badges for policy determinations
|
||
* **Panels**:
|
||
|
||
* Left: search, filters, lens selector, saved queries
|
||
* Right: details, explain trace, evidence tabs (Advisory/VEX/Policy/Provenance)
|
||
* Bottom: query expression, diagnostics, performance/stream status
|
||
* **Diff mode**: split or overlay, color legend (add/remove/changed), filter by node type.
|
||
* **Deep links**: URL encodes query + viewport; shareable respecting RBAC.
|
||
* **Keyboard**: space drag, +/- zoom, F to focus, G to expand neighbors, P to show paths.
|
||
|
||
### 3.5 Back‑end architecture
|
||
|
||
**Graph Indexer (new)**
|
||
|
||
* Consumes SBOM ingests, Conseiller advisories, Excitator VEX statements, Policy Engine determinations (read‑only).
|
||
* Projects facts into a **property graph** persisted in:
|
||
|
||
* Primary: document store + adjacency sets (e.g., Mongo collections + compressed adjacency lists)
|
||
* Optional driver for graph DB backends if needed (pluggable)
|
||
* Maintains materialized aggregates: degree, critical paths cache, affected artifact counts, license distribution.
|
||
* Emits **graph snapshots** per SBOM with lineage to original ingestion.
|
||
|
||
**Graph API (new)**
|
||
|
||
* Endpoints for search, neighbor expansion, path queries, diffs, overlays, exports.
|
||
* Streaming responses for large graphs (chunked NDJSON tiles).
|
||
* Cost accounting + quotas per tenant.
|
||
|
||
**Workers**
|
||
|
||
* **Centrality & clustering** precompute on idle: betweenness approximations, connected components, Louvain clusters.
|
||
* **Diff compute** on new SBOM ingestion pairs (previous vs current).
|
||
* **Overlay materialization** cache for popular policy versions.
|
||
|
||
**Policy Engine integration**
|
||
|
||
* Graph API requests can specify a policy version.
|
||
* For sampled nodes, the API fetches explain traces; for counts, uses precomputed overlay materializations where available.
|
||
|
||
**AOC enforcement**
|
||
|
||
* Graph Indexer never merges or edits advisories/VEX; it links them and exposes overlays that the Policy Engine evaluates.
|
||
* Conseiller and Excitator remain authoritative sources; severities come from Policy‑governed normalization.
|
||
|
||
### 3.6 APIs (representative)
|
||
|
||
* `GET /graph/search?q=...&type=package|artifact|advisory|license`
|
||
* `POST /graph/query` ⇒ stream tiles `{nodes[], edges[], stats, cursor}`
|
||
* `POST /graph/paths` body: `{from, to, depth<=6, constraints{scope, runtime_only}}`
|
||
* `POST /graph/diff` body: `{sbom_a, sbom_b, filters}`
|
||
* `GET /graph/snapshots/{sbom_id}` ⇒ graph metadata, counts, top advisories
|
||
* `POST /graph/export` body: `{format: graphml|csv|ndjson|png|svg, query|snapshot}`
|
||
* `GET /graph/saved` / `POST /graph/saved` save and list tenant queries
|
||
* `GET /graph/overlays/policy/{version_id}` ⇒ summary stats for caching
|
||
|
||
All endpoints tenant‑scoped, RBAC‑checked. Timeouts and pagination by server. Errors return structured diagnostics.
|
||
|
||
### 3.7 CLI
|
||
|
||
```
|
||
stella sbom graph search "purl:pkg:maven/org.apache.logging.log4j/log4j-core"
|
||
stella sbom graph query --file ./query.json --export graphml > graph.graphml
|
||
stella sbom graph impacted --advisory GHSA-xxxx --runtime-only --limit 100
|
||
stella sbom graph paths --from artifact:service-a --to advisory:GHSA-xxxx --depth 5 --policy 1.3.0
|
||
stella sbom graph diff --sbom-a 2025-03-15T10:00Z --sbom-b 2025-03-22T10:00Z --export csv > diff.csv
|
||
stella sbom graph save --name "openssl-runtime" --file ./query.json
|
||
```
|
||
|
||
Exit codes: 0 ok, 2 query validation error, 3 over‑budget, 4 not found, 5 RBAC denied.
|
||
|
||
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||
|
||
### 3.8 Performance & scale
|
||
|
||
* **Progressive loading**: server pages tiles by BFS frontier; client renders incrementally.
|
||
* **Viewport culling**: only visible nodes/edges in canvas; offscreen demoted to aggregates.
|
||
* **Level‑of‑detail**: simplified glyphs and collapsed clusters at distance.
|
||
* **Query budgets**: per‑tenant rate + node/edge caps; interactive paths limited to depth ≤ 6.
|
||
* **Caching**: hot queries memoized per tenant + overlay version; diffs precomputed for consecutive SBOMs.
|
||
|
||
### 3.9 Security
|
||
|
||
* Multi‑tenant isolation at storage and API layers.
|
||
* RBAC roles:
|
||
|
||
* **Viewer**: browse graphs, saved queries
|
||
* **Investigator**: run queries, export data
|
||
* **Operator**: configure budgets, purge caches
|
||
* **Auditor**: download evidence bundles
|
||
* Input validation for query JSON; deny disallowed edge traversals; strict CSP in web app.
|
||
|
||
### 3.10 Observability
|
||
|
||
* Metrics: tile latency, nodes/edges per tile, cache hit rate, query denials, memory pressure.
|
||
* Logs: structured, include query hash, cost, truncation flags.
|
||
* Traces: server spans per stage (parse, plan, fetch, overlay, stream).
|
||
|
||
### 3.11 Accessibility & UX guarantees
|
||
|
||
* Keyboard complete, ARIA roles for graph and panels, high‑contrast theme.
|
||
* Deterministic layout on reload for shareable investigations.
|
||
|
||
### 3.12 Data retention
|
||
|
||
* Graph nodes derived from SBOMs share retention with SBOM artifacts; overlays are ephemeral caches.
|
||
* Saved queries retained until deleted; references to missing objects show warnings.
|
||
|
||
---
|
||
|
||
## 4) Implementation plan
|
||
|
||
### 4.1 Services
|
||
|
||
* **Graph Indexer (new microservice)**
|
||
|
||
* Subscribes to SBOM ingest events, Conseiller advisory updates, Excitator VEX updates, Policy overlay materializations.
|
||
* Builds adjacency lists and node documents; computes aggregates and clusters.
|
||
|
||
* **Graph API (new microservice)**
|
||
|
||
* Validates and executes queries; streams tiles; composes overlays; serves diffs and exports.
|
||
* Integrates with Policy Engine for explain sample retrieval.
|
||
|
||
* **SBOM Service (existing)**
|
||
|
||
* Emits ingestion events with SBOM ids and lineage; exposes SBOM metadata to Graph API.
|
||
|
||
* **Web API Gateway**
|
||
|
||
* Routes `/graph/*`, injects tenant context, enforces RBAC.
|
||
|
||
### 4.2 Console (Web UI) feature module
|
||
|
||
* `packages/features/graph-explorer`
|
||
|
||
* Canvas renderer (WebGL), panels, query builder, diff mode, overlays, exports.
|
||
* Deep‑link router and viewport state serializer.
|
||
|
||
### 4.3 Workers
|
||
|
||
* Centrality/clustering worker, diff worker, overlay materialization worker.
|
||
* Schedules on low‑traffic windows; backpressure aware.
|
||
|
||
### 4.4 Data model (storage)
|
||
|
||
* Collections:
|
||
|
||
* `graph_nodes`: `{_id, tenant, type, natural_id, attrs, degree, created_at, updated_at}`
|
||
* `graph_edges`: `{_id, tenant, from_id, to_id, type, attrs, valid_from, valid_to}`
|
||
* `graph_snapshots`: per‑SBOM node/edge references
|
||
* `graph_saved_queries`: `{_id, tenant, name, query_json, created_by}`
|
||
* `graph_overlays_cache`: keyed by `{tenant, policy_version, hash(query)}`
|
||
* Indexes: compound on `{tenant, type, natural_id}`, `{tenant, from_id}`, `{tenant, to_id}`, time bounds.
|
||
|
||
---
|
||
|
||
## 5) Documentation changes (create/update)
|
||
|
||
1. **`/docs/sbom/graph-explorer-overview.md`**
|
||
|
||
* Concepts, node/edge taxonomy, lenses, overlays, roles, limitations.
|
||
2. **`/docs/sbom/graph-using-the-console.md`**
|
||
|
||
* Walkthroughs: search, navigate, impact, diff, export; screenshots and keyboard cheatsheet.
|
||
3. **`/docs/sbom/graph-query-language.md`**
|
||
|
||
* JSON schema, examples, constraints, cost/budget rules.
|
||
4. **`/docs/sbom/graph-api.md`**
|
||
|
||
* REST endpoints, request/response examples, streaming and pagination.
|
||
5. **`/docs/sbom/graph-cli.md`**
|
||
|
||
* CLI command reference and example pipelines.
|
||
6. **`/docs/policy/graph-overlays.md`**
|
||
|
||
* How policy versions render in Graph; explain sampling; AOC guardrails.
|
||
7. **`/docs/vex/graph-integration.md`**
|
||
|
||
* How VEX suppressions appear and how to validate product scoping.
|
||
8. **`/docs/advisories/graph-integration.md`**
|
||
|
||
* Advisory linkage and severity normalization by policy.
|
||
9. **`/docs/architecture/graph-services.md`**
|
||
|
||
* Graph Indexer, Graph API, storage choices, failure modes.
|
||
10. **`/docs/observability/graph-telemetry.md`**
|
||
|
||
* Metrics, logs, tracing, dashboards.
|
||
11. **`/docs/runbooks/graph-incidents.md`**
|
||
|
||
* Handling runaway queries, cache poisoning, degraded render.
|
||
12. **`/docs/security/graph-rbac.md`**
|
||
|
||
* Permissions matrix, multi‑tenant boundaries.
|
||
|
||
Every doc should end with a “Compliance checklist.”
|
||
**Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||
|
||
---
|
||
|
||
## 6) Tasks
|
||
|
||
### 6.1 Backend: Graph Indexer
|
||
|
||
* [ ] Define node/edge schemas and attribute dictionaries for each type.
|
||
* [ ] Implement event consumers for SBOM ingests, Conseiller updates, Excitator updates.
|
||
* [ ] Build ingestion pipeline that populates nodes/edges and maintains `valid_from/valid_to`.
|
||
* [ ] Implement aggregate counters and degree metrics.
|
||
* [ ] Implement clustering job and persist cluster ids per node.
|
||
* [ ] Implement snapshot materialization per SBOM and lineage tracking.
|
||
* [ ] Unit tests for each node/edge builder; property‑based tests for identity stability.
|
||
|
||
### 6.2 Backend: Graph API
|
||
|
||
* [ ] Implement `/graph/search` with prefix and exact match across node types.
|
||
* [ ] Implement `/graph/query` with validation, planning, cost estimation, and streaming tile results.
|
||
* [ ] Implement `/graph/paths` with constraints and depth limits; shortest path heuristic.
|
||
* [ ] Implement `/graph/diff` computing adds/removes/changed versions; stream results.
|
||
* [ ] Implement overlays: advisory join, VEX join, policy materialization and explain sampling.
|
||
* [ ] Implement exports: GraphML, CSV edge list, NDJSON findings, PNG/SVG snapshots.
|
||
* [ ] RBAC middleware integration; multi‑tenant scoping.
|
||
* [ ] Load tests with synthetic large SBOMs; define default budgets.
|
||
|
||
### 6.3 Policy Engine integration
|
||
|
||
* [ ] Add endpoint to fetch explain traces for specific node ids in batch.
|
||
* [ ] Add materialization export that Graph API can cache per policy version.
|
||
|
||
### 6.4 Console (Web UI)
|
||
|
||
* [ ] Create `graph-explorer` module with routes `/graph`, `/graph/diff`, `/graph/q/:id`.
|
||
* [ ] Implement WebGL canvas with LOD, culling, edge bundling, deterministic layout seed.
|
||
* [ ] Build search, filter, lens, and overlay toolbars.
|
||
* [ ] Side panels: details, evidence tabs, explain viewer.
|
||
* [ ] Diff mode: split/overlay toggles and color legend.
|
||
* [ ] Saved queries: create, update, run; deep links.
|
||
* [ ] Export UI: formats, server round‑trip, progress indicators.
|
||
* [ ] a11y audit and keyboard‑only flow.
|
||
|
||
### 6.5 CLI
|
||
|
||
* [ ] Implement `stella sbom graph *` subcommands with JSON IO and piping support.
|
||
* [ ] Document examples and stable output schemas for CI consumption.
|
||
|
||
### 6.6 Observability & Ops
|
||
|
||
* [ ] Dashboards for tile latency, query denials, cache hit rate, memory.
|
||
* [ ] Alerting on query error spikes, OOM risk, cache churn.
|
||
* [ ] Runbooks in `/docs/runbooks/graph-incidents.md`.
|
||
|
||
### 6.7 Docs
|
||
|
||
* [ ] Author all docs in section 5, link from Console contextual help.
|
||
* [ ] Add end‑to‑end tutorial: “Investigate GHSA‑XXXX across prod artifacts.”
|
||
|
||
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
|
||
|
||
---
|
||
|
||
## 7) Acceptance criteria
|
||
|
||
* Console renders large SBOM graphs with semantic zoom, overlays, and responsive interactions.
|
||
* Users can run impact and path queries with bounded depth and get results within budget.
|
||
* VEX suppressions and advisory severities appear correctly and are consistent with policy.
|
||
* Diff view clearly shows added/removed/changed nodes/edges between two SBOMs.
|
||
* Saved queries and deep links reproduce the same view deterministically (given same data).
|
||
* Exports produce valid GraphML/CSV/NDJSON and image snapshots.
|
||
* CLI supports search, query, paths, impacted, diff, and export with stable schemas.
|
||
* AOC guardrails: explorer never mutates facts; overlays reflect Policy Engine decisions.
|
||
* RBAC enforced; all actions logged and observable.
|
||
|
||
---
|
||
|
||
## 8) Risks & mitigations
|
||
|
||
* **Graph explosion on large monorepos** → tiling, clustering, budgets, and strict depth limits.
|
||
* **Inconsistent identities across tools** → canonicalize purls/digests; property‑based tests for identity stability.
|
||
* **Policy overlay latency** → precompute materializations for hot policy versions; sample explains only on focus.
|
||
* **User confusion** → strong lens defaults, deterministic layouts, legends, in‑context help.
|
||
|
||
---
|
||
|
||
## 9) Test plan
|
||
|
||
* **Unit**: node/edge builders, identity normalization, cost estimator.
|
||
* **Integration**: ingest SBOM + advisories + VEX, verify overlays and counts.
|
||
* **E2E**: Playwright flows for search→impact→diff→export; deep link determinism.
|
||
* **Performance**: simulate 500k nodes/2M edges; measure tile latency and memory.
|
||
* **Security**: RBAC matrix; tenant isolation tests; query validation fuzzing.
|
||
* **Determinism**: snapshot round‑trip: same query and seed produce identical layout and stats.
|
||
|
||
---
|
||
|
||
## 10) Feature flags
|
||
|
||
* `graph.explorer` (UI feature module)
|
||
* `graph.paths` (advanced path queries)
|
||
* `graph.diff` (SBOM diff mode)
|
||
* `graph.overlays.policy` (policy overlay + explain sampling)
|
||
* `graph.export` (exports enabled)
|
||
|
||
Documented in `/docs/observability/graph-telemetry.md`.
|
||
|
||
---
|
||
|
||
## 11) Non‑goals (this epic)
|
||
|
||
* Real‑time process/runtime call graphs.
|
||
* Full substitution for text reports; Explorer complements Reports.
|
||
* Cross‑tenant graphs; all queries are tenant‑scoped.
|
||
|
||
---
|
||
|
||
## 12) Philosophy
|
||
|
||
* **See the system**: security and license risk are structural. If you cannot see structure, you will miss risk.
|
||
* **Evidence over assertion**: every colored node corresponds to raw facts and explainable determinations.
|
||
* **Bounded interactivity**: fast, partial answers beat slow “complete” ones.
|
||
* **Immutability**: graphs mirror SBOM snapshots and are never rewritten; we add context, not edits.
|
||
|
||
> Final reminder: **Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.**
|