Here’s Epic 5 in the same paste‑into‑repo, implementation‑ready format as the prior epics. It’s exhaustive, formal, and designed to slot into AOC, Policy Engine, Conseiller/Excitator, and the Console. --- # Epic 5: SBOM Graph Explorer > Short name: **Graph Explorer** > Services touched: **SBOM Service**, **Graph Indexer** (new), **Graph API** (new), **Policy Engine**, **Conseiller (Feedser)**, **Excitator (Vexer)**, **Web API Gateway**, **Authority** (authN/Z), **Workers/Scheduler**, **Telemetry** > Surfaces: **Console (Web UI)** graph module, **CLI**, **Exports** > Deliverables: Interactive graph UI with semantic zoom, saved queries, policy/VEX/advisory overlays, diff views, impact analysis, exports --- ## 1) What it is **SBOM Graph Explorer** is the interactive, tenant‑scoped view of all supply‑chain relationships the platform knows about, rendered as a navigable graph. It connects: * **Artifacts** (applications, images, libs), **Packages/Versions**, **Files/Paths**, **Licenses**, **Advisories** (from Conseiller), **VEX statements** (from Excitator), **Provenance** (builds, sources), and **Policies** (overlays of determinations) * **Edges** like `depends_on`, `contains`, `built_from`, `declared_in`, `affected_by`, `vex_exempts`, `governs_with` * **Time/version** dimension: multiple SBOM snapshots with diffs It’s built for investigation and review: find where a vulnerable package enters; see which apps are impacted; understand why a finding exists; simulate a policy version and see the delta. The explorer observes **AOC enforcement**: it never mutates facts; it aggregates and visualizes them. Only the Policy Engine may classify, and classification is displayed as overlays. --- ## 2) Why * SBOMs are graphs. Tables flatten what matters and hide transitive risk. * Engineers, security, and auditors need impact answers quickly: “What pulls in `log4j:2.17` and where is it at runtime?” * Policy/VEX/advisory interactions are nuanced. A visual overlay makes precedence and outcomes obvious. * Review is collaborative; you need saved queries, deep links, exports, and consistent evidence. --- ## 3) How it should work (maximum detail) ### 3.1 Domain model **Nodes** (typed, versioned, tenant‑scoped): * `Artifact`: application, service, container image, library, module * `Package`: name + ecosystem (purl), `PackageVersion` with resolved version * `File`: path within artifact or image layer * `License`: SPDX id * `Advisory`: normalized advisory id (GHSA, CVE, vendor), source = Conseiller * `VEX`: statement with product context, status, justification, source = Excitator * `SBOM`: ingestion unit; includes metadata (tool, sha, build info) * `PolicyDetermination`: materialized view of Policy Engine results (read‑only overlay) * `Build`: provenance, commit, workflow run * `Source`: repo, tag, commit **Edges** (directed): * `declared_in` (PackageVersion → SBOM) * `contains` (Artifact → PackageVersion | File) * `depends_on` (PackageVersion → PackageVersion) with scope attr (prod|dev|test|optional) * `built_from` (Artifact → Build), `provenance_of` (Build → Source) * `affected_by` (PackageVersion → Advisory) with range semantics * `vex_exempts` (Advisory ↔ VEX) scoped by product/component * `licensed_under` (Artifact|PackageVersion → License) * `governs_with` (Artifact|PackageVersion → PolicyDetermination) * `derived_from` (SBOM → SBOM) for superseding snapshots **Identity & versioning** * Every node has a stable key: `{tenant}:{type}:{natural_id}` (e.g., purl for packages, digest for images). * SBOM snapshots are immutable; edges carry `valid_from`/`valid_to` for time travel and diffing. > **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. ### 3.2 User capabilities (end‑to‑end) * **Search & Navigate**: global search (purls, CVEs, repos, licenses), keyboard nav, breadcrumbs, semantic zoom. * **Lenses**: toggle views (Security, License, Provenance, Runtime vs Dev, Policy effect). * **Overlays**: * **Advisory overlay**: show affected nodes/edges with source, severity, ranges. * **VEX overlay**: show suppressions/justifications; collapse exempted paths. * **Policy overlay**: choose a policy version; nodes/edges reflect determinations (severity, status) with explain sampling. * **Impact analysis**: pick a vulnerable node; highlight upstream/downstream dependents, scope filters, shortest/all paths with constraints. * **Diff view**: compare SBOM A vs B; show added/removed nodes/edges, changed versions, changed determinations. * **Saved queries**: visual builder + JSON query; shareable permalinks scoped by tenant and environment. * **Exports**: GraphML, CSV edge list, NDJSON of findings, PNG/SVG snapshot. * **Evidence details**: side panel with raw facts, advisory links, VEX statements, policy explain trace, provenance. * **Accessibility**: tab‑navigable, high‑contrast, screen‑reader labels for nodes and sidebars. ### 3.3 Query model * **Visual builder** for common queries: * “Show all paths from Artifact X to Advisory Y up to depth 6.” * “All runtime dependencies with license = GPL‑3.0.” * “All artifacts affected by GHSA‑… with no applicable VEX.” * “Which SBOMs introduced/removed `openssl` between build 120 and 130?” * **JSON query** (internal, POST body) with: * `start`: list of node selectors (type + id or attributes) * `expand`: edge types and depth, direction, scope filters * `where`: predicates on node/edge attributes * `overlay`: policy version id, advisory sources, VEX filters * `limit`: nodes, edges, timebox, cost budget * **Cost control**: server estimates cost, denies or pages results; UI streams partial graph tiles. ### 3.4 UI architecture (Console) * **Canvas**: WebGL renderer with level‑of‑detail, edge bundling, and label culling; deterministic layout when possible (seeded). * **Semantic zoom**: * Far: clusters by artifact/repo/ecosystem, color by lens * Mid: package groups, advisory badges, license swatches * Near: concrete versions, direct edges, inline badges for policy determinations * **Panels**: * Left: search, filters, lens selector, saved queries * Right: details, explain trace, evidence tabs (Advisory/VEX/Policy/Provenance) * Bottom: query expression, diagnostics, performance/stream status * **Diff mode**: split or overlay, color legend (add/remove/changed), filter by node type. * **Deep links**: URL encodes query + viewport; shareable respecting RBAC. * **Keyboard**: space drag, +/- zoom, F to focus, G to expand neighbors, P to show paths. ### 3.5 Back‑end architecture **Graph Indexer (new)** * Consumes SBOM ingests, Conseiller advisories, Excitator VEX statements, Policy Engine determinations (read‑only). * Projects facts into a **property graph** persisted in: * Primary: document store + adjacency sets (e.g., Mongo collections + compressed adjacency lists) * Optional driver for graph DB backends if needed (pluggable) * Maintains materialized aggregates: degree, critical paths cache, affected artifact counts, license distribution. * Emits **graph snapshots** per SBOM with lineage to original ingestion. **Graph API (new)** * Endpoints for search, neighbor expansion, path queries, diffs, overlays, exports. * Streaming responses for large graphs (chunked NDJSON tiles). * Cost accounting + quotas per tenant. **Workers** * **Centrality & clustering** precompute on idle: betweenness approximations, connected components, Louvain clusters. * **Diff compute** on new SBOM ingestion pairs (previous vs current). * **Overlay materialization** cache for popular policy versions. **Policy Engine integration** * Graph API requests can specify a policy version. * For sampled nodes, the API fetches explain traces; for counts, uses precomputed overlay materializations where available. **AOC enforcement** * Graph Indexer never merges or edits advisories/VEX; it links them and exposes overlays that the Policy Engine evaluates. * Conseiller and Excitator remain authoritative sources; severities come from Policy‑governed normalization. ### 3.6 APIs (representative) * `GET /graph/search?q=...&type=package|artifact|advisory|license` * `POST /graph/query` ⇒ stream tiles `{nodes[], edges[], stats, cursor}` * `POST /graph/paths` body: `{from, to, depth<=6, constraints{scope, runtime_only}}` * `POST /graph/diff` body: `{sbom_a, sbom_b, filters}` * `GET /graph/snapshots/{sbom_id}` ⇒ graph metadata, counts, top advisories * `POST /graph/export` body: `{format: graphml|csv|ndjson|png|svg, query|snapshot}` * `GET /graph/saved` / `POST /graph/saved` save and list tenant queries * `GET /graph/overlays/policy/{version_id}` ⇒ summary stats for caching All endpoints tenant‑scoped, RBAC‑checked. Timeouts and pagination by server. Errors return structured diagnostics. ### 3.7 CLI ``` stella sbom graph search "purl:pkg:maven/org.apache.logging.log4j/log4j-core" stella sbom graph query --file ./query.json --export graphml > graph.graphml stella sbom graph impacted --advisory GHSA-xxxx --runtime-only --limit 100 stella sbom graph paths --from artifact:service-a --to advisory:GHSA-xxxx --depth 5 --policy 1.3.0 stella sbom graph diff --sbom-a 2025-03-15T10:00Z --sbom-b 2025-03-22T10:00Z --export csv > diff.csv stella sbom graph save --name "openssl-runtime" --file ./query.json ``` Exit codes: 0 ok, 2 query validation error, 3 over‑budget, 4 not found, 5 RBAC denied. > **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. ### 3.8 Performance & scale * **Progressive loading**: server pages tiles by BFS frontier; client renders incrementally. * **Viewport culling**: only visible nodes/edges in canvas; offscreen demoted to aggregates. * **Level‑of‑detail**: simplified glyphs and collapsed clusters at distance. * **Query budgets**: per‑tenant rate + node/edge caps; interactive paths limited to depth ≤ 6. * **Caching**: hot queries memoized per tenant + overlay version; diffs precomputed for consecutive SBOMs. ### 3.9 Security * Multi‑tenant isolation at storage and API layers. * RBAC roles: * **Viewer**: browse graphs, saved queries * **Investigator**: run queries, export data * **Operator**: configure budgets, purge caches * **Auditor**: download evidence bundles * Input validation for query JSON; deny disallowed edge traversals; strict CSP in web app. ### 3.10 Observability * Metrics: tile latency, nodes/edges per tile, cache hit rate, query denials, memory pressure. * Logs: structured, include query hash, cost, truncation flags. * Traces: server spans per stage (parse, plan, fetch, overlay, stream). ### 3.11 Accessibility & UX guarantees * Keyboard complete, ARIA roles for graph and panels, high‑contrast theme. * Deterministic layout on reload for shareable investigations. ### 3.12 Data retention * Graph nodes derived from SBOMs share retention with SBOM artifacts; overlays are ephemeral caches. * Saved queries retained until deleted; references to missing objects show warnings. --- ## 4) Implementation plan ### 4.1 Services * **Graph Indexer (new microservice)** * Subscribes to SBOM ingest events, Conseiller advisory updates, Excitator VEX updates, Policy overlay materializations. * Builds adjacency lists and node documents; computes aggregates and clusters. * **Graph API (new microservice)** * Validates and executes queries; streams tiles; composes overlays; serves diffs and exports. * Integrates with Policy Engine for explain sample retrieval. * **SBOM Service (existing)** * Emits ingestion events with SBOM ids and lineage; exposes SBOM metadata to Graph API. * **Web API Gateway** * Routes `/graph/*`, injects tenant context, enforces RBAC. ### 4.2 Console (Web UI) feature module * `packages/features/graph-explorer` * Canvas renderer (WebGL), panels, query builder, diff mode, overlays, exports. * Deep‑link router and viewport state serializer. ### 4.3 Workers * Centrality/clustering worker, diff worker, overlay materialization worker. * Schedules on low‑traffic windows; backpressure aware. ### 4.4 Data model (storage) * Collections: * `graph_nodes`: `{_id, tenant, type, natural_id, attrs, degree, created_at, updated_at}` * `graph_edges`: `{_id, tenant, from_id, to_id, type, attrs, valid_from, valid_to}` * `graph_snapshots`: per‑SBOM node/edge references * `graph_saved_queries`: `{_id, tenant, name, query_json, created_by}` * `graph_overlays_cache`: keyed by `{tenant, policy_version, hash(query)}` * Indexes: compound on `{tenant, type, natural_id}`, `{tenant, from_id}`, `{tenant, to_id}`, time bounds. --- ## 5) Documentation changes (create/update) 1. **`/docs/sbom/graph-explorer-overview.md`** * Concepts, node/edge taxonomy, lenses, overlays, roles, limitations. 2. **`/docs/sbom/graph-using-the-console.md`** * Walkthroughs: search, navigate, impact, diff, export; screenshots and keyboard cheatsheet. 3. **`/docs/sbom/graph-query-language.md`** * JSON schema, examples, constraints, cost/budget rules. 4. **`/docs/sbom/graph-api.md`** * REST endpoints, request/response examples, streaming and pagination. 5. **`/docs/sbom/graph-cli.md`** * CLI command reference and example pipelines. 6. **`/docs/policy/graph-overlays.md`** * How policy versions render in Graph; explain sampling; AOC guardrails. 7. **`/docs/vex/graph-integration.md`** * How VEX suppressions appear and how to validate product scoping. 8. **`/docs/advisories/graph-integration.md`** * Advisory linkage and severity normalization by policy. 9. **`/docs/architecture/graph-services.md`** * Graph Indexer, Graph API, storage choices, failure modes. 10. **`/docs/observability/graph-telemetry.md`** * Metrics, logs, tracing, dashboards. 11. **`/docs/runbooks/graph-incidents.md`** * Handling runaway queries, cache poisoning, degraded render. 12. **`/docs/security/graph-rbac.md`** * Permissions matrix, multi‑tenant boundaries. Every doc should end with a “Compliance checklist.” **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. --- ## 6) Tasks ### 6.1 Backend: Graph Indexer * [ ] Define node/edge schemas and attribute dictionaries for each type. * [ ] Implement event consumers for SBOM ingests, Conseiller updates, Excitator updates. * [ ] Build ingestion pipeline that populates nodes/edges and maintains `valid_from/valid_to`. * [ ] Implement aggregate counters and degree metrics. * [ ] Implement clustering job and persist cluster ids per node. * [ ] Implement snapshot materialization per SBOM and lineage tracking. * [ ] Unit tests for each node/edge builder; property‑based tests for identity stability. ### 6.2 Backend: Graph API * [ ] Implement `/graph/search` with prefix and exact match across node types. * [ ] Implement `/graph/query` with validation, planning, cost estimation, and streaming tile results. * [ ] Implement `/graph/paths` with constraints and depth limits; shortest path heuristic. * [ ] Implement `/graph/diff` computing adds/removes/changed versions; stream results. * [ ] Implement overlays: advisory join, VEX join, policy materialization and explain sampling. * [ ] Implement exports: GraphML, CSV edge list, NDJSON findings, PNG/SVG snapshots. * [ ] RBAC middleware integration; multi‑tenant scoping. * [ ] Load tests with synthetic large SBOMs; define default budgets. ### 6.3 Policy Engine integration * [ ] Add endpoint to fetch explain traces for specific node ids in batch. * [ ] Add materialization export that Graph API can cache per policy version. ### 6.4 Console (Web UI) * [ ] Create `graph-explorer` module with routes `/graph`, `/graph/diff`, `/graph/q/:id`. * [ ] Implement WebGL canvas with LOD, culling, edge bundling, deterministic layout seed. * [ ] Build search, filter, lens, and overlay toolbars. * [ ] Side panels: details, evidence tabs, explain viewer. * [ ] Diff mode: split/overlay toggles and color legend. * [ ] Saved queries: create, update, run; deep links. * [ ] Export UI: formats, server round‑trip, progress indicators. * [ ] a11y audit and keyboard‑only flow. ### 6.5 CLI * [ ] Implement `stella sbom graph *` subcommands with JSON IO and piping support. * [ ] Document examples and stable output schemas for CI consumption. ### 6.6 Observability & Ops * [ ] Dashboards for tile latency, query denials, cache hit rate, memory. * [ ] Alerting on query error spikes, OOM risk, cache churn. * [ ] Runbooks in `/docs/runbooks/graph-incidents.md`. ### 6.7 Docs * [ ] Author all docs in section 5, link from Console contextual help. * [ ] Add end‑to‑end tutorial: “Investigate GHSA‑XXXX across prod artifacts.” > **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied. --- ## 7) Acceptance criteria * Console renders large SBOM graphs with semantic zoom, overlays, and responsive interactions. * Users can run impact and path queries with bounded depth and get results within budget. * VEX suppressions and advisory severities appear correctly and are consistent with policy. * Diff view clearly shows added/removed/changed nodes/edges between two SBOMs. * Saved queries and deep links reproduce the same view deterministically (given same data). * Exports produce valid GraphML/CSV/NDJSON and image snapshots. * CLI supports search, query, paths, impacted, diff, and export with stable schemas. * AOC guardrails: explorer never mutates facts; overlays reflect Policy Engine decisions. * RBAC enforced; all actions logged and observable. --- ## 8) Risks & mitigations * **Graph explosion on large monorepos** → tiling, clustering, budgets, and strict depth limits. * **Inconsistent identities across tools** → canonicalize purls/digests; property‑based tests for identity stability. * **Policy overlay latency** → precompute materializations for hot policy versions; sample explains only on focus. * **User confusion** → strong lens defaults, deterministic layouts, legends, in‑context help. --- ## 9) Test plan * **Unit**: node/edge builders, identity normalization, cost estimator. * **Integration**: ingest SBOM + advisories + VEX, verify overlays and counts. * **E2E**: Playwright flows for search→impact→diff→export; deep link determinism. * **Performance**: simulate 500k nodes/2M edges; measure tile latency and memory. * **Security**: RBAC matrix; tenant isolation tests; query validation fuzzing. * **Determinism**: snapshot round‑trip: same query and seed produce identical layout and stats. --- ## 10) Feature flags * `graph.explorer` (UI feature module) * `graph.paths` (advanced path queries) * `graph.diff` (SBOM diff mode) * `graph.overlays.policy` (policy overlay + explain sampling) * `graph.export` (exports enabled) Documented in `/docs/observability/graph-telemetry.md`. --- ## 11) Non‑goals (this epic) * Real‑time process/runtime call graphs. * Full substitution for text reports; Explorer complements Reports. * Cross‑tenant graphs; all queries are tenant‑scoped. --- ## 12) Philosophy * **See the system**: security and license risk are structural. If you cannot see structure, you will miss risk. * **Evidence over assertion**: every colored node corresponds to raw facts and explainable determinations. * **Bounded interactivity**: fast, partial answers beat slow “complete” ones. * **Immutability**: graphs mirror SBOM snapshots and are never rewritten; we add context, not edits. > Final reminder: **Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.**