20 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	Here’s Epic 5 in the same paste‑into‑repo, implementation‑ready format as the prior epics. It’s exhaustive, formal, and designed to slot into AOC, Policy Engine, Conseiller/Excitator, and the Console.
Epic 5: SBOM Graph Explorer
Short name: Graph Explorer Services touched: SBOM Service, Graph Indexer (new), Graph API (new), Policy Engine, Conseiller (Feedser), Excitator (Vexer), Web API Gateway, Authority (authN/Z), Workers/Scheduler, Telemetry Surfaces: Console (Web UI) graph module, CLI, Exports Deliverables: Interactive graph UI with semantic zoom, saved queries, policy/VEX/advisory overlays, diff views, impact analysis, exports
1) What it is
SBOM Graph Explorer is the interactive, tenant‑scoped view of all supply‑chain relationships the platform knows about, rendered as a navigable graph. It connects:
- Artifacts (applications, images, libs), Packages/Versions, Files/Paths, Licenses, Advisories (from Conseiller), VEX statements (from Excitator), Provenance (builds, sources), and Policies (overlays of determinations)
 - Edges like 
depends_on,contains,built_from,declared_in,affected_by,vex_exempts,governs_with - Time/version dimension: multiple SBOM snapshots with diffs
 
It’s built for investigation and review: find where a vulnerable package enters; see which apps are impacted; understand why a finding exists; simulate a policy version and see the delta. The explorer observes AOC enforcement: it never mutates facts; it aggregates and visualizes them. Only the Policy Engine may classify, and classification is displayed as overlays.
2) Why
- SBOMs are graphs. Tables flatten what matters and hide transitive risk.
 - Engineers, security, and auditors need impact answers quickly: “What pulls in 
log4j:2.17and where is it at runtime?” - Policy/VEX/advisory interactions are nuanced. A visual overlay makes precedence and outcomes obvious.
 - Review is collaborative; you need saved queries, deep links, exports, and consistent evidence.
 
3) How it should work (maximum detail)
3.1 Domain model
Nodes (typed, versioned, tenant‑scoped):
Artifact: application, service, container image, library, modulePackage: name + ecosystem (purl),PackageVersionwith resolved versionFile: path within artifact or image layerLicense: SPDX idAdvisory: normalized advisory id (GHSA, CVE, vendor), source = ConseillerVEX: statement with product context, status, justification, source = ExcitatorSBOM: ingestion unit; includes metadata (tool, sha, build info)PolicyDetermination: materialized view of Policy Engine results (read‑only overlay)Build: provenance, commit, workflow runSource: repo, tag, commit
Edges (directed):
declared_in(PackageVersion → SBOM)contains(Artifact → PackageVersion | File)depends_on(PackageVersion → PackageVersion) with scope attr (prod|dev|test|optional)built_from(Artifact → Build),provenance_of(Build → Source)affected_by(PackageVersion → Advisory) with range semanticsvex_exempts(Advisory ↔ VEX) scoped by product/componentlicensed_under(Artifact|PackageVersion → License)governs_with(Artifact|PackageVersion → PolicyDetermination)derived_from(SBOM → SBOM) for superseding snapshots
Identity & versioning
- Every node has a stable key: 
{tenant}:{type}:{natural_id}(e.g., purl for packages, digest for images). - SBOM snapshots are immutable; edges carry 
valid_from/valid_tofor time travel and diffing. 
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.2 User capabilities (end‑to‑end)
- 
Search & Navigate: global search (purls, CVEs, repos, licenses), keyboard nav, breadcrumbs, semantic zoom.
 - 
Lenses: toggle views (Security, License, Provenance, Runtime vs Dev, Policy effect).
 - 
Overlays:
- Advisory overlay: show affected nodes/edges with source, severity, ranges.
 - VEX overlay: show suppressions/justifications; collapse exempted paths.
 - Policy overlay: choose a policy version; nodes/edges reflect determinations (severity, status) with explain sampling.
 
 - 
Impact analysis: pick a vulnerable node; highlight upstream/downstream dependents, scope filters, shortest/all paths with constraints.
 - 
Diff view: compare SBOM A vs B; show added/removed nodes/edges, changed versions, changed determinations.
 - 
Saved queries: visual builder + JSON query; shareable permalinks scoped by tenant and environment.
 - 
Exports: GraphML, CSV edge list, NDJSON of findings, PNG/SVG snapshot.
 - 
Evidence details: side panel with raw facts, advisory links, VEX statements, policy explain trace, provenance.
 - 
Accessibility: tab‑navigable, high‑contrast, screen‑reader labels for nodes and sidebars.
 
3.3 Query model
- 
Visual builder for common queries:
- “Show all paths from Artifact X to Advisory Y up to depth 6.”
 - “All runtime dependencies with license = GPL‑3.0.”
 - “All artifacts affected by GHSA‑… with no applicable VEX.”
 - “Which SBOMs introduced/removed 
opensslbetween build 120 and 130?” 
 - 
JSON query (internal, POST body) with:
start: list of node selectors (type + id or attributes)expand: edge types and depth, direction, scope filterswhere: predicates on node/edge attributesoverlay: policy version id, advisory sources, VEX filterslimit: nodes, edges, timebox, cost budget
 - 
Cost control: server estimates cost, denies or pages results; UI streams partial graph tiles.
 
3.4 UI architecture (Console)
- 
Canvas: WebGL renderer with level‑of‑detail, edge bundling, and label culling; deterministic layout when possible (seeded).
 - 
Semantic zoom:
- Far: clusters by artifact/repo/ecosystem, color by lens
 - Mid: package groups, advisory badges, license swatches
 - Near: concrete versions, direct edges, inline badges for policy determinations
 
 - 
Panels:
- Left: search, filters, lens selector, saved queries
 - Right: details, explain trace, evidence tabs (Advisory/VEX/Policy/Provenance)
 - Bottom: query expression, diagnostics, performance/stream status
 
 - 
Diff mode: split or overlay, color legend (add/remove/changed), filter by node type.
 - 
Deep links: URL encodes query + viewport; shareable respecting RBAC.
 - 
Keyboard: space drag, +/- zoom, F to focus, G to expand neighbors, P to show paths.
 
3.5 Back‑end architecture
Graph Indexer (new)
- 
Consumes SBOM ingests, Conseiller advisories, Excitator VEX statements, Policy Engine determinations (read‑only).
 - 
Projects facts into a property graph persisted in:
- Primary: document store + adjacency sets (e.g., Mongo collections + compressed adjacency lists)
 - Optional driver for graph DB backends if needed (pluggable)
 
 - 
Maintains materialized aggregates: degree, critical paths cache, affected artifact counts, license distribution.
 - 
Emits graph snapshots per SBOM with lineage to original ingestion.
 
Graph API (new)
- Endpoints for search, neighbor expansion, path queries, diffs, overlays, exports.
 - Streaming responses for large graphs (chunked NDJSON tiles).
 - Cost accounting + quotas per tenant.
 
Workers
- Centrality & clustering precompute on idle: betweenness approximations, connected components, Louvain clusters.
 - Diff compute on new SBOM ingestion pairs (previous vs current).
 - Overlay materialization cache for popular policy versions.
 
Policy Engine integration
- Graph API requests can specify a policy version.
 - For sampled nodes, the API fetches explain traces; for counts, uses precomputed overlay materializations where available.
 
AOC enforcement
- Graph Indexer never merges or edits advisories/VEX; it links them and exposes overlays that the Policy Engine evaluates.
 - Conseiller and Excitator remain authoritative sources; severities come from Policy‑governed normalization.
 
3.6 APIs (representative)
GET /graph/search?q=...&type=package|artifact|advisory|licensePOST /graph/query⇒ stream tiles{nodes[], edges[], stats, cursor}POST /graph/pathsbody:{from, to, depth<=6, constraints{scope, runtime_only}}POST /graph/diffbody:{sbom_a, sbom_b, filters}GET /graph/snapshots/{sbom_id}⇒ graph metadata, counts, top advisoriesPOST /graph/exportbody:{format: graphml|csv|ndjson|png|svg, query|snapshot}GET /graph/saved/POST /graph/savedsave and list tenant queriesGET /graph/overlays/policy/{version_id}⇒ summary stats for caching
All endpoints tenant‑scoped, RBAC‑checked. Timeouts and pagination by server. Errors return structured diagnostics.
3.7 CLI
stella sbom graph search "purl:pkg:maven/org.apache.logging.log4j/log4j-core"
stella sbom graph query --file ./query.json --export graphml > graph.graphml
stella sbom graph impacted --advisory GHSA-xxxx --runtime-only --limit 100
stella sbom graph paths --from artifact:service-a --to advisory:GHSA-xxxx --depth 5 --policy 1.3.0
stella sbom graph diff --sbom-a 2025-03-15T10:00Z --sbom-b 2025-03-22T10:00Z --export csv > diff.csv
stella sbom graph save --name "openssl-runtime" --file ./query.json
Exit codes: 0 ok, 2 query validation error, 3 over‑budget, 4 not found, 5 RBAC denied.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.8 Performance & scale
- Progressive loading: server pages tiles by BFS frontier; client renders incrementally.
 - Viewport culling: only visible nodes/edges in canvas; offscreen demoted to aggregates.
 - Level‑of‑detail: simplified glyphs and collapsed clusters at distance.
 - Query budgets: per‑tenant rate + node/edge caps; interactive paths limited to depth ≤ 6.
 - Caching: hot queries memoized per tenant + overlay version; diffs precomputed for consecutive SBOMs.
 
3.9 Security
- 
Multi‑tenant isolation at storage and API layers.
 - 
RBAC roles:
- Viewer: browse graphs, saved queries
 - Investigator: run queries, export data
 - Operator: configure budgets, purge caches
 - Auditor: download evidence bundles
 
 - 
Input validation for query JSON; deny disallowed edge traversals; strict CSP in web app.
 
3.10 Observability
- Metrics: tile latency, nodes/edges per tile, cache hit rate, query denials, memory pressure.
 - Logs: structured, include query hash, cost, truncation flags.
 - Traces: server spans per stage (parse, plan, fetch, overlay, stream).
 
3.11 Accessibility & UX guarantees
- Keyboard complete, ARIA roles for graph and panels, high‑contrast theme.
 - Deterministic layout on reload for shareable investigations.
 
3.12 Data retention
- Graph nodes derived from SBOMs share retention with SBOM artifacts; overlays are ephemeral caches.
 - Saved queries retained until deleted; references to missing objects show warnings.
 
4) Implementation plan
4.1 Services
- 
Graph Indexer (new microservice)
- Subscribes to SBOM ingest events, Conseiller advisory updates, Excitator VEX updates, Policy overlay materializations.
 - Builds adjacency lists and node documents; computes aggregates and clusters.
 
 - 
Graph API (new microservice)
- Validates and executes queries; streams tiles; composes overlays; serves diffs and exports.
 - Integrates with Policy Engine for explain sample retrieval.
 
 - 
SBOM Service (existing)
- Emits ingestion events with SBOM ids and lineage; exposes SBOM metadata to Graph API.
 
 - 
Web API Gateway
- Routes 
/graph/*, injects tenant context, enforces RBAC. 
 - Routes 
 
4.2 Console (Web UI) feature module
- 
packages/features/graph-explorer- Canvas renderer (WebGL), panels, query builder, diff mode, overlays, exports.
 - Deep‑link router and viewport state serializer.
 
 
4.3 Workers
- Centrality/clustering worker, diff worker, overlay materialization worker.
 - Schedules on low‑traffic windows; backpressure aware.
 
4.4 Data model (storage)
- 
Collections:
graph_nodes:{_id, tenant, type, natural_id, attrs, degree, created_at, updated_at}graph_edges:{_id, tenant, from_id, to_id, type, attrs, valid_from, valid_to}graph_snapshots: per‑SBOM node/edge referencesgraph_saved_queries:{_id, tenant, name, query_json, created_by}graph_overlays_cache: keyed by{tenant, policy_version, hash(query)}
 - 
Indexes: compound on
{tenant, type, natural_id},{tenant, from_id},{tenant, to_id}, time bounds. 
5) Documentation changes (create/update)
- 
/docs/sbom/graph-explorer-overview.md- Concepts, node/edge taxonomy, lenses, overlays, roles, limitations.
 
 - 
/docs/sbom/graph-using-the-console.md- Walkthroughs: search, navigate, impact, diff, export; screenshots and keyboard cheatsheet.
 
 - 
/docs/sbom/graph-query-language.md- JSON schema, examples, constraints, cost/budget rules.
 
 - 
/docs/sbom/graph-api.md- REST endpoints, request/response examples, streaming and pagination.
 
 - 
/docs/sbom/graph-cli.md- CLI command reference and example pipelines.
 
 - 
/docs/policy/graph-overlays.md- How policy versions render in Graph; explain sampling; AOC guardrails.
 
 - 
/docs/vex/graph-integration.md- How VEX suppressions appear and how to validate product scoping.
 
 - 
/docs/advisories/graph-integration.md- Advisory linkage and severity normalization by policy.
 
 - 
/docs/architecture/graph-services.md- Graph Indexer, Graph API, storage choices, failure modes.
 
 - 
/docs/observability/graph-telemetry.md- Metrics, logs, tracing, dashboards.
 
 - 
/docs/runbooks/graph-incidents.md- Handling runaway queries, cache poisoning, degraded render.
 
 - 
/docs/security/graph-rbac.md- Permissions matrix, multi‑tenant boundaries.
 
 
Every doc should end with a “Compliance checklist.” Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
6) Tasks
6.1 Backend: Graph Indexer
- Define node/edge schemas and attribute dictionaries for each type.
 - Implement event consumers for SBOM ingests, Conseiller updates, Excitator updates.
 - Build ingestion pipeline that populates nodes/edges and maintains 
valid_from/valid_to. - Implement aggregate counters and degree metrics.
 - Implement clustering job and persist cluster ids per node.
 - Implement snapshot materialization per SBOM and lineage tracking.
 - Unit tests for each node/edge builder; property‑based tests for identity stability.
 
6.2 Backend: Graph API
- Implement 
/graph/searchwith prefix and exact match across node types. - Implement 
/graph/querywith validation, planning, cost estimation, and streaming tile results. - Implement 
/graph/pathswith constraints and depth limits; shortest path heuristic. - Implement 
/graph/diffcomputing adds/removes/changed versions; stream results. - Implement overlays: advisory join, VEX join, policy materialization and explain sampling.
 - Implement exports: GraphML, CSV edge list, NDJSON findings, PNG/SVG snapshots.
 - RBAC middleware integration; multi‑tenant scoping.
 - Load tests with synthetic large SBOMs; define default budgets.
 
6.3 Policy Engine integration
- Add endpoint to fetch explain traces for specific node ids in batch.
 - Add materialization export that Graph API can cache per policy version.
 
6.4 Console (Web UI)
- Create 
graph-explorermodule with routes/graph,/graph/diff,/graph/q/:id. - Implement WebGL canvas with LOD, culling, edge bundling, deterministic layout seed.
 - Build search, filter, lens, and overlay toolbars.
 - Side panels: details, evidence tabs, explain viewer.
 - Diff mode: split/overlay toggles and color legend.
 - Saved queries: create, update, run; deep links.
 - Export UI: formats, server round‑trip, progress indicators.
 - a11y audit and keyboard‑only flow.
 
6.5 CLI
- Implement 
stella sbom graph *subcommands with JSON IO and piping support. - Document examples and stable output schemas for CI consumption.
 
6.6 Observability & Ops
- Dashboards for tile latency, query denials, cache hit rate, memory.
 - Alerting on query error spikes, OOM risk, cache churn.
 - Runbooks in 
/docs/runbooks/graph-incidents.md. 
6.7 Docs
- Author all docs in section 5, link from Console contextual help.
 - Add end‑to‑end tutorial: “Investigate GHSA‑XXXX across prod artifacts.”
 
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Acceptance criteria
- Console renders large SBOM graphs with semantic zoom, overlays, and responsive interactions.
 - Users can run impact and path queries with bounded depth and get results within budget.
 - VEX suppressions and advisory severities appear correctly and are consistent with policy.
 - Diff view clearly shows added/removed/changed nodes/edges between two SBOMs.
 - Saved queries and deep links reproduce the same view deterministically (given same data).
 - Exports produce valid GraphML/CSV/NDJSON and image snapshots.
 - CLI supports search, query, paths, impacted, diff, and export with stable schemas.
 - AOC guardrails: explorer never mutates facts; overlays reflect Policy Engine decisions.
 - RBAC enforced; all actions logged and observable.
 
8) Risks & mitigations
- Graph explosion on large monorepos → tiling, clustering, budgets, and strict depth limits.
 - Inconsistent identities across tools → canonicalize purls/digests; property‑based tests for identity stability.
 - Policy overlay latency → precompute materializations for hot policy versions; sample explains only on focus.
 - User confusion → strong lens defaults, deterministic layouts, legends, in‑context help.
 
9) Test plan
- Unit: node/edge builders, identity normalization, cost estimator.
 - Integration: ingest SBOM + advisories + VEX, verify overlays and counts.
 - E2E: Playwright flows for search→impact→diff→export; deep link determinism.
 - Performance: simulate 500k nodes/2M edges; measure tile latency and memory.
 - Security: RBAC matrix; tenant isolation tests; query validation fuzzing.
 - Determinism: snapshot round‑trip: same query and seed produce identical layout and stats.
 
10) Feature flags
graph.explorer(UI feature module)graph.paths(advanced path queries)graph.diff(SBOM diff mode)graph.overlays.policy(policy overlay + explain sampling)graph.export(exports enabled)
Documented in /docs/observability/graph-telemetry.md.
11) Non‑goals (this epic)
- Real‑time process/runtime call graphs.
 - Full substitution for text reports; Explorer complements Reports.
 - Cross‑tenant graphs; all queries are tenant‑scoped.
 
12) Philosophy
- See the system: security and license risk are structural. If you cannot see structure, you will miss risk.
 - Evidence over assertion: every colored node corresponds to raw facts and explainable determinations.
 - Bounded interactivity: fast, partial answers beat slow “complete” ones.
 - Immutability: graphs mirror SBOM snapshots and are never rewritten; we add context, not edits.
 
Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.