20 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	Here’s Epic 5 in the same paste‑into‑repo, implementation‑ready format as the prior epics. It’s exhaustive, formal, and designed to slot into AOC, Policy Engine, Conseiller/Excitator, and the Console.
Epic 5: SBOM Graph Explorer
Short name: Graph Explorer Services touched: SBOM Service, Graph Indexer (new), Graph API (new), Policy Engine, Conseiller (Feedser), Excitator (Vexer), Web API Gateway, Authority (authN/Z), Workers/Scheduler, Telemetry Surfaces: Console (Web UI) graph module, CLI, Exports Deliverables: Interactive graph UI with semantic zoom, saved queries, policy/VEX/advisory overlays, diff views, impact analysis, exports
1) What it is
SBOM Graph Explorer is the interactive, tenant‑scoped view of all supply‑chain relationships the platform knows about, rendered as a navigable graph. It connects:
- Artifacts (applications, images, libs), Packages/Versions, Files/Paths, Licenses, Advisories (from Conseiller), VEX statements (from Excitator), Provenance (builds, sources), and Policies (overlays of determinations)
- Edges like depends_on,contains,built_from,declared_in,affected_by,vex_exempts,governs_with
- Time/version dimension: multiple SBOM snapshots with diffs
It’s built for investigation and review: find where a vulnerable package enters; see which apps are impacted; understand why a finding exists; simulate a policy version and see the delta. The explorer observes AOC enforcement: it never mutates facts; it aggregates and visualizes them. Only the Policy Engine may classify, and classification is displayed as overlays.
2) Why
- SBOMs are graphs. Tables flatten what matters and hide transitive risk.
- Engineers, security, and auditors need impact answers quickly: “What pulls in log4j:2.17and where is it at runtime?”
- Policy/VEX/advisory interactions are nuanced. A visual overlay makes precedence and outcomes obvious.
- Review is collaborative; you need saved queries, deep links, exports, and consistent evidence.
3) How it should work (maximum detail)
3.1 Domain model
Nodes (typed, versioned, tenant‑scoped):
- Artifact: application, service, container image, library, module
- Package: name + ecosystem (purl),- PackageVersionwith resolved version
- File: path within artifact or image layer
- License: SPDX id
- Advisory: normalized advisory id (GHSA, CVE, vendor), source = Conseiller
- VEX: statement with product context, status, justification, source = Excitator
- SBOM: ingestion unit; includes metadata (tool, sha, build info)
- PolicyDetermination: materialized view of Policy Engine results (read‑only overlay)
- Build: provenance, commit, workflow run
- Source: repo, tag, commit
Edges (directed):
- declared_in(PackageVersion → SBOM)
- contains(Artifact → PackageVersion | File)
- depends_on(PackageVersion → PackageVersion) with scope attr (prod|dev|test|optional)
- built_from(Artifact → Build),- provenance_of(Build → Source)
- affected_by(PackageVersion → Advisory) with range semantics
- vex_exempts(Advisory ↔ VEX) scoped by product/component
- licensed_under(Artifact|PackageVersion → License)
- governs_with(Artifact|PackageVersion → PolicyDetermination)
- derived_from(SBOM → SBOM) for superseding snapshots
Identity & versioning
- Every node has a stable key: {tenant}:{type}:{natural_id}(e.g., purl for packages, digest for images).
- SBOM snapshots are immutable; edges carry valid_from/valid_tofor time travel and diffing.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.2 User capabilities (end‑to‑end)
- 
Search & Navigate: global search (purls, CVEs, repos, licenses), keyboard nav, breadcrumbs, semantic zoom. 
- 
Lenses: toggle views (Security, License, Provenance, Runtime vs Dev, Policy effect). 
- 
Overlays: - Advisory overlay: show affected nodes/edges with source, severity, ranges.
- VEX overlay: show suppressions/justifications; collapse exempted paths.
- Policy overlay: choose a policy version; nodes/edges reflect determinations (severity, status) with explain sampling.
 
- 
Impact analysis: pick a vulnerable node; highlight upstream/downstream dependents, scope filters, shortest/all paths with constraints. 
- 
Diff view: compare SBOM A vs B; show added/removed nodes/edges, changed versions, changed determinations. 
- 
Saved queries: visual builder + JSON query; shareable permalinks scoped by tenant and environment. 
- 
Exports: GraphML, CSV edge list, NDJSON of findings, PNG/SVG snapshot. 
- 
Evidence details: side panel with raw facts, advisory links, VEX statements, policy explain trace, provenance. 
- 
Accessibility: tab‑navigable, high‑contrast, screen‑reader labels for nodes and sidebars. 
3.3 Query model
- 
Visual builder for common queries: - “Show all paths from Artifact X to Advisory Y up to depth 6.”
- “All runtime dependencies with license = GPL‑3.0.”
- “All artifacts affected by GHSA‑… with no applicable VEX.”
- “Which SBOMs introduced/removed opensslbetween build 120 and 130?”
 
- 
JSON query (internal, POST body) with: - start: list of node selectors (type + id or attributes)
- expand: edge types and depth, direction, scope filters
- where: predicates on node/edge attributes
- overlay: policy version id, advisory sources, VEX filters
- limit: nodes, edges, timebox, cost budget
 
- 
Cost control: server estimates cost, denies or pages results; UI streams partial graph tiles. 
3.4 UI architecture (Console)
- 
Canvas: WebGL renderer with level‑of‑detail, edge bundling, and label culling; deterministic layout when possible (seeded). 
- 
Semantic zoom: - Far: clusters by artifact/repo/ecosystem, color by lens
- Mid: package groups, advisory badges, license swatches
- Near: concrete versions, direct edges, inline badges for policy determinations
 
- 
Panels: - Left: search, filters, lens selector, saved queries
- Right: details, explain trace, evidence tabs (Advisory/VEX/Policy/Provenance)
- Bottom: query expression, diagnostics, performance/stream status
 
- 
Diff mode: split or overlay, color legend (add/remove/changed), filter by node type. 
- 
Deep links: URL encodes query + viewport; shareable respecting RBAC. 
- 
Keyboard: space drag, +/- zoom, F to focus, G to expand neighbors, P to show paths. 
3.5 Back‑end architecture
Graph Indexer (new)
- 
Consumes SBOM ingests, Conseiller advisories, Excitator VEX statements, Policy Engine determinations (read‑only). 
- 
Projects facts into a property graph persisted in: - Primary: document store + adjacency sets (e.g., Mongo collections + compressed adjacency lists)
- Optional driver for graph DB backends if needed (pluggable)
 
- 
Maintains materialized aggregates: degree, critical paths cache, affected artifact counts, license distribution. 
- 
Emits graph snapshots per SBOM with lineage to original ingestion. 
Graph API (new)
- Endpoints for search, neighbor expansion, path queries, diffs, overlays, exports.
- Streaming responses for large graphs (chunked NDJSON tiles).
- Cost accounting + quotas per tenant.
Workers
- Centrality & clustering precompute on idle: betweenness approximations, connected components, Louvain clusters.
- Diff compute on new SBOM ingestion pairs (previous vs current).
- Overlay materialization cache for popular policy versions.
Policy Engine integration
- Graph API requests can specify a policy version.
- For sampled nodes, the API fetches explain traces; for counts, uses precomputed overlay materializations where available.
AOC enforcement
- Graph Indexer never merges or edits advisories/VEX; it links them and exposes overlays that the Policy Engine evaluates.
- Conseiller and Excitator remain authoritative sources; severities come from Policy‑governed normalization.
3.6 APIs (representative)
- GET /graph/search?q=...&type=package|artifact|advisory|license
- POST /graph/query⇒ stream tiles- {nodes[], edges[], stats, cursor}
- POST /graph/pathsbody:- {from, to, depth<=6, constraints{scope, runtime_only}}
- POST /graph/diffbody:- {sbom_a, sbom_b, filters}
- GET /graph/snapshots/{sbom_id}⇒ graph metadata, counts, top advisories
- POST /graph/exportbody:- {format: graphml|csv|ndjson|png|svg, query|snapshot}
- GET /graph/saved/- POST /graph/savedsave and list tenant queries
- GET /graph/overlays/policy/{version_id}⇒ summary stats for caching
All endpoints tenant‑scoped, RBAC‑checked. Timeouts and pagination by server. Errors return structured diagnostics.
3.7 CLI
stella sbom graph search "purl:pkg:maven/org.apache.logging.log4j/log4j-core"
stella sbom graph query --file ./query.json --export graphml > graph.graphml
stella sbom graph impacted --advisory GHSA-xxxx --runtime-only --limit 100
stella sbom graph paths --from artifact:service-a --to advisory:GHSA-xxxx --depth 5 --policy 1.3.0
stella sbom graph diff --sbom-a 2025-03-15T10:00Z --sbom-b 2025-03-22T10:00Z --export csv > diff.csv
stella sbom graph save --name "openssl-runtime" --file ./query.json
Exit codes: 0 ok, 2 query validation error, 3 over‑budget, 4 not found, 5 RBAC denied.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.8 Performance & scale
- Progressive loading: server pages tiles by BFS frontier; client renders incrementally.
- Viewport culling: only visible nodes/edges in canvas; offscreen demoted to aggregates.
- Level‑of‑detail: simplified glyphs and collapsed clusters at distance.
- Query budgets: per‑tenant rate + node/edge caps; interactive paths limited to depth ≤ 6.
- Caching: hot queries memoized per tenant + overlay version; diffs precomputed for consecutive SBOMs.
3.9 Security
- 
Multi‑tenant isolation at storage and API layers. 
- 
RBAC roles: - Viewer: browse graphs, saved queries
- Investigator: run queries, export data
- Operator: configure budgets, purge caches
- Auditor: download evidence bundles
 
- 
Input validation for query JSON; deny disallowed edge traversals; strict CSP in web app. 
3.10 Observability
- Metrics: tile latency, nodes/edges per tile, cache hit rate, query denials, memory pressure.
- Logs: structured, include query hash, cost, truncation flags.
- Traces: server spans per stage (parse, plan, fetch, overlay, stream).
3.11 Accessibility & UX guarantees
- Keyboard complete, ARIA roles for graph and panels, high‑contrast theme.
- Deterministic layout on reload for shareable investigations.
3.12 Data retention
- Graph nodes derived from SBOMs share retention with SBOM artifacts; overlays are ephemeral caches.
- Saved queries retained until deleted; references to missing objects show warnings.
4) Implementation plan
4.1 Services
- 
Graph Indexer (new microservice) - Subscribes to SBOM ingest events, Conseiller advisory updates, Excitator VEX updates, Policy overlay materializations.
- Builds adjacency lists and node documents; computes aggregates and clusters.
 
- 
Graph API (new microservice) - Validates and executes queries; streams tiles; composes overlays; serves diffs and exports.
- Integrates with Policy Engine for explain sample retrieval.
 
- 
SBOM Service (existing) - Emits ingestion events with SBOM ids and lineage; exposes SBOM metadata to Graph API.
 
- 
Web API Gateway - Routes /graph/*, injects tenant context, enforces RBAC.
 
- Routes 
4.2 Console (Web UI) feature module
- 
packages/features/graph-explorer- Canvas renderer (WebGL), panels, query builder, diff mode, overlays, exports.
- Deep‑link router and viewport state serializer.
 
4.3 Workers
- Centrality/clustering worker, diff worker, overlay materialization worker.
- Schedules on low‑traffic windows; backpressure aware.
4.4 Data model (storage)
- 
Collections: - graph_nodes:- {_id, tenant, type, natural_id, attrs, degree, created_at, updated_at}
- graph_edges:- {_id, tenant, from_id, to_id, type, attrs, valid_from, valid_to}
- graph_snapshots: per‑SBOM node/edge references
- graph_saved_queries:- {_id, tenant, name, query_json, created_by}
- graph_overlays_cache: keyed by- {tenant, policy_version, hash(query)}
 
- 
Indexes: compound on {tenant, type, natural_id},{tenant, from_id},{tenant, to_id}, time bounds.
5) Documentation changes (create/update)
- 
/docs/sbom/graph-explorer-overview.md- Concepts, node/edge taxonomy, lenses, overlays, roles, limitations.
 
- 
/docs/sbom/graph-using-the-console.md- Walkthroughs: search, navigate, impact, diff, export; screenshots and keyboard cheatsheet.
 
- 
/docs/sbom/graph-query-language.md- JSON schema, examples, constraints, cost/budget rules.
 
- 
/docs/sbom/graph-api.md- REST endpoints, request/response examples, streaming and pagination.
 
- 
/docs/sbom/graph-cli.md- CLI command reference and example pipelines.
 
- 
/docs/policy/graph-overlays.md- How policy versions render in Graph; explain sampling; AOC guardrails.
 
- 
/docs/vex/graph-integration.md- How VEX suppressions appear and how to validate product scoping.
 
- 
/docs/advisories/graph-integration.md- Advisory linkage and severity normalization by policy.
 
- 
/docs/architecture/graph-services.md- Graph Indexer, Graph API, storage choices, failure modes.
 
- 
/docs/observability/graph-telemetry.md- Metrics, logs, tracing, dashboards.
 
- 
/docs/runbooks/graph-incidents.md- Handling runaway queries, cache poisoning, degraded render.
 
- 
/docs/security/graph-rbac.md- Permissions matrix, multi‑tenant boundaries.
 
Every doc should end with a “Compliance checklist.” Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
6) Tasks
6.1 Backend: Graph Indexer
- Define node/edge schemas and attribute dictionaries for each type.
- Implement event consumers for SBOM ingests, Conseiller updates, Excitator updates.
- Build ingestion pipeline that populates nodes/edges and maintains valid_from/valid_to.
- Implement aggregate counters and degree metrics.
- Implement clustering job and persist cluster ids per node.
- Implement snapshot materialization per SBOM and lineage tracking.
- Unit tests for each node/edge builder; property‑based tests for identity stability.
6.2 Backend: Graph API
- Implement /graph/searchwith prefix and exact match across node types.
- Implement /graph/querywith validation, planning, cost estimation, and streaming tile results.
- Implement /graph/pathswith constraints and depth limits; shortest path heuristic.
- Implement /graph/diffcomputing adds/removes/changed versions; stream results.
- Implement overlays: advisory join, VEX join, policy materialization and explain sampling.
- Implement exports: GraphML, CSV edge list, NDJSON findings, PNG/SVG snapshots.
- RBAC middleware integration; multi‑tenant scoping.
- Load tests with synthetic large SBOMs; define default budgets.
6.3 Policy Engine integration
- Add endpoint to fetch explain traces for specific node ids in batch.
- Add materialization export that Graph API can cache per policy version.
6.4 Console (Web UI)
- Create graph-explorermodule with routes/graph,/graph/diff,/graph/q/:id.
- Implement WebGL canvas with LOD, culling, edge bundling, deterministic layout seed.
- Build search, filter, lens, and overlay toolbars.
- Side panels: details, evidence tabs, explain viewer.
- Diff mode: split/overlay toggles and color legend.
- Saved queries: create, update, run; deep links.
- Export UI: formats, server round‑trip, progress indicators.
- a11y audit and keyboard‑only flow.
6.5 CLI
- Implement stella sbom graph *subcommands with JSON IO and piping support.
- Document examples and stable output schemas for CI consumption.
6.6 Observability & Ops
- Dashboards for tile latency, query denials, cache hit rate, memory.
- Alerting on query error spikes, OOM risk, cache churn.
- Runbooks in /docs/runbooks/graph-incidents.md.
6.7 Docs
- Author all docs in section 5, link from Console contextual help.
- Add end‑to‑end tutorial: “Investigate GHSA‑XXXX across prod artifacts.”
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Acceptance criteria
- Console renders large SBOM graphs with semantic zoom, overlays, and responsive interactions.
- Users can run impact and path queries with bounded depth and get results within budget.
- VEX suppressions and advisory severities appear correctly and are consistent with policy.
- Diff view clearly shows added/removed/changed nodes/edges between two SBOMs.
- Saved queries and deep links reproduce the same view deterministically (given same data).
- Exports produce valid GraphML/CSV/NDJSON and image snapshots.
- CLI supports search, query, paths, impacted, diff, and export with stable schemas.
- AOC guardrails: explorer never mutates facts; overlays reflect Policy Engine decisions.
- RBAC enforced; all actions logged and observable.
8) Risks & mitigations
- Graph explosion on large monorepos → tiling, clustering, budgets, and strict depth limits.
- Inconsistent identities across tools → canonicalize purls/digests; property‑based tests for identity stability.
- Policy overlay latency → precompute materializations for hot policy versions; sample explains only on focus.
- User confusion → strong lens defaults, deterministic layouts, legends, in‑context help.
9) Test plan
- Unit: node/edge builders, identity normalization, cost estimator.
- Integration: ingest SBOM + advisories + VEX, verify overlays and counts.
- E2E: Playwright flows for search→impact→diff→export; deep link determinism.
- Performance: simulate 500k nodes/2M edges; measure tile latency and memory.
- Security: RBAC matrix; tenant isolation tests; query validation fuzzing.
- Determinism: snapshot round‑trip: same query and seed produce identical layout and stats.
10) Feature flags
- graph.explorer(UI feature module)
- graph.paths(advanced path queries)
- graph.diff(SBOM diff mode)
- graph.overlays.policy(policy overlay + explain sampling)
- graph.export(exports enabled)
Documented in /docs/observability/graph-telemetry.md.
11) Non‑goals (this epic)
- Real‑time process/runtime call graphs.
- Full substitution for text reports; Explorer complements Reports.
- Cross‑tenant graphs; all queries are tenant‑scoped.
12) Philosophy
- See the system: security and license risk are structural. If you cannot see structure, you will miss risk.
- Evidence over assertion: every colored node corresponds to raw facts and explainable determinations.
- Bounded interactivity: fast, partial answers beat slow “complete” ones.
- Immutability: graphs mirror SBOM snapshots and are never rewritten; we add context, not edits.
Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.