- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution. - Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done. - Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
20 KiB
Here’s Epic 5 in the same paste‑into‑repo, implementation‑ready format as the prior epics. It’s exhaustive, formal, and designed to slot into AOC, Policy Engine, Conseiller/Excitator, and the Console.
Epic 5: SBOM Graph Explorer
Short name: Graph Explorer Services touched: SBOM Service, Graph Indexer (new), Graph API (new), Policy Engine, Conseiller (Feedser), Excitator (Vexer), Web API Gateway, Authority (authN/Z), Workers/Scheduler, Telemetry Surfaces: Console (Web UI) graph module, CLI, Exports Deliverables: Interactive graph UI with semantic zoom, saved queries, policy/VEX/advisory overlays, diff views, impact analysis, exports
1) What it is
SBOM Graph Explorer is the interactive, tenant‑scoped view of all supply‑chain relationships the platform knows about, rendered as a navigable graph. It connects:
- Artifacts (applications, images, libs), Packages/Versions, Files/Paths, Licenses, Advisories (from Conseiller), VEX statements (from Excitator), Provenance (builds, sources), and Policies (overlays of determinations)
- Edges like
depends_on,contains,built_from,declared_in,affected_by,vex_exempts,governs_with - Time/version dimension: multiple SBOM snapshots with diffs
It’s built for investigation and review: find where a vulnerable package enters; see which apps are impacted; understand why a finding exists; simulate a policy version and see the delta. The explorer observes AOC enforcement: it never mutates facts; it aggregates and visualizes them. Only the Policy Engine may classify, and classification is displayed as overlays.
2) Why
- SBOMs are graphs. Tables flatten what matters and hide transitive risk.
- Engineers, security, and auditors need impact answers quickly: “What pulls in
log4j:2.17and where is it at runtime?” - Policy/VEX/advisory interactions are nuanced. A visual overlay makes precedence and outcomes obvious.
- Review is collaborative; you need saved queries, deep links, exports, and consistent evidence.
3) How it should work (maximum detail)
3.1 Domain model
Nodes (typed, versioned, tenant‑scoped):
Artifact: application, service, container image, library, modulePackage: name + ecosystem (purl),PackageVersionwith resolved versionFile: path within artifact or image layerLicense: SPDX idAdvisory: normalized advisory id (GHSA, CVE, vendor), source = ConseillerVEX: statement with product context, status, justification, source = ExcitatorSBOM: ingestion unit; includes metadata (tool, sha, build info)PolicyDetermination: materialized view of Policy Engine results (read‑only overlay)Build: provenance, commit, workflow runSource: repo, tag, commit
Edges (directed):
declared_in(PackageVersion → SBOM)contains(Artifact → PackageVersion | File)depends_on(PackageVersion → PackageVersion) with scope attr (prod|dev|test|optional)built_from(Artifact → Build),provenance_of(Build → Source)affected_by(PackageVersion → Advisory) with range semanticsvex_exempts(Advisory ↔ VEX) scoped by product/componentlicensed_under(Artifact|PackageVersion → License)governs_with(Artifact|PackageVersion → PolicyDetermination)derived_from(SBOM → SBOM) for superseding snapshots
Identity & versioning
- Every node has a stable key:
{tenant}:{type}:{natural_id}(e.g., purl for packages, digest for images). - SBOM snapshots are immutable; edges carry
valid_from/valid_tofor time travel and diffing.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.2 User capabilities (end‑to‑end)
-
Search & Navigate: global search (purls, CVEs, repos, licenses), keyboard nav, breadcrumbs, semantic zoom.
-
Lenses: toggle views (Security, License, Provenance, Runtime vs Dev, Policy effect).
-
Overlays:
- Advisory overlay: show affected nodes/edges with source, severity, ranges.
- VEX overlay: show suppressions/justifications; collapse exempted paths.
- Policy overlay: choose a policy version; nodes/edges reflect determinations (severity, status) with explain sampling.
-
Impact analysis: pick a vulnerable node; highlight upstream/downstream dependents, scope filters, shortest/all paths with constraints.
-
Diff view: compare SBOM A vs B; show added/removed nodes/edges, changed versions, changed determinations.
-
Saved queries: visual builder + JSON query; shareable permalinks scoped by tenant and environment.
-
Exports: GraphML, CSV edge list, NDJSON of findings, PNG/SVG snapshot.
-
Evidence details: side panel with raw facts, advisory links, VEX statements, policy explain trace, provenance.
-
Accessibility: tab‑navigable, high‑contrast, screen‑reader labels for nodes and sidebars.
3.3 Query model
-
Visual builder for common queries:
- “Show all paths from Artifact X to Advisory Y up to depth 6.”
- “All runtime dependencies with license = GPL‑3.0.”
- “All artifacts affected by GHSA‑… with no applicable VEX.”
- “Which SBOMs introduced/removed
opensslbetween build 120 and 130?”
-
JSON query (internal, POST body) with:
start: list of node selectors (type + id or attributes)expand: edge types and depth, direction, scope filterswhere: predicates on node/edge attributesoverlay: policy version id, advisory sources, VEX filterslimit: nodes, edges, timebox, cost budget
-
Cost control: server estimates cost, denies or pages results; UI streams partial graph tiles.
3.4 UI architecture (Console)
-
Canvas: WebGL renderer with level‑of‑detail, edge bundling, and label culling; deterministic layout when possible (seeded).
-
Semantic zoom:
- Far: clusters by artifact/repo/ecosystem, color by lens
- Mid: package groups, advisory badges, license swatches
- Near: concrete versions, direct edges, inline badges for policy determinations
-
Panels:
- Left: search, filters, lens selector, saved queries
- Right: details, explain trace, evidence tabs (Advisory/VEX/Policy/Provenance)
- Bottom: query expression, diagnostics, performance/stream status
-
Diff mode: split or overlay, color legend (add/remove/changed), filter by node type.
-
Deep links: URL encodes query + viewport; shareable respecting RBAC.
-
Keyboard: space drag, +/- zoom, F to focus, G to expand neighbors, P to show paths.
3.5 Back‑end architecture
Graph Indexer (new)
-
Consumes SBOM ingests, Conseiller advisories, Excitator VEX statements, Policy Engine determinations (read‑only).
-
Projects facts into a property graph persisted in:
- Primary: document store + adjacency sets (e.g., Mongo collections + compressed adjacency lists)
- Optional driver for graph DB backends if needed (pluggable)
-
Maintains materialized aggregates: degree, critical paths cache, affected artifact counts, license distribution.
-
Emits graph snapshots per SBOM with lineage to original ingestion.
Graph API (new)
- Endpoints for search, neighbor expansion, path queries, diffs, overlays, exports.
- Streaming responses for large graphs (chunked NDJSON tiles).
- Cost accounting + quotas per tenant.
Workers
- Centrality & clustering precompute on idle: betweenness approximations, connected components, Louvain clusters.
- Diff compute on new SBOM ingestion pairs (previous vs current).
- Overlay materialization cache for popular policy versions.
Policy Engine integration
- Graph API requests can specify a policy version.
- For sampled nodes, the API fetches explain traces; for counts, uses precomputed overlay materializations where available.
AOC enforcement
- Graph Indexer never merges or edits advisories/VEX; it links them and exposes overlays that the Policy Engine evaluates.
- Conseiller and Excitator remain authoritative sources; severities come from Policy‑governed normalization.
3.6 APIs (representative)
GET /graph/search?q=...&type=package|artifact|advisory|licensePOST /graph/query⇒ stream tiles{nodes[], edges[], stats, cursor}POST /graph/pathsbody:{from, to, depth<=6, constraints{scope, runtime_only}}POST /graph/diffbody:{sbom_a, sbom_b, filters}GET /graph/snapshots/{sbom_id}⇒ graph metadata, counts, top advisoriesPOST /graph/exportbody:{format: graphml|csv|ndjson|png|svg, query|snapshot}GET /graph/saved/POST /graph/savedsave and list tenant queriesGET /graph/overlays/policy/{version_id}⇒ summary stats for caching
All endpoints tenant‑scoped, RBAC‑checked. Timeouts and pagination by server. Errors return structured diagnostics.
3.7 CLI
stella sbom graph search "purl:pkg:maven/org.apache.logging.log4j/log4j-core"
stella sbom graph query --file ./query.json --export graphml > graph.graphml
stella sbom graph impacted --advisory GHSA-xxxx --runtime-only --limit 100
stella sbom graph paths --from artifact:service-a --to advisory:GHSA-xxxx --depth 5 --policy 1.3.0
stella sbom graph diff --sbom-a 2025-03-15T10:00Z --sbom-b 2025-03-22T10:00Z --export csv > diff.csv
stella sbom graph save --name "openssl-runtime" --file ./query.json
Exit codes: 0 ok, 2 query validation error, 3 over‑budget, 4 not found, 5 RBAC denied.
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
3.8 Performance & scale
- Progressive loading: server pages tiles by BFS frontier; client renders incrementally.
- Viewport culling: only visible nodes/edges in canvas; offscreen demoted to aggregates.
- Level‑of‑detail: simplified glyphs and collapsed clusters at distance.
- Query budgets: per‑tenant rate + node/edge caps; interactive paths limited to depth ≤ 6.
- Caching: hot queries memoized per tenant + overlay version; diffs precomputed for consecutive SBOMs.
3.9 Security
-
Multi‑tenant isolation at storage and API layers.
-
RBAC roles:
- Viewer: browse graphs, saved queries
- Investigator: run queries, export data
- Operator: configure budgets, purge caches
- Auditor: download evidence bundles
-
Input validation for query JSON; deny disallowed edge traversals; strict CSP in web app.
3.10 Observability
- Metrics: tile latency, nodes/edges per tile, cache hit rate, query denials, memory pressure.
- Logs: structured, include query hash, cost, truncation flags.
- Traces: server spans per stage (parse, plan, fetch, overlay, stream).
3.11 Accessibility & UX guarantees
- Keyboard complete, ARIA roles for graph and panels, high‑contrast theme.
- Deterministic layout on reload for shareable investigations.
3.12 Data retention
- Graph nodes derived from SBOMs share retention with SBOM artifacts; overlays are ephemeral caches.
- Saved queries retained until deleted; references to missing objects show warnings.
4) Implementation plan
4.1 Services
-
Graph Indexer (new microservice)
- Subscribes to SBOM ingest events, Conseiller advisory updates, Excitator VEX updates, Policy overlay materializations.
- Builds adjacency lists and node documents; computes aggregates and clusters.
-
Graph API (new microservice)
- Validates and executes queries; streams tiles; composes overlays; serves diffs and exports.
- Integrates with Policy Engine for explain sample retrieval.
-
SBOM Service (existing)
- Emits ingestion events with SBOM ids and lineage; exposes SBOM metadata to Graph API.
-
Web API Gateway
- Routes
/graph/*, injects tenant context, enforces RBAC.
- Routes
4.2 Console (Web UI) feature module
-
packages/features/graph-explorer- Canvas renderer (WebGL), panels, query builder, diff mode, overlays, exports.
- Deep‑link router and viewport state serializer.
4.3 Workers
- Centrality/clustering worker, diff worker, overlay materialization worker.
- Schedules on low‑traffic windows; backpressure aware.
4.4 Data model (storage)
-
Collections:
graph_nodes:{_id, tenant, type, natural_id, attrs, degree, created_at, updated_at}graph_edges:{_id, tenant, from_id, to_id, type, attrs, valid_from, valid_to}graph_snapshots: per‑SBOM node/edge referencesgraph_saved_queries:{_id, tenant, name, query_json, created_by}graph_overlays_cache: keyed by{tenant, policy_version, hash(query)}
-
Indexes: compound on
{tenant, type, natural_id},{tenant, from_id},{tenant, to_id}, time bounds.
5) Documentation changes (create/update)
-
/docs/sbom/graph-explorer-overview.md- Concepts, node/edge taxonomy, lenses, overlays, roles, limitations.
-
/docs/sbom/graph-using-the-console.md- Walkthroughs: search, navigate, impact, diff, export; screenshots and keyboard cheatsheet.
-
/docs/sbom/graph-query-language.md- JSON schema, examples, constraints, cost/budget rules.
-
/docs/sbom/graph-api.md- REST endpoints, request/response examples, streaming and pagination.
-
/docs/sbom/graph-cli.md- CLI command reference and example pipelines.
-
/docs/policy/graph-overlays.md- How policy versions render in Graph; explain sampling; AOC guardrails.
-
/docs/vex/graph-integration.md- How VEX suppressions appear and how to validate product scoping.
-
/docs/advisories/graph-integration.md- Advisory linkage and severity normalization by policy.
-
/docs/architecture/graph-services.md- Graph Indexer, Graph API, storage choices, failure modes.
-
/docs/observability/graph-telemetry.md- Metrics, logs, tracing, dashboards.
-
/docs/runbooks/graph-incidents.md- Handling runaway queries, cache poisoning, degraded render.
-
/docs/security/graph-rbac.md- Permissions matrix, multi‑tenant boundaries.
Every doc should end with a “Compliance checklist.” Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
6) Tasks
6.1 Backend: Graph Indexer
- Define node/edge schemas and attribute dictionaries for each type.
- Implement event consumers for SBOM ingests, Conseiller updates, Excitator updates.
- Build ingestion pipeline that populates nodes/edges and maintains
valid_from/valid_to. - Implement aggregate counters and degree metrics.
- Implement clustering job and persist cluster ids per node.
- Implement snapshot materialization per SBOM and lineage tracking.
- Unit tests for each node/edge builder; property‑based tests for identity stability.
6.2 Backend: Graph API
- Implement
/graph/searchwith prefix and exact match across node types. - Implement
/graph/querywith validation, planning, cost estimation, and streaming tile results. - Implement
/graph/pathswith constraints and depth limits; shortest path heuristic. - Implement
/graph/diffcomputing adds/removes/changed versions; stream results. - Implement overlays: advisory join, VEX join, policy materialization and explain sampling.
- Implement exports: GraphML, CSV edge list, NDJSON findings, PNG/SVG snapshots.
- RBAC middleware integration; multi‑tenant scoping.
- Load tests with synthetic large SBOMs; define default budgets.
6.3 Policy Engine integration
- Add endpoint to fetch explain traces for specific node ids in batch.
- Add materialization export that Graph API can cache per policy version.
6.4 Console (Web UI)
- Create
graph-explorermodule with routes/graph,/graph/diff,/graph/q/:id. - Implement WebGL canvas with LOD, culling, edge bundling, deterministic layout seed.
- Build search, filter, lens, and overlay toolbars.
- Side panels: details, evidence tabs, explain viewer.
- Diff mode: split/overlay toggles and color legend.
- Saved queries: create, update, run; deep links.
- Export UI: formats, server round‑trip, progress indicators.
- a11y audit and keyboard‑only flow.
6.5 CLI
- Implement
stella sbom graph *subcommands with JSON IO and piping support. - Document examples and stable output schemas for CI consumption.
6.6 Observability & Ops
- Dashboards for tile latency, query denials, cache hit rate, memory.
- Alerting on query error spikes, OOM risk, cache churn.
- Runbooks in
/docs/runbooks/graph-incidents.md.
6.7 Docs
- Author all docs in section 5, link from Console contextual help.
- Add end‑to‑end tutorial: “Investigate GHSA‑XXXX across prod artifacts.”
Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
7) Acceptance criteria
- Console renders large SBOM graphs with semantic zoom, overlays, and responsive interactions.
- Users can run impact and path queries with bounded depth and get results within budget.
- VEX suppressions and advisory severities appear correctly and are consistent with policy.
- Diff view clearly shows added/removed/changed nodes/edges between two SBOMs.
- Saved queries and deep links reproduce the same view deterministically (given same data).
- Exports produce valid GraphML/CSV/NDJSON and image snapshots.
- CLI supports search, query, paths, impacted, diff, and export with stable schemas.
- AOC guardrails: explorer never mutates facts; overlays reflect Policy Engine decisions.
- RBAC enforced; all actions logged and observable.
8) Risks & mitigations
- Graph explosion on large monorepos → tiling, clustering, budgets, and strict depth limits.
- Inconsistent identities across tools → canonicalize purls/digests; property‑based tests for identity stability.
- Policy overlay latency → precompute materializations for hot policy versions; sample explains only on focus.
- User confusion → strong lens defaults, deterministic layouts, legends, in‑context help.
9) Test plan
- Unit: node/edge builders, identity normalization, cost estimator.
- Integration: ingest SBOM + advisories + VEX, verify overlays and counts.
- E2E: Playwright flows for search→impact→diff→export; deep link determinism.
- Performance: simulate 500k nodes/2M edges; measure tile latency and memory.
- Security: RBAC matrix; tenant isolation tests; query validation fuzzing.
- Determinism: snapshot round‑trip: same query and seed produce identical layout and stats.
10) Feature flags
graph.explorer(UI feature module)graph.paths(advanced path queries)graph.diff(SBOM diff mode)graph.overlays.policy(policy overlay + explain sampling)graph.export(exports enabled)
Documented in /docs/observability/graph-telemetry.md.
11) Non‑goals (this epic)
- Real‑time process/runtime call graphs.
- Full substitution for text reports; Explorer complements Reports.
- Cross‑tenant graphs; all queries are tenant‑scoped.
12) Philosophy
- See the system: security and license risk are structural. If you cannot see structure, you will miss risk.
- Evidence over assertion: every colored node corresponds to raw facts and explainable determinations.
- Bounded interactivity: fast, partial answers beat slow “complete” ones.
- Immutability: graphs mirror SBOM snapshots and are never rewritten; we add context, not edits.
Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.