3.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			3.8 KiB
		
	
	
	
	
	
	
	
Graph architecture
Derived from Epic 5 – SBOM Graph Explorer; this section captures the core model, pipeline, and API expectations. Extend with diagrams as implementation matures.
1) Core model
- Nodes:
Artifact(application/image digest) with metadata (tenant, environment, labels).Component(package/version, purl, ecosystem).File/Path(source files, binary paths) with hash/time metadata.Licensenodes linked to components and SBOM attestations.AdvisoryandVEXStatementnodes linking to Concelier/Excititor records via digests.PolicyVersionnodes representing signed policy packs.
 - Edges: directed, timestamped relationships such as 
DEPENDS_ON,BUILT_FROM,DECLARED_IN,AFFECTED_BY,VEX_EXEMPTS,GOVERNS_WITH,OBSERVED_RUNTIME. Each edge carries provenance (SRM hash, SBOM digest, policy run ID). - Overlays: computed index tables providing fast access to reachability, blast radius, and differential views (e.g., 
graph_overlay/vuln/{tenant}/{advisoryKey}). 
2) Pipelines
- Ingestion: Cartographer/SBOM Service emit SBOM snapshots (
sbom_snapshotevents) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata. - ETL: Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
 - Overlay computation: Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
 - Diffing: 
graph_diffjobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption. 
3) APIs
GET /graph/nodes/{id}— fetch node with metadata and attached provenance.POST /graph/query/saved— execute saved query (Cypher-like DSL) with tenant filtering; supports paging, citation metadata, andexplaintraces.GET /graph/impact/{advisoryKey}— returns impacted artifacts with path context and policy/vex overlays.GET /graph/diff/{snapshotA}/{snapshotB}— streaming API returning diff manifest including new/removed edges, risk summary, and export references.POST /graph/overlay/policy— create or retrieve overlay for policy version + advisory set, referencingeffective_findingresults.
4) Storage considerations
- Backed by either:
- Document + adjacency (Mongo collections 
graph_nodes,graph_edges,graph_overlays) with deterministic ordering and streaming exports. - Or Graph DB (e.g., Neo4j/Cosmos Gremlin) behind an abstraction layer; choice depends on deployment footprint.
 
 - Document + adjacency (Mongo collections 
 - All storages require tenant partitioning, append-only change logs, and export manifests for Offline Kits.
 
5) Offline & export
- Each snapshot packages 
nodes.jsonl,edges.jsonl,overlays/plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles. - Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
 
6) Observability
- Metrics: ingestion lag (
graph_ingest_lag_seconds), node/edge counts, query latency per saved query, overlay generation duration. - Logs: structured events for ETL stages and query execution (with trace IDs).
 - Traces: ETL pipeline spans, query engine spans.
 
7) Rollout notes
- Phase 1: ingest SBOM + advisories, deliver impact queries.
 - Phase 2: add VEX overlays, policy overlays, diff tooling.
 - Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).
 
Refer to the module README and implementation plan for immediate context, and update this document once component boundaries and data flows are finalised.