- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
		
			
				
	
	
	
		
			3.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			3.8 KiB
		
	
	
	
	
	
	
	
Graph architecture
Derived from Epic 5 – SBOM Graph Explorer; this section captures the core model, pipeline, and API expectations. Extend with diagrams as implementation matures.
1) Core model
- Nodes:
- Artifact(application/image digest) with metadata (tenant, environment, labels).
- Component(package/version, purl, ecosystem).
- File/- Path(source files, binary paths) with hash/time metadata.
- Licensenodes linked to components and SBOM attestations.
- Advisoryand- VEXStatementnodes linking to Concelier/Excititor records via digests.
- PolicyVersionnodes representing signed policy packs.
 
- Edges: directed, timestamped relationships such as DEPENDS_ON,BUILT_FROM,DECLARED_IN,AFFECTED_BY,VEX_EXEMPTS,GOVERNS_WITH,OBSERVED_RUNTIME. Each edge carries provenance (SRM hash, SBOM digest, policy run ID).
- Overlays: computed index tables providing fast access to reachability, blast radius, and differential views (e.g., graph_overlay/vuln/{tenant}/{advisoryKey}).
2) Pipelines
- Ingestion: Cartographer/SBOM Service emit SBOM snapshots (sbom_snapshotevents) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata.
- ETL: Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
- Overlay computation: Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
- Diffing: graph_diffjobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
3) APIs
- GET /graph/nodes/{id}— fetch node with metadata and attached provenance.
- POST /graph/query/saved— execute saved query (Cypher-like DSL) with tenant filtering; supports paging, citation metadata, and- explaintraces.
- GET /graph/impact/{advisoryKey}— returns impacted artifacts with path context and policy/vex overlays.
- GET /graph/diff/{snapshotA}/{snapshotB}— streaming API returning diff manifest including new/removed edges, risk summary, and export references.
- POST /graph/overlay/policy— create or retrieve overlay for policy version + advisory set, referencing- effective_findingresults.
4) Storage considerations
- Backed by either:
- Document + adjacency (Mongo collections graph_nodes,graph_edges,graph_overlays) with deterministic ordering and streaming exports.
- Or Graph DB (e.g., Neo4j/Cosmos Gremlin) behind an abstraction layer; choice depends on deployment footprint.
 
- Document + adjacency (Mongo collections 
- All storages require tenant partitioning, append-only change logs, and export manifests for Offline Kits.
5) Offline & export
- Each snapshot packages nodes.jsonl,edges.jsonl,overlays/plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles.
- Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
6) Observability
- Metrics: ingestion lag (graph_ingest_lag_seconds), node/edge counts, query latency per saved query, overlay generation duration.
- Logs: structured events for ETL stages and query execution (with trace IDs).
- Traces: ETL pipeline spans, query engine spans.
7) Rollout notes
- Phase 1: ingest SBOM + advisories, deliver impact queries.
- Phase 2: add VEX overlays, policy overlays, diff tooling.
- Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).
Refer to the module README and implementation plan for immediate context, and update this document once component boundaries and data flows are finalised.