- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
3.8 KiB
3.8 KiB
Graph architecture
Derived from Epic 5 – SBOM Graph Explorer; this section captures the core model, pipeline, and API expectations. Extend with diagrams as implementation matures.
1) Core model
- Nodes:
Artifact(application/image digest) with metadata (tenant, environment, labels).Component(package/version, purl, ecosystem).File/Path(source files, binary paths) with hash/time metadata.Licensenodes linked to components and SBOM attestations.AdvisoryandVEXStatementnodes linking to Concelier/Excititor records via digests.PolicyVersionnodes representing signed policy packs.
- Edges: directed, timestamped relationships such as
DEPENDS_ON,BUILT_FROM,DECLARED_IN,AFFECTED_BY,VEX_EXEMPTS,GOVERNS_WITH,OBSERVED_RUNTIME. Each edge carries provenance (SRM hash, SBOM digest, policy run ID). - Overlays: computed index tables providing fast access to reachability, blast radius, and differential views (e.g.,
graph_overlay/vuln/{tenant}/{advisoryKey}).
2) Pipelines
- Ingestion: Cartographer/SBOM Service emit SBOM snapshots (
sbom_snapshotevents) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata. - ETL: Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
- Overlay computation: Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
- Diffing:
graph_diffjobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
3) APIs
GET /graph/nodes/{id}— fetch node with metadata and attached provenance.POST /graph/query/saved— execute saved query (Cypher-like DSL) with tenant filtering; supports paging, citation metadata, andexplaintraces.GET /graph/impact/{advisoryKey}— returns impacted artifacts with path context and policy/vex overlays.GET /graph/diff/{snapshotA}/{snapshotB}— streaming API returning diff manifest including new/removed edges, risk summary, and export references.POST /graph/overlay/policy— create or retrieve overlay for policy version + advisory set, referencingeffective_findingresults.
4) Storage considerations
- Backed by either:
- Document + adjacency (Mongo collections
graph_nodes,graph_edges,graph_overlays) with deterministic ordering and streaming exports. - Or Graph DB (e.g., Neo4j/Cosmos Gremlin) behind an abstraction layer; choice depends on deployment footprint.
- Document + adjacency (Mongo collections
- All storages require tenant partitioning, append-only change logs, and export manifests for Offline Kits.
5) Offline & export
- Each snapshot packages
nodes.jsonl,edges.jsonl,overlays/plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles. - Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
6) Observability
- Metrics: ingestion lag (
graph_ingest_lag_seconds), node/edge counts, query latency per saved query, overlay generation duration. - Logs: structured events for ETL stages and query execution (with trace IDs).
- Traces: ETL pipeline spans, query engine spans.
7) Rollout notes
- Phase 1: ingest SBOM + advisories, deliver impact queries.
- Phase 2: add VEX overlays, policy overlays, diff tooling.
- Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).
Refer to the module README and implementation plan for immediate context, and update this document once component boundaries and data flows are finalised.