Files
git.stella-ops.org/docs/modules/graph/architecture.md
StellaOps Bot d63af51f84
Some checks failed
api-governance / spectral-lint (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
oas-ci / oas-validate (push) Has been cancelled
SDK Publish & Sign / sdk-publish (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-11-26 20:23:28 +02:00

5.9 KiB
Raw Blame History

Graph architecture

Derived from Epic5 SBOM Graph Explorer; this section captures the core model, pipeline, and API expectations. Extend with diagrams as implementation matures.

1) Core model

  • Nodes:
    • Artifact (application/image digest) with metadata (tenant, environment, labels).
    • Component (package/version, purl, ecosystem).
    • File/Path (source files, binary paths) with hash/time metadata.
    • License nodes linked to components and SBOM attestations.
    • Advisory and VEXStatement nodes linking to Concelier/Excititor records via digests.
    • PolicyVersion nodes representing signed policy packs.
  • Edges: directed, timestamped relationships such as DEPENDS_ON, BUILT_FROM, DECLARED_IN, AFFECTED_BY, VEX_EXEMPTS, GOVERNS_WITH, OBSERVED_RUNTIME. Each edge carries provenance (SRM hash, SBOM digest, policy run ID).
  • Overlays: computed index tables providing fast access to reachability, blast radius, and differential views (e.g., graph_overlay/vuln/{tenant}/{advisoryKey}). Runtime endpoints emit overlays inline (policy.overlay.v1, openvex.v1) with deterministic overlay IDs (sha256(tenant|nodeId|overlayKind)) and sampled explain traces on policy overlays.

2) Pipelines

  1. Ingestion: Cartographer/SBOM Service emit SBOM snapshots (sbom_snapshot events) captured by the Graph Indexer. Advisories/VEX from Concelier/Excititor generate edge updates, policy runs attach overlay metadata.
  2. ETL: Normalises nodes/edges into canonical IDs, deduplicates, enforces tenant partitions, and writes to the graph store (planned: Neo4j-compatible or document + adjacency lists in Mongo).
  3. Overlay computation: Batch workers build materialised views for frequently used queries (impact lists, saved queries, policy overlays) and store as immutable blobs for Offline Kit exports.
  4. Diffing: graph_diff jobs compare two snapshots (e.g., pre/post deploy) and generate signed diff manifests for UI/CLI consumption.
  5. Analytics (Runtime & Signals 140.A): background workers run Louvain-style clustering + degree/betweenness approximations on ingested graphs, emitting overlays per tenant/snapshot and writing cluster ids back to nodes when enabled.

3) APIs

  • POST /graph/search — NDJSON node tiles with cursor paging, tenant + scope guards.
  • POST /graph/query — NDJSON nodes/edges/stats/cursor with budgets (tiles/nodes/edges) and optional inline overlays (includeOverlays=true) emitting policy.overlay.v1 and openvex.v1 payloads; overlay IDs are sha256(tenant|nodeId|overlayKind); policy overlay may include a sampled explainTrace.
  • POST /graph/paths — bounded BFS (depth ≤6) returning path nodes/edges/stats; honours budgets and overlays.
  • POST /graph/diff — compares snapshotA vs snapshotB, streaming node/edge added/removed/changed tiles plus stats; budget enforcement mirrors /graph/query.
  • POST /graph/export — async job producing deterministic manifests (sha256, size, format) for ndjson/csv/graphml/png/svg; download via /graph/export/{jobId}.
  • Legacy: GET /graph/nodes/{id}, POST /graph/query/saved, GET /graph/impact/{advisoryKey}, POST /graph/overlay/policy remain in spec but should align to the NDJSON surfaces above as they are brought forward.

4) Storage considerations

  • Backed by either:
    • Document + adjacency (Mongo collections graph_nodes, graph_edges, graph_overlays) with deterministic ordering and streaming exports.
    • Or Graph DB (e.g., Neo4j/Cosmos Gremlin) behind an abstraction layer; choice depends on deployment footprint.
  • All storages require tenant partitioning, append-only change logs, and export manifests for Offline Kits.

5) Offline & export

  • Each snapshot packages nodes.jsonl, edges.jsonl, overlays/ plus manifest with hash, counts, and provenance. Export Center consumes these artefacts for graph-specific bundles.
  • Saved queries and overlays include deterministic IDs so Offline Kit consumers can import and replay results.
  • Runtime hosts register the SBOM ingest pipeline via services.AddSbomIngestPipeline(...). Snapshot exports default to ./artifacts/graph-snapshots but can be redirected with STELLAOPS_GRAPH_SNAPSHOT_DIR or the SbomIngestOptions.SnapshotRootDirectory callback.
  • Analytics overlays are exported as NDJSON (overlays/clusters.ndjson, overlays/centrality.ndjson) ordered by node id; overlays/manifest.json mirrors snapshot id and counts for offline parity.

6) Observability

  • Metrics: ingestion lag (graph_ingest_lag_seconds), node/edge counts, query latency per saved query, overlay generation duration.
  • New analytics metrics: graph_analytics_runs_total, graph_analytics_failures_total, graph_analytics_clusters_total, graph_analytics_centrality_total, plus change-stream/backfill counters (graph_changes_total, graph_backfill_total, graph_change_failures_total, graph_change_lag_seconds).
  • Logs: structured events for ETL stages and query execution (with trace IDs).
  • Traces: ETL pipeline spans, query engine spans.

7) Rollout notes

  • Phase 1: ingest SBOM + advisories, deliver impact queries.
  • Phase 2: add VEX overlays, policy overlays, diff tooling.
  • Phase 3: expose runtime/Zastava edges and AI-assisted recommendations (future).

Local testing note

Set STELLAOPS_TEST_MONGO_URI to a reachable MongoDB instance before running tests/Graph/StellaOps.Graph.Indexer.Tests. The test harness falls back to mongodb://127.0.0.1:27017, then Mongo2Go, but the CI workflow requires the environment variable to be present to ensure upsert coverage runs against a managed database. Use STELLAOPS_GRAPH_SNAPSHOT_DIR (or the AddSbomIngestPipeline options callback) to control where graph snapshot artefacts land during local runs.

Refer to the module README and implementation plan for immediate context, and update this document once component boundaries and data flows are finalised.