Add 12 new sprint files (Integrations, Graph, JobEngine, FE, Router, AdvisoryAI), archive completed scheduler UI sprint, update module architecture docs (router, graph, jobengine, web, integrations), and add Gitea entrypoint script for local dev. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
StellaOps Graph
Graph Indexer + Graph API build the tenant-scoped knowledge graph that powers blast-radius analysis, provenance timelines, and saved-query automation across StellaOps. Cartographer has been retired as of 2025-10-30 (see docs/updates/2025-10-30-devops-governance.md); this module now owns ingestion, storage, overlays, and query surfaces for graph data.
Scope & responsibilities
- Ingest SBOM snapshots, advisory/VEX events, policy overlays, and runtime signals to maintain a first-party graph representation with deterministic node/edge identities.
- Serve APIs and saved-query tooling for impact analysis, dependency traversal, diffing, and policy/VEX overlays with explainable provenance.
- Supply Graph Explorer UI/CLI experiences, plus Offline Kit exports (
nodes.jsonl,edges.jsonl,overlays/) with DSSE manifests for air-gapped replay. Analytics overlays are emitted as NDJSON (overlays/clusters.ndjson,overlays/centrality.ndjson) with deterministic ordering; PostgreSQL-backed providers support production wiring. - Maintain the Graph Index Canonical Schema and coordinate query/overlay lifecycle with Scheduler, Policy Engine, Vulnerability Explorer, and Export Center.
Architecture snapshot (Sprint 30 groundwork)
- Graph Indexer service — consumes SBOM (
sbom_snapshot), advisory, and VEX events; normalises identifiers; persists intograph_nodes,graph_edges,graph_snapshots, and overlay caches with tenant partitions. - Graph API service — exposes
GET /graph/nodes,/graph/impact/{advisory},/graph/query/saved,/graph/diff, and overlay endpoints with RBAC scopes defined in Authority (docs/updates/2025-10-26-authority-graph-scopes.md). - Overlay & diff workers — materialise impact lists, saved-query caches, and signed diff manifests; feed Scheduler
GraphBuildJob/GraphOverlayJobcontracts (docs/updates/2025-10-26-scheduler-graph-jobs.md). - Console & CLI integrations — planned modules deliver WebGL explorer, timeline viz, and CLI
stella sbom graph ...commands aligned with implementation plan phases. - Storage abstraction — supports document + adjacency (PostgreSQL) or pluggable graph engine; both paths enforce deterministic ordering and export manifests.
Current workstreams (Q4 2025)
GRAPH-SVC-30-00x(seesrc/Graph/StellaOps.Graph.Indexer/TASKS.md) — stand up Graph Indexer pipeline, identity registry, snapshot exports.- Active sprint:
docs/implplan/SPRINT_0141_0001_0001_graph_indexer.md(Runtime & Signals 140.A) — clustering/centrality jobs, incremental/backfill pipeline, determinism tests, packaging. GRAPH-API-30-00x— draft API planner/cost guard, streaming responses, and Authority scope integration.DOCS-GRAPH-24-003& related backlog — author overview/API/query language docs; update this README again once those deliverables land.- Deployment/DevOps follow-ups (
DEVOPS-VEX-30-001,DEPLOY-VEX-30-001) coordinate dashboards, load tests, and Helm/Compose overlays for the graph stack.
Integrations & dependencies
- SBOM Service (Scanner WebService + Worker) produce
sbom_snapshotevents consumed by Graph Indexer. - Concelier/Excititor contribute advisory + VEX edges; VEX Lens consensus overlays attach to graph nodes as attributes.
- Policy Engine & Scheduler trigger recompute jobs and consume overlays for risk/impact automation.
- Vulnerability Explorer & Console surface graph queries, saved views, and diff visualisations.
- Authority defines scopes (
graph.viewer,graph.operator) and client registrations; secrets managed via existing platform patterns.
Data, observability & offline
- Collections/tables:
graph_nodes,graph_edges,graph_snapshots,graph_saved_queries,graph_overlays_cache, append-only change logs for replay. - Metrics:
graph_ingest_lag_seconds,graph_nodes_total,graph_query_latency_seconds{queryId}, overlay/diff duration counters. - Logs/traces: structured ETL logs, query planner traces, WebGL interaction telemetry (once UI lands).
- Offline bundles: deterministic
nodes.jsonl,edges.jsonl, overlay manifests + DSSE signatures, consumable by Export Center and CLI mirroring.
Operations & runbook (Sprint 030)
- Dashboards: import
Observability/graph-api-grafana.json(panels for latency, budget denials, overlay cache ratio, export latency). Apply tenant filter in every panel. - Health checks:
/healthzshould be 200; search/query/paths/diff/export endpoints require tenant context,Authorization, and graph scopes (graph:read/query/export). - Tenant context resolution:
- Canonical header:
X-StellaOps-Tenant. - Compatibility headers:
X-Stella-Tenant,X-Tenant-Id(migration-only). - Conflicting tenant values across headers/claims are rejected deterministically with
400 GRAPH_VALIDATION_FAILED.
- Canonical header:
- Scope enforcement:
- Graph endpoints authorize against claim-based policies (
Graph.ReadOrQuery,Graph.Query,Graph.Export). - Header scope compatibility (
X-StellaOps-Scopes,X-Stella-Scopes) is bridged once at authentication and then evaluated only through policies.
- Graph endpoints authorize against claim-based policies (
- Key metrics (new):
graph_tile_latency_secondshistogram (labelroute); alert when p95 > 1.5s for 5m.graph_query_budget_denied_totalcounter (labelreason); investigate spikes (>50 in 5m).graph_overlay_cache_hits_total/graph_overlay_cache_misses_total; watch miss ratio > 0.4 for 10m.graph_export_latency_secondshistogram (labelformat); alert when p95 > 2s for ndjson/graphml.
- Triage playbook:
- Budget denials: lower default edges/nodes budget or guide callers to request smaller scopes; verify overlay includes are truly required.
- Overlay cache misses: ensure cache TTL is ≥5m; check overlay service connectivity to Policy Engine; warm cache by replaying recent hot nodes.
- Export slowness: reduce export
Limit, offload PNG/SVG to worker, and confirm disk I/O headroom. - If alerts fire, capture tenant, route, cursor/budget values, and recent deploy SHA in incident note.
Key docs & updates
architecture.md— inputs, pipelines, APIs, storage choices, observability, offline handling.implementation_plan.md— phased delivery roadmap, work breakdown, risks, test strategy.schema.md— canonical node/edge schema and attribute dictionary (keep in sync with indexer code).- API surface:
docs/api/graph-gateway-spec-draft.yaml(NDJSON tiles for/graph/search|query|paths|diff|export, budgets, overlays). - Updates:
docs/updates/2025-10-26-scheduler-graph-jobs.md,docs/updates/2025-10-26-authority-graph-scopes.md,docs/updates/2025-10-30-devops-governance.mdfor the latest decisions/dependencies. - Index: see
architecture-index.mdfor data model, ingestion pipeline, overlays/caches, events, and API/observability pointers.
Epic alignment
- Epic 5 – SBOM Graph Explorer: Graph Indexer, Graph API, saved queries, overlays, Console/CLI experiences, Offline Kit parity.
- Cross-epic ties: Policy reasoning (explain overlays), Scheduler recompute, Notify/Task Runner integration for graph incidents.
Implementation Status
Delivery Phases
- Phase 1 – Graph Indexer foundations: Stand up Graph Indexer service, node/edge schemas, ingestion from SBOM/Concelier/Excititor events, identity stability, snapshot materialisation
- Phase 2 – Graph API service: Expose search, query, path, impact, diff, and overlay endpoints with RBAC, cost controls, streaming responses
- Phase 3 – Console & CLI experiences: Ship Graph Explorer UI (WebGL canvas, filters, diff mode, overlays) and CLI for automation pipelines
- Phase 4 – Advanced analytics: Implement clustering, centrality, saved queries, overlay caching, Policy Engine explain integration
- Phase 5 – Exports & offline: Deliver GraphML/CSV/NDJSON exports, Offline Kit bundles with deterministic manifests
- Phase 6 – Observability & hardening: Complete dashboards, alerts, runbooks, load/perf testing, a11y review
Acceptance Criteria
- Graph Indexer ingests SBOM/advisory/VEX events deterministically with tenant isolation and append-only provenance
- Graph API serves endpoints within budgeted latency and enforces cost limits + RBAC
- Console explorer visualises topology, overlays, diffs; CLI commands mirror functionality for automation
- Exports and Offline Kit bundles reproduce snapshots and overlays with signed manifests
- Observability dashboards/alerts detect ingest lag, query failures, cache churn, memory pressure; runbooks guide remediation
- Policy/VEX overlays align with Policy Engine explain traces and VEX suppressions
Key Risks & Mitigations
- Graph scale/complexity: Adopt adjacency compression, cached overlays, streaming pagination, enforced query budgets
- Tenant bleed: Strict tenant filters, fuzz tests, data masking, compliance reviews
- Runaway queries/visualization: Cost planner, query timeout, UI hints, safe mode renders
- Cache poisoning: Input validation, schema versioning, eviction policies
- Offline parity gaps: Deterministic export pipeline, integration tests for Offline Kit import
Current Active Sprint
- Runtime & Signals 140.A: Clustering/centrality jobs, incremental/backfill pipeline, determinism tests, packaging