# Graph Analytics and Dependency Insights **Version:** 1.0 **Date:** 2025-11-29 **Status:** Canonical This advisory defines the product rationale, graph model, and implementation strategy for the Graph module, covering dependency analysis, impact visualization, and offline exports. --- ## 1. Executive Summary The Graph module provides **dependency analysis and impact visualization** across the vulnerability landscape. Key capabilities: - **Unified Graph Model** - Artifacts, components, advisories, policies linked - **Impact Analysis** - Blast radius, affected paths, transitive dependencies - **Policy Overlays** - VEX and policy decisions visualized on graph - **Analytics** - Clustering, centrality, community detection - **Offline Export** - Deterministic graph snapshots for air-gap --- ## 2. Market Drivers ### 2.1 Target Segments | Segment | Graph Requirements | Use Case | |---------|-------------------|----------| | **Security Teams** | Impact analysis | Vulnerability prioritization | | **Developers** | Dependency visualization | Upgrade planning | | **Compliance** | Audit trails | Relationship documentation | | **Management** | Risk dashboards | Portfolio risk view | ### 2.2 Competitive Positioning Most vulnerability tools show flat lists. Stella Ops differentiates with: - **Graph-native architecture** linking all entities - **Impact visualization** showing blast radius - **Policy overlays** embedding decisions in graph - **Offline-compatible** exports for air-gap analysis - **Analytics** for community detection and centrality --- ## 3. Graph Model ### 3.1 Node Types | Node | Description | Key Properties | |------|-------------|----------------| | **Artifact** | Image/application digest | tenant, environment, labels | | **Component** | Package version | purl, ecosystem, version | | **File** | Source/binary path | hash, mtime | | **License** | License identifier | spdx-id, restrictions | | **Advisory** | Vulnerability record | cve-id, severity, sources | | **VEXStatement** | VEX decision | status, justification | | **PolicyVersion** | Signed policy pack | version, digest | ### 3.2 Edge Types | Edge | From | To | Properties | |------|------|-----|------------| | `DEPENDS_ON` | Component | Component | scope, optional | | `BUILT_FROM` | Artifact | Component | layer, path | | `DECLARED_IN` | Component | File | sbom-id | | `AFFECTED_BY` | Component | Advisory | version-range | | `VEX_EXEMPTS` | VEXStatement | Advisory | justification | | `GOVERNS_WITH` | PolicyVersion | Artifact | run-id | | `OBSERVED_RUNTIME` | Artifact | Component | zastava-event-id | ### 3.3 Provenance Every edge carries: - `createdAt` - UTC timestamp - `sourceDigest` - SRM/SBOM hash - `provenanceRef` - Link to source document --- ## 4. Overlay System ### 4.1 Overlay Types | Overlay | Purpose | Content | |---------|---------|---------| | `policy.overlay.v1` | Policy decisions | verdict, severity, rules | | `openvex.v1` | VEX status | status, justification | | `reachability.v1` | Runtime reachability | state, confidence | | `clustering.v1` | Community detection | cluster-id, modularity | | `centrality.v1` | Node importance | degree, betweenness | ### 4.2 Overlay Structure ```json { "overlayId": "sha256(tenant|nodeId|overlayKind)", "overlayKind": "policy.overlay.v1", "nodeId": "component:pkg:npm/lodash@4.17.21", "tenant": "acme-corp", "generatedAt": "2025-11-29T12:00:00Z", "content": { "verdict": "blocked", "severity": "critical", "rulesMatched": ["rule-001", "rule-002"], "explainTrace": "sampled trace data..." } } ``` --- ## 5. Query Capabilities ### 5.1 Search API ```bash POST /graph/search { "tenant": "acme-corp", "query": "severity:critical AND ecosystem:npm", "nodeTypes": ["Component", "Advisory"], "limit": 100 } ``` ### 5.2 Path Query ```bash POST /graph/paths { "source": "artifact:sha256:abc123...", "target": "advisory:CVE-2025-12345", "maxDepth": 6, "includeOverlays": true } ``` **Response:** ```json { "paths": [ { "nodes": ["artifact:sha256:...", "component:pkg:npm/...", "advisory:CVE-..."], "edges": [{"type": "BUILT_FROM"}, {"type": "AFFECTED_BY"}], "length": 2 } ], "overlays": [ {"nodeId": "component:...", "overlayKind": "policy.overlay.v1", "content": {...}} ] } ``` ### 5.3 Diff Query ```bash POST /graph/diff { "snapshotA": "snapshot-2025-11-28", "snapshotB": "snapshot-2025-11-29", "includeOverlays": true } ``` --- ## 6. Analytics Pipeline ### 6.1 Clustering - **Algorithm:** Louvain community detection - **Output:** Cluster IDs per node, modularity score - **Use Case:** Identify tightly coupled component groups ### 6.2 Centrality - **Degree centrality:** Most connected nodes - **Betweenness centrality:** Critical path nodes - **Use Case:** Identify high-impact components ### 6.3 Background Processing ```yaml analytics: enabled: true schedule: "0 */6 * * *" # Every 6 hours algorithms: - clustering - centrality snapshotRetention: 30 ``` --- ## 7. Implementation Strategy ### 7.1 Phase 1: Core Model (Complete) - [x] Node/edge schema - [x] SBOM ingestion pipeline - [x] Advisory/VEX linking - [x] Basic search API ### 7.2 Phase 2: Overlays (In Progress) - [x] Policy overlay generation - [x] VEX overlay generation - [ ] Reachability overlay (GRAPH-REACH-50-001) - [ ] Inline overlay in query responses (GRAPH-QUERY-51-001) ### 7.3 Phase 3: Analytics (Planned) - [ ] Clustering algorithm - [ ] Centrality calculations - [ ] Background worker - [ ] Analytics overlays export ### 7.4 Phase 4: Visualization (Planned) - [ ] Console graph viewer - [ ] Impact tree visualization - [ ] Diff visualization --- ## 8. API Surface ### 8.1 Core APIs | Endpoint | Method | Scope | Description | |----------|--------|-------|-------------| | `/graph/search` | POST | `graph:read` | Search nodes | | `/graph/query` | POST | `graph:read` | Complex queries | | `/graph/paths` | POST | `graph:read` | Path finding | | `/graph/diff` | POST | `graph:read` | Snapshot diff | | `/graph/nodes/{id}` | GET | `graph:read` | Node detail | ### 8.2 Export APIs | Endpoint | Method | Scope | Description | |----------|--------|-------|-------------| | `/graph/export` | POST | `graph:export` | Start export job | | `/graph/export/{jobId}` | GET | `graph:read` | Job status | | `/graph/export/{jobId}/download` | GET | `graph:export` | Download bundle | --- ## 9. Storage Model ### 9.1 Collections | Collection | Purpose | Key Indexes | |------------|---------|-------------| | `graph_nodes` | Node records | `{tenant, nodeType, nodeId}` | | `graph_edges` | Edge records | `{tenant, fromId, toId, edgeType}` | | `graph_overlays` | Overlay data | `{tenant, nodeId, overlayKind}` | | `graph_snapshots` | Point-in-time snapshots | `{tenant, snapshotId}` | ### 9.2 Export Format ``` graph-export/ ├── nodes.jsonl # Sorted by nodeId ├── edges.jsonl # Sorted by (from, to, type) ├── overlays/ │ ├── policy.jsonl │ ├── openvex.jsonl │ └── manifest.json └── manifest.json ``` --- ## 10. Observability ### 10.1 Metrics - `graph_ingest_lag_seconds` - `graph_nodes_total{nodeType}` - `graph_edges_total{edgeType}` - `graph_query_latency_seconds{queryType}` - `graph_analytics_runs_total` - `graph_analytics_clusters_total` ### 10.2 Offline Support - Graph snapshots packaged for Offline Kit - Deterministic NDJSON exports - Overlay manifests with digests --- ## 11. Related Documentation | Resource | Location | |----------|----------| | Graph architecture | `docs/modules/graph/architecture.md` | | Query language | `docs/modules/graph/query-language.md` | | Overlay specification | `docs/modules/graph/overlays.md` | --- ## 12. Sprint Mapping - **Primary Sprint:** SPRINT_0141_0001_0001_graph_indexer.md - **Related Sprints:** - SPRINT_0401_0001_0001_reachability_evidence_chain.md - SPRINT_0140_0001_0001_runtime_signals.md **Key Task IDs:** - `GRAPH-CORE-40-001` - Core model (DONE) - `GRAPH-INGEST-41-001` - SBOM ingestion (DONE) - `GRAPH-REACH-50-001` - Reachability overlay (IN PROGRESS) - `GRAPH-ANALYTICS-55-001` - Clustering (TODO) - `GRAPH-VIZ-60-001` - Visualization (FUTURE) --- ## 13. Success Metrics | Metric | Target | |--------|--------| | Query latency | < 500ms p95 | | Ingestion lag | < 5 minutes | | Path query depth | Up to 6 hops | | Export reproducibility | 100% deterministic | | Analytics freshness | < 6 hours | --- *Last updated: 2025-11-29*