# StellaOps.ReachGraph Module ## Module Charter The **ReachGraph** module provides a unified store for reachability subgraphs, enabling fast, deterministic, audit-ready answers to "*exactly why* a dependency is reachable." ### Mission Consolidate reachability data from Scanner, Signals, and Attestor into a single, content-addressed store with: - **Edge explainability**: Every edge carries "why" metadata (import, dynamic load, guards) - **Deterministic replay**: Same inputs produce identical digests - **Slice queries**: Fast queries by package, CVE, entrypoint, or file - **Audit-ready proofs**: DSSE-signed artifacts verifiable offline ### Scope | In Scope | Out of Scope | |----------|--------------| | ReachGraph schema and data model | Call graph extraction (handled by Scanner) | | Content-addressed storage | Runtime signal collection (handled by Signals) | | Slice query APIs | DSSE signing internals (handled by Attestor) | | Deterministic serialization | VEX document ingestion (handled by Excititor) | | Valkey caching | Policy evaluation (handled by Policy module) | | Replay verification | UI components (handled by Web module) | --- ## Architecture ### Component Diagram ``` ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ReachGraph Module ??? ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ????????????????????????????????????????????????????????? ????????????????????????????????????????????????????????? ????????????????????????????????????????????????????????? ??? ??? ??? Schema Layer ??? ??? Serialization ??? ??? Signing Layer ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ReachGraphMin ??? ??? Canonical JSON ??? ??? DSSE Wrapper ??? ??? ??? ??? EdgeExplanation ??? ??? BLAKE3 Digest ??? ??? Verification ??? ??? ??? ??? Provenance ??? ??? Compression ??? ??? ??? ??? ??? ????????????????????????????????????????????????????????? ????????????????????????????????????????????????????????? ????????????????????????????????????????????????????????? ??? ??? ??? ??? ??? ??? ??? ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ??? ??? Store Layer ??? ??? ??? ??? ??? ??? ??? ??? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ??? ??? ??? ??? ??? Repository ??? ??? Slice Engine ??? ??? Replay Driver??? ??? ??? ??? ??? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ??? ??? ??? ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ??? ??? ??? ??? ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ??? ??? Persistence Layer ??? ??? ??? ??? ??? ??? ??? ??? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ??? ??? ??? ??? ??? PostgreSQL ??? ??? Valkey ??? ??? ??? ??? ??? ??? (primary) ??? ??? (cache) ??? ??? ??? ??? ??? ???????????????????????????????????????????????? ???????????????????????????????????????????????? ??? ??? ??? ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ??? ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ``` ### Project Structure ``` src/__Libraries/StellaOps.ReachGraph/ ????????? Schema/ ??? ????????? ReachGraphMinimal.cs # Top-level graph structure ??? ????????? ReachGraphNode.cs # Node with metadata ??? ????????? ReachGraphEdge.cs # Edge with explanation ??? ????????? EdgeExplanation.cs # Why the edge exists ??? ????????? ReachGraphProvenance.cs # Input tracking ????????? Serialization/ ??? ????????? CanonicalReachGraphSerializer.cs ??? ????????? SortedKeysJsonConverter.cs ??? ????????? DeterministicArraySortConverter.cs ????????? Hashing/ ??? ????????? ReachGraphDigestComputer.cs ??? ????????? Blake3HashProvider.cs ????????? Signing/ ??? ????????? IReachGraphSignerService.cs ??? ????????? ReachGraphSignerService.cs ????????? Store/ ??? ????????? IReachGraphRepository.cs ??? ????????? PostgresReachGraphRepository.cs ??? ????????? SliceQueryEngine.cs ????????? Cache/ ??? ????????? IReachGraphCache.cs ??? ????????? ValkeyReachGraphCache.cs ????????? Replay/ ??? ????????? IReplayDriver.cs ??? ????????? DeterministicReplayDriver.cs ????????? StellaOps.ReachGraph.csproj src/__Libraries/StellaOps.ReachGraph.Persistence/ ????????? Migrations/ ??? ????????? 001_reachgraph_store.sql ????????? Models/ ??? ????????? SubgraphEntity.cs ????????? StellaOps.ReachGraph.Persistence.csproj src/ReachGraph/ ????????? StellaOps.ReachGraph.WebService/ ??? ????????? Endpoints/ ??? ??? ????????? ReachGraphEndpoints.cs ??? ??? ????????? SliceQueryEndpoints.cs ??? ????????? Contracts/ ??? ??? ????????? UpsertRequest.cs ??? ??? ????????? SliceQueryRequest.cs ??? ??? ????????? ReplayRequest.cs ??? ????????? Program.cs ??? ????????? openapi.yaml ????????? __Tests/ ????????? StellaOps.ReachGraph.WebService.Tests/ ``` --- ## Data Model ### ReachGraphMinimal Schema (v1) ```json { "schemaVersion": "reachgraph.min@v1", "artifact": { "name": "svc.payments", "digest": "sha256:abc123...", "env": ["linux/amd64"] }, "scope": { "entrypoints": ["/app/bin/svc"], "selectors": ["prod"], "cves": ["CVE-2024-1234"] }, "nodes": [ { "id": "sha256:nodeHash1", "kind": "function", "ref": "main()", "file": "src/index.ts", "line": 1, "isEntrypoint": true } ], "edges": [ { "from": "sha256:nodeHash1", "to": "sha256:nodeHash2", "why": { "type": "Import", "loc": "src/index.ts:3", "confidence": 1.0 } } ], "provenance": { "intoto": ["attestation-1.link"], "inputs": { "sbom": "sha256:sbomDigest", "vex": "sha256:vexDigest", "callgraph": "sha256:cgDigest" }, "computedAt": "2025-12-27T10:00:00Z", "analyzer": { "name": "stellaops-scanner", "version": "1.0.0", "toolchainDigest": "sha256:..." } }, "signatures": [ {"keyId": "scanner-signing-2025", "sig": "base64..."} ] } ``` ### Edge Explanation Types | Type | Description | Example Guard | |------|-------------|---------------| | `Import` | Static import statement | - | | `DynamicLoad` | Runtime require/import | - | | `Reflection` | Reflective invocation | - | | `Ffi` | Foreign function call | - | | `EnvGuard` | Environment variable check | `DEBUG=true` | | `FeatureFlag` | Feature flag condition | `FEATURE_X=enabled` | | `PlatformArch` | Platform/arch guard | `os=linux` | | `TaintGate` | Sanitization/validation | - | | `LoaderRule` | PLT/IAT/GOT entry | `RTLD_LAZY` | | `DirectCall` | Direct function call | - | | `Unknown` | Cannot determine | - | --- ## API Contracts ### Endpoints | Method | Path | Description | |--------|------|-------------| | POST | `/v1/reachgraphs` | Upsert subgraph | | GET | `/v1/reachgraphs/{digest}` | Get full subgraph | | GET | `/v1/reachgraphs/{digest}/slice` | Query slice | | POST | `/v1/reachgraphs/replay` | Verify determinism | | GET | `/v1/reachgraphs/by-artifact/{digest}` | List by artifact | ### Slice Query Parameters | Parameter | Description | |-----------|-------------| | `q` | PURL pattern for package slice | | `cve` | CVE ID for vulnerability slice | | `entrypoint` | Entrypoint path/symbol | | `file` | File path pattern (glob) | | `depth` | Max traversal depth | | `direction` | `upstream`, `downstream`, `both` | --- ## Coding Guidelines ### Determinism Rules 1. **All JSON serialization must use canonical format** - Sorted object keys (lexicographic) - Sorted arrays by deterministic field - UTC ISO-8601 timestamps - No null fields (omit when null) 2. **Hash computation excludes signatures** - Remove `signatures` field before hashing - Use BLAKE3-256 for all digests 3. **Tests must verify determinism** - Same input must produce same digest - Golden samples for regression testing ### Error Handling - Return structured errors with codes - Log correlation IDs for tracing - Never expose internal details in errors ### Performance - Cache hot slices in Valkey (30min TTL) - Compress stored blobs with gzip - Paginate large results (50 nodes per page) - Timeout long queries (30s max) --- ## Integration Points ### Upstream (Data Producers) | Module | Data | Integration | |--------|------|-------------| | Scanner.CallGraph | Call graph nodes/edges | `ICallGraphExtractor` produces input | | Signals | Runtime facts | Correlates static + dynamic paths | | Attestor | DSSE signing | `IReachGraphSignerService` delegates | ### Downstream (Data Consumers) | Module | Usage | Integration | |--------|-------|-------------| | Policy | VEX decisions | `ReachabilityRequirementGate` queries slices | | Web | UI panel | REST API for "Why Reachable?" | | CLI | Proof export | `stella reachgraph` commands | | ExportCenter | Batch reports | Includes subgraphs in evidence bundles | --- ## Testing Requirements ### Unit Tests - `CanonicalSerializerTests.cs` - Deterministic serialization - `DigestComputerTests.cs` - BLAKE3 hashing - `EdgeExplanationTests.cs` - Type coverage - `SliceEngineTests.cs` - Query correctness ### Integration Tests - PostgreSQL with Testcontainers - Valkey cache behavior - Tenant isolation (RLS) - Rate limiting enforcement ### Golden Samples Located in `tests/ReachGraph/Fixtures/`: - `simple-single-path.reachgraph.min.json` - `multi-edge-java.reachgraph.min.json` - `feature-flag-guards.reachgraph.min.json` - `large-graph-50-nodes.reachgraph.min.json` --- ## Configuration ### Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `REACHGRAPH_POSTGRES_CONNECTION` | PostgreSQL connection string | - | | `REACHGRAPH_VALKEY_CONNECTION` | Valkey connection string | - | | `REACHGRAPH_CACHE_TTL_MINUTES` | Cache TTL for full graphs | 1440 | | `REACHGRAPH_SLICE_CACHE_TTL_MINUTES` | Cache TTL for slices | 30 | | `REACHGRAPH_MAX_GRAPH_SIZE_MB` | Max graph size in cache | 10 | ### YAML Configuration ```yaml # etc/reachgraph.yaml reachgraph: store: maxDepth: 10 maxPaths: 5 compressionLevel: 6 cache: enabled: true ttlMinutes: 30 replay: enabled: true logResults: true ``` --- ## Observability ### Metrics - `reachgraph_upsert_total` - Upsert count by result - `reachgraph_query_duration_seconds` - Query latency histogram - `reachgraph_cache_hit_ratio` - Cache hit rate - `reachgraph_replay_match_total` - Replay verification results - `reachgraph_slice_size_bytes` - Slice response sizes ### Logging - Structured JSON logs - Correlation ID in all entries - Tenant context - Query parameters (sanitized) ### Tracing - OpenTelemetry spans for: - Upsert operations - Slice queries - Cache lookups - Replay verification --- ## Related Documentation - `docs-archived/implplan/SPRINT_1227_0012_0001_LB_reachgraph_core.md` - `docs-archived/implplan/SPRINT_1227_0012_0002_BE_reachgraph_store.md` - `docs-archived/implplan/SPRINT_1227_0012_0003_FE_reachgraph_integration.md` - `src/Attestor/POE_PREDICATE_SPEC.md` (predecessor schema) - `docs/modules/scanner/architecture.md` - `docs/modules/signals/architecture.md` --- _Module created: 2025-12-27. Owner: ReachGraph Guild._