13 KiB
13 KiB
StellaOps.ReachGraph Module
Module Charter
The ReachGraph module provides a unified store for reachability subgraphs, enabling fast, deterministic, audit-ready answers to "exactly why a dependency is reachable."
Mission
Consolidate reachability data from Scanner, Signals, and Attestor into a single, content-addressed store with:
- Edge explainability: Every edge carries "why" metadata (import, dynamic load, guards)
- Deterministic replay: Same inputs produce identical digests
- Slice queries: Fast queries by package, CVE, entrypoint, or file
- Audit-ready proofs: DSSE-signed artifacts verifiable offline
Scope
| In Scope | Out of Scope |
|---|---|
| ReachGraph schema and data model | Call graph extraction (handled by Scanner) |
| Content-addressed storage | Runtime signal collection (handled by Signals) |
| Slice query APIs | DSSE signing internals (handled by Attestor) |
| Deterministic serialization | VEX document ingestion (handled by Excititor) |
| Valkey caching | Policy evaluation (handled by Policy module) |
| Replay verification | UI components (handled by Web module) |
Architecture
Component Diagram
┌──────────────────────────────────────────────────────────────────┐
│ ReachGraph Module │
├──────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Schema Layer │ │ Serialization │ │ Signing Layer │ │
│ │ │ │ │ │ │ │
│ │ ReachGraphMin │ │ Canonical JSON │ │ DSSE Wrapper │ │
│ │ EdgeExplanation │ │ BLAKE3 Digest │ │ Verification │ │
│ │ Provenance │ │ Compression │ │ │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ ┌────────▼────────────────────▼────────────────────▼────────┐ │
│ │ Store Layer │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Repository │ │ Slice Engine │ │ Replay Driver│ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────────┐ │
│ │ Persistence Layer │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ PostgreSQL │ │ Valkey │ │ │
│ │ │ (primary) │ │ (cache) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
Project Structure
src/__Libraries/StellaOps.ReachGraph/
├── Schema/
│ ├── ReachGraphMinimal.cs # Top-level graph structure
│ ├── ReachGraphNode.cs # Node with metadata
│ ├── ReachGraphEdge.cs # Edge with explanation
│ ├── EdgeExplanation.cs # Why the edge exists
│ └── ReachGraphProvenance.cs # Input tracking
├── Serialization/
│ ├── CanonicalReachGraphSerializer.cs
│ ├── SortedKeysJsonConverter.cs
│ └── DeterministicArraySortConverter.cs
├── Hashing/
│ ├── ReachGraphDigestComputer.cs
│ └── Blake3HashProvider.cs
├── Signing/
│ ├── IReachGraphSignerService.cs
│ └── ReachGraphSignerService.cs
├── Store/
│ ├── IReachGraphRepository.cs
│ ├── PostgresReachGraphRepository.cs
│ └── SliceQueryEngine.cs
├── Cache/
│ ├── IReachGraphCache.cs
│ └── ValkeyReachGraphCache.cs
├── Replay/
│ ├── IReplayDriver.cs
│ └── DeterministicReplayDriver.cs
└── StellaOps.ReachGraph.csproj
src/__Libraries/StellaOps.ReachGraph.Persistence/
├── Migrations/
│ └── 001_reachgraph_store.sql
├── Models/
│ └── SubgraphEntity.cs
└── StellaOps.ReachGraph.Persistence.csproj
src/ReachGraph/
├── StellaOps.ReachGraph.WebService/
│ ├── Endpoints/
│ │ ├── ReachGraphEndpoints.cs
│ │ └── SliceQueryEndpoints.cs
│ ├── Contracts/
│ │ ├── UpsertRequest.cs
│ │ ├── SliceQueryRequest.cs
│ │ └── ReplayRequest.cs
│ ├── Program.cs
│ └── openapi.yaml
└── __Tests/
└── StellaOps.ReachGraph.WebService.Tests/
Data Model
ReachGraphMinimal Schema (v1)
{
"schemaVersion": "reachgraph.min@v1",
"artifact": {
"name": "svc.payments",
"digest": "sha256:abc123...",
"env": ["linux/amd64"]
},
"scope": {
"entrypoints": ["/app/bin/svc"],
"selectors": ["prod"],
"cves": ["CVE-2024-1234"]
},
"nodes": [
{
"id": "sha256:nodeHash1",
"kind": "function",
"ref": "main()",
"file": "src/index.ts",
"line": 1,
"isEntrypoint": true
}
],
"edges": [
{
"from": "sha256:nodeHash1",
"to": "sha256:nodeHash2",
"why": {
"type": "Import",
"loc": "src/index.ts:3",
"confidence": 1.0
}
}
],
"provenance": {
"intoto": ["attestation-1.link"],
"inputs": {
"sbom": "sha256:sbomDigest",
"vex": "sha256:vexDigest",
"callgraph": "sha256:cgDigest"
},
"computedAt": "2025-12-27T10:00:00Z",
"analyzer": {
"name": "stellaops-scanner",
"version": "1.0.0",
"toolchainDigest": "sha256:..."
}
},
"signatures": [
{"keyId": "scanner-signing-2025", "sig": "base64..."}
]
}
Edge Explanation Types
| Type | Description | Example Guard |
|---|---|---|
Import |
Static import statement | - |
DynamicLoad |
Runtime require/import | - |
Reflection |
Reflective invocation | - |
Ffi |
Foreign function call | - |
EnvGuard |
Environment variable check | DEBUG=true |
FeatureFlag |
Feature flag condition | FEATURE_X=enabled |
PlatformArch |
Platform/arch guard | os=linux |
TaintGate |
Sanitization/validation | - |
LoaderRule |
PLT/IAT/GOT entry | RTLD_LAZY |
DirectCall |
Direct function call | - |
Unknown |
Cannot determine | - |
API Contracts
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /v1/reachgraphs |
Upsert subgraph |
| GET | /v1/reachgraphs/{digest} |
Get full subgraph |
| GET | /v1/reachgraphs/{digest}/slice |
Query slice |
| POST | /v1/reachgraphs/replay |
Verify determinism |
| GET | /v1/reachgraphs/by-artifact/{digest} |
List by artifact |
Slice Query Parameters
| Parameter | Description |
|---|---|
q |
PURL pattern for package slice |
cve |
CVE ID for vulnerability slice |
entrypoint |
Entrypoint path/symbol |
file |
File path pattern (glob) |
depth |
Max traversal depth |
direction |
upstream, downstream, both |
Coding Guidelines
Determinism Rules
-
All JSON serialization must use canonical format
- Sorted object keys (lexicographic)
- Sorted arrays by deterministic field
- UTC ISO-8601 timestamps
- No null fields (omit when null)
-
Hash computation excludes signatures
- Remove
signaturesfield before hashing - Use BLAKE3-256 for all digests
- Remove
-
Tests must verify determinism
- Same input must produce same digest
- Golden samples for regression testing
Error Handling
- Return structured errors with codes
- Log correlation IDs for tracing
- Never expose internal details in errors
Performance
- Cache hot slices in Valkey (30min TTL)
- Compress stored blobs with gzip
- Paginate large results (50 nodes per page)
- Timeout long queries (30s max)
Integration Points
Upstream (Data Producers)
| Module | Data | Integration |
|---|---|---|
| Scanner.CallGraph | Call graph nodes/edges | ICallGraphExtractor produces input |
| Signals | Runtime facts | Correlates static + dynamic paths |
| Attestor | DSSE signing | IReachGraphSignerService delegates |
Downstream (Data Consumers)
| Module | Usage | Integration |
|---|---|---|
| Policy | VEX decisions | ReachabilityRequirementGate queries slices |
| Web | UI panel | REST API for "Why Reachable?" |
| CLI | Proof export | stella reachgraph commands |
| ExportCenter | Batch reports | Includes subgraphs in evidence bundles |
Testing Requirements
Unit Tests
CanonicalSerializerTests.cs- Deterministic serializationDigestComputerTests.cs- BLAKE3 hashingEdgeExplanationTests.cs- Type coverageSliceEngineTests.cs- Query correctness
Integration Tests
- PostgreSQL with Testcontainers
- Valkey cache behavior
- Tenant isolation (RLS)
- Rate limiting enforcement
Golden Samples
Located in tests/ReachGraph/Fixtures/:
simple-single-path.reachgraph.min.jsonmulti-edge-java.reachgraph.min.jsonfeature-flag-guards.reachgraph.min.jsonlarge-graph-50-nodes.reachgraph.min.json
Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
REACHGRAPH_POSTGRES_CONNECTION |
PostgreSQL connection string | - |
REACHGRAPH_VALKEY_CONNECTION |
Valkey connection string | - |
REACHGRAPH_CACHE_TTL_MINUTES |
Cache TTL for full graphs | 1440 |
REACHGRAPH_SLICE_CACHE_TTL_MINUTES |
Cache TTL for slices | 30 |
REACHGRAPH_MAX_GRAPH_SIZE_MB |
Max graph size in cache | 10 |
YAML Configuration
# etc/reachgraph.yaml
reachgraph:
store:
maxDepth: 10
maxPaths: 5
compressionLevel: 6
cache:
enabled: true
ttlMinutes: 30
replay:
enabled: true
logResults: true
Observability
Metrics
reachgraph_upsert_total- Upsert count by resultreachgraph_query_duration_seconds- Query latency histogramreachgraph_cache_hit_ratio- Cache hit ratereachgraph_replay_match_total- Replay verification resultsreachgraph_slice_size_bytes- Slice response sizes
Logging
- Structured JSON logs
- Correlation ID in all entries
- Tenant context
- Query parameters (sanitized)
Tracing
- OpenTelemetry spans for:
- Upsert operations
- Slice queries
- Cache lookups
- Replay verification
Related Documentation
docs/implplan/SPRINT_1227_0012_0001_LB_reachgraph_core.mddocs/implplan/SPRINT_1227_0012_0002_BE_reachgraph_store.mddocs/implplan/SPRINT_1227_0012_0003_FE_reachgraph_integration.mdsrc/Attestor/POE_PREDICATE_SPEC.md(predecessor schema)docs/modules/scanner/architecture.mddocs/modules/signals/architecture.md
Module created: 2025-12-27. Owner: ReachGraph Guild.