fixes save

This commit is contained in:
StellaOps Bot
2025-12-26 22:03:32 +02:00
parent 9a4cd2e0f7
commit 3bfbbae115
2076 changed files with 47168 additions and 32914 deletions

View File

@@ -0,0 +1,369 @@
# StellaOps.ReachGraph Module
## Module Charter
The **ReachGraph** module provides a unified store for reachability subgraphs, enabling fast, deterministic, audit-ready answers to "*exactly why* a dependency is reachable."
### Mission
Consolidate reachability data from Scanner, Signals, and Attestor into a single, content-addressed store with:
- **Edge explainability**: Every edge carries "why" metadata (import, dynamic load, guards)
- **Deterministic replay**: Same inputs produce identical digests
- **Slice queries**: Fast queries by package, CVE, entrypoint, or file
- **Audit-ready proofs**: DSSE-signed artifacts verifiable offline
### Scope
| In Scope | Out of Scope |
|----------|--------------|
| ReachGraph schema and data model | Call graph extraction (handled by Scanner) |
| Content-addressed storage | Runtime signal collection (handled by Signals) |
| Slice query APIs | DSSE signing internals (handled by Attestor) |
| Deterministic serialization | VEX document ingestion (handled by Excititor) |
| Valkey caching | Policy evaluation (handled by Policy module) |
| Replay verification | UI components (handled by Web module) |
---
## Architecture
### Component Diagram
```
┌──────────────────────────────────────────────────────────────────┐
│ ReachGraph Module │
├──────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Schema Layer │ │ Serialization │ │ Signing Layer │ │
│ │ │ │ │ │ │ │
│ │ ReachGraphMin │ │ Canonical JSON │ │ DSSE Wrapper │ │
│ │ EdgeExplanation │ │ BLAKE3 Digest │ │ Verification │ │
│ │ Provenance │ │ Compression │ │ │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ ┌────────▼────────────────────▼────────────────────▼────────┐ │
│ │ Store Layer │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ Repository │ │ Slice Engine │ │ Replay Driver│ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────────┐ │
│ │ Persistence Layer │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ PostgreSQL │ │ Valkey │ │ │
│ │ │ (primary) │ │ (cache) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────┘
```
### Project Structure
```
src/__Libraries/StellaOps.ReachGraph/
├── Schema/
│ ├── ReachGraphMinimal.cs # Top-level graph structure
│ ├── ReachGraphNode.cs # Node with metadata
│ ├── ReachGraphEdge.cs # Edge with explanation
│ ├── EdgeExplanation.cs # Why the edge exists
│ └── ReachGraphProvenance.cs # Input tracking
├── Serialization/
│ ├── CanonicalReachGraphSerializer.cs
│ ├── SortedKeysJsonConverter.cs
│ └── DeterministicArraySortConverter.cs
├── Hashing/
│ ├── ReachGraphDigestComputer.cs
│ └── Blake3HashProvider.cs
├── Signing/
│ ├── IReachGraphSignerService.cs
│ └── ReachGraphSignerService.cs
├── Store/
│ ├── IReachGraphRepository.cs
│ ├── PostgresReachGraphRepository.cs
│ └── SliceQueryEngine.cs
├── Cache/
│ ├── IReachGraphCache.cs
│ └── ValkeyReachGraphCache.cs
├── Replay/
│ ├── IReplayDriver.cs
│ └── DeterministicReplayDriver.cs
└── StellaOps.ReachGraph.csproj
src/__Libraries/StellaOps.ReachGraph.Persistence/
├── Migrations/
│ └── 001_reachgraph_store.sql
├── Models/
│ └── SubgraphEntity.cs
└── StellaOps.ReachGraph.Persistence.csproj
src/ReachGraph/
├── StellaOps.ReachGraph.WebService/
│ ├── Endpoints/
│ │ ├── ReachGraphEndpoints.cs
│ │ └── SliceQueryEndpoints.cs
│ ├── Contracts/
│ │ ├── UpsertRequest.cs
│ │ ├── SliceQueryRequest.cs
│ │ └── ReplayRequest.cs
│ ├── Program.cs
│ └── openapi.yaml
└── __Tests/
└── StellaOps.ReachGraph.WebService.Tests/
```
---
## Data Model
### ReachGraphMinimal Schema (v1)
```json
{
"schemaVersion": "reachgraph.min@v1",
"artifact": {
"name": "svc.payments",
"digest": "sha256:abc123...",
"env": ["linux/amd64"]
},
"scope": {
"entrypoints": ["/app/bin/svc"],
"selectors": ["prod"],
"cves": ["CVE-2024-1234"]
},
"nodes": [
{
"id": "sha256:nodeHash1",
"kind": "function",
"ref": "main()",
"file": "src/index.ts",
"line": 1,
"isEntrypoint": true
}
],
"edges": [
{
"from": "sha256:nodeHash1",
"to": "sha256:nodeHash2",
"why": {
"type": "Import",
"loc": "src/index.ts:3",
"confidence": 1.0
}
}
],
"provenance": {
"intoto": ["attestation-1.link"],
"inputs": {
"sbom": "sha256:sbomDigest",
"vex": "sha256:vexDigest",
"callgraph": "sha256:cgDigest"
},
"computedAt": "2025-12-27T10:00:00Z",
"analyzer": {
"name": "stellaops-scanner",
"version": "1.0.0",
"toolchainDigest": "sha256:..."
}
},
"signatures": [
{"keyId": "scanner-signing-2025", "sig": "base64..."}
]
}
```
### Edge Explanation Types
| Type | Description | Example Guard |
|------|-------------|---------------|
| `Import` | Static import statement | - |
| `DynamicLoad` | Runtime require/import | - |
| `Reflection` | Reflective invocation | - |
| `Ffi` | Foreign function call | - |
| `EnvGuard` | Environment variable check | `DEBUG=true` |
| `FeatureFlag` | Feature flag condition | `FEATURE_X=enabled` |
| `PlatformArch` | Platform/arch guard | `os=linux` |
| `TaintGate` | Sanitization/validation | - |
| `LoaderRule` | PLT/IAT/GOT entry | `RTLD_LAZY` |
| `DirectCall` | Direct function call | - |
| `Unknown` | Cannot determine | - |
---
## API Contracts
### Endpoints
| Method | Path | Description |
|--------|------|-------------|
| POST | `/v1/reachgraphs` | Upsert subgraph |
| GET | `/v1/reachgraphs/{digest}` | Get full subgraph |
| GET | `/v1/reachgraphs/{digest}/slice` | Query slice |
| POST | `/v1/reachgraphs/replay` | Verify determinism |
| GET | `/v1/reachgraphs/by-artifact/{digest}` | List by artifact |
### Slice Query Parameters
| Parameter | Description |
|-----------|-------------|
| `q` | PURL pattern for package slice |
| `cve` | CVE ID for vulnerability slice |
| `entrypoint` | Entrypoint path/symbol |
| `file` | File path pattern (glob) |
| `depth` | Max traversal depth |
| `direction` | `upstream`, `downstream`, `both` |
---
## Coding Guidelines
### Determinism Rules
1. **All JSON serialization must use canonical format**
- Sorted object keys (lexicographic)
- Sorted arrays by deterministic field
- UTC ISO-8601 timestamps
- No null fields (omit when null)
2. **Hash computation excludes signatures**
- Remove `signatures` field before hashing
- Use BLAKE3-256 for all digests
3. **Tests must verify determinism**
- Same input must produce same digest
- Golden samples for regression testing
### Error Handling
- Return structured errors with codes
- Log correlation IDs for tracing
- Never expose internal details in errors
### Performance
- Cache hot slices in Valkey (30min TTL)
- Compress stored blobs with gzip
- Paginate large results (50 nodes per page)
- Timeout long queries (30s max)
---
## Integration Points
### Upstream (Data Producers)
| Module | Data | Integration |
|--------|------|-------------|
| Scanner.CallGraph | Call graph nodes/edges | `ICallGraphExtractor` produces input |
| Signals | Runtime facts | Correlates static + dynamic paths |
| Attestor | DSSE signing | `IReachGraphSignerService` delegates |
### Downstream (Data Consumers)
| Module | Usage | Integration |
|--------|-------|-------------|
| Policy | VEX decisions | `ReachabilityRequirementGate` queries slices |
| Web | UI panel | REST API for "Why Reachable?" |
| CLI | Proof export | `stella reachgraph` commands |
| ExportCenter | Batch reports | Includes subgraphs in evidence bundles |
---
## Testing Requirements
### Unit Tests
- `CanonicalSerializerTests.cs` - Deterministic serialization
- `DigestComputerTests.cs` - BLAKE3 hashing
- `EdgeExplanationTests.cs` - Type coverage
- `SliceEngineTests.cs` - Query correctness
### Integration Tests
- PostgreSQL with Testcontainers
- Valkey cache behavior
- Tenant isolation (RLS)
- Rate limiting enforcement
### Golden Samples
Located in `tests/ReachGraph/Fixtures/`:
- `simple-single-path.reachgraph.min.json`
- `multi-edge-java.reachgraph.min.json`
- `feature-flag-guards.reachgraph.min.json`
- `large-graph-50-nodes.reachgraph.min.json`
---
## Configuration
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `REACHGRAPH_POSTGRES_CONNECTION` | PostgreSQL connection string | - |
| `REACHGRAPH_VALKEY_CONNECTION` | Valkey connection string | - |
| `REACHGRAPH_CACHE_TTL_MINUTES` | Cache TTL for full graphs | 1440 |
| `REACHGRAPH_SLICE_CACHE_TTL_MINUTES` | Cache TTL for slices | 30 |
| `REACHGRAPH_MAX_GRAPH_SIZE_MB` | Max graph size in cache | 10 |
### YAML Configuration
```yaml
# etc/reachgraph.yaml
reachgraph:
store:
maxDepth: 10
maxPaths: 5
compressionLevel: 6
cache:
enabled: true
ttlMinutes: 30
replay:
enabled: true
logResults: true
```
---
## Observability
### Metrics
- `reachgraph_upsert_total` - Upsert count by result
- `reachgraph_query_duration_seconds` - Query latency histogram
- `reachgraph_cache_hit_ratio` - Cache hit rate
- `reachgraph_replay_match_total` - Replay verification results
- `reachgraph_slice_size_bytes` - Slice response sizes
### Logging
- Structured JSON logs
- Correlation ID in all entries
- Tenant context
- Query parameters (sanitized)
### Tracing
- OpenTelemetry spans for:
- Upsert operations
- Slice queries
- Cache lookups
- Replay verification
---
## Related Documentation
- `docs/implplan/SPRINT_1227_0012_0001_LB_reachgraph_core.md`
- `docs/implplan/SPRINT_1227_0012_0002_BE_reachgraph_store.md`
- `docs/implplan/SPRINT_1227_0012_0003_FE_reachgraph_integration.md`
- `src/Attestor/POE_PREDICATE_SPEC.md` (predecessor schema)
- `docs/modules/scanner/architecture.md`
- `docs/modules/signals/architecture.md`
---
_Module created: 2025-12-27. Owner: ReachGraph Guild._