Files
git.stella-ops.org/docs/reachability/graph-revision-schema.md
StellaOps Bot 6e45066e37
Some checks failed
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Policy Simulation / policy-simulate (push) Has been cancelled
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled
Scanner Analyzers / Discover Analyzers (push) Has been cancelled
Scanner Analyzers / Build Analyzers (push) Has been cancelled
Scanner Analyzers / Test Language Analyzers (push) Has been cancelled
Scanner Analyzers / Validate Test Fixtures (push) Has been cancelled
Scanner Analyzers / Verify Deterministic Output (push) Has been cancelled
up
2025-12-13 09:37:15 +02:00

378 lines
8.7 KiB
Markdown

# Graph Revision Schema
_Last updated: 2025-12-13. Owner: Platform Guild._
This document defines the graph revision schema addressing gaps GR1-GR10 from the November 2025 product findings. It specifies manifest structure, hash algorithms, storage layout, lineage tracking, and governance rules for deterministic, auditable reachability graphs.
---
## 1. Overview
Graph revisions provide content-addressable, append-only versioning for `richgraph-v1` documents. Every graph mutation produces a new immutable revision with:
- **Deterministic hash:** BLAKE3-256 of canonical JSON
- **Lineage metadata:** Parent revision + diff summary
- **Cross-artifact digests:** Links to SBOM, VEX, policy, and tool versions
- **Audit trail:** Timestamp, actor, tenant, and operation type
---
## 2. Gap Resolutions
### GR1: Manifest Schema + Canonical Hash Rules
**Manifest schema:**
```json
{
"schema": "stellaops.graph.revision@v1",
"revision_id": "rev:blake3:a1b2c3d4e5f6...",
"graph_hash": "blake3:a1b2c3d4e5f6...",
"parent_revision_id": "rev:blake3:9f8e7d6c5b4a...",
"created_at": "2025-12-13T10:00:00Z",
"created_by": "service:scanner",
"tenant_id": "tenant:acme",
"shard_id": "shard:01",
"operation": "create",
"lineage": {
"depth": 3,
"root_revision_id": "rev:blake3:1a2b3c4d5e6f..."
},
"cross_artifacts": {
"sbom_digest": "sha256:...",
"vex_digest": "sha256:...",
"policy_digest": "sha256:...",
"analyzer_digest": "sha256:..."
},
"diff_summary": {
"nodes_added": 12,
"nodes_removed": 3,
"edges_added": 24,
"edges_removed": 8,
"roots_changed": false
}
}
```
**Canonical hash rules:**
1. JSON keys sorted alphabetically at all nesting levels
2. No whitespace/indentation (compact JSON)
3. UTF-8 encoding, no BOM
4. Arrays sorted by deterministic key (nodes by `id`, edges by `from,to,kind`)
5. Null/empty values omitted
6. Numeric values without trailing zeros
### GR2: Mandated BLAKE3-256 Encoding
All graph-level hashes use BLAKE3-256 with the following format:
```
blake3:{64_hex_chars}
```
Example:
```
blake3:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2
```
**Rationale:**
- BLAKE3 is 3x+ faster than SHA-256 on modern CPUs
- Parallelizable for large graphs (>100K nodes)
- Cryptographically secure (256-bit security)
- Algorithm prefix enables future migration
### GR3: Append-Only Storage
Graph revisions are immutable. Operations:
| Operation | Creates New Revision | Modifies Existing |
|-----------|---------------------|-------------------|
| `create` | Yes | No |
| `update` | Yes | No |
| `merge` | Yes | No |
| `tombstone` | Yes | No |
| `read` | No | No |
**Storage layout:**
```
cas://reachability/
revisions/
{blake3}/ # Revision manifest
{blake3}.graph # Graph body
{blake3}.dsse # DSSE envelope
indices/
by-tenant/{tenant_id}/ # Tenant index
by-sbom/{sbom_digest}/ # SBOM correlation
by-root/{root_revision_id}/ # Lineage tree
```
### GR4: Lineage/Diff Metadata
Every revision tracks its lineage:
```json
{
"lineage": {
"depth": 5,
"root_revision_id": "rev:blake3:...",
"parent_revision_id": "rev:blake3:...",
"merge_parents": []
},
"diff_summary": {
"nodes_added": 12,
"nodes_removed": 3,
"nodes_modified": 0,
"edges_added": 24,
"edges_removed": 8,
"edges_modified": 0,
"roots_added": 0,
"roots_removed": 0
},
"diff_detail_uri": "cas://reachability/diffs/{parent_hash}_{child_hash}.ndjson"
}
```
**Diff detail format (NDJSON):**
```ndjson
{"op":"add","path":"nodes","value":{"id":"sym:java:...","display":"..."}}
{"op":"remove","path":"edges","from":"sym:java:a","to":"sym:java:b"}
```
### GR5: Cross-Artifact Digests (SBOM/VEX/Policy/Tool)
Every revision links to related artifacts:
```json
{
"cross_artifacts": {
"sbom_digest": "sha256:...",
"sbom_uri": "cas://scanner-artifacts/sbom.cdx.json",
"sbom_format": "cyclonedx-1.6",
"vex_digest": "sha256:...",
"vex_uri": "cas://excititor/vex/openvex.json",
"policy_digest": "sha256:...",
"policy_version": "P-7:v4",
"analyzer_digest": "sha256:...",
"analyzer_name": "scanner.java",
"analyzer_version": "1.2.0"
}
}
```
### GR6: UI/CLI Surfacing of Full/Short IDs
**Full ID format:**
```
rev:blake3:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2
```
**Short ID format (for display):**
```
rev:a1b2c3d4
```
**CLI commands:**
```bash
# List revisions
stella graph revisions --scan-id scan-123
# Show full ID
stella graph revisions --scan-id scan-123 --full
# Output:
REVISION CREATED NODES EDGES PARENT
rev:a1b2c3d4 2025-12-13T10:00:00 1247 3891 rev:9f8e7d6c
rev:9f8e7d6c 2025-12-12T15:30:00 1235 3867 rev:1a2b3c4d
```
**UI display:**
- Revision chips show short ID with copy-to-clipboard for full ID
- Hover tooltip shows full ID and creation timestamp
- Lineage tree visualization available in "Revision History" drawer
### GR7: Shard/Tenant Context
Every revision includes partition context:
```json
{
"tenant_id": "tenant:acme",
"shard_id": "shard:01",
"namespace": "prod",
"workspace_id": "ws:default"
}
```
**Tenant isolation:**
- Revisions are tenant-scoped; cross-tenant access requires explicit grants
- Shard ID enables horizontal scaling and data locality
- Namespace supports multi-environment deployments
### GR8: Pin/Audit Governance
**Pinned revisions:**
Revisions can be pinned to prevent automatic retention cleanup:
```json
{
"pinned": true,
"pinned_at": "2025-12-13T10:00:00Z",
"pinned_by": "user:alice",
"pin_reason": "Audit retention for CVE-2021-44228 investigation",
"pin_expires_at": "2026-12-13T10:00:00Z"
}
```
**Audit events:**
All revision operations emit audit events:
```json
{
"event_type": "graph.revision.created",
"revision_id": "rev:blake3:...",
"actor": "service:scanner",
"tenant_id": "tenant:acme",
"timestamp": "2025-12-13T10:00:00Z",
"metadata": {
"operation": "create",
"parent_revision_id": "rev:blake3:...",
"graph_hash": "blake3:..."
}
}
```
### GR9: Retention/Tombstones
**Retention policy:**
| Category | Default Retention | Configurable |
|----------|-------------------|--------------|
| Latest revision | Forever | No |
| Intermediate revisions | 90 days | Yes |
| Tombstoned revisions | 30 days | Yes |
| Pinned revisions | Until unpin + 7 days | No |
**Tombstone format:**
```json
{
"schema": "stellaops.graph.revision@v1",
"revision_id": "rev:blake3:...",
"tombstone": true,
"tombstoned_at": "2025-12-13T10:00:00Z",
"tombstoned_by": "service:retention-worker",
"tombstone_reason": "retention_policy",
"successor_revision_id": "rev:blake3:..."
}
```
### GR10: Inclusion in Offline Kits
Offline kits include graph revisions for air-gapped deployments:
**Offline bundle manifest:**
```json
{
"schema": "stellaops.offline.bundle@v1",
"bundle_id": "bundle:2025-12-13",
"graph_revisions": [
{
"revision_id": "rev:blake3:...",
"graph_hash": "blake3:...",
"included_artifacts": ["graph", "dsse", "diff"]
}
],
"rekor_checkpoints": [
{
"log_id": "rekor.sigstore.dev",
"checkpoint": "...",
"verified_at": "2025-12-13T10:00:00Z"
}
],
"signature": {
"algorithm": "ecdsa-p256",
"value": "base64:...",
"public_key_id": "key:offline-signing-2025"
}
}
```
**Import verification:**
```bash
stella offline import --bundle ./offline-bundle.tgz --verify
# Output:
Bundle: bundle:2025-12-13
Graph Revisions: 5
Rekor Checkpoints: 2
Verifying signatures...
Bundle signature: VALID
DSSE envelopes: 5/5 VALID
Rekor checkpoints: 2/2 VERIFIED
Import complete.
```
---
## 3. API Reference
### 3.1 Create Revision
```http
POST /api/graph/revisions
Authorization: Bearer <token>
Content-Type: application/json
{
"graph": { ... richgraph-v1 ... },
"parent_revision_id": "rev:blake3:...",
"cross_artifacts": { ... }
}
```
### 3.2 Get Revision
```http
GET /api/graph/revisions/{revision_id}
Authorization: Bearer <token>
```
### 3.3 List Revisions
```http
GET /api/graph/revisions?tenant_id=acme&sbom_digest=sha256:...&limit=20
Authorization: Bearer <token>
```
### 3.4 Diff Revisions
```http
GET /api/graph/revisions/diff?from={rev_a}&to={rev_b}
Authorization: Bearer <token>
```
---
## 4. Related Documentation
- [richgraph-v1 Contract](../contracts/richgraph-v1.md) - Graph schema specification
- [Function-Level Evidence](./function-level-evidence.md) - Evidence chain guide
- [CAS Infrastructure](../contracts/cas-infrastructure.md) - Content-addressable storage
- [Offline Kit](../24_OFFLINE_KIT.md) - Air-gap deployment
---
_Last updated: 2025-12-13. See Sprint 0401 GRAPHREV-GAPS-401-063 for change history._