Files
git.stella-ops.org/samples/graph/fixtures-plan.md
StellaOps Bot 37cba83708
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-12-03 00:10:19 +02:00

40 lines
2.4 KiB
Markdown

# Graph Fixtures Plan (SAMPLES-GRAPH-24-003)
## Goals
- Produce a deterministic large-scale SBOM graph fixture (~40k nodes) with policy overlay snapshot for perf/regression suites (UI/CLI/Graph API).
- Align with current graph node/edge schema and overlay format used by `StellaOps.Graph.Indexer` and Vulnerability Explorer.
- Ensure offline parity: fixtures packaged for Offline Kit consumption (NDJSON + manifest hashes).
## Assumptions / Pending confirmations
- Overlay format resolved: `policy.overlay.v1` with `overlay_id = sha256(tenant|nodeId|overlayKind)`, verdict + severity, optional edge to policy rule node for bench compatibility.
- SBOM bundle source: scanner surface mock bundle v1; swap in real cache when approved without schema changes.
- Tenant: `demo-tenant`; timestamps frozen to `2025-11-22T00:00:00Z`.
## Canonical fixture (delivered 2025-12-02)
- Location: `samples/graph/graph-40k/`
- `nodes.ndjson`: 40,000 component nodes (`pkg:pypi/demo-*`)
- `edges.ndjson`: 100,071 `DEPENDS_ON` edges (fan-out ≤4, DAG order)
- `overlay.ndjson`: 100 `policy.overlay.v1` records (verdict/severity + optional policy-rule edge)
- `manifest.json`: hashes (SHA-256) and counts (nodes `d14e8c64…`, edges `143a2944…`, overlay `627a0d8c…`)
- `README.md` and `verify.py`: usage, hashes, offline verification
## Generation sketch (implemented)
1) Deterministic generator `samples/graph/scripts/generate_canonical.py` (seed `424242`, snapshot `graph-40k-policy-overlay-20251122`).
2) Writes nodes/edges/overlay with sorted keys, then manifest with hashes/counts.
3) `verify.py` recomputes hashes/counts to confirm reproducibility.
## Interim fixtures (still available, delivered 2025-12-01)
- Synthetic deterministic graphs under `samples/graph/interim/`:
- `graph-50k` (50k nodes, ~200k edges)
- `graph-100k` (100k nodes, ~400k edges)
- Minimal schema (`id, kind, name, version, tenant`), seeded RNG, stable ordering, manifests with hashes.
- Purpose: throughput/latency benches; overlay-free.
## Open items
- Regenerate if Graph overlay schema changes; update manifest/hashes and downstream references.
- Consider adding advisory/VEX nodes once Graph/Concelier schema freeze lands; currently component-focused.
## Next steps
- Wire `graph-40k` into BENCH-GRAPH-21-001/002 results and UI fixtures (SAMPLES-GRAPH-24-004).
- Add CAS/DSSE manifest once Offline Kit package format is finalized.