Files
git.stella-ops.org/docs/modules/graph/analytics/GA1-GA10-analytics-plan.md
StellaOps Bot 37cba83708
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-12-03 00:10:19 +02:00

40 lines
3.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Graph Analytics Gaps (GA1GA10) Remediation Plan
**Sprint:** 0207-0001-0001 (Experience & SDKs 180.C)
**Artifacts produced:** schemas + samples for analytics results/bundles; governance rules; test/fixture expectations.
## Objectives (mapped to GA1GA10)
- **GA1 — Versioned analytics schemas:** `analytics-result.schema.json` defines versioned result payloads with `schemaVersion` + `algorithmVersion`.
- **GA2 — Deterministic seeds/rerun-hash CI:** every job records `seed`, `rerunHash = sha256(inputs+seed+algorithmVersion)`, and must replay to identical outputs.
- **GA3 — Privacy/tenant redaction:** results require `tenant` field; redaction rules apply before export (`redactions[]` logged).
- **GA4 — Baseline datasets/fixtures:** ship minimal deterministic fixture set under `src/Graph/__Tests/Fixtures/analytics-baseline/` (TODO when code added) and sample bundle here.
- **GA5 — Performance budgets/quotas:** default budgets captured in schema (`budgetSeconds`, `maxNodes`, `maxEdges`); jobs failing budgets emit `status=budget_exceeded`.
- **GA6 — Explainability metadata:** include `inputs`, `seed`, `algorithmVersion`, `parameters`, `provenance` (source hashes) for replay.
- **GA7 — Checksums + DSSE for exports:** bundle schema carries per-file SHA-256 plus optional DSSE signature envelope reference.
- **GA8 — Algorithm versioning:** `algorithmVersion` semver and `changeLogUrl` required; breaking changes bump MAJOR.
- **GA9 — Offline analytics bundle schema:** `analytics-bundle.schema.json` documents offline package with manifest, dataset hashes, redactions, and optional signatures.
- **GA10 — SemVer/change-log governance:** bundles must cite `changeLogUrl`; release notes must link to signed manifests; exports failing SemVer gating are rejected.
## Schemas & Samples
- `docs/modules/graph/analytics/analytics-result.schema.json`
- `docs/modules/graph/analytics/analytics-bundle.schema.json`
- Sample bundle: `docs/modules/graph/analytics/samples/analytics-bundle.sample.json`
## Rules of Engagement
1. **Determinism:** fixed `seed`; stable ordering of nodes/edges; `rerunHash` must match across runs given same inputs/seed.
2. **Redaction before export:** `redactions[]` enumerates removed fields per tenant policy; exports lacking redaction entries are invalid for multi-tenant bundles.
3. **Signatures (optional but encouraged):** DSSE/JWS envelopes over `bundle.manifest` and `resultHash` using offline keys; record under `signatures[]`.
4. **Offline readiness:** no network fetch during analysis or validation; datasets referenced by hash + relative path.
5. **Performance budgets:** defaults—`budgetSeconds: 30`, `maxNodes: 50000`, `maxEdges: 200000`; overridable per job but must be logged.
## Implementation Hooks
- API/Indexer must emit analytics results conforming to `analytics-result.schema.json`.
- Export jobs must validate bundles against `analytics-bundle.schema.json` and attach DSSE refs when available.
- CI: add rerun-hash check in analytics test pipeline using fixture bundle; fail on drift.
## Open Follow-ups
- Add real fixtures under `src/Graph/__Tests/Fixtures/analytics-baseline/` mirrored in Offline Kit.
- Wire DSSE signing in release pipeline once signing keys for Graph are provisioned.
## Evidence
- Schemas + sample committed in this sprint. Link in sprint Decisions & Risks. Tests to follow in analytics pipeline PR.***