# Graph Analytics Gaps (GA1–GA10) Remediation Plan **Sprint:** 0207-0001-0001 (Experience & SDKs 180.C) **Artifacts produced:** schemas + samples for analytics results/bundles; governance rules; test/fixture expectations. ## Objectives (mapped to GA1–GA10) - **GA1 — Versioned analytics schemas:** `analytics-result.schema.json` defines versioned result payloads with `schemaVersion` + `algorithmVersion`. - **GA2 — Deterministic seeds/rerun-hash CI:** every job records `seed`, `rerunHash = sha256(inputs+seed+algorithmVersion)`, and must replay to identical outputs. - **GA3 — Privacy/tenant redaction:** results require `tenant` field; redaction rules apply before export (`redactions[]` logged). - **GA4 — Baseline datasets/fixtures:** ship minimal deterministic fixture set under `src/Graph/__Tests/Fixtures/analytics-baseline/` (TODO when code added) and sample bundle here. - **GA5 — Performance budgets/quotas:** default budgets captured in schema (`budgetSeconds`, `maxNodes`, `maxEdges`); jobs failing budgets emit `status=budget_exceeded`. - **GA6 — Explainability metadata:** include `inputs`, `seed`, `algorithmVersion`, `parameters`, `provenance` (source hashes) for replay. - **GA7 — Checksums + DSSE for exports:** bundle schema carries per-file SHA-256 plus optional DSSE signature envelope reference. - **GA8 — Algorithm versioning:** `algorithmVersion` semver and `changeLogUrl` required; breaking changes bump MAJOR. - **GA9 — Offline analytics bundle schema:** `analytics-bundle.schema.json` documents offline package with manifest, dataset hashes, redactions, and optional signatures. - **GA10 — SemVer/change-log governance:** bundles must cite `changeLogUrl`; release notes must link to signed manifests; exports failing SemVer gating are rejected. ## Schemas & Samples - `docs/modules/graph/analytics/analytics-result.schema.json` - `docs/modules/graph/analytics/analytics-bundle.schema.json` - Sample bundle: `docs/modules/graph/analytics/samples/analytics-bundle.sample.json` ## Rules of Engagement 1. **Determinism:** fixed `seed`; stable ordering of nodes/edges; `rerunHash` must match across runs given same inputs/seed. 2. **Redaction before export:** `redactions[]` enumerates removed fields per tenant policy; exports lacking redaction entries are invalid for multi-tenant bundles. 3. **Signatures (optional but encouraged):** DSSE/JWS envelopes over `bundle.manifest` and `resultHash` using offline keys; record under `signatures[]`. 4. **Offline readiness:** no network fetch during analysis or validation; datasets referenced by hash + relative path. 5. **Performance budgets:** defaults—`budgetSeconds: 30`, `maxNodes: 50000`, `maxEdges: 200000`; overridable per job but must be logged. ## Implementation Hooks - API/Indexer must emit analytics results conforming to `analytics-result.schema.json`. - Export jobs must validate bundles against `analytics-bundle.schema.json` and attach DSSE refs when available. - CI: add rerun-hash check in analytics test pipeline using fixture bundle; fail on drift. ## Open Follow-ups - Add real fixtures under `src/Graph/__Tests/Fixtures/analytics-baseline/` mirrored in Offline Kit. - Wire DSSE signing in release pipeline once signing keys for Graph are provisioned. ## Evidence - Schemas + sample committed in this sprint. Link in sprint Decisions & Risks. Tests to follow in analytics pipeline PR.***