Files
git.stella-ops.org/docs/modules/graph/analytics/GA1-GA10-analytics-plan.md
StellaOps Bot 37cba83708
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Concelier Attestation Tests / attestation-tests (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Export Center CI / export-ci (push) Has been cancelled
devportal-offline / build-offline (push) Has been cancelled
up
2025-12-03 00:10:19 +02:00

3.3 KiB
Raw Blame History

Graph Analytics Gaps (GA1GA10) Remediation Plan

Sprint: 0207-0001-0001 (Experience & SDKs 180.C)
Artifacts produced: schemas + samples for analytics results/bundles; governance rules; test/fixture expectations.

Objectives (mapped to GA1GA10)

  • GA1 — Versioned analytics schemas: analytics-result.schema.json defines versioned result payloads with schemaVersion + algorithmVersion.
  • GA2 — Deterministic seeds/rerun-hash CI: every job records seed, rerunHash = sha256(inputs+seed+algorithmVersion), and must replay to identical outputs.
  • GA3 — Privacy/tenant redaction: results require tenant field; redaction rules apply before export (redactions[] logged).
  • GA4 — Baseline datasets/fixtures: ship minimal deterministic fixture set under src/Graph/__Tests/Fixtures/analytics-baseline/ (TODO when code added) and sample bundle here.
  • GA5 — Performance budgets/quotas: default budgets captured in schema (budgetSeconds, maxNodes, maxEdges); jobs failing budgets emit status=budget_exceeded.
  • GA6 — Explainability metadata: include inputs, seed, algorithmVersion, parameters, provenance (source hashes) for replay.
  • GA7 — Checksums + DSSE for exports: bundle schema carries per-file SHA-256 plus optional DSSE signature envelope reference.
  • GA8 — Algorithm versioning: algorithmVersion semver and changeLogUrl required; breaking changes bump MAJOR.
  • GA9 — Offline analytics bundle schema: analytics-bundle.schema.json documents offline package with manifest, dataset hashes, redactions, and optional signatures.
  • GA10 — SemVer/change-log governance: bundles must cite changeLogUrl; release notes must link to signed manifests; exports failing SemVer gating are rejected.

Schemas & Samples

  • docs/modules/graph/analytics/analytics-result.schema.json
  • docs/modules/graph/analytics/analytics-bundle.schema.json
  • Sample bundle: docs/modules/graph/analytics/samples/analytics-bundle.sample.json

Rules of Engagement

  1. Determinism: fixed seed; stable ordering of nodes/edges; rerunHash must match across runs given same inputs/seed.
  2. Redaction before export: redactions[] enumerates removed fields per tenant policy; exports lacking redaction entries are invalid for multi-tenant bundles.
  3. Signatures (optional but encouraged): DSSE/JWS envelopes over bundle.manifest and resultHash using offline keys; record under signatures[].
  4. Offline readiness: no network fetch during analysis or validation; datasets referenced by hash + relative path.
  5. Performance budgets: defaults—budgetSeconds: 30, maxNodes: 50000, maxEdges: 200000; overridable per job but must be logged.

Implementation Hooks

  • API/Indexer must emit analytics results conforming to analytics-result.schema.json.
  • Export jobs must validate bundles against analytics-bundle.schema.json and attach DSSE refs when available.
  • CI: add rerun-hash check in analytics test pipeline using fixture bundle; fail on drift.

Open Follow-ups

  • Add real fixtures under src/Graph/__Tests/Fixtures/analytics-baseline/ mirrored in Offline Kit.
  • Wire DSSE signing in release pipeline once signing keys for Graph are provisioned.

Evidence

  • Schemas + sample committed in this sprint. Link in sprint Decisions & Risks. Tests to follow in analytics pipeline PR.***