Files
git.stella-ops.org/docs/_archive/orchestrator-legacy/run-ledger.md
2025-12-24 21:45:46 +02:00

1.6 KiB

Orchestrator Run Ledger (DOCS-ORCH-34-001)

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

Last updated: 2025-11-25

Purpose

Immutable record of every DAG run and step execution for audit, replay, and offline export.

Record schema (conceptual)

  • tenant, runId, dagId, dagVersion, runToken, traceId
  • status (running|completed|failed|cancelled)
  • inputsHash, outputsHash (overall)
  • startedUtc, endedUtc, durationMs
  • steps[]:
    • stepId, status, attempt, startedUtc, endedUtc, durationMs
    • inputsHash, outputsHash, logsRef, metricsRef, errorCode, retryable
  • events[] (optional): ordered list of significant events with timestamp, type, message, actor

Storage

  • PostgreSQL table partitioned by tenant; indexes on (tenant, dagId, runId), (tenant, status, startedUtc).
  • Artifacts/logs referenced by content hash; stored separately (object storage/RustFS).
  • Append-only updates; run status transitions are monotonic.

Exports

  • NDJSON export sorted by startedUtc, then runId; includes steps/events inline.
  • Exports include manifest with hash and count for determinism.

Observability

  • Metrics derived from ledger: run counts, durations, failure rates, retry counts.
  • Trace links preserved via stored traceId.

Governance

  • Runs never mutated or deleted; cancellation recorded as an event.
  • Access is tenant-scoped; admin queries require orchestrator:admin.
  • Replay tokens can be derived from inputsHash + dagVersion; consumers must log rationale when replaying.