# Orchestrator architecture Runtime components - WebService: REST and WebSocket API for DAG definitions, runs, and admin actions. - Scheduler: cron and timer triggers that enqueue run intents. - Worker: executes DAG steps, enforces resource limits, and reports telemetry. - Plugin host: loads task plugins from signed offline bundles. Data model - DAG: directed acyclic graph with deterministic topological ordering. - Run: immutable record with runId, dagVersion, tenant, inputsHash, status, traceId, startedUtc, endedUtc. - Step execution: stepId, inputsHash, outputsHash, status, attempt, durationMs, logsRef, metricsRef. Execution flow - Run creation is idempotent on runToken, dagId, and inputsHash. - Scheduler enqueues run intent to a tenant queue. - Worker reconstructs DAG order, executes steps, applies retries and backoff. - WebSocket streams run and step status updates. Storage and queues - PostgreSQL stores DAG specs, versions, and run history. - Queues are per-tenant FIFO in PostgreSQL or Valkey-backed lists. - Artifacts are content-addressed and stored in object storage or large objects. Security and AOC alignment - Tenant header required on every request; cross-tenant DAGs are forbidden. - Scopes: orchestrator:read, orchestrator:write, orchestrator:admin. - AOC alignment: orchestrator schedules and records only; no policy decisions. - Step sandboxing enforces CPU and memory limits; network egress deny by default. Determinism - Step ordering uses topological order with lexical tie-breaks. - Retries preserve traceId and reuse the same runToken. - Timestamps UTC; hashes lower-case hex. Offline posture - DAG specs and plugins are loaded from offline bundles with signatures. - Exports of runs, steps, and logs are available as NDJSON. Observability - Traces: orchestrator.run and orchestrator.step with tenant, dagId, runId, stepId. - Metrics: orchestrator_runs_total, orchestrator_run_duration_seconds, orchestrator_queue_depth. - Logs: structured JSON with trace_id, tenant, dagId, runId, stepId.