# Task Runner Collections — Initial Migration Last updated: 2025-11-06 This migration seeds the MongoDB collections that back the Task Runner service. It is implemented as `20251106-task-runner-baseline.mongosh` under the platform migration runner and must be applied **before** enabling the TaskRunner service in any environment. ## Collections ### `pack_runs` | Field | Type | Notes | |------------------|-----------------|-----------------------------------------------------------| | `_id` | `string` | Run identifier (same as `runId`). | | `planHash` | `string` | Deterministic hash produced by the planner. | | `plan` | `object` | Full `TaskPackPlan` payload used to execute the run. | | `failurePolicy` | `object` | Retry/backoff directives resolved at plan time. | | `requestedAt` | `date` | Timestamp when the client requested the run. | | `createdAt` | `date` | Timestamp when the run was persisted. | | `updatedAt` | `date` | Timestamp of the last mutation. | | `steps` | `array` | Flattened step records (`stepId`, `status`, attempts…). | | `tenantId` | `string` | Optional multi-tenant scope (reserved for future phases). | **Indexes** 1. `{ _id: 1 }` — implicit primary key / uniqueness guarantee. 2. `{ updatedAt: -1 }` — serves `GET /runs` listings and staleness checks. 3. `{ tenantId: 1, updatedAt: -1 }` — activated once tenancy is enforced; remains sparse until then. ### `pack_run_logs` | Field | Type | Notes | |---------------|-----------------|--------------------------------------------------------| | `_id` | `ObjectId` | Generated per log entry. | | `runId` | `string` | Foreign key to `pack_runs._id`. | | `sequence` | `long` | Monotonic counter assigned by the writer. | | `timestamp` | `date` | UTC timestamp of the log event. | | `level` | `string` | `trace`, `debug`, `info`, `warn`, `error`. | | `eventType` | `string` | Machine-friendly event identifier (e.g. `step.started`). | | `message` | `string` | Human-readable summary. | | `stepId` | `string` | Optional step identifier. | | `metadata` | `object` | Deterministic key/value payload (string-only values). | **Indexes** 1. `{ runId: 1, sequence: 1 }` (unique) — guarantees ordered retrieval and enforces idempotence. 2. `{ runId: 1, timestamp: 1 }` — accelerates replay and time-window queries. 3. `{ timestamp: 1 }` — optional TTL (disabled by default) for retention policies. ### `pack_artifacts` | Field | Type | Notes | |--------------|------------|-------------------------------------------------------------| | `_id` | `ObjectId` | Generated per artifact record. | | `runId` | `string` | Foreign key to `pack_runs._id`. | | `name` | `string` | Output name from the Task Pack manifest. | | `type` | `string` | `file`, `object`, or other future evidence categories. | | `sourcePath` | `string` | Local path captured during execution (nullable). | | `storedPath` | `string` | Object store path or bundle-relative URI (nullable). | | `status` | `string` | `pending`, `copied`, `materialized`, `skipped`. | | `notes` | `string` | Free-form notes (deterministic messages only). | | `capturedAt` | `date` | UTC timestamp recorded by the worker. | **Indexes** 1. `{ runId: 1, name: 1 }` (unique) — ensures a run emits at most one record per output. 2. `{ runId: 1 }` — supports artifact listing alongside run inspection. ## Execution Order 1. Create collections with `validator` envelopes mirroring the field expectations above (if MongoDB schema validation is enabled in the environment). 2. Apply the indexes in the order listed — unique indexes first to surface data issues early. 3. Backfill existing filesystem-backed runs by importing the serialized state/log/artifact manifests into the new collections. A dedicated importer script (`tools/taskrunner/import-filesystem-state.ps1`) accompanies the migration. 4. Switch the Task Runner service configuration to point at the Mongo-backed stores (`TaskRunner:Storage:Mode = "Mongo"`), then redeploy workers and web service. ## Rollback To revert, switch the Task Runner configuration back to the filesystem provider and stop the Mongo migration runner. Collections can remain in place; they are append-only and harmless when unused. ## Configuration Reference Enable the Mongo-backed stores by updating the worker and web service configuration (Compose/Helm values or `appsettings*.json`): ```json "TaskRunner": { "Storage": { "Mode": "mongo", "Mongo": { "ConnectionString": "mongodb://127.0.0.1:27017/taskrunner", "Database": "taskrunner", "RunsCollection": "pack_runs", "LogsCollection": "pack_run_logs", "ArtifactsCollection": "pack_artifacts", "ApprovalsCollection": "pack_run_approvals" } } } ``` The worker uses the mirrored structure under the `Worker` section. Omit the `Database` property to fall back to the name embedded in the connection string.