git.stella-ops.org/docs/modules/taskrunner/migrations/pack-run-collections.md

# Task Runner Collections — Initial Migration

Last updated: 2025-11-06

This migration seeds the MongoDB collections that back the Task Runner service. It is implemented as `20251106-task-runner-baseline.mongosh` under the platform migration runner and must be applied **before** enabling the TaskRunner service in any environment.

## Collections

### `pack_runs`

| Field            | Type            | Notes                                                     |
|------------------|-----------------|-----------------------------------------------------------|
| `_id`            | `string`        | Run identifier (same as `runId`).                         |
| `planHash`       | `string`        | Deterministic hash produced by the planner.               |
| `plan`           | `object`        | Full `TaskPackPlan` payload used to execute the run.      |
| `failurePolicy`  | `object`        | Retry/backoff directives resolved at plan time.           |
| `requestedAt`    | `date`          | Timestamp when the client requested the run.              |
| `createdAt`      | `date`          | Timestamp when the run was persisted.                     |
| `updatedAt`      | `date`          | Timestamp of the last mutation.                           |
| `steps`          | `array<object>` | Flattened step records (`stepId`, `status`, attempts…).   |
| `tenantId`       | `string`        | Optional multi-tenant scope (reserved for future phases). |

**Indexes**

1. `{ _id: 1 }` — implicit primary key / uniqueness guarantee.
2. `{ updatedAt: -1 }` — serves `GET /runs` listings and staleness checks.
3. `{ tenantId: 1, updatedAt: -1 }` — activated once tenancy is enforced; remains sparse until then.

### `pack_run_logs`

| Field         | Type            | Notes                                                  |
|---------------|-----------------|--------------------------------------------------------|
| `_id`         | `ObjectId`      | Generated per log entry.                               |
| `runId`       | `string`        | Foreign key to `pack_runs._id`.                        |
| `sequence`    | `long`          | Monotonic counter assigned by the writer.              |
| `timestamp`   | `date`          | UTC timestamp of the log event.                        |
| `level`       | `string`        | `trace`, `debug`, `info`, `warn`, `error`.             |
| `eventType`   | `string`        | Machine-friendly event identifier (e.g. `step.started`). |
| `message`     | `string`        | Human-readable summary.                                |
| `stepId`      | `string`        | Optional step identifier.                              |
| `metadata`    | `object`        | Deterministic key/value payload (string-only values).  |

**Indexes**

1. `{ runId: 1, sequence: 1 }` (unique) — guarantees ordered retrieval and enforces idempotence.
2. `{ runId: 1, timestamp: 1 }` — accelerates replay and time-window queries.
3. `{ timestamp: 1 }` — optional TTL (disabled by default) for retention policies.

### `pack_artifacts`

| Field        | Type       | Notes                                                       |
|--------------|------------|-------------------------------------------------------------|
| `_id`        | `ObjectId` | Generated per artifact record.                              |
| `runId`      | `string`   | Foreign key to `pack_runs._id`.                             |
| `name`       | `string`   | Output name from the Task Pack manifest.                    |
| `type`       | `string`   | `file`, `object`, or other future evidence categories.      |
| `sourcePath` | `string`   | Local path captured during execution (nullable).            |
| `storedPath` | `string`   | Object store path or bundle-relative URI (nullable).        |
| `status`     | `string`   | `pending`, `copied`, `materialized`, `skipped`.             |
| `notes`      | `string`   | Free-form notes (deterministic messages only).              |
| `capturedAt` | `date`     | UTC timestamp recorded by the worker.                       |

**Indexes**

1. `{ runId: 1, name: 1 }` (unique) — ensures a run emits at most one record per output.
2. `{ runId: 1 }` — supports artifact listing alongside run inspection.

## Execution Order

1. Create collections with `validator` envelopes mirroring the field expectations above (if MongoDB schema validation is enabled in the environment).
2. Apply the indexes in the order listed — unique indexes first to surface data issues early.
3. Backfill existing filesystem-backed runs by importing the serialized state/log/artifact manifests into the new collections. A dedicated importer script (`tools/taskrunner/import-filesystem-state.ps1`) accompanies the migration.
4. Switch the Task Runner service configuration to point at the Mongo-backed stores (`TaskRunner:Storage:Mode = "Mongo"`), then redeploy workers and web service.

## Rollback

To revert, switch the Task Runner configuration back to the filesystem provider and stop the Mongo migration runner. Collections can remain in place; they are append-only and harmless when unused.

## Configuration Reference

Enable the Mongo-backed stores by updating the worker and web service configuration (Compose/Helm values or `appsettings*.json`):

```json
"TaskRunner": {
  "Storage": {
    "Mode": "mongo",
    "Mongo": {
      "ConnectionString": "mongodb://127.0.0.1:27017/taskrunner",
      "Database": "taskrunner",
      "RunsCollection": "pack_runs",
      "LogsCollection": "pack_run_logs",
      "ArtifactsCollection": "pack_artifacts",
      "ApprovalsCollection": "pack_run_approvals"
    }
  }
}
```

The worker uses the mirrored structure under the `Worker` section. Omit the `Database` property to fall back to the name embedded in the connection string.