Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
- Added MongoPackRunApprovalStore for managing approval states with MongoDB. - Introduced MongoPackRunArtifactUploader for uploading and storing artifacts. - Created MongoPackRunLogStore to handle logging of pack run events. - Developed MongoPackRunStateStore for persisting and retrieving pack run states. - Implemented unit tests for MongoDB stores to ensure correct functionality. - Added MongoTaskRunnerTestContext for setting up MongoDB test environment. - Enhanced PackRunStateFactory to correctly initialize state with gate reasons.
100 lines
5.7 KiB
Markdown
100 lines
5.7 KiB
Markdown
# Task Runner Collections — Initial Migration
|
|
|
|
Last updated: 2025-11-06
|
|
|
|
This migration seeds the MongoDB collections that back the Task Runner service. It is implemented as `20251106-task-runner-baseline.mongosh` under the platform migration runner and must be applied **before** enabling the TaskRunner service in any environment.
|
|
|
|
## Collections
|
|
|
|
### `pack_runs`
|
|
|
|
| Field | Type | Notes |
|
|
|------------------|-----------------|-----------------------------------------------------------|
|
|
| `_id` | `string` | Run identifier (same as `runId`). |
|
|
| `planHash` | `string` | Deterministic hash produced by the planner. |
|
|
| `plan` | `object` | Full `TaskPackPlan` payload used to execute the run. |
|
|
| `failurePolicy` | `object` | Retry/backoff directives resolved at plan time. |
|
|
| `requestedAt` | `date` | Timestamp when the client requested the run. |
|
|
| `createdAt` | `date` | Timestamp when the run was persisted. |
|
|
| `updatedAt` | `date` | Timestamp of the last mutation. |
|
|
| `steps` | `array<object>` | Flattened step records (`stepId`, `status`, attempts…). |
|
|
| `tenantId` | `string` | Optional multi-tenant scope (reserved for future phases). |
|
|
|
|
**Indexes**
|
|
|
|
1. `{ _id: 1 }` — implicit primary key / uniqueness guarantee.
|
|
2. `{ updatedAt: -1 }` — serves `GET /runs` listings and staleness checks.
|
|
3. `{ tenantId: 1, updatedAt: -1 }` — activated once tenancy is enforced; remains sparse until then.
|
|
|
|
### `pack_run_logs`
|
|
|
|
| Field | Type | Notes |
|
|
|---------------|-----------------|--------------------------------------------------------|
|
|
| `_id` | `ObjectId` | Generated per log entry. |
|
|
| `runId` | `string` | Foreign key to `pack_runs._id`. |
|
|
| `sequence` | `long` | Monotonic counter assigned by the writer. |
|
|
| `timestamp` | `date` | UTC timestamp of the log event. |
|
|
| `level` | `string` | `trace`, `debug`, `info`, `warn`, `error`. |
|
|
| `eventType` | `string` | Machine-friendly event identifier (e.g. `step.started`). |
|
|
| `message` | `string` | Human-readable summary. |
|
|
| `stepId` | `string` | Optional step identifier. |
|
|
| `metadata` | `object` | Deterministic key/value payload (string-only values). |
|
|
|
|
**Indexes**
|
|
|
|
1. `{ runId: 1, sequence: 1 }` (unique) — guarantees ordered retrieval and enforces idempotence.
|
|
2. `{ runId: 1, timestamp: 1 }` — accelerates replay and time-window queries.
|
|
3. `{ timestamp: 1 }` — optional TTL (disabled by default) for retention policies.
|
|
|
|
### `pack_artifacts`
|
|
|
|
| Field | Type | Notes |
|
|
|--------------|------------|-------------------------------------------------------------|
|
|
| `_id` | `ObjectId` | Generated per artifact record. |
|
|
| `runId` | `string` | Foreign key to `pack_runs._id`. |
|
|
| `name` | `string` | Output name from the Task Pack manifest. |
|
|
| `type` | `string` | `file`, `object`, or other future evidence categories. |
|
|
| `sourcePath` | `string` | Local path captured during execution (nullable). |
|
|
| `storedPath` | `string` | Object store path or bundle-relative URI (nullable). |
|
|
| `status` | `string` | `pending`, `copied`, `materialized`, `skipped`. |
|
|
| `notes` | `string` | Free-form notes (deterministic messages only). |
|
|
| `capturedAt` | `date` | UTC timestamp recorded by the worker. |
|
|
|
|
**Indexes**
|
|
|
|
1. `{ runId: 1, name: 1 }` (unique) — ensures a run emits at most one record per output.
|
|
2. `{ runId: 1 }` — supports artifact listing alongside run inspection.
|
|
|
|
## Execution Order
|
|
|
|
1. Create collections with `validator` envelopes mirroring the field expectations above (if MongoDB schema validation is enabled in the environment).
|
|
2. Apply the indexes in the order listed — unique indexes first to surface data issues early.
|
|
3. Backfill existing filesystem-backed runs by importing the serialized state/log/artifact manifests into the new collections. A dedicated importer script (`tools/taskrunner/import-filesystem-state.ps1`) accompanies the migration.
|
|
4. Switch the Task Runner service configuration to point at the Mongo-backed stores (`TaskRunner:Storage:Mode = "Mongo"`), then redeploy workers and web service.
|
|
|
|
## Rollback
|
|
|
|
To revert, switch the Task Runner configuration back to the filesystem provider and stop the Mongo migration runner. Collections can remain in place; they are append-only and harmless when unused.
|
|
|
|
## Configuration Reference
|
|
|
|
Enable the Mongo-backed stores by updating the worker and web service configuration (Compose/Helm values or `appsettings*.json`):
|
|
|
|
```json
|
|
"TaskRunner": {
|
|
"Storage": {
|
|
"Mode": "mongo",
|
|
"Mongo": {
|
|
"ConnectionString": "mongodb://127.0.0.1:27017/taskrunner",
|
|
"Database": "taskrunner",
|
|
"RunsCollection": "pack_runs",
|
|
"LogsCollection": "pack_run_logs",
|
|
"ArtifactsCollection": "pack_artifacts",
|
|
"ApprovalsCollection": "pack_run_approvals"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
The worker uses the mirrored structure under the `Worker` section. Omit the `Database` property to fall back to the name embedded in the connection string.
|