# Task Pack Orchestration and Automation

**Version:** 1.0
**Date:** 2025-11-29
**Status:** Canonical

This advisory defines the product rationale, DSL semantics, and implementation strategy for the TaskRunner module, covering pack manifest structure, execution semantics, approval workflows, and evidence capture.

---

## 1. Executive Summary

The TaskRunner provides **deterministic, auditable automation** for security workflows. Key capabilities:

- **Task Pack DSL** - Declarative YAML manifests for multi-step workflows
- **Approval Gates** - Human-in-the-loop checkpoints with Authority integration
- **Deterministic Execution** - Plan hash verification prevents runtime divergence
- **Evidence Capture** - DSSE attestations for provenance and audit
- **Air-Gap Support** - Sealed-mode validation for offline installations

---

## 2. Market Drivers

### 2.1 Target Segments

| Segment | Automation Requirements | Use Case |
|---------|------------------------|----------|
| **Enterprise Security** | Approval workflows for vulnerability remediation | Change advisory board gates |
| **DevSecOps** | CI/CD pipeline integration | Automated policy enforcement |
| **Compliance Teams** | Auditable execution with evidence | SOC 2, FedRAMP documentation |
| **MSP/MSSP** | Multi-tenant orchestration | Managed security services |

### 2.2 Competitive Positioning

Most vulnerability scanning tools lack built-in orchestration. Stella Ops differentiates with:
- **Declarative task packs** with schema validation
- **Cryptographic plan verification** (plan hash binding)
- **Native approval gates** with Authority token integration
- **Evidence attestations** for audit trails
- **Sealed-mode enforcement** for air-gapped environments

---

## 3. Technical Architecture

### 3.1 Pack Manifest Structure (v1)

```yaml
apiVersion: stellaops.io/pack.v1
kind: TaskPack
metadata:
  name: vulnerability-scan-and-report
  version: 1.2.0
  description: Scan container, evaluate policy, generate report
  tags: [security, compliance, scanning]
  tenantVisibility: private
  maintainers:
    - name: Security Team
      email: security@example.com
  license: MIT

spec:
  inputs:
    - name: imageRef
      type: string
      required: true
      schema:
        pattern: "^[a-z0-9./-]+:[a-z0-9.-]+$"
    - name: policyPack
      type: string
      default: "default-policy-v1"

  secrets:
    - name: registryCredentials
      scope: scanner.read
      description: Registry pull credentials

  approvals:
    - name: security-review
      grants: ["security-lead", "ciso"]
      ttlHours: 72
      message: "Approve scan results before policy evaluation"

  steps:
    - id: scan
      type: run
      module: scanner/sbom-vuln
      inputs:
        image: "{{ inputs.imageRef }}"
      outputs:
        sbom: sbom.json
        vulns: vulnerabilities.json

    - id: review-gate
      type: gate.approval
      approval: security-review
      dependsOn: [scan]

    - id: policy-eval
      type: run
      module: policy/evaluate
      inputs:
        sbom: "{{ steps.scan.outputs.sbom }}"
        vulns: "{{ steps.scan.outputs.vulns }}"
        pack: "{{ inputs.policyPack }}"
      dependsOn: [review-gate]

    - id: generate-report
      type: parallel
      maxParallel: 2
      steps:
        - id: pdf-report
          type: run
          module: export/pdf
          inputs:
            data: "{{ steps.policy-eval.outputs.results }}"
        - id: json-report
          type: run
          module: export/json
          inputs:
            data: "{{ steps.policy-eval.outputs.results }}"
      dependsOn: [policy-eval]

  outputs:
    - name: scanReport
      type: file
      path: "{{ steps.generate-report.steps.pdf-report.outputs.file }}"
    - name: machineReadable
      type: object
      value: "{{ steps.generate-report.steps.json-report.outputs.data }}"

  success:
    message: "Scan completed successfully"
  failure:
    message: "Scan failed - review logs"
    retryPolicy:
      maxRetries: 2
      backoffSeconds: 60
```

### 3.2 Step Types

| Type | Purpose | Key Properties |
|------|---------|----------------|
| `run` | Execute module | `module`, `inputs`, `outputs` |
| `parallel` | Concurrent execution | `steps[]`, `maxParallel` |
| `map` | Iterate over list | `items`, `step`, `maxParallel` |
| `gate.approval` | Human approval checkpoint | `approval`, `timeout` |
| `gate.policy` | Policy Engine validation | `policy`, `failAction` |

### 3.3 Execution Semantics

**Plan Phase:**
1. Parse manifest and validate schema
2. Resolve input expressions
3. Build execution graph
4. Compute **canonical plan hash** (SHA-256 of normalized graph)

**Simulation Phase (Optional):**
1. Execute all steps in dry-run mode
2. Capture expected outputs
3. Store simulation results with plan hash

**Execution Phase:**
1. Verify runtime graph matches plan hash
2. Execute steps in dependency order
3. Emit progress events to Timeline
4. Capture output artifacts

**Evidence Phase:**
1. Generate DSSE attestation with plan hash
2. Include input digests and output manifests
3. Store in Evidence Locker
4. Optionally anchor to Rekor

---

## 4. Approval Workflow

### 4.1 Gate Definition

```yaml
approvals:
  - name: security-review
    grants: ["role/security-lead", "role/ciso"]
    ttlHours: 72
    message: "Review vulnerability findings before proceeding"
    requiredCount: 1  # Number of approvals needed
```

### 4.2 Authority Token Contract

Approval tokens must include:

| Claim | Description |
|-------|-------------|
| `pack_run_id` | Run identifier (UUID) |
| `pack_gate_id` | Gate name from manifest |
| `pack_plan_hash` | Canonical plan hash |
| `auth_time` | Must be within 5 minutes of request |

### 4.3 CLI Approval Command

```bash
stella pack approve \
  --run "run:tenant-default:20251129T120000Z" \
  --gate security-review \
  --pack-run-id "abc123..." \
  --pack-gate-id "security-review" \
  --pack-plan-hash "sha256:def456..." \
  --comment "Reviewed findings, no critical issues"
```

### 4.4 Approval Events

| Event | Trigger |
|-------|---------|
| `pack.approval.requested` | Gate reached, awaiting approval |
| `pack.approval.granted` | Approval recorded |
| `pack.approval.denied` | Approval rejected |
| `pack.approval.expired` | TTL exceeded without approval |

---

## 5. Implementation Strategy

### 5.1 Phase 1: Core Execution (In Progress)

- [x] Telemetry core adoption (TASKRUN-OBS-50-001)
- [x] Metrics implementation (TASKRUN-OBS-51-001)
- [ ] Architecture/API contracts (TASKRUN-41-001) - BLOCKED
- [ ] Execution engine enhancements (TASKRUN-42-001) - BLOCKED

### 5.2 Phase 2: Approvals & Evidence (Planned)

- [ ] Timeline event emission (TASKRUN-OBS-52-001)
- [ ] Evidence locker snapshots (TASKRUN-OBS-53-001)
- [ ] DSSE attestations (TASKRUN-OBS-54-001)
- [ ] Incident mode escalations (TASKRUN-OBS-55-001)

### 5.3 Phase 3: Multi-Tenancy & Air-Gap (Planned)

- [ ] Tenant scoping and egress control (TASKRUN-TEN-48-001)
- [ ] Sealed-mode validation (TASKRUN-AIRGAP-56-001)
- [ ] Bundle ingestion for offline (TASKRUN-AIRGAP-56-002)
- [ ] Evidence capture in sealed mode (TASKRUN-AIRGAP-58-001)

---

## 6. API Surface

### 6.1 TaskRunner APIs

| Endpoint | Method | Scope | Description |
|----------|--------|-------|-------------|
| `/api/runs` | POST | `packs.run` | Submit pack run |
| `/api/runs/{runId}` | GET | `packs.read` | Get run status |
| `/api/runs/{runId}/logs` | GET | `packs.read` | Stream logs (SSE) |
| `/api/runs/{runId}/artifacts` | GET | `packs.read` | List artifacts |
| `/api/runs/{runId}/approve` | POST | `packs.approve` | Record approval |
| `/api/runs/{runId}/cancel` | POST | `packs.run` | Cancel run |

### 6.2 Packs Registry APIs

| Endpoint | Method | Scope | Description |
|----------|--------|-------|-------------|
| `/api/packs` | GET | `packs.read` | List packs |
| `/api/packs/{packId}/versions` | GET | `packs.read` | List versions |
| `/api/packs/{packId}/versions/{version}` | GET | `packs.read` | Get manifest |
| `/api/packs/{packId}/versions` | POST | `packs.write` | Publish pack |
| `/api/packs/{packId}/promote` | POST | `packs.write` | Promote channel |

### 6.3 CLI Commands

```bash
# Initialize pack scaffold
stella pack init --name my-workflow

# Validate manifest
stella pack validate pack.yaml

# Dry-run simulation
stella pack plan pack.yaml --inputs image=nginx:latest

# Execute pack
stella pack run pack.yaml --inputs image=nginx:latest

# Build distributable bundle
stella pack build pack.yaml --output my-workflow-1.0.0.tar.gz

# Sign bundle
cosign sign-blob my-workflow-1.0.0.tar.gz

# Publish to registry
stella pack push my-workflow-1.0.0.tar.gz --registry packs.example.com

# Export for offline distribution
stella pack bundle export --pack my-workflow --version 1.0.0
```

---

## 7. Storage Model

### 7.1 MongoDB Collections

**pack_runs:**

| Field | Type | Description |
|-------|------|-------------|
| `_id` | string | Run identifier |
| `planHash` | string | Canonical plan hash |
| `plan` | object | Full TaskPackPlan |
| `failurePolicy` | object | Retry/backoff config |
| `requestedAt` | date | Client request time |
| `tenantId` | string | Tenant scope |
| `steps` | array | Step execution records |

**pack_run_logs:**

| Field | Type | Description |
|-------|------|-------------|
| `runId` | string | FK to pack_runs |
| `sequence` | long | Monotonic counter |
| `timestamp` | date | Event time (UTC) |
| `level` | string | trace/debug/info/warn/error |
| `eventType` | string | Machine identifier |
| `stepId` | string | Optional step reference |

**pack_artifacts:**

| Field | Type | Description |
|-------|------|-------------|
| `runId` | string | FK to pack_runs |
| `name` | string | Output name |
| `type` | string | file/object/url |
| `storedPath` | string | Object store URI |
| `status` | string | pending/copied/materialized |

---

## 8. Evidence & Attestation

### 8.1 DSSE Attestation Structure

```json
{
  "payloadType": "application/vnd.stellaops.pack-run+json",
  "payload": {
    "runId": "abc123...",
    "packName": "vulnerability-scan-and-report",
    "packVersion": "1.2.0",
    "planHash": "sha256:def456...",
    "inputs": {
      "imageRef": { "value": "nginx:latest", "digest": "sha256:..." }
    },
    "outputs": [
      { "name": "scanReport", "digest": "sha256:..." }
    ],
    "steps": [
      { "id": "scan", "status": "completed", "duration": 45.2 }
    ],
    "completedAt": "2025-11-29T12:30:00Z"
  },
  "signatures": [...]
}
```

### 8.2 Evidence Bundle

Task pack runs produce evidence bundles containing:
- Pack manifest (signed)
- Input values (redacted secrets)
- Output artifacts
- Step transcripts
- DSSE attestation

---

## 9. Determinism Requirements

All TaskRunner operations must maintain determinism:

1. **Plan hash binding** - Runtime graph must match computed plan hash
2. **Stable step ordering** - Dependencies resolve deterministically
3. **Expression evaluation** - Same inputs produce same resolved values
4. **Timestamps in UTC** - All logs and events use ISO-8601 UTC
5. **Secret masking** - Secrets never appear in logs or evidence

---

## 10. RBAC & Scopes

| Scope | Purpose |
|-------|---------|
| `packs.read` | Discover/download packs |
| `packs.write` | Publish/update packs (requires signature) |
| `packs.run` | Execute packs via CLI/TaskRunner |
| `packs.approve` | Fulfill approval gates |

**Approval Token Requirements:**
- `pack_run_id`, `pack_gate_id`, `pack_plan_hash` are mandatory
- Token must be fresh (within 5-minute auth window)

---

## 11. Related Documentation

| Resource | Location |
|----------|----------|
| Task Pack specification | `docs/task-packs/spec.md` |
| Authoring guide | `docs/task-packs/authoring-guide.md` |
| Operations runbook | `docs/task-packs/runbook.md` |
| Registry architecture | `docs/task-packs/registry.md` |
| MongoDB migrations | `docs/modules/taskrunner/migrations/pack-run-collections.md` |

---

## 12. Sprint Mapping

- **Primary Sprint:** SPRINT_0157_0001_0001_taskrunner_i.md
- **Phase II:** SPRINT_0158_0001_0002_taskrunner_ii.md
- **Blockers:** SPRINT_0157_0001_0002_taskrunner_blockers.md

**Key Task IDs:**
- `TASKRUN-41-001` - Architecture/API contracts (BLOCKED)
- `TASKRUN-42-001` - Execution engine enhancements (BLOCKED)
- `TASKRUN-OBS-50-001` - Telemetry core adoption (DONE)
- `TASKRUN-OBS-51-001` - Metrics implementation (DONE)
- `TASKRUN-OBS-52-001` - Timeline events (BLOCKED)

---

## 13. Success Metrics

| Metric | Target |
|--------|--------|
| Plan hash verification | 100% match or abort |
| Approval gate response | < 5 min for high-priority |
| Evidence attestation rate | 100% of completed runs |
| Offline execution success | Works in sealed mode |
| Step execution latency | < 2s overhead per step |

---

*Last updated: 2025-11-29*