up

2025-11-29 02:19:50 +02:00
parent 2548abc56f
commit b34f13dc03
86 changed files with 9625 additions and 640 deletions
--- a/docs/product-advisories/29-Nov-2025
+++ b/docs/product-advisories/29-Nov-2025
@@ -0,0 +1,447 @@
+# Task Pack Orchestration and Automation
+
+**Version:** 1.0
+**Date:** 2025-11-29
+**Status:** Canonical
+
+This advisory defines the product rationale, DSL semantics, and implementation strategy for the TaskRunner module, covering pack manifest structure, execution semantics, approval workflows, and evidence capture.
+
+---
+
+## 1. Executive Summary
+
+The TaskRunner provides **deterministic, auditable automation** for security workflows. Key capabilities:
+
+- **Task Pack DSL** - Declarative YAML manifests for multi-step workflows
+- **Approval Gates** - Human-in-the-loop checkpoints with Authority integration
+- **Deterministic Execution** - Plan hash verification prevents runtime divergence
+- **Evidence Capture** - DSSE attestations for provenance and audit
+- **Air-Gap Support** - Sealed-mode validation for offline installations
+
+---
+
+## 2. Market Drivers
+
+### 2.1 Target Segments
+
+| Segment | Automation Requirements | Use Case |
+|---------|------------------------|----------|
+| **Enterprise Security** | Approval workflows for vulnerability remediation | Change advisory board gates |
+| **DevSecOps** | CI/CD pipeline integration | Automated policy enforcement |
+| **Compliance Teams** | Auditable execution with evidence | SOC 2, FedRAMP documentation |
+| **MSP/MSSP** | Multi-tenant orchestration | Managed security services |
+
+### 2.2 Competitive Positioning
+
+Most vulnerability scanning tools lack built-in orchestration. Stella Ops differentiates with:
+- **Declarative task packs** with schema validation
+- **Cryptographic plan verification** (plan hash binding)
+- **Native approval gates** with Authority token integration
+- **Evidence attestations** for audit trails
+- **Sealed-mode enforcement** for air-gapped environments
+
+---
+
+## 3. Technical Architecture
+
+### 3.1 Pack Manifest Structure (v1)
+
+```yaml
+apiVersion: stellaops.io/pack.v1
+kind: TaskPack
+metadata:
+  name: vulnerability-scan-and-report
+  version: 1.2.0
+  description: Scan container, evaluate policy, generate report
+  tags: [security, compliance, scanning]
+  tenantVisibility: private
+  maintainers:
+    - name: Security Team
+      email: security@example.com
+  license: MIT
+
+spec:
+  inputs:
+    - name: imageRef
+      type: string
+      required: true
+      schema:
+        pattern: "^[a-z0-9./-]+:[a-z0-9.-]+$"
+    - name: policyPack
+      type: string
+      default: "default-policy-v1"
+
+  secrets:
+    - name: registryCredentials
+      scope: scanner.read
+      description: Registry pull credentials
+
+  approvals:
+    - name: security-review
+      grants: ["security-lead", "ciso"]
+      ttlHours: 72
+      message: "Approve scan results before policy evaluation"
+
+  steps:
+    - id: scan
+      type: run
+      module: scanner/sbom-vuln
+      inputs:
+        image: "{{ inputs.imageRef }}"
+      outputs:
+        sbom: sbom.json
+        vulns: vulnerabilities.json
+
+    - id: review-gate
+      type: gate.approval
+      approval: security-review
+      dependsOn: [scan]
+
+    - id: policy-eval
+      type: run
+      module: policy/evaluate
+      inputs:
+        sbom: "{{ steps.scan.outputs.sbom }}"
+        vulns: "{{ steps.scan.outputs.vulns }}"
+        pack: "{{ inputs.policyPack }}"
+      dependsOn: [review-gate]
+
+    - id: generate-report
+      type: parallel
+      maxParallel: 2
+      steps:
+        - id: pdf-report
+          type: run
+          module: export/pdf
+          inputs:
+            data: "{{ steps.policy-eval.outputs.results }}"
+        - id: json-report
+          type: run
+          module: export/json
+          inputs:
+            data: "{{ steps.policy-eval.outputs.results }}"
+      dependsOn: [policy-eval]
+
+  outputs:
+    - name: scanReport
+      type: file
+      path: "{{ steps.generate-report.steps.pdf-report.outputs.file }}"
+    - name: machineReadable
+      type: object
+      value: "{{ steps.generate-report.steps.json-report.outputs.data }}"
+
+  success:
+    message: "Scan completed successfully"
+  failure:
+    message: "Scan failed - review logs"
+    retryPolicy:
+      maxRetries: 2
+      backoffSeconds: 60
+```
+
+### 3.2 Step Types
+
+| Type | Purpose | Key Properties |
+|------|---------|----------------|
+| `run` | Execute module | `module`, `inputs`, `outputs` |
+| `parallel` | Concurrent execution | `steps[]`, `maxParallel` |
+| `map` | Iterate over list | `items`, `step`, `maxParallel` |
+| `gate.approval` | Human approval checkpoint | `approval`, `timeout` |
+| `gate.policy` | Policy Engine validation | `policy`, `failAction` |
+
+### 3.3 Execution Semantics
+
+**Plan Phase:**
+1. Parse manifest and validate schema
+2. Resolve input expressions
+3. Build execution graph
+4. Compute **canonical plan hash** (SHA-256 of normalized graph)
+
+**Simulation Phase (Optional):**
+1. Execute all steps in dry-run mode
+2. Capture expected outputs
+3. Store simulation results with plan hash
+
+**Execution Phase:**
+1. Verify runtime graph matches plan hash
+2. Execute steps in dependency order
+3. Emit progress events to Timeline
+4. Capture output artifacts
+
+**Evidence Phase:**
+1. Generate DSSE attestation with plan hash
+2. Include input digests and output manifests
+3. Store in Evidence Locker
+4. Optionally anchor to Rekor
+
+---
+
+## 4. Approval Workflow
+
+### 4.1 Gate Definition
+
+```yaml
+approvals:
+  - name: security-review
+    grants: ["role/security-lead", "role/ciso"]
+    ttlHours: 72
+    message: "Review vulnerability findings before proceeding"
+    requiredCount: 1  # Number of approvals needed
+```
+
+### 4.2 Authority Token Contract
+
+Approval tokens must include:
+
+| Claim | Description |
+|-------|-------------|
+| `pack_run_id` | Run identifier (UUID) |
+| `pack_gate_id` | Gate name from manifest |
+| `pack_plan_hash` | Canonical plan hash |
+| `auth_time` | Must be within 5 minutes of request |
+
+### 4.3 CLI Approval Command
+
+```bash
+stella pack approve \
+  --run "run:tenant-default:20251129T120000Z" \
+  --gate security-review \
+  --pack-run-id "abc123..." \
+  --pack-gate-id "security-review" \
+  --pack-plan-hash "sha256:def456..." \
+  --comment "Reviewed findings, no critical issues"
+```
+
+### 4.4 Approval Events
+
+| Event | Trigger |
+|-------|---------|
+| `pack.approval.requested` | Gate reached, awaiting approval |
+| `pack.approval.granted` | Approval recorded |
+| `pack.approval.denied` | Approval rejected |
+| `pack.approval.expired` | TTL exceeded without approval |
+
+---
+
+## 5. Implementation Strategy
+
+### 5.1 Phase 1: Core Execution (In Progress)
+
+- [x] Telemetry core adoption (TASKRUN-OBS-50-001)
+- [x] Metrics implementation (TASKRUN-OBS-51-001)
+- [ ] Architecture/API contracts (TASKRUN-41-001) - BLOCKED
+- [ ] Execution engine enhancements (TASKRUN-42-001) - BLOCKED
+
+### 5.2 Phase 2: Approvals & Evidence (Planned)
+
+- [ ] Timeline event emission (TASKRUN-OBS-52-001)
+- [ ] Evidence locker snapshots (TASKRUN-OBS-53-001)
+- [ ] DSSE attestations (TASKRUN-OBS-54-001)
+- [ ] Incident mode escalations (TASKRUN-OBS-55-001)
+
+### 5.3 Phase 3: Multi-Tenancy & Air-Gap (Planned)
+
+- [ ] Tenant scoping and egress control (TASKRUN-TEN-48-001)
+- [ ] Sealed-mode validation (TASKRUN-AIRGAP-56-001)
+- [ ] Bundle ingestion for offline (TASKRUN-AIRGAP-56-002)
+- [ ] Evidence capture in sealed mode (TASKRUN-AIRGAP-58-001)
+
+---
+
+## 6. API Surface
+
+### 6.1 TaskRunner APIs
+
+| Endpoint | Method | Scope | Description |
+|----------|--------|-------|-------------|
+| `/api/runs` | POST | `packs.run` | Submit pack run |
+| `/api/runs/{runId}` | GET | `packs.read` | Get run status |
+| `/api/runs/{runId}/logs` | GET | `packs.read` | Stream logs (SSE) |
+| `/api/runs/{runId}/artifacts` | GET | `packs.read` | List artifacts |
+| `/api/runs/{runId}/approve` | POST | `packs.approve` | Record approval |
+| `/api/runs/{runId}/cancel` | POST | `packs.run` | Cancel run |
+
+### 6.2 Packs Registry APIs
+
+| Endpoint | Method | Scope | Description |
+|----------|--------|-------|-------------|
+| `/api/packs` | GET | `packs.read` | List packs |
+| `/api/packs/{packId}/versions` | GET | `packs.read` | List versions |
+| `/api/packs/{packId}/versions/{version}` | GET | `packs.read` | Get manifest |
+| `/api/packs/{packId}/versions` | POST | `packs.write` | Publish pack |
+| `/api/packs/{packId}/promote` | POST | `packs.write` | Promote channel |
+
+### 6.3 CLI Commands
+
+```bash
+# Initialize pack scaffold
+stella pack init --name my-workflow
+
+# Validate manifest
+stella pack validate pack.yaml
+
+# Dry-run simulation
+stella pack plan pack.yaml --inputs image=nginx:latest
+
+# Execute pack
+stella pack run pack.yaml --inputs image=nginx:latest
+
+# Build distributable bundle
+stella pack build pack.yaml --output my-workflow-1.0.0.tar.gz
+
+# Sign bundle
+cosign sign-blob my-workflow-1.0.0.tar.gz
+
+# Publish to registry
+stella pack push my-workflow-1.0.0.tar.gz --registry packs.example.com
+
+# Export for offline distribution
+stella pack bundle export --pack my-workflow --version 1.0.0
+```
+
+---
+
+## 7. Storage Model
+
+### 7.1 MongoDB Collections
+
+**pack_runs:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `_id` | string | Run identifier |
+| `planHash` | string | Canonical plan hash |
+| `plan` | object | Full TaskPackPlan |
+| `failurePolicy` | object | Retry/backoff config |
+| `requestedAt` | date | Client request time |
+| `tenantId` | string | Tenant scope |
+| `steps` | array | Step execution records |
+
+**pack_run_logs:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `runId` | string | FK to pack_runs |
+| `sequence` | long | Monotonic counter |
+| `timestamp` | date | Event time (UTC) |
+| `level` | string | trace/debug/info/warn/error |
+| `eventType` | string | Machine identifier |
+| `stepId` | string | Optional step reference |
+
+**pack_artifacts:**
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `runId` | string | FK to pack_runs |
+| `name` | string | Output name |
+| `type` | string | file/object/url |
+| `storedPath` | string | Object store URI |
+| `status` | string | pending/copied/materialized |
+
+---
+
+## 8. Evidence & Attestation
+
+### 8.1 DSSE Attestation Structure
+
+```json
+{
+  "payloadType": "application/vnd.stellaops.pack-run+json",
+  "payload": {
+    "runId": "abc123...",
+    "packName": "vulnerability-scan-and-report",
+    "packVersion": "1.2.0",
+    "planHash": "sha256:def456...",
+    "inputs": {
+      "imageRef": { "value": "nginx:latest", "digest": "sha256:..." }
+    },
+    "outputs": [
+      { "name": "scanReport", "digest": "sha256:..." }
+    ],
+    "steps": [
+      { "id": "scan", "status": "completed", "duration": 45.2 }
+    ],
+    "completedAt": "2025-11-29T12:30:00Z"
+  },
+  "signatures": [...]
+}
+```
+
+### 8.2 Evidence Bundle
+
+Task pack runs produce evidence bundles containing:
+- Pack manifest (signed)
+- Input values (redacted secrets)
+- Output artifacts
+- Step transcripts
+- DSSE attestation
+
+---
+
+## 9. Determinism Requirements
+
+All TaskRunner operations must maintain determinism:
+
+1. **Plan hash binding** - Runtime graph must match computed plan hash
+2. **Stable step ordering** - Dependencies resolve deterministically
+3. **Expression evaluation** - Same inputs produce same resolved values
+4. **Timestamps in UTC** - All logs and events use ISO-8601 UTC
+5. **Secret masking** - Secrets never appear in logs or evidence
+
+---
+
+## 10. RBAC & Scopes
+
+| Scope | Purpose |
+|-------|---------|
+| `packs.read` | Discover/download packs |
+| `packs.write` | Publish/update packs (requires signature) |
+| `packs.run` | Execute packs via CLI/TaskRunner |
+| `packs.approve` | Fulfill approval gates |
+
+**Approval Token Requirements:**
+- `pack_run_id`, `pack_gate_id`, `pack_plan_hash` are mandatory
+- Token must be fresh (within 5-minute auth window)
+
+---
+
+## 11. Related Documentation
+
+| Resource | Location |
+|----------|----------|
+| Task Pack specification | `docs/task-packs/spec.md` |
+| Authoring guide | `docs/task-packs/authoring-guide.md` |
+| Operations runbook | `docs/task-packs/runbook.md` |
+| Registry architecture | `docs/task-packs/registry.md` |
+| MongoDB migrations | `docs/modules/taskrunner/migrations/pack-run-collections.md` |
+
+---
+
+## 12. Sprint Mapping
+
+- **Primary Sprint:** SPRINT_0157_0001_0001_taskrunner_i.md
+- **Phase II:** SPRINT_0158_0001_0002_taskrunner_ii.md
+- **Blockers:** SPRINT_0157_0001_0002_taskrunner_blockers.md
+
+**Key Task IDs:**
+- `TASKRUN-41-001` - Architecture/API contracts (BLOCKED)
+- `TASKRUN-42-001` - Execution engine enhancements (BLOCKED)
+- `TASKRUN-OBS-50-001` - Telemetry core adoption (DONE)
+- `TASKRUN-OBS-51-001` - Metrics implementation (DONE)
+- `TASKRUN-OBS-52-001` - Timeline events (BLOCKED)
+
+---
+
+## 13. Success Metrics
+
+| Metric | Target |
+|--------|--------|
+| Plan hash verification | 100% match or abort |
+| Approval gate response | < 5 min for high-priority |
+| Evidence attestation rate | 100% of completed runs |
+| Offline execution success | Works in sealed mode |
+| Step execution latency | < 2s overhead per step |
+
+---
+
+*Last updated: 2025-11-29*