feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
		
							
								
								
									
										62
									
								
								docs/modules/orchestrator/implementation_plan.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										62
									
								
								docs/modules/orchestrator/implementation_plan.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,62 @@ | ||||
| # Implementation plan — Source & Job Orchestrator | ||||
|  | ||||
| ## Delivery phases | ||||
| - **Phase 1 – Core service & job ledger**   | ||||
|   Implement source registry, run/job tables, queue abstraction, lease management, token-bucket rate limiting, watchdogs, and API primitives (`/sources`, `/runs`, `/jobs`). | ||||
| - **Phase 2 – Worker SDK & artifact registry**   | ||||
|   Embed worker SDK in Conseiller, Excititor, SBOM, Policy Engine; capture artifact metadata + hashes, enforce idempotency, publish progress/metrics. | ||||
| - **Phase 3 – Observability & dashboard**   | ||||
|   Ship metrics, traces, incident logging, SSE/WebSocket feeds, and Console dashboard (DAG/timeline, heatmaps, error clustering, SLO burn rate). | ||||
| - **Phase 4 – Controls & resilience**   | ||||
|   Deliver pause/resume/throttle/retry/backfill tooling, dead-letter review, circuit breakers, blackouts, backpressure handling, and automation hooks. | ||||
| - **Phase 5 – Offline & compliance**   | ||||
|   Generate deterministic audit bundles (`jobs.jsonl`, `history.jsonl`, `throttles.jsonl`), provenance manifests, and offline replay scripts. | ||||
|  | ||||
| ## Work breakdown | ||||
| - **Service & persistence** | ||||
|   - Postgres schema (`sources`, `runs`, `jobs`, `artifacts`, `dag_edges`, `quotas`, `schedules`, `incidents`). | ||||
|   - Lease manager with heartbeats, retries, and dead-letter queues. | ||||
|   - Token-bucket rate limiter per `{tenant, source.host}`; adaptive refill on upstream throttles. | ||||
|   - Watermark/backfill orchestration for event-time windows. | ||||
| - **Worker SDK** | ||||
|   - Claim/heartbeat/report contract, deterministic artifact hashing, idempotency enforcement. | ||||
|   - Library release for .NET workers plus language bindings for Rust/Go ingestion agents. | ||||
| - **Control plane APIs** | ||||
|   - CRUD for sources, runs, jobs, quotas, schedules; control actions (retry, cancel, prioritize, pause/resume, backfill). | ||||
|   - SSE/WebSocket stream for Console updates. | ||||
| - **Observability** | ||||
|   - Metrics: queue depth, job latency, failure classes, rate-limit hits, burn rate. | ||||
|   - Error clustering (HTTP 429/5xx, schema mismatch, parse errors), incident logging with reason codes. | ||||
|   - Gantt timeline and DAG JSON for Console visualisation. | ||||
| - **Console & CLI** | ||||
|   - Console app sections: overview, sources, runs, job detail, incidents, throttles. | ||||
|   - CLI commands: `stella orchestrator sources|runs|jobs|throttle|backfill`. | ||||
| - **Compliance & offline** | ||||
|   - Immutable audit bundles with signatures; exports via Export Center; Offline Kit instructions. | ||||
|   - Tenant isolation validation and secret redaction for logs/UI. | ||||
|  | ||||
| ## Acceptance criteria | ||||
| - Orchestrator schedules all advisory/VEX/SBOM/policy jobs with quotas, rate limits, and idempotency; retries and replay preserve provenance. | ||||
| - Console dashboard reflects real-time DAG status, queue depth, SLO burn rate, and allows pause/resume/throttle/backfill with audit trail. | ||||
| - Worker SDK integrated across producer services, emitting progress and artifact metadata. | ||||
| - Observability stack exposes metrics, logs, traces, incidents, and alerts for stuck jobs, throttling, and failure spikes. | ||||
| - Offline audit bundles reproduce job history deterministically and verify signatures. | ||||
|  | ||||
| ## Risks & mitigations | ||||
| - **Backpressure/queue overload:** adaptive token buckets, circuit breakers, dynamic concurrency; degrade gracefully. | ||||
| - **Upstream vendor throttles:** throttle management with user-visible state, automatic jitter and retry. | ||||
| - **Tenant leakage:** enforce tenant filters at API/queue/storage, fuzz tests, redaction. | ||||
| - **Complex DAG errors:** built-in diagnostics, error clustering, partial replay tooling. | ||||
| - **Operator error:** confirmation prompts, RBAC, runbook guidance, reason codes logged. | ||||
|  | ||||
| ## Test strategy | ||||
| - **Unit:** scheduling, quota enforcement, lease renewals, token bucket, watermark arithmetic. | ||||
| - **Integration:** worker SDK with Conseiller/Excititor/SBOM pipelines, pause/resume/backfill flows, failure recovery. | ||||
| - **Performance:** high-volume job workloads, queue backpressure, concurrency caps, dashboard SSE load tests. | ||||
| - **Chaos:** simulate upstream outages, stuck workers, clock skew, Postgres failover. | ||||
| - **Compliance:** audit bundle generation, signature verification, offline replay. | ||||
|  | ||||
| ## Definition of done | ||||
| - All phases delivered with telemetry, dashboards, and runbooks published. | ||||
| - Console + CLI parity validated; Offline Kit instructions complete. | ||||
| - ./TASKS.md and ../../TASKS.md updated with status; documentation (README/architecture/this plan) reflects latest behaviour. | ||||
		Reference in New Issue
	
	Block a user