Files
git.stella-ops.org/SPRINTS_IMPLEMENTION_PLAN.md
master 7e2fa0a42a Refactor and enhance scanner worker functionality
- Cleaned up code formatting and organization across multiple files for improved readability.
- Introduced `OsScanAnalyzerDispatcher` to handle OS analyzer execution and plugin loading.
- Updated `ScanJobContext` to include an `Analysis` property for storing scan results.
- Enhanced `ScanJobProcessor` to utilize the new `OsScanAnalyzerDispatcher`.
- Improved logging and error handling in `ScanProgressReporter` for better traceability.
- Updated project dependencies and added references to new analyzer plugins.
- Revised task documentation to reflect current status and dependencies.
2025-10-19 18:34:15 +03:00

296 lines
18 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# StellaOps Multi-Sprint Implementation Plan (Agile Track)
This plan translates the current `SPRINTS.md` (read the file if you have not) backlog into parallel-friendly execution clusters. Each sprint is decomposed into **groups** that can run concurrently without stepping on the same directories. For every group we capture:
- **Tasks** (ID · est. effort · path)
- **Acceptance metrics** (quantitative targets to reduce rework)
- **Gate** artifacts required before dependent groups can start
Durations are estimated work sizes (1d ≈ one focused engineer day). Milestones are gated by artifacts—not calendar dates—to keep us agile and adaptable to competitor pressure.
---
## Sprint 9 Scanner Core Foundations (ID: SP9, ~3w)
### Group SP9-G1 — Core Contracts & Observability (src/StellaOps.Scanner.Core) ~1w
- Tasks:
- SCANNER-CORE-09-501 · 3d · `/src/StellaOps.Scanner.Core/TASKS.md`
- SCANNER-CORE-09-502 · 2d · same path
- SCANNER-CORE-09-503 · 2d · same path
- Acceptance metrics: DTO round-trip tests stable; middleware adds ≤5µs per call.
- Gate SP9-G1 → WebService: `scanner-core-contracts.md` snippet plus `ScannerCoreContractsTests` green.
### Group SP9-G2 — Queue Backbone (src/StellaOps.Scanner.Queue) ~1w
- Tasks: SCANNER-QUEUE-09-401 (3d), -402 (2d), -403 (2d) · `/src/StellaOps.Scanner.Queue/TASKS.md`
- Acceptance: dequeue latency p95 ≤20ms at 40rps; chaos test retains leases.
- Gate: Redis/NATS adapters docs + `QueueLeaseIntegrationTests` passing.
- Status: **DONE (2025-10-19)** Gate satisfied via Redis/NATS adapter docs and `QueueLeaseIntegrationTests` run under fake clock.
### Group SP9-G3 — Storage Backbone (src/StellaOps.Scanner.Storage) ~1w
- Tasks: SCANNER-STORAGE-09-301 (3d), -302 (2d), -303 (2d)
- Acceptance: majority write/read ≤50ms; TTL verified.
- Gate: migrations checked in; `StorageDualWriteFixture` passes.
- Status: **DONE (2025-10-19)** Mongo bootstrapper + migrations committed; MinIO dual-write service wired; `StorageDualWriteFixture` green on Mongo2Go.
### Group SP9-G4 — WebService Host & Policy Surfacing (src/StellaOps.Scanner.WebService) ~1.2w
- Tasks: SCANNER-WEB-09-101 (2d), -102 (3d), -103 (2d), -104 (2d), SCANNER-POLICY-09-105 (3d), SCANNER-POLICY-09-106 (4d)
- Acceptance: `/api/v1/scans` enqueue p95 ≤50ms under synthetic load; policy validation errors actionable; `/reports` response signed.
- Gate SP9-G4 → SP10/SP11: `/reports` OpenAPI frozen; sample signed envelope committed in `samples/api/reports/`.
- Status: **IN PROGRESS (2025-10-19)** Minimal host and `/api/v1/scans` endpoints delivered (SCANNER-WEB-09-101/102 done); progress streaming and policy/report surfaces remain.
### Group SP9-G5 — Worker Host (src/StellaOps.Scanner.Worker) ~1w
- Tasks: SCANNER-WORKER-09-201 (3d), -202 (3d), -203 (2d), -204 (2d), -205 (1d)
- Acceptance: job lease never drops <3× heartbeat; progress events deterministic.
- Gate: `WorkerBasicScanScenario` integration recorded + optional live queue smoke validation.
- Status: **DONE (2025-10-19)** Host bootstrap, heartbeat jitter clamp, deterministic stage pipeline, metrics, and Redis-backed smoke harness landed; `WorkerBasicScanScenarioTests` and `RedisWorkerSmokeTests` (flagged) green.
### Group SP9-G6 — Buildx Plug-in (src/StellaOps.Scanner.Sbomer.BuildXPlugin) ~0.8w
- Tasks: SP9-BLDX-09-001 (3d), SP9-BLDX-09-002 (2d), SP9-BLDX-09-003 (2d), SP9-BLDX-09-004 (2d), SP9-BLDX-09-005 (1d)
- Acceptance: build-time overhead 300ms/layer on 4vCPU; CAS handshake reliable in CI sample.
- Gate: buildx demo workflow artifact + quickstart doc + determinism regression guard in CI.
- Status: **DONE (2025-10-19)** manifest+CAS scaffold, descriptor/Attestor hand-off, GitHub/Gitea determinism workflows, quickstart update, and golden tests committed.
### Group SP9-G7 — Policy Engine Core (src/StellaOps.Policy) ~1w
- Tasks: POLICY-CORE-09-001 (2d) ✅, -002 (3d) ✅, -003 (3d) ✅, -004 (3d), -005 (4d), -006 (2d)
- Acceptance: policy parsing 200 files/s; preview diff response <200ms for 500-component SBOM; quieting logic audited.
- Gate: `policy-schema@1` published; revision digests stored; preview API doc updated.
### Group SP9-G8 — DevOps Early Guardrails (ops/devops) ~0.4w
- Tasks: DEVOPS-HELM-09-001 (3d) **DONE (2025-10-19)**
- Acceptance: helm/compose profiles for dev/stage/airgap lint + dry-run clean; manifests pinned to digest.
- Gate: profiles merged under `deploy/`; install guide cross-link satisfied via `deploy/compose/` bundles and `docs/21_INSTALL_GUIDE.md`.
### Group SP9-G9 — Documentation & Events (docs/) ~0.4w
- Tasks: DOCS-ADR-09-001 (2d), DOCS-EVENTS-09-002 (2d)
- Acceptance: ADR process broadcast; event schemas validated via CI.
- Gate: `docs/adr/index.md` linking template; `docs/events/README.md` referencing schemas.
- Status: **DONE (2025-10-19)** ADR contribution guide + template updates merged, Docs CI Ajv validation wired, events catalog documented, guild announcement recorded.
---
## Sprint 10 Scanner Analyzers & SBOM (ID: SP10, ~4w)
### Group SP10-G1 — OS Analyzer Plug-ins (src/StellaOps.Scanner.Analyzers.OS) ~1w
- Tasks: SCANNER-ANALYZERS-OS-10-201..207 (durations 23d each)
- Acceptance: analyzer runtime <1.5s/image; memory <250MB.
- Gate: plug-ins packaged under `plugins/scanner/analyzers/os/`; determinism CI job green.
### Group SP10-G2 — Language Analyzer Plug-ins (src/StellaOps.Scanner.Analyzers.Lang) ~1.5w
- Tasks: SCANNER-ANALYZERS-LANG-10-301..309
- Acceptance: Node analyzer handles 10k modules <2s; Python memory <200MB.
- Gate: golden outputs stored; plugin manifests present.
### Group SP10-G3 — EntryTrace Plug-ins (src/StellaOps.Scanner.EntryTrace) ~0.8w
- Tasks: SCANNER-ENTRYTRACE-10-401..407
- Acceptance: 95% launcher resolution success on samples; unknown reasons enumerated.
- Gate: entrytrace plug-ins packaged; explainability doc updated.
### Group SP10-G4 — SBOM Composition & BOM Index (src/StellaOps.Scanner.Diff + Emit) ~1w
- Tasks: SCANNER-DIFF-10-501..503, SCANNER-EMIT-10-601..606
- Acceptance: BOM-Index emission <500ms/image; diff output deterministic across runs.
- Gate SP10-G4 SP16: `docs/artifacts/bom-index/` schema + fixtures; tests `BOMIndexGoldenIsStable` & `UsageFlagsAreAccurate` green.
### Group SP10-G5 — Cache Subsystem (src/StellaOps.Scanner.Cache) ~0.6w
- Tasks: SCANNER-CACHE-10-101..104
- Acceptance: cache hit instrumentation validated; eviction keeps footprint <5GB.
- Gate: cache configuration doc; integration test `LayerCacheRoundTrip` green.
### Group SP10-G6 — Benchmarks & Samples (bench/, samples/, ops/devops) ~0.6w
- Tasks: BENCH-SCANNER-10-001 (2d), SAMPLES-10-001 (finish 3d), DEVOPS-PERF-10-001 (2d)
- Acceptance: analyzer benchmark CSV published; perf CI guard ensures SBOM compose <5s; sample SBOM/BOM-Index committed.
- Gate: bench results stored under `bench/`; `samples/` populated; CI job added.
---
## Sprint 11 Signing Chain Bring-up (ID: SP11, ~3w)
### Group SP11-G1 — Authority Sender Constraints (src/StellaOps.Authority) ~0.8w
- Tasks: AUTH-DPOP-11-001 (3d), AUTH-MTLS-11-002 (2d)
- Acceptance: DPoP nonce dance validated; mTLS tokens issued in 40ms.
- Gate: updated Authority OpenAPI; QA scripts verifying DPoP/mTLS.
### Group SP11-G2 — Signer Service (src/StellaOps.Signer) ~1.2w
- Tasks: SIGNER-API-11-101 (4d), SIGNER-REF-11-102 (2d), SIGNER-QUOTA-11-103 (2d)
- Acceptance: signing throughput 30 req/min; p95 latency 200ms.
- Gate SP11-G2 Attestor/UI: `/sign/dsse` OpenAPI frozen; signed DSSE bundle in repo; Rekor interop test passing.
### Group SP11-G3 — Attestor Service (src/StellaOps.Attestor) ~1w
- Tasks: ATTESTOR-API-11-201 (3d), ATTESTOR-VERIFY-11-202 (2d), ATTESTOR-OBS-11-203 (2d)
- Acceptance: inclusion proof retrieval <500ms; audit log coverage 100%.
- Gate: Attestor API doc + verification script.
### Group SP11-G4 — UI Attestation Hooks (src/StellaOps.UI) ~0.4w
- Tasks: UI-ATTEST-11-005 (3d)
- Acceptance: attestation panel renders within 200ms; Rekor link verified.
- Gate SP11-G4 SP13-G1: recorded UX walkthrough.
---
## Sprint 12 Runtime Guardrails (ID: SP12, ~3w)
### Group SP12-G1 — Zastava Core (src/StellaOps.Zastava.Core) ~0.8w
- Tasks: ZASTAVA-CORE-12-201..204
- Acceptance: DTO tests stable; configuration docs produced.
- Gate: schema doc + logging helpers integrated.
### Group SP12-G2 — Zastava Observer (src/StellaOps.Zastava.Observer) ~0.8w
- Tasks: ZASTAVA-OBS-12-001..004
- Acceptance: observer memory <200MB; event flush 2s.
- Gate: sample runtime events stored; offline buffer test passes.
### Group SP12-G3 — Zastava Webhook (src/StellaOps.Zastava.Webhook) ~0.6w
- Tasks: ZASTAVA-WEBHOOK-12-101..103
- Acceptance: admission latency p95 45ms; cache TTL adhered to.
- Gate: TLS rotation procedure documented; readiness probe script.
### Group SP12-G4 — Scanner Runtime APIs (src/StellaOps.Scanner.WebService) ~0.8w
- Tasks: SCANNER-RUNTIME-12-301 (2d), SCANNER-RUNTIME-12-302 (3d)
- Acceptance: `/runtime/events` handles 500 events/sec; `/policy/runtime` output matches webhook decisions.
- Gate SP12-G4 SP13/SP15: API documented, fixtures updated.
---
## Sprint 13 UX & CLI Experience (ID: SP13, ~2w)
### Group SP13-G1 — UI Shell & Panels (src/StellaOps.UI) ~1.6w
- Tasks: UI-AUTH-13-001 (3d), UI-SCANS-13-002 (4d), UI-VEX-13-003 (3d), UI-ADMIN-13-004 (2d), UI-SCHED-13-005 (3d), UI-NOTIFY-13-006 (3d)
- Acceptance: Lighthouse 85; Scheduler/Notify panels function against mocked APIs.
- Gate: UI dev server fixtures committed; QA sign-off captured.
### Group SP13-G2 — CLI Enhancements (src/StellaOps.Cli) ~0.8w
- Tasks: CLI-RUNTIME-13-005 (3d), CLI-OFFLINE-13-006 (3d), CLI-PLUGIN-13-007 (2d)
- Acceptance: runtime policy CLI completes <1s for 10 images; offline kit commands resume downloads.
- Gate: CLI plugin manifest doc; smoke tests covering new verbs.
---
## Sprint 14 Release & Offline Ops (ID: SP14, ~2w)
### Group SP14-G1 — Release Automation (ops/devops) ~0.8w
- Tasks: DEVOPS-REL-14-001 (4d)
- Acceptance: reproducible build diff tool shows zero drift across two runs; signing pipeline green.
- Gate: signed manifest + provenance published.
### Group SP14-G2 — Offline Kit Packaging (ops/offline-kit) ~0.6w
- Tasks: DEVOPS-OFFLINE-14-002 (3d)
- Acceptance: kit import <5min with integrity verification CLI.
- Gate: kit doc updated; import script included.
### Group SP14-G3 — Deployment Playbooks (ops/deployment) ~0.4w
- Tasks: DEVOPS-OPS-14-003 (2d)
- Acceptance: rollback drill recorded; compatibility matrix produced.
- Gate: playbook PR merged with Ops sign-off.
### Group SP14-G4 — Licensing Token Service (ops/licensing) ~0.4w
- Tasks: DEVOPS-LIC-14-004 (2d)
- Acceptance: token service handles 100 req/min; revocation latency <60s.
- Gate: monitoring dashboard links; failover doc.
---
## Sprint 15 Notify Foundations (ID: SP15, ~3w)
### Group SP15-G1 — Models & Storage (src/StellaOps.Notify.Models + Storage.Mongo) ~0.8w
- Tasks: NOTIFY-MODELS-15-101 (2d), -102 (2d), -103 (1d); NOTIFY-STORAGE-15-201 (3d), -202 (2d), -203 (1d)
- Acceptance: rule CRUD latency <120ms; delivery retention job verified.
- Gate: schema docs + fixtures published.
### Group SP15-G2 — Engine & Queue (src/StellaOps.Notify.Engine + Queue) ~0.8w
- Tasks: NOTIFY-ENGINE-15-301..304, NOTIFY-QUEUE-15-401..403
- Acceptance: rules evaluation 5k events/min; queue dead-letter <0.5%.
- Gate: digest outputs committed; queue config doc updated.
### Group SP15-G3 — WebService & Worker (src/StellaOps.Notify.WebService + Worker) ~0.8w
- Tasks: NOTIFY-WEB-15-101..104, NOTIFY-WORKER-15-201..204
- Acceptance: API p95 <120ms; worker delivery success 99%.
- Gate: end-to-end fixture run producing delivery record.
### Group SP15-G4 — Channel Plug-ins (src/StellaOps.Notify.Connectors.*) ~0.6w
- Tasks: NOTIFY-CONN-SLACK-15-501..503, NOTIFY-CONN-TEAMS-15-601..603, NOTIFY-CONN-EMAIL-15-701..703, NOTIFY-CONN-WEBHOOK-15-801..803
- Acceptance: channel-specific retry policies verified; rate limits respected.
- Gate: plug-in manifests inside `plugins/notify/**`; test-send docs.
### Group SP15-G5 — Events & Benchmarks (src/StellaOps.Scanner.WebService + bench) ~0.5w
- Tasks: SCANNER-EVENTS-15-201 (2d), BENCH-NOTIFY-15-001 (2d)
- Acceptance: event emission latency <100ms; throughput bench results stored.
- Gate: `docs/events/samples/` contains sample payloads; bench CSV in repo.
---
## Sprint 16 Scheduler Intelligence (ID: SP16, ~4w)
### Group SP16-G1 — Models & Storage (src/StellaOps.Scheduler.Models + Storage.Mongo) ~1w
- Tasks: SCHED-MODELS-16-101 (3d), -102 (2d), -103 (2d); SCHED-STORAGE-16-201 (3d), -202 (2d), -203 (2d)
- Acceptance: schedule CRUD latency <120ms; run retention TTL enforced.
- Gate: schema doc + integration tests passing.
### Group SP16-G2 — ImpactIndex & Queue (src/StellaOps.Scheduler.ImpactIndex + Queue + Bench) ~1.2w
- Tasks: SCHED-IMPACT-16-300 (2d, DOING), SCHED-IMPACT-16-301 (3d), -302 (3d), -303 (2d); SCHED-QUEUE-16-401..403 (each 2d); BENCH-IMPACT-16-001 (2d)
- Acceptance: impact resolve 10k productKeys <300ms hot; stub removed by sprint end.
- Gate: roaring snapshot stored; bench CSV published; removal plan for stub recorded.
### Group SP16-G3 — Scheduler WebService (src/StellaOps.Scheduler.WebService) ~0.8w
- Tasks: SCHED-WEB-16-101..104 (each 2d)
- Acceptance: preview endpoint <250ms; webhook security enforced.
- Gate: OpenAPI published; dry-run JSON fixtures stored.
### Group SP16-G4 — Scheduler Worker (src/StellaOps.Scheduler.Worker) ~1w
- Tasks: SCHED-WORKER-16-201 (3d), -202 (2d), -203 (3d), -204 (2d), -205 (2d)
- Acceptance: planner fairness metrics captured; runner success 98% across 1k sims.
- Gate: event emission to Notify verified; metrics dashboards live.
---
## Sprint 17 Symbol Intelligence & Forensics (ID: SP17, ~2.5w)
### Group SP17-G1 — Scanner Forensics (src/StellaOps.Scanner.Emit + WebService) ~1.2w
- Tasks: SCANNER-EMIT-17-701 (4d), SCANNER-RUNTIME-17-401 (3d)
- Acceptance: forensic overlays add 150ms per image; runtime API exposes symbol hints with feature flag.
- Gate: forensic SBOM samples committed; API doc updated.
### Group SP17-G2 — Zastava Observability (src/StellaOps.Zastava.Observer) ~0.6w
- Tasks: ZASTAVA-OBS-17-005 (3d)
- Acceptance: new telemetry surfaces symbol diffs; observer CPU <10% under load.
- Gate: Grafana dashboard export, alert thresholds defined.
### Group SP17-G3 — Release Hardening (ops/devops) ~0.4w
- Tasks: DEVOPS-REL-17-002 (2d)
- Acceptance: deterministic build verifier job updated to include forensics artifacts.
- Gate: CI pipeline stage `forensics-verify` green.
### Group SP17-G4 — Documentation (docs/) ~0.3w
- Tasks: DOCS-RUNTIME-17-004 (2d)
- Acceptance: runtime forensic guide published with troubleshooting.
- Gate: docs review sign-off; links added to UI help.
---
## Integration Buffers
- **INT-A (0.3w, after SP10):** Image SBOM BOM-Index Scheduler preview UI dry-run using fixtures.
- **INT-B (0.3w, after SP11 & SP15):** SBOM policy verdict signed DSSE Rekor entry Notify delivery end-to-end.
## Parallelisation Strategy
- SP9 core modules and SP11 authority upgrades can progress in parallel; scanner clients rely on feature flags while DPoP/mTLS hardening lands.
- SP10 SBOM emission may start alongside Scheduler ImpactIndex using `samples/` fixtures; stub SCHED-IMPACT-16-300 keeps velocity while awaiting roaring index.
- Notify foundations (SP15) can begin once event schemas freeze (delivered in SP9-G9/SP12-G4), consuming canned events until Scanner emits live ones.
- UI (SP13) uses mocked endpoints early, decoupling front-end delivery from backend readiness.
## Risk Registry
| Risk ID | Description | Owner | Mitigation | Trigger |
|---------|-------------|-------|-----------|---------|
| R1 | BOM-Index memory blow-up on large fleets | Scheduler ImpactIndex Guild | Shard + mmap plan; monitor BENCH-IMPACT-16-001 | RAM > 8GB in bench |
| R2 | Buildx plugin latency regression | BuildX Guild | DEVOPS-PERF-10-001 guard; fallback to post-build scan | Buildx job >300ms/layer |
| R3 | Notify digests flooding Slack | Notify Engine Guild | throttle defaults, BENCH-NOTIFY-15-001 coverage | Dropped messages >1% |
| R4 | Policy precedence confusion | Policy Guild | ADR, preview API, unit tests | Operator escalation about precedence |
| R5 | ImpactIndex stub lingers | Scheduler ImpactIndex Guild | Track SCHED-IMPACT-16-300 removal in sprint review | Stub present past SP16 |
| R6 | Symbol forensics slows runtime | Scanner Emit Guild | Feature flag; perf tests in SP17-G1 | Forensics adds >150ms/image |
## Envelope & ADR Governance
- Event schemas (`docs/events/*.json`) versioned; producers must bump suffix on breaking changes.
- ADR template (`docs/adr/0000-template.md`) mandatory for BOM-Index format, event envelopes, DPoP nonce policy, Rekor migration.
---
**Summary:** The plan keeps high-impact artifacts (policy engine, BOM-Index, signing chain) on the critical path while unlocking parallel tracks (Notify, Scheduler, UI) through early schema freezes and fixtures. Integration buffers ensure cross-team touchpoints are validated continuously, supporting rapid iteration against competitive pressure.