Files
git.stella-ops.org/SPRINTS_IMPLEMENTION_PLAN.md
master 7e2fa0a42a Refactor and enhance scanner worker functionality
- Cleaned up code formatting and organization across multiple files for improved readability.
- Introduced `OsScanAnalyzerDispatcher` to handle OS analyzer execution and plugin loading.
- Updated `ScanJobContext` to include an `Analysis` property for storing scan results.
- Enhanced `ScanJobProcessor` to utilize the new `OsScanAnalyzerDispatcher`.
- Improved logging and error handling in `ScanProgressReporter` for better traceability.
- Updated project dependencies and added references to new analyzer plugins.
- Revised task documentation to reflect current status and dependencies.
2025-10-19 18:34:15 +03:00

18 KiB
Raw Blame History

StellaOps Multi-Sprint Implementation Plan (Agile Track)

This plan translates the current SPRINTS.md (read the file if you have not) backlog into parallel-friendly execution clusters. Each sprint is decomposed into groups that can run concurrently without stepping on the same directories. For every group we capture:

  • Tasks (ID · est. effort · path)
  • Acceptance metrics (quantitative targets to reduce rework)
  • Gate artifacts required before dependent groups can start

Durations are estimated work sizes (1d ≈ one focused engineer day). Milestones are gated by artifacts—not calendar dates—to keep us agile and adaptable to competitor pressure.


Sprint 9 Scanner Core Foundations (ID: SP9, ~3w)

Group SP9-G1 — Core Contracts & Observability (src/StellaOps.Scanner.Core) ~1w

  • Tasks:
    • SCANNER-CORE-09-501 · 3d · /src/StellaOps.Scanner.Core/TASKS.md
    • SCANNER-CORE-09-502 · 2d · same path
    • SCANNER-CORE-09-503 · 2d · same path
  • Acceptance metrics: DTO round-trip tests stable; middleware adds ≤5µs per call.
  • Gate SP9-G1 → WebService: scanner-core-contracts.md snippet plus ScannerCoreContractsTests green.

Group SP9-G2 — Queue Backbone (src/StellaOps.Scanner.Queue) ~1w

  • Tasks: SCANNER-QUEUE-09-401 (3d), -402 (2d), -403 (2d) · /src/StellaOps.Scanner.Queue/TASKS.md
  • Acceptance: dequeue latency p95 ≤20ms at 40rps; chaos test retains leases.
  • Gate: Redis/NATS adapters docs + QueueLeaseIntegrationTests passing.
  • Status: DONE (2025-10-19) Gate satisfied via Redis/NATS adapter docs and QueueLeaseIntegrationTests run under fake clock.

Group SP9-G3 — Storage Backbone (src/StellaOps.Scanner.Storage) ~1w

  • Tasks: SCANNER-STORAGE-09-301 (3d), -302 (2d), -303 (2d)
  • Acceptance: majority write/read ≤50ms; TTL verified.
  • Gate: migrations checked in; StorageDualWriteFixture passes.
  • Status: DONE (2025-10-19) Mongo bootstrapper + migrations committed; MinIO dual-write service wired; StorageDualWriteFixture green on Mongo2Go.

Group SP9-G4 — WebService Host & Policy Surfacing (src/StellaOps.Scanner.WebService) ~1.2w

  • Tasks: SCANNER-WEB-09-101 (2d), -102 (3d), -103 (2d), -104 (2d), SCANNER-POLICY-09-105 (3d), SCANNER-POLICY-09-106 (4d)
  • Acceptance: /api/v1/scans enqueue p95 ≤50ms under synthetic load; policy validation errors actionable; /reports response signed.
  • Gate SP9-G4 → SP10/SP11: /reports OpenAPI frozen; sample signed envelope committed in samples/api/reports/.
  • Status: IN PROGRESS (2025-10-19) Minimal host and /api/v1/scans endpoints delivered (SCANNER-WEB-09-101/102 done); progress streaming and policy/report surfaces remain.

Group SP9-G5 — Worker Host (src/StellaOps.Scanner.Worker) ~1w

  • Tasks: SCANNER-WORKER-09-201 (3d), -202 (3d), -203 (2d), -204 (2d), -205 (1d)
  • Acceptance: job lease never drops <3× heartbeat; progress events deterministic.
  • Gate: WorkerBasicScanScenario integration recorded + optional live queue smoke validation.
  • Status: DONE (2025-10-19) Host bootstrap, heartbeat jitter clamp, deterministic stage pipeline, metrics, and Redis-backed smoke harness landed; WorkerBasicScanScenarioTests and RedisWorkerSmokeTests (flagged) green.

Group SP9-G6 — Buildx Plug-in (src/StellaOps.Scanner.Sbomer.BuildXPlugin) ~0.8w

  • Tasks: SP9-BLDX-09-001 (3d), SP9-BLDX-09-002 (2d), SP9-BLDX-09-003 (2d), SP9-BLDX-09-004 (2d), SP9-BLDX-09-005 (1d)
  • Acceptance: build-time overhead ≤300ms/layer on 4vCPU; CAS handshake reliable in CI sample.
  • Gate: buildx demo workflow artifact + quickstart doc + determinism regression guard in CI.
  • Status: DONE (2025-10-19) — manifest+CAS scaffold, descriptor/Attestor hand-off, GitHub/Gitea determinism workflows, quickstart update, and golden tests committed.

Group SP9-G7 — Policy Engine Core (src/StellaOps.Policy) ~1w

  • Tasks: POLICY-CORE-09-001 (2d) , -002 (3d) , -003 (3d) , -004 (3d), -005 (4d), -006 (2d)
  • Acceptance: policy parsing ≥200 files/s; preview diff response <200ms for 500-component SBOM; quieting logic audited.
  • Gate: policy-schema@1 published; revision digests stored; preview API doc updated.

Group SP9-G8 — DevOps Early Guardrails (ops/devops) ~0.4w

  • Tasks: DEVOPS-HELM-09-001 (3d) — DONE (2025-10-19)
  • Acceptance: helm/compose profiles for dev/stage/airgap lint + dry-run clean; manifests pinned to digest.
  • Gate: profiles merged under deploy/; install guide cross-link satisfied via deploy/compose/ bundles and docs/21_INSTALL_GUIDE.md.

Group SP9-G9 — Documentation & Events (docs/) ~0.4w

  • Tasks: DOCS-ADR-09-001 (2d), DOCS-EVENTS-09-002 (2d)
  • Acceptance: ADR process broadcast; event schemas validated via CI.
  • Gate: docs/adr/index.md linking template; docs/events/README.md referencing schemas.
  • Status: DONE (2025-10-19) ADR contribution guide + template updates merged, Docs CI Ajv validation wired, events catalog documented, guild announcement recorded.

Sprint 10 Scanner Analyzers & SBOM (ID: SP10, ~4w)

Group SP10-G1 — OS Analyzer Plug-ins (src/StellaOps.Scanner.Analyzers.OS) ~1w

  • Tasks: SCANNER-ANALYZERS-OS-10-201..207 (durations 23d each)
  • Acceptance: analyzer runtime <1.5s/image; memory <250MB.
  • Gate: plug-ins packaged under plugins/scanner/analyzers/os/; determinism CI job green.

Group SP10-G2 — Language Analyzer Plug-ins (src/StellaOps.Scanner.Analyzers.Lang) ~1.5w

  • Tasks: SCANNER-ANALYZERS-LANG-10-301..309
  • Acceptance: Node analyzer handles 10k modules <2s; Python memory <200MB.
  • Gate: golden outputs stored; plugin manifests present.

Group SP10-G3 — EntryTrace Plug-ins (src/StellaOps.Scanner.EntryTrace) ~0.8w

  • Tasks: SCANNER-ENTRYTRACE-10-401..407
  • Acceptance: ≥95% launcher resolution success on samples; unknown reasons enumerated.
  • Gate: entrytrace plug-ins packaged; explainability doc updated.

Group SP10-G4 — SBOM Composition & BOM Index (src/StellaOps.Scanner.Diff + Emit) ~1w

  • Tasks: SCANNER-DIFF-10-501..503, SCANNER-EMIT-10-601..606
  • Acceptance: BOM-Index emission <500ms/image; diff output deterministic across runs.
  • Gate SP10-G4 → SP16: docs/artifacts/bom-index/ schema + fixtures; tests BOMIndexGoldenIsStable & UsageFlagsAreAccurate green.

Group SP10-G5 — Cache Subsystem (src/StellaOps.Scanner.Cache) ~0.6w

  • Tasks: SCANNER-CACHE-10-101..104
  • Acceptance: cache hit instrumentation validated; eviction keeps footprint <5GB.
  • Gate: cache configuration doc; integration test LayerCacheRoundTrip green.

Group SP10-G6 — Benchmarks & Samples (bench/, samples/, ops/devops) ~0.6w

  • Tasks: BENCH-SCANNER-10-001 (2d), SAMPLES-10-001 (finish 3d), DEVOPS-PERF-10-001 (2d)
  • Acceptance: analyzer benchmark CSV published; perf CI guard ensures SBOM compose <5s; sample SBOM/BOM-Index committed.
  • Gate: bench results stored under bench/; samples/ populated; CI job added.

Sprint 11 Signing Chain Bring-up (ID: SP11, ~3w)

Group SP11-G1 — Authority Sender Constraints (src/StellaOps.Authority) ~0.8w

  • Tasks: AUTH-DPOP-11-001 (3d), AUTH-MTLS-11-002 (2d)
  • Acceptance: DPoP nonce dance validated; mTLS tokens issued in ≤40ms.
  • Gate: updated Authority OpenAPI; QA scripts verifying DPoP/mTLS.

Group SP11-G2 — Signer Service (src/StellaOps.Signer) ~1.2w

  • Tasks: SIGNER-API-11-101 (4d), SIGNER-REF-11-102 (2d), SIGNER-QUOTA-11-103 (2d)
  • Acceptance: signing throughput ≥30 req/min; p95 latency ≤200ms.
  • Gate SP11-G2 → Attestor/UI: /sign/dsse OpenAPI frozen; signed DSSE bundle in repo; Rekor interop test passing.

Group SP11-G3 — Attestor Service (src/StellaOps.Attestor) ~1w

  • Tasks: ATTESTOR-API-11-201 (3d), ATTESTOR-VERIFY-11-202 (2d), ATTESTOR-OBS-11-203 (2d)
  • Acceptance: inclusion proof retrieval <500ms; audit log coverage 100%.
  • Gate: Attestor API doc + verification script.

Group SP11-G4 — UI Attestation Hooks (src/StellaOps.UI) ~0.4w

  • Tasks: UI-ATTEST-11-005 (3d)
  • Acceptance: attestation panel renders within 200ms; Rekor link verified.
  • Gate SP11-G4 → SP13-G1: recorded UX walkthrough.

Sprint 12 Runtime Guardrails (ID: SP12, ~3w)

Group SP12-G1 — Zastava Core (src/StellaOps.Zastava.Core) ~0.8w

  • Tasks: ZASTAVA-CORE-12-201..204
  • Acceptance: DTO tests stable; configuration docs produced.
  • Gate: schema doc + logging helpers integrated.

Group SP12-G2 — Zastava Observer (src/StellaOps.Zastava.Observer) ~0.8w

  • Tasks: ZASTAVA-OBS-12-001..004
  • Acceptance: observer memory <200MB; event flush ≤2s.
  • Gate: sample runtime events stored; offline buffer test passes.

Group SP12-G3 — Zastava Webhook (src/StellaOps.Zastava.Webhook) ~0.6w

  • Tasks: ZASTAVA-WEBHOOK-12-101..103
  • Acceptance: admission latency p95 ≤45ms; cache TTL adhered to.
  • Gate: TLS rotation procedure documented; readiness probe script.

Group SP12-G4 — Scanner Runtime APIs (src/StellaOps.Scanner.WebService) ~0.8w

  • Tasks: SCANNER-RUNTIME-12-301 (2d), SCANNER-RUNTIME-12-302 (3d)
  • Acceptance: /runtime/events handles 500 events/sec; /policy/runtime output matches webhook decisions.
  • Gate SP12-G4 → SP13/SP15: API documented, fixtures updated.

Sprint 13 UX & CLI Experience (ID: SP13, ~2w)

Group SP13-G1 — UI Shell & Panels (src/StellaOps.UI) ~1.6w

  • Tasks: UI-AUTH-13-001 (3d), UI-SCANS-13-002 (4d), UI-VEX-13-003 (3d), UI-ADMIN-13-004 (2d), UI-SCHED-13-005 (3d), UI-NOTIFY-13-006 (3d)
  • Acceptance: Lighthouse ≥85; Scheduler/Notify panels function against mocked APIs.
  • Gate: UI dev server fixtures committed; QA sign-off captured.

Group SP13-G2 — CLI Enhancements (src/StellaOps.Cli) ~0.8w

  • Tasks: CLI-RUNTIME-13-005 (3d), CLI-OFFLINE-13-006 (3d), CLI-PLUGIN-13-007 (2d)
  • Acceptance: runtime policy CLI completes <1s for 10 images; offline kit commands resume downloads.
  • Gate: CLI plugin manifest doc; smoke tests covering new verbs.

Sprint 14 Release & Offline Ops (ID: SP14, ~2w)

Group SP14-G1 — Release Automation (ops/devops) ~0.8w

  • Tasks: DEVOPS-REL-14-001 (4d)
  • Acceptance: reproducible build diff tool shows zero drift across two runs; signing pipeline green.
  • Gate: signed manifest + provenance published.

Group SP14-G2 — Offline Kit Packaging (ops/offline-kit) ~0.6w

  • Tasks: DEVOPS-OFFLINE-14-002 (3d)
  • Acceptance: kit import <5min with integrity verification CLI.
  • Gate: kit doc updated; import script included.

Group SP14-G3 — Deployment Playbooks (ops/deployment) ~0.4w

  • Tasks: DEVOPS-OPS-14-003 (2d)
  • Acceptance: rollback drill recorded; compatibility matrix produced.
  • Gate: playbook PR merged with Ops sign-off.

Group SP14-G4 — Licensing Token Service (ops/licensing) ~0.4w

  • Tasks: DEVOPS-LIC-14-004 (2d)
  • Acceptance: token service handles 100 req/min; revocation latency <60s.
  • Gate: monitoring dashboard links; failover doc.

Sprint 15 Notify Foundations (ID: SP15, ~3w)

Group SP15-G1 — Models & Storage (src/StellaOps.Notify.Models + Storage.Mongo) ~0.8w

  • Tasks: NOTIFY-MODELS-15-101 (2d), -102 (2d), -103 (1d); NOTIFY-STORAGE-15-201 (3d), -202 (2d), -203 (1d)
  • Acceptance: rule CRUD latency <120ms; delivery retention job verified.
  • Gate: schema docs + fixtures published.

Group SP15-G2 — Engine & Queue (src/StellaOps.Notify.Engine + Queue) ~0.8w

  • Tasks: NOTIFY-ENGINE-15-301..304, NOTIFY-QUEUE-15-401..403
  • Acceptance: rules evaluation ≥5k events/min; queue dead-letter <0.5%.
  • Gate: digest outputs committed; queue config doc updated.

Group SP15-G3 — WebService & Worker (src/StellaOps.Notify.WebService + Worker) ~0.8w

  • Tasks: NOTIFY-WEB-15-101..104, NOTIFY-WORKER-15-201..204
  • Acceptance: API p95 <120ms; worker delivery success ≥99%.
  • Gate: end-to-end fixture run producing delivery record.

Group SP15-G4 — Channel Plug-ins (src/StellaOps.Notify.Connectors.*) ~0.6w

  • Tasks: NOTIFY-CONN-SLACK-15-501..503, NOTIFY-CONN-TEAMS-15-601..603, NOTIFY-CONN-EMAIL-15-701..703, NOTIFY-CONN-WEBHOOK-15-801..803
  • Acceptance: channel-specific retry policies verified; rate limits respected.
  • Gate: plug-in manifests inside plugins/notify/**; test-send docs.

Group SP15-G5 — Events & Benchmarks (src/StellaOps.Scanner.WebService + bench) ~0.5w

  • Tasks: SCANNER-EVENTS-15-201 (2d), BENCH-NOTIFY-15-001 (2d)
  • Acceptance: event emission latency <100ms; throughput bench results stored.
  • Gate: docs/events/samples/ contains sample payloads; bench CSV in repo.

Sprint 16 Scheduler Intelligence (ID: SP16, ~4w)

Group SP16-G1 — Models & Storage (src/StellaOps.Scheduler.Models + Storage.Mongo) ~1w

  • Tasks: SCHED-MODELS-16-101 (3d), -102 (2d), -103 (2d); SCHED-STORAGE-16-201 (3d), -202 (2d), -203 (2d)
  • Acceptance: schedule CRUD latency <120ms; run retention TTL enforced.
  • Gate: schema doc + integration tests passing.

Group SP16-G2 — ImpactIndex & Queue (src/StellaOps.Scheduler.ImpactIndex + Queue + Bench) ~1.2w

  • Tasks: SCHED-IMPACT-16-300 (2d, DOING), SCHED-IMPACT-16-301 (3d), -302 (3d), -303 (2d); SCHED-QUEUE-16-401..403 (each 2d); BENCH-IMPACT-16-001 (2d)
  • Acceptance: impact resolve 10k productKeys <300ms hot; stub removed by sprint end.
  • Gate: roaring snapshot stored; bench CSV published; removal plan for stub recorded.

Group SP16-G3 — Scheduler WebService (src/StellaOps.Scheduler.WebService) ~0.8w

  • Tasks: SCHED-WEB-16-101..104 (each 2d)
  • Acceptance: preview endpoint <250ms; webhook security enforced.
  • Gate: OpenAPI published; dry-run JSON fixtures stored.

Group SP16-G4 — Scheduler Worker (src/StellaOps.Scheduler.Worker) ~1w

  • Tasks: SCHED-WORKER-16-201 (3d), -202 (2d), -203 (3d), -204 (2d), -205 (2d)
  • Acceptance: planner fairness metrics captured; runner success ≥98% across 1k sims.
  • Gate: event emission to Notify verified; metrics dashboards live.

Sprint 17 Symbol Intelligence & Forensics (ID: SP17, ~2.5w)

Group SP17-G1 — Scanner Forensics (src/StellaOps.Scanner.Emit + WebService) ~1.2w

  • Tasks: SCANNER-EMIT-17-701 (4d), SCANNER-RUNTIME-17-401 (3d)
  • Acceptance: forensic overlays add ≤150ms per image; runtime API exposes symbol hints with feature flag.
  • Gate: forensic SBOM samples committed; API doc updated.

Group SP17-G2 — Zastava Observability (src/StellaOps.Zastava.Observer) ~0.6w

  • Tasks: ZASTAVA-OBS-17-005 (3d)
  • Acceptance: new telemetry surfaces symbol diffs; observer CPU <10% under load.
  • Gate: Grafana dashboard export, alert thresholds defined.

Group SP17-G3 — Release Hardening (ops/devops) ~0.4w

  • Tasks: DEVOPS-REL-17-002 (2d)
  • Acceptance: deterministic build verifier job updated to include forensics artifacts.
  • Gate: CI pipeline stage forensics-verify green.

Group SP17-G4 — Documentation (docs/) ~0.3w

  • Tasks: DOCS-RUNTIME-17-004 (2d)
  • Acceptance: runtime forensic guide published with troubleshooting.
  • Gate: docs review sign-off; links added to UI help.

Integration Buffers

  • INT-A (0.3w, after SP10): Image → SBOM → BOM-Index → Scheduler preview → UI dry-run using fixtures.
  • INT-B (0.3w, after SP11 & SP15): SBOM → policy verdict → signed DSSE → Rekor entry → Notify delivery end-to-end.

Parallelisation Strategy

  • SP9 core modules and SP11 authority upgrades can progress in parallel; scanner clients rely on feature flags while DPoP/mTLS hardening lands.
  • SP10 SBOM emission may start alongside Scheduler ImpactIndex using samples/ fixtures; stub SCHED-IMPACT-16-300 keeps velocity while awaiting roaring index.
  • Notify foundations (SP15) can begin once event schemas freeze (delivered in SP9-G9/SP12-G4), consuming canned events until Scanner emits live ones.
  • UI (SP13) uses mocked endpoints early, decoupling front-end delivery from backend readiness.

Risk Registry

Risk ID Description Owner Mitigation Trigger
R1 BOM-Index memory blow-up on large fleets Scheduler ImpactIndex Guild Shard + mmap plan; monitor BENCH-IMPACT-16-001 RAM > 8GB in bench
R2 Buildx plugin latency regression BuildX Guild DEVOPS-PERF-10-001 guard; fallback to post-build scan Buildx job >300ms/layer
R3 Notify digests flooding Slack Notify Engine Guild throttle defaults, BENCH-NOTIFY-15-001 coverage Dropped messages >1%
R4 Policy precedence confusion Policy Guild ADR, preview API, unit tests Operator escalation about precedence
R5 ImpactIndex stub lingers Scheduler ImpactIndex Guild Track SCHED-IMPACT-16-300 removal in sprint review Stub present past SP16
R6 Symbol forensics slows runtime Scanner Emit Guild Feature flag; perf tests in SP17-G1 Forensics adds >150ms/image

Envelope & ADR Governance

  • Event schemas (docs/events/*.json) versioned; producers must bump suffix on breaking changes.
  • ADR template (docs/adr/0000-template.md) mandatory for BOM-Index format, event envelopes, DPoP nonce policy, Rekor migration.

Summary: The plan keeps high-impact artifacts (policy engine, BOM-Index, signing chain) on the critical path while unlocking parallel tracks (Notify, Scheduler, UI) through early schema freezes and fixtures. Integration buffers ensure cross-team touchpoints are validated continuously, supporting rapid iteration against competitive pressure.