FUll implementation plan (first draft)

2025-10-19 00:28:48 +03:00
parent 6524626230
commit c4980d9625
125 changed files with 5438 additions and 166 deletions
--- a/docs/ARCHITECTURE_SCHEDULER.md
+++ b/docs/ARCHITECTURE_SCHEDULER.md
@@ -0,0 +1,424 @@
+# component_architecture_scheduler.md — **Stella Ops Scheduler** (2025Q4)
+
+> **Scope.** Implementation‑ready architecture for **Scheduler**: a service that (1) **re‑evaluates** already‑cataloged images when intel changes (Feedser/Vexer/policy), (2) orchestrates **nightly** and **ad‑hoc** runs, (3) targets only the **impacted** images using the BOM‑Index, and (4) emits **report‑ready** events that downstream **Notify** fans out. Default mode is **analysis‑only** (no image pull); optional **content‑refresh** can be enabled per schedule.
+
+---
+
+## 0) Mission & boundaries
+
+**Mission.** Keep scan results **current** without rescanning the world. When new advisories or VEX claims land, **pinpoint** affected images and ask the backend to recompute **verdicts** against the **existing SBOMs**. Surface only **meaningful deltas** to humans and ticket queues.
+
+**Boundaries.**
+
+* Scheduler **does not** compute SBOMs and **does not** sign. It calls Scanner/WebService’s **/reports (analysis‑only)** endpoint and lets the backend (Policy + Vexer + Feedser) decide PASS/FAIL.
+* Scheduler **may** ask Scanner to **content‑refresh** selected targets (e.g., mutable tags) but the default is **no** image pull.
+* Notifications are **not** sent directly; Scheduler emits events consumed by **Notify**.
+
+---
+
+## 1) Runtime shape & projects
+
+```
+src/
+ ├─ StellaOps.Scheduler.WebService/      # REST (schedules CRUD, runs, admin)
+ ├─ StellaOps.Scheduler.Worker/          # planners + runners (N replicas)
+ ├─ StellaOps.Scheduler.ImpactIndex/     # purl→images inverted index (roaring bitmaps)
+ ├─ StellaOps.Scheduler.Models/          # DTOs (Schedule, Run, ImpactSet, Deltas)
+ ├─ StellaOps.Scheduler.Storage.Mongo/   # schedules, runs, cursors, locks
+ ├─ StellaOps.Scheduler.Queue/           # Redis Streams / NATS abstraction
+ ├─ StellaOps.Scheduler.Tests.*          # unit/integration/e2e
+```
+
+**Deployables**:
+
+* **Scheduler.WebService** (stateless)
+* **Scheduler.Worker** (scale‑out; planners + executors)
+
+**Dependencies**: Authority (OpTok + DPoP/mTLS), Scanner.WebService, Feedser, Vexer, MongoDB, Redis/NATS, (optional) Notify.
+
+---
+
+## 2) Core responsibilities
+
+1. **Time‑based** runs: cron windows per tenant/timezone (e.g., “02:00 Europe/Sofia”).
+2. **Event‑driven** runs: react to **Feedser export** and **Vexer export** deltas (changed product keys / advisories / claims).
+3. **Impact targeting**: map changes to **image sets** using a **global inverted index** built from Scanner’s per‑image **BOM‑Index** sidecars.
+4. **Run planning**: shard, pace, and rate‑limit jobs to avoid thundering herds.
+5. **Execution**: call Scanner **/reports (analysis‑only)** or **/scans (content‑refresh)**; aggregate **delta** results.
+6. **Events**: publish `rescan.delta` and `report.ready` summaries for **Notify** & **UI**.
+7. **Control plane**: CRUD schedules, **pause/resume**, dry‑run previews, audit.
+
+---
+
+## 3) Data model (Mongo)
+
+**Database**: `scheduler`
+
+* `schedules`
+
+  ```
+  { _id, tenantId, name, enabled, whenCron, timezone,
+    mode: "analysis-only" | "content-refresh",
+    selection: { scope: "all-images" | "by-namespace" | "by-repo" | "by-digest" | "by-labels",
+                 includeTags?: ["prod-*"], digests?: [sha256...], resolvesTags?: bool },
+    onlyIf: { lastReportOlderThanDays?: int, policyRevision?: string },
+    notify: { onNewFindings: bool, minSeverity: "low|medium|high|critical", includeKEV: bool },
+    limits: { maxJobs?: int, ratePerSecond?: int, parallelism?: int },
+    createdAt, updatedAt, createdBy, updatedBy }
+  ```
+
+* `runs`
+
+  ```
+  { _id, scheduleId?, tenantId, trigger: "cron|feedser|vexer|manual",
+    reason?: { feedserExportId?, vexerExportId?, cursor? },
+    state: "planning|queued|running|completed|error|cancelled",
+    stats: { candidates: int, deduped: int, queued: int, completed: int, deltas: int, newCriticals: int },
+    startedAt, finishedAt, error? }
+  ```
+
+* `impact_cursors`
+
+  ```
+  { _id: tenantId, feedserLastExportId, vexerLastExportId, updatedAt }
+  ```
+
+* `locks` (singleton schedulers, run leases)
+
+* `audit` (CRUD actions, run outcomes)
+
+**Indexes**:
+
+* `schedules` on `{tenantId, enabled}`, `{whenCron}`.
+* `runs` on `{tenantId, startedAt desc}`, `{state}`.
+* TTL optional for completed runs (e.g., 180 days).
+
+---
+
+## 4) ImpactIndex (global inverted index)
+
+Goal: translate **change keys** → **image sets** in **milliseconds**.
+
+**Source**: Scanner produces per‑image **BOM‑Index** sidecars (purls, and `usedByEntrypoint` bitmaps). Scheduler ingests/refreshes them to build a **global** index.
+
+**Representation**:
+
+* Assign **image IDs** (dense ints) to catalog images.
+* Keep **Roaring Bitmaps**:
+
+  * `Contains[purl]        → bitmap(imageIds)`
+  * `UsedBy[purl]          → bitmap(imageIds)` (subset of Contains)
+* Optionally keep **Owner maps**: `{imageId → {tenantId, namespaces[], repos[]}}` for selection filters.
+* Persist in RocksDB/LMDB or Redis‑modules; cache hot shards in memory; snapshot to Mongo for cold start.
+
+**Update paths**:
+
+* On new/updated image SBOM: **merge** per‑image set into global maps.
+* On image remove/expiry: **clear** id from bitmaps.
+
+**API (internal)**:
+
+```csharp
+IImpactIndex {
+  ImpactSet ResolveByPurls(IEnumerable<string> purls, bool usageOnly, Selector sel);
+  ImpactSet ResolveByVulns(IEnumerable<string> vulnIds, bool usageOnly, Selector sel); // optional (vuln->purl precomputed by Feedser)
+  ImpactSet ResolveAll(Selector sel); // for nightly
+}
+```
+
+**Selector filters**: tenant, namespaces, repos, labels, digest allowlists, `includeTags` patterns.
+
+---
+
+## 5) External interfaces (REST)
+
+Base path: `/api/v1/scheduler` (Authority OpToks; scopes: `scheduler.read`, `scheduler.admin`).
+
+### 5.1 Schedules CRUD
+
+* `POST /schedules` → create
+* `GET /schedules` → list (filter by tenant)
+* `GET /schedules/{id}` → details + next run
+* `PATCH /schedules/{id}` → pause/resume/update
+* `DELETE /schedules/{id}` → delete (soft delete, optional)
+
+### 5.2 Run control & introspection
+
+* `POST /run` — ad‑hoc run
+
+  ```json
+  { "mode": "analysis-only|content-refresh", "selection": {...}, "reason": "manual" }
+  ```
+* `GET /runs` — list with paging
+* `GET /runs/{id}` — status, stats, links to deltas
+* `POST /runs/{id}/cancel` — best‑effort cancel
+
+### 5.3 Previews (dry‑run)
+
+* `POST /preview/impact` — returns **candidate count** and a small sample of impacted digests for given change keys or selection.
+
+### 5.4 Event webhooks (optional push from Feedser/Vexer)
+
+* `POST /events/feedser-export`
+
+  ```json
+  { "exportId":"...", "changedProductKeys":["pkg:rpm/openssl", ...], "kev": ["CVE-..."], "window": { "from":"...","to":"..." } }
+  ```
+* `POST /events/vexer-export`
+
+  ```json
+  { "exportId":"...", "changedClaims":[ { "productKey":"pkg:deb/...", "vulnId":"CVE-...", "status":"not_affected→affected"} ], ... }
+  ```
+
+**Security**: webhook requires **mTLS** or an **HMAC** `X-Scheduler-Signature` (Ed25519 / SHA‑256) plus Authority token.
+
+---
+
+## 6) Planner → Runner pipeline
+
+### 6.1 Planning algorithm (event‑driven)
+
+```
+On Export Event (Feedser/Vexer):
+  keys = Normalize(change payload)          # productKeys or vulnIds→productKeys
+  usageOnly = schedule/policy hint?         # default true
+  sel = Selector for tenant/scope from schedules subscribed to events
+
+  impacted = ImpactIndex.ResolveByPurls(keys, usageOnly, sel)
+  impacted = ApplyOwnerFilters(impacted, sel)           # namespaces/repos/labels
+  impacted = DeduplicateByDigest(impacted)
+  impacted = EnforceLimits(impacted, limits.maxJobs)
+  shards    = Shard(impacted, byHashPrefix, n=limits.parallelism)
+
+  For each shard:
+    Enqueue RunSegment (runId, shard, rate=limits.ratePerSecond)
+```
+
+**Fairness & pacing**
+
+* Use **leaky bucket** per tenant and per registry host.
+* Prioritize **KEV‑tagged** and **critical** first if oversubscribed.
+
+### 6.2 Nightly planning
+
+```
+At cron tick:
+  sel = resolve selection
+  candidates = ImpactIndex.ResolveAll(sel)
+  if lastReportOlderThanDays present → filter by report age (via Scanner catalog)
+  shard & enqueue as above
+```
+
+### 6.3 Execution (Runner)
+
+* Pop **RunSegment** job → for each image digest:
+
+  * **analysis‑only**: `POST scanner/reports { imageDigest, policyRevision? }`
+  * **content‑refresh**: resolve tag→digest if needed; `POST scanner/scans { imageRef, attest? false }` then `POST /reports`
+* Collect **delta**: `newFindings`, `newCriticals`/`highs`, `links` (UI deep link, Rekor if present).
+* Persist per‑image outcome in `runs.{id}.stats` (incremental counters).
+* Emit `scheduler.rescan.delta` events to **Notify** only when **delta > 0** and matches severity rule.
+
+---
+
+## 7) Event model (outbound)
+
+**Topic**: `rescan.delta` (internal bus → Notify; UI subscribes via backend).
+
+```json
+{
+  "tenant": "tenant-01",
+  "runId": "324af…",
+  "imageDigest": "sha256:…",
+  "newCriticals": 1,
+  "newHigh": 2,
+  "kevHits": ["CVE-2025-..."],
+  "topFindings": [
+    { "purl":"pkg:rpm/openssl@3.0.12-...","vulnId":"CVE-2025-...","severity":"critical","link":"https://ui/scans/..." }
+  ],
+  "reportUrl": "https://ui/.../scans/sha256:.../report",
+  "attestation": { "uuid":"rekor-uuid", "verified": true },
+  "ts": "2025-10-18T03:12:45Z"
+}
+```
+
+**Also**: `report.ready` for “no‑change” summaries (digest + zero delta), which Notify can ignore by rule.
+
+---
+
+## 8) Security posture
+
+* **AuthN/Z**: Authority OpToks with `aud=scheduler`; DPoP (preferred) or mTLS.
+* **Multi‑tenant**: every schedule, run, and event carries `tenantId`; ImpactIndex filters by tenant‑visible images.
+* **Webhook** callers (Feedser/Vexer) present **mTLS** or **HMAC** and Authority token.
+* **Input hardening**: size caps on changed key lists; reject >100k keys per event; compress (zstd/gzip) allowed with limits.
+* **No secrets** in logs; redact tokens and signatures.
+
+---
+
+## 9) Observability & SLOs
+
+**Metrics (Prometheus)**
+
+* `scheduler.events_total{source, result}`
+* `scheduler.impact_resolve_seconds{quantile}`
+* `scheduler.images_selected_total{mode}`
+* `scheduler.jobs_enqueued_total{mode}`
+* `scheduler.run_latency_seconds{quantile}` // event → first verdict
+* `scheduler.delta_images_total{severity}`
+* `scheduler.rate_limited_total{reason}`
+
+**Targets**
+
+* Resolve 10k changed keys → impacted set in **<300 ms** (hot cache).
+* Event → first rescan verdict in **≤60 s** (p95).
+* Nightly coverage 50k images in **≤10 min** with 10 workers (analysis‑only).
+
+**Tracing** (OTEL): spans `plan`, `resolve`, `enqueue`, `report_call`, `persist`, `emit`.
+
+---
+
+## 10) Configuration (YAML)
+
+```yaml
+scheduler:
+  authority:
+    issuer: "https://authority.internal"
+    require: "dpop"            # or "mtls"
+  queue:
+    kind: "redis"              # or "nats"
+    url: "redis://redis:6379/4"
+  mongo:
+    uri: "mongodb://mongo/scheduler"
+  impactIndex:
+    storage: "rocksdb"         # "rocksdb" | "redis" | "memory"
+    warmOnStart: true
+    usageOnlyDefault: true
+  limits:
+    defaultRatePerSecond: 50
+    defaultParallelism: 8
+    maxJobsPerRun: 50000
+  integrates:
+    scannerUrl: "https://scanner-web.internal"
+    feedserWebhook: true
+    vexerWebhook: true
+  notifications:
+    emitBus: "internal"        # deliver to Notify via internal bus
+```
+
+---
+
+## 11) UI touch‑points
+
+* **Schedules** page: CRUD, enable/pause, next run, last run stats, mode (analysis/content), selector preview.
+* **Runs** page: timeline; heat‑map of deltas; drill‑down to affected images.
+* **Dry‑run preview** modal: “This Feedser export touches ~3,214 images; projected deltas: ~420 (34 KEV).”
+
+---
+
+## 12) Failure modes & degradations
+
+| Condition                            | Behavior                                                                                 |
+| ------------------------------------ | ---------------------------------------------------------------------------------------- |
+| ImpactIndex cold / incomplete        | Fall back to **All** selection for nightly; for events, cap to KEV+critical until warmed |
+| Feedser/Vexer webhook storm          | Coalesce by exportId; debounce 30–60 s; keep last                                        |
+| Scanner under load (429)             | Backoff with jitter; respect per‑tenant/leaky bucket                                     |
+| Oversubscription (too many impacted) | Prioritize KEV/critical first; spillover to next window; UI banner shows backlog         |
+| Notify down                          | Buffer outbound events in queue (TTL 24h)                                                |
+| Mongo slow                           | Cut batch sizes; sample‑log; alert ops; don’t drop runs unless critical                  |
+
+---
+
+## 13) Testing matrix
+
+* **ImpactIndex**: correctness (purl→image sets), performance, persistence after restart, memory pressure with 1M purls.
+* **Planner**: dedupe, shard, fairness, limit enforcement, KEV prioritization.
+* **Runner**: parallel report calls, error backoff, partial failures, idempotency.
+* **End‑to‑end**: Feedser export → deltas visible in UI in ≤60 s.
+* **Security**: webhook auth (mTLS/HMAC), DPoP nonce dance, tenant isolation.
+* **Chaos**: drop scanner availability; simulate registry throttles (content‑refresh mode).
+* **Nightly**: cron tick correctness across timezones and DST.
+
+---
+
+## 14) Implementation notes
+
+* **Language**: .NET 10 minimal API; Channels‑based pipeline; `System.Threading.RateLimiting`.
+* **Bitmaps**: Roaring via `RoaringBitmap` bindings; memory‑map large shards if RocksDB used.
+* **Cron**: Quartz‑style parser with timezone support; clock skew tolerated ±60 s.
+* **Dry‑run**: use ImpactIndex only; never call scanner.
+* **Idempotency**: run segments carry deterministic keys; retries safe.
+* **Backpressure**: per‑tenant buckets; per‑host registry budgets respected when content‑refresh enabled.
+
+---
+
+## 15) Sequences (representative)
+
+**A) Event‑driven rescan (Feedser delta)**
+
+```mermaid
+sequenceDiagram
+  autonumber
+  participant FE as Feedser
+  participant SCH as Scheduler.Worker
+  participant IDX as ImpactIndex
+  participant SC as Scanner.WebService
+  participant NO as Notify
+
+  FE->>SCH: POST /events/feedser-export {exportId, changedProductKeys}
+  SCH->>IDX: ResolveByPurls(keys, usageOnly=true, sel)
+  IDX-->>SCH: bitmap(imageIds) → digests list
+  SCH->>SC: POST /reports {imageDigest}  (batch/sequenced)
+  SC-->>SCH: report deltas (new criticals/highs)
+  alt delta>0
+    SCH->>NO: rescan.delta {digest, newCriticals, links}
+  end
+```
+
+**B) Nightly rescan**
+
+```mermaid
+sequenceDiagram
+  autonumber
+  participant CRON as Cron
+  participant SCH as Scheduler.Worker
+  participant IDX as ImpactIndex
+  participant SC as Scanner.WebService
+
+  CRON->>SCH: tick (02:00 Europe/Sofia)
+  SCH->>IDX: ResolveAll(selector)
+  IDX-->>SCH: candidates
+  SCH->>SC: POST /reports {digest} (paced)
+  SC-->>SCH: results
+  SCH-->>SCH: aggregate, store run stats
+```
+
+**C) Content‑refresh (tag followers)**
+
+```mermaid
+sequenceDiagram
+  autonumber
+  participant SCH as Scheduler
+  participant SC as Scanner
+  SCH->>SC: resolve tag→digest (if changed)
+  alt digest changed
+    SCH->>SC: POST /scans {imageRef}    # new SBOM
+    SC-->>SCH: scan complete (artifacts)
+    SCH->>SC: POST /reports {imageDigest}
+  else unchanged
+    SCH->>SC: POST /reports {imageDigest}  # analysis-only
+  end
+```
+
+---
+
+## 16) Roadmap
+
+* **Vuln‑centric impact**: pre‑join vuln→purl→images to rank by **KEV** and **exploited‑in‑the‑wild** signals.
+* **Policy diff preview**: when a staged policy changes, show projected breakage set before promotion.
+* **Cross‑cluster federation**: one Scheduler instance driving many Scanner clusters (tenant isolation).
+* **Windows containers**: integrate Zastava runtime hints for Usage view tightening.
+
+---
+
+**End — component_architecture_scheduler.md**