425 lines
16 KiB
Markdown
425 lines
16 KiB
Markdown
# component_architecture_scheduler.md — **Stella Ops Scheduler** (2025Q4)
|
||
|
||
> **Scope.** Implementation‑ready architecture for **Scheduler**: a service that (1) **re‑evaluates** already‑cataloged images when intel changes (Feedser/Vexer/policy), (2) orchestrates **nightly** and **ad‑hoc** runs, (3) targets only the **impacted** images using the BOM‑Index, and (4) emits **report‑ready** events that downstream **Notify** fans out. Default mode is **analysis‑only** (no image pull); optional **content‑refresh** can be enabled per schedule.
|
||
|
||
---
|
||
|
||
## 0) Mission & boundaries
|
||
|
||
**Mission.** Keep scan results **current** without rescanning the world. When new advisories or VEX claims land, **pinpoint** affected images and ask the backend to recompute **verdicts** against the **existing SBOMs**. Surface only **meaningful deltas** to humans and ticket queues.
|
||
|
||
**Boundaries.**
|
||
|
||
* Scheduler **does not** compute SBOMs and **does not** sign. It calls Scanner/WebService’s **/reports (analysis‑only)** endpoint and lets the backend (Policy + Vexer + Feedser) decide PASS/FAIL.
|
||
* Scheduler **may** ask Scanner to **content‑refresh** selected targets (e.g., mutable tags) but the default is **no** image pull.
|
||
* Notifications are **not** sent directly; Scheduler emits events consumed by **Notify**.
|
||
|
||
---
|
||
|
||
## 1) Runtime shape & projects
|
||
|
||
```
|
||
src/
|
||
├─ StellaOps.Scheduler.WebService/ # REST (schedules CRUD, runs, admin)
|
||
├─ StellaOps.Scheduler.Worker/ # planners + runners (N replicas)
|
||
├─ StellaOps.Scheduler.ImpactIndex/ # purl→images inverted index (roaring bitmaps)
|
||
├─ StellaOps.Scheduler.Models/ # DTOs (Schedule, Run, ImpactSet, Deltas)
|
||
├─ StellaOps.Scheduler.Storage.Mongo/ # schedules, runs, cursors, locks
|
||
├─ StellaOps.Scheduler.Queue/ # Redis Streams / NATS abstraction
|
||
├─ StellaOps.Scheduler.Tests.* # unit/integration/e2e
|
||
```
|
||
|
||
**Deployables**:
|
||
|
||
* **Scheduler.WebService** (stateless)
|
||
* **Scheduler.Worker** (scale‑out; planners + executors)
|
||
|
||
**Dependencies**: Authority (OpTok + DPoP/mTLS), Scanner.WebService, Feedser, Vexer, MongoDB, Redis/NATS, (optional) Notify.
|
||
|
||
---
|
||
|
||
## 2) Core responsibilities
|
||
|
||
1. **Time‑based** runs: cron windows per tenant/timezone (e.g., “02:00 Europe/Sofia”).
|
||
2. **Event‑driven** runs: react to **Feedser export** and **Vexer export** deltas (changed product keys / advisories / claims).
|
||
3. **Impact targeting**: map changes to **image sets** using a **global inverted index** built from Scanner’s per‑image **BOM‑Index** sidecars.
|
||
4. **Run planning**: shard, pace, and rate‑limit jobs to avoid thundering herds.
|
||
5. **Execution**: call Scanner **/reports (analysis‑only)** or **/scans (content‑refresh)**; aggregate **delta** results.
|
||
6. **Events**: publish `rescan.delta` and `report.ready` summaries for **Notify** & **UI**.
|
||
7. **Control plane**: CRUD schedules, **pause/resume**, dry‑run previews, audit.
|
||
|
||
---
|
||
|
||
## 3) Data model (Mongo)
|
||
|
||
**Database**: `scheduler`
|
||
|
||
* `schedules`
|
||
|
||
```
|
||
{ _id, tenantId, name, enabled, whenCron, timezone,
|
||
mode: "analysis-only" | "content-refresh",
|
||
selection: { scope: "all-images" | "by-namespace" | "by-repo" | "by-digest" | "by-labels",
|
||
includeTags?: ["prod-*"], digests?: [sha256...], resolvesTags?: bool },
|
||
onlyIf: { lastReportOlderThanDays?: int, policyRevision?: string },
|
||
notify: { onNewFindings: bool, minSeverity: "low|medium|high|critical", includeKEV: bool },
|
||
limits: { maxJobs?: int, ratePerSecond?: int, parallelism?: int },
|
||
createdAt, updatedAt, createdBy, updatedBy }
|
||
```
|
||
|
||
* `runs`
|
||
|
||
```
|
||
{ _id, scheduleId?, tenantId, trigger: "cron|feedser|vexer|manual",
|
||
reason?: { feedserExportId?, vexerExportId?, cursor? },
|
||
state: "planning|queued|running|completed|error|cancelled",
|
||
stats: { candidates: int, deduped: int, queued: int, completed: int, deltas: int, newCriticals: int },
|
||
startedAt, finishedAt, error? }
|
||
```
|
||
|
||
* `impact_cursors`
|
||
|
||
```
|
||
{ _id: tenantId, feedserLastExportId, vexerLastExportId, updatedAt }
|
||
```
|
||
|
||
* `locks` (singleton schedulers, run leases)
|
||
|
||
* `audit` (CRUD actions, run outcomes)
|
||
|
||
**Indexes**:
|
||
|
||
* `schedules` on `{tenantId, enabled}`, `{whenCron}`.
|
||
* `runs` on `{tenantId, startedAt desc}`, `{state}`.
|
||
* TTL optional for completed runs (e.g., 180 days).
|
||
|
||
---
|
||
|
||
## 4) ImpactIndex (global inverted index)
|
||
|
||
Goal: translate **change keys** → **image sets** in **milliseconds**.
|
||
|
||
**Source**: Scanner produces per‑image **BOM‑Index** sidecars (purls, and `usedByEntrypoint` bitmaps). Scheduler ingests/refreshes them to build a **global** index.
|
||
|
||
**Representation**:
|
||
|
||
* Assign **image IDs** (dense ints) to catalog images.
|
||
* Keep **Roaring Bitmaps**:
|
||
|
||
* `Contains[purl] → bitmap(imageIds)`
|
||
* `UsedBy[purl] → bitmap(imageIds)` (subset of Contains)
|
||
* Optionally keep **Owner maps**: `{imageId → {tenantId, namespaces[], repos[]}}` for selection filters.
|
||
* Persist in RocksDB/LMDB or Redis‑modules; cache hot shards in memory; snapshot to Mongo for cold start.
|
||
|
||
**Update paths**:
|
||
|
||
* On new/updated image SBOM: **merge** per‑image set into global maps.
|
||
* On image remove/expiry: **clear** id from bitmaps.
|
||
|
||
**API (internal)**:
|
||
|
||
```csharp
|
||
IImpactIndex {
|
||
ImpactSet ResolveByPurls(IEnumerable<string> purls, bool usageOnly, Selector sel);
|
||
ImpactSet ResolveByVulns(IEnumerable<string> vulnIds, bool usageOnly, Selector sel); // optional (vuln->purl precomputed by Feedser)
|
||
ImpactSet ResolveAll(Selector sel); // for nightly
|
||
}
|
||
```
|
||
|
||
**Selector filters**: tenant, namespaces, repos, labels, digest allowlists, `includeTags` patterns.
|
||
|
||
---
|
||
|
||
## 5) External interfaces (REST)
|
||
|
||
Base path: `/api/v1/scheduler` (Authority OpToks; scopes: `scheduler.read`, `scheduler.admin`).
|
||
|
||
### 5.1 Schedules CRUD
|
||
|
||
* `POST /schedules` → create
|
||
* `GET /schedules` → list (filter by tenant)
|
||
* `GET /schedules/{id}` → details + next run
|
||
* `PATCH /schedules/{id}` → pause/resume/update
|
||
* `DELETE /schedules/{id}` → delete (soft delete, optional)
|
||
|
||
### 5.2 Run control & introspection
|
||
|
||
* `POST /run` — ad‑hoc run
|
||
|
||
```json
|
||
{ "mode": "analysis-only|content-refresh", "selection": {...}, "reason": "manual" }
|
||
```
|
||
* `GET /runs` — list with paging
|
||
* `GET /runs/{id}` — status, stats, links to deltas
|
||
* `POST /runs/{id}/cancel` — best‑effort cancel
|
||
|
||
### 5.3 Previews (dry‑run)
|
||
|
||
* `POST /preview/impact` — returns **candidate count** and a small sample of impacted digests for given change keys or selection.
|
||
|
||
### 5.4 Event webhooks (optional push from Feedser/Vexer)
|
||
|
||
* `POST /events/feedser-export`
|
||
|
||
```json
|
||
{ "exportId":"...", "changedProductKeys":["pkg:rpm/openssl", ...], "kev": ["CVE-..."], "window": { "from":"...","to":"..." } }
|
||
```
|
||
* `POST /events/vexer-export`
|
||
|
||
```json
|
||
{ "exportId":"...", "changedClaims":[ { "productKey":"pkg:deb/...", "vulnId":"CVE-...", "status":"not_affected→affected"} ], ... }
|
||
```
|
||
|
||
**Security**: webhook requires **mTLS** or an **HMAC** `X-Scheduler-Signature` (Ed25519 / SHA‑256) plus Authority token.
|
||
|
||
---
|
||
|
||
## 6) Planner → Runner pipeline
|
||
|
||
### 6.1 Planning algorithm (event‑driven)
|
||
|
||
```
|
||
On Export Event (Feedser/Vexer):
|
||
keys = Normalize(change payload) # productKeys or vulnIds→productKeys
|
||
usageOnly = schedule/policy hint? # default true
|
||
sel = Selector for tenant/scope from schedules subscribed to events
|
||
|
||
impacted = ImpactIndex.ResolveByPurls(keys, usageOnly, sel)
|
||
impacted = ApplyOwnerFilters(impacted, sel) # namespaces/repos/labels
|
||
impacted = DeduplicateByDigest(impacted)
|
||
impacted = EnforceLimits(impacted, limits.maxJobs)
|
||
shards = Shard(impacted, byHashPrefix, n=limits.parallelism)
|
||
|
||
For each shard:
|
||
Enqueue RunSegment (runId, shard, rate=limits.ratePerSecond)
|
||
```
|
||
|
||
**Fairness & pacing**
|
||
|
||
* Use **leaky bucket** per tenant and per registry host.
|
||
* Prioritize **KEV‑tagged** and **critical** first if oversubscribed.
|
||
|
||
### 6.2 Nightly planning
|
||
|
||
```
|
||
At cron tick:
|
||
sel = resolve selection
|
||
candidates = ImpactIndex.ResolveAll(sel)
|
||
if lastReportOlderThanDays present → filter by report age (via Scanner catalog)
|
||
shard & enqueue as above
|
||
```
|
||
|
||
### 6.3 Execution (Runner)
|
||
|
||
* Pop **RunSegment** job → for each image digest:
|
||
|
||
* **analysis‑only**: `POST scanner/reports { imageDigest, policyRevision? }`
|
||
* **content‑refresh**: resolve tag→digest if needed; `POST scanner/scans { imageRef, attest? false }` then `POST /reports`
|
||
* Collect **delta**: `newFindings`, `newCriticals`/`highs`, `links` (UI deep link, Rekor if present).
|
||
* Persist per‑image outcome in `runs.{id}.stats` (incremental counters).
|
||
* Emit `scheduler.rescan.delta` events to **Notify** only when **delta > 0** and matches severity rule.
|
||
|
||
---
|
||
|
||
## 7) Event model (outbound)
|
||
|
||
**Topic**: `rescan.delta` (internal bus → Notify; UI subscribes via backend).
|
||
|
||
```json
|
||
{
|
||
"tenant": "tenant-01",
|
||
"runId": "324af…",
|
||
"imageDigest": "sha256:…",
|
||
"newCriticals": 1,
|
||
"newHigh": 2,
|
||
"kevHits": ["CVE-2025-..."],
|
||
"topFindings": [
|
||
{ "purl":"pkg:rpm/openssl@3.0.12-...","vulnId":"CVE-2025-...","severity":"critical","link":"https://ui/scans/..." }
|
||
],
|
||
"reportUrl": "https://ui/.../scans/sha256:.../report",
|
||
"attestation": { "uuid":"rekor-uuid", "verified": true },
|
||
"ts": "2025-10-18T03:12:45Z"
|
||
}
|
||
```
|
||
|
||
**Also**: `report.ready` for “no‑change” summaries (digest + zero delta), which Notify can ignore by rule.
|
||
|
||
---
|
||
|
||
## 8) Security posture
|
||
|
||
* **AuthN/Z**: Authority OpToks with `aud=scheduler`; DPoP (preferred) or mTLS.
|
||
* **Multi‑tenant**: every schedule, run, and event carries `tenantId`; ImpactIndex filters by tenant‑visible images.
|
||
* **Webhook** callers (Feedser/Vexer) present **mTLS** or **HMAC** and Authority token.
|
||
* **Input hardening**: size caps on changed key lists; reject >100k keys per event; compress (zstd/gzip) allowed with limits.
|
||
* **No secrets** in logs; redact tokens and signatures.
|
||
|
||
---
|
||
|
||
## 9) Observability & SLOs
|
||
|
||
**Metrics (Prometheus)**
|
||
|
||
* `scheduler.events_total{source, result}`
|
||
* `scheduler.impact_resolve_seconds{quantile}`
|
||
* `scheduler.images_selected_total{mode}`
|
||
* `scheduler.jobs_enqueued_total{mode}`
|
||
* `scheduler.run_latency_seconds{quantile}` // event → first verdict
|
||
* `scheduler.delta_images_total{severity}`
|
||
* `scheduler.rate_limited_total{reason}`
|
||
|
||
**Targets**
|
||
|
||
* Resolve 10k changed keys → impacted set in **<300 ms** (hot cache).
|
||
* Event → first rescan verdict in **≤60 s** (p95).
|
||
* Nightly coverage 50k images in **≤10 min** with 10 workers (analysis‑only).
|
||
|
||
**Tracing** (OTEL): spans `plan`, `resolve`, `enqueue`, `report_call`, `persist`, `emit`.
|
||
|
||
---
|
||
|
||
## 10) Configuration (YAML)
|
||
|
||
```yaml
|
||
scheduler:
|
||
authority:
|
||
issuer: "https://authority.internal"
|
||
require: "dpop" # or "mtls"
|
||
queue:
|
||
kind: "redis" # or "nats"
|
||
url: "redis://redis:6379/4"
|
||
mongo:
|
||
uri: "mongodb://mongo/scheduler"
|
||
impactIndex:
|
||
storage: "rocksdb" # "rocksdb" | "redis" | "memory"
|
||
warmOnStart: true
|
||
usageOnlyDefault: true
|
||
limits:
|
||
defaultRatePerSecond: 50
|
||
defaultParallelism: 8
|
||
maxJobsPerRun: 50000
|
||
integrates:
|
||
scannerUrl: "https://scanner-web.internal"
|
||
feedserWebhook: true
|
||
vexerWebhook: true
|
||
notifications:
|
||
emitBus: "internal" # deliver to Notify via internal bus
|
||
```
|
||
|
||
---
|
||
|
||
## 11) UI touch‑points
|
||
|
||
* **Schedules** page: CRUD, enable/pause, next run, last run stats, mode (analysis/content), selector preview.
|
||
* **Runs** page: timeline; heat‑map of deltas; drill‑down to affected images.
|
||
* **Dry‑run preview** modal: “This Feedser export touches ~3,214 images; projected deltas: ~420 (34 KEV).”
|
||
|
||
---
|
||
|
||
## 12) Failure modes & degradations
|
||
|
||
| Condition | Behavior |
|
||
| ------------------------------------ | ---------------------------------------------------------------------------------------- |
|
||
| ImpactIndex cold / incomplete | Fall back to **All** selection for nightly; for events, cap to KEV+critical until warmed |
|
||
| Feedser/Vexer webhook storm | Coalesce by exportId; debounce 30–60 s; keep last |
|
||
| Scanner under load (429) | Backoff with jitter; respect per‑tenant/leaky bucket |
|
||
| Oversubscription (too many impacted) | Prioritize KEV/critical first; spillover to next window; UI banner shows backlog |
|
||
| Notify down | Buffer outbound events in queue (TTL 24h) |
|
||
| Mongo slow | Cut batch sizes; sample‑log; alert ops; don’t drop runs unless critical |
|
||
|
||
---
|
||
|
||
## 13) Testing matrix
|
||
|
||
* **ImpactIndex**: correctness (purl→image sets), performance, persistence after restart, memory pressure with 1M purls.
|
||
* **Planner**: dedupe, shard, fairness, limit enforcement, KEV prioritization.
|
||
* **Runner**: parallel report calls, error backoff, partial failures, idempotency.
|
||
* **End‑to‑end**: Feedser export → deltas visible in UI in ≤60 s.
|
||
* **Security**: webhook auth (mTLS/HMAC), DPoP nonce dance, tenant isolation.
|
||
* **Chaos**: drop scanner availability; simulate registry throttles (content‑refresh mode).
|
||
* **Nightly**: cron tick correctness across timezones and DST.
|
||
|
||
---
|
||
|
||
## 14) Implementation notes
|
||
|
||
* **Language**: .NET 10 minimal API; Channels‑based pipeline; `System.Threading.RateLimiting`.
|
||
* **Bitmaps**: Roaring via `RoaringBitmap` bindings; memory‑map large shards if RocksDB used.
|
||
* **Cron**: Quartz‑style parser with timezone support; clock skew tolerated ±60 s.
|
||
* **Dry‑run**: use ImpactIndex only; never call scanner.
|
||
* **Idempotency**: run segments carry deterministic keys; retries safe.
|
||
* **Backpressure**: per‑tenant buckets; per‑host registry budgets respected when content‑refresh enabled.
|
||
|
||
---
|
||
|
||
## 15) Sequences (representative)
|
||
|
||
**A) Event‑driven rescan (Feedser delta)**
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
autonumber
|
||
participant FE as Feedser
|
||
participant SCH as Scheduler.Worker
|
||
participant IDX as ImpactIndex
|
||
participant SC as Scanner.WebService
|
||
participant NO as Notify
|
||
|
||
FE->>SCH: POST /events/feedser-export {exportId, changedProductKeys}
|
||
SCH->>IDX: ResolveByPurls(keys, usageOnly=true, sel)
|
||
IDX-->>SCH: bitmap(imageIds) → digests list
|
||
SCH->>SC: POST /reports {imageDigest} (batch/sequenced)
|
||
SC-->>SCH: report deltas (new criticals/highs)
|
||
alt delta>0
|
||
SCH->>NO: rescan.delta {digest, newCriticals, links}
|
||
end
|
||
```
|
||
|
||
**B) Nightly rescan**
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
autonumber
|
||
participant CRON as Cron
|
||
participant SCH as Scheduler.Worker
|
||
participant IDX as ImpactIndex
|
||
participant SC as Scanner.WebService
|
||
|
||
CRON->>SCH: tick (02:00 Europe/Sofia)
|
||
SCH->>IDX: ResolveAll(selector)
|
||
IDX-->>SCH: candidates
|
||
SCH->>SC: POST /reports {digest} (paced)
|
||
SC-->>SCH: results
|
||
SCH-->>SCH: aggregate, store run stats
|
||
```
|
||
|
||
**C) Content‑refresh (tag followers)**
|
||
|
||
```mermaid
|
||
sequenceDiagram
|
||
autonumber
|
||
participant SCH as Scheduler
|
||
participant SC as Scanner
|
||
SCH->>SC: resolve tag→digest (if changed)
|
||
alt digest changed
|
||
SCH->>SC: POST /scans {imageRef} # new SBOM
|
||
SC-->>SCH: scan complete (artifacts)
|
||
SCH->>SC: POST /reports {imageDigest}
|
||
else unchanged
|
||
SCH->>SC: POST /reports {imageDigest} # analysis-only
|
||
end
|
||
```
|
||
|
||
---
|
||
|
||
## 16) Roadmap
|
||
|
||
* **Vuln‑centric impact**: pre‑join vuln→purl→images to rank by **KEV** and **exploited‑in‑the‑wild** signals.
|
||
* **Policy diff preview**: when a staged policy changes, show projected breakage set before promotion.
|
||
* **Cross‑cluster federation**: one Scheduler instance driving many Scanner clusters (tenant isolation).
|
||
* **Windows containers**: integrate Zastava runtime hints for Usage view tightening.
|
||
|
||
---
|
||
|
||
**End — component_architecture_scheduler.md**
|