Files

master 7b5bdcf4d3 feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules

- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes.
- Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes.
- Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables.
- Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.

2025-10-30 00:09:39 +02:00

8.8 KiB

Raw Blame History

StellaOps Console - Runs Workspace

Audience: Scheduler Guild, Console UX, operators, support engineers.
Scope: Runs dashboard, live progress, queue management, diffs, retries, evidence downloads, observability, troubleshooting, and offline behaviour (Sprint 23).

The Runs workspace surfaces Scheduler activity across tenants: upcoming schedules, active runs, progress, deltas, and evidence bundles. It helps operators monitor backlog, drill into run segments, and recover from failures without leaving the console.

1. Access and prerequisites

Route: /console/runs (list) with detail drawer /console/runs/:runId. SSE stream at /console/runs/:runId/stream.
Scopes: runs.read (baseline), runs.manage (cancel/retry), policy:runs (view policy deltas), downloads.read (evidence bundles).
Dependencies: Scheduler WebService (/runs, /schedules, /preview), Scheduler Worker event feeds, Policy Engine run summaries, Scanner WebService evidence endpoints.
Feature flags: runs.dashboard.enabled, runs.sse.enabled, runs.retry.enabled, runs.evidenceBundles.
Tenancy: Tenant selector filters list; cross-tenant admins can pin multiple tenants side-by-side (split view).

2. Layout overview

+-------------------------------------------------------------------+
| Header: Tenant badge - schedule selector - backlog metrics        |
+-------------------------------------------------------------------+
| Cards: Active runs - Queue depth - New findings - KEV deltas      |
+-------------------------------------------------------------------+
| Tabs: Active | Completed | Scheduled | Failures                   |
+-------------------------------------------------------------------+
| Runs table (virtualised)                                          |
|  Columns: Run ID | Trigger | State | Progress | Duration | Deltas |
+-------------------------------------------------------------------+
| Detail drawer: Summary | Segments | Deltas | Evidence | Logs      |
+-------------------------------------------------------------------+

The header integrates the status ticker to show ingestion deltas and planner heartbeat.

3. Runs table

Column	Description
Run ID	Deterministic identifier (`run:<tenant>:<timestamp>:<nonce>`). Clicking opens detail drawer.
Trigger	`cron`, `manual`, `feedser`, `vexer`, `policy`, `content-refresh`. Tooltip lists schedule and initiator.
State	Badges: `planning`, `queued`, `running`, `completed`, `cancelled`, `error`. Errors include error code (e.g., `ERR_RUN_005`).
Progress	Percentage + processed/total candidates. SSE updates increment in real time.
Duration	Elapsed time (auto-updating). Completed runs show total duration; running runs show timer.
Deltas	Count of findings deltas (`+critical`, `+high`, `-quieted`, etc.). Tooltip expands severity breakdown.

Row badges include KEV first, Content refresh, Policy promotion follow-up, and Retry. Selecting multiple rows enables bulk downloads and exports.

Filters: trigger type, state, schedule, severity impact (critical/high), policy revision, timeframe, planner shard, error code.

4. Detail drawer

Sections:

Summary - run metadata (tenant, trigger, linked schedule, planner shard count, started/finished timestamps, correlation ID).
Progress - segmented progress bar (planner, queue, execution, post-processing). Real-time updates via SSE; includes throughput (targets per minute).
Segments - table of run segments with state, target count, executor, retry count. Operators can retry failed segments individually (requires runs.manage).
Deltas - summary of findings changes (new findings, resolved findings, severity shifts, KEV additions). Links to Findings view filtered by run ID.
Evidence - links to evidence bundles (JSON manifest, DSSE attestation), policy run records, and explain bundles. Download buttons use /console/exports orchestration.
Logs - last 50 structured log entries with severity, message, correlation ID; scroll-to-live for streaming logs. Open in logs copies query for external log tooling.

5. Queue and schedule management

Schedule side panel lists upcoming jobs with cron expressions, time zones, and enable toggles.
Queue depth chart shows current backlog per tenant and schedule (planner backlog, executor backlog).
"Preview impact" button opens modal for manual run planning (purls or vuln IDs) and shows impacted image count before launch. CLI parity: stella runs preview --tenant <id> --file keys.json.
Manual run form allows selecting mode (analysis-only, content-refresh), scope, and optional policy snapshot.
Pausing a schedule requires confirmation; UI displays earliest next run after resume.

6. Live updates and SSE stream

SSE endpoint /console/runs/{id}/stream streams JSON events (stateChanged, segmentProgress, deltaSummary, log). UI reconnects with exponential backoff and heartbeat.
Global ticker shows planner heartbeat age; banner warns after 90 seconds of silence.
Offline mode disables SSE and falls back to polling every 30 seconds.

7. Retry and remediation

Failed segments show retry button; UI displays reason and cooldown timers. Retry actions are scope-gated and logged.
Full run retry resets segments while preserving original run metadata; new run ID references previous run in retryOf field.
"Escalate to support" button opens incident template pre-filled with run context and correlation IDs.
Troubleshooting quick links:
- ERR_RUN_001 (planner lock)
- ERR_RUN_005 (Scanner timeout)
- ERR_RUN_009 (impact index stale)
  Each link points to corresponding runbook sections (docs/modules/scheduler/operations/worker.md).
CLI parity: stella runs retry --run <id>, stella runs cancel --run <id>.

8. Evidence downloads

Evidence tab aggregates:
- Policy run summary (/policy/runs/{id})
- Findings delta CSV (/downloads/findings/{runId}.csv)
- Scanner evidence bundle (compressed JSON with manifest)
Downloads show size, hash, signature status.
"Bundle for offline" packages all evidence into single tarball with manifest/digest; UI notes CLI parity (stella runs export --run <id> --bundle).
Completed bundles stored in Downloads workspace for reuse (links provided).

9. Observability

Metrics cards: scheduler_queue_depth, scheduler_runs_active, scheduler_runs_error_total, scheduler_runs_duration_seconds.
Trend charts: queue depth (last 24h), runs per trigger, average duration, determinism score.
Alert banners: planner lag > SLA, queue depth > threshold, repeated error codes.
Telemetry panel lists latest events (e.g., scheduler.run.started, scheduler.run.completed, scheduler.run.failed).

10. Offline and air-gap behaviour

Offline banner highlights snapshot timestamp and indicates SSE disabled.
Manual run form switches to generate CLI script for offline execution (stella runs submit --bundle <file>).
Evidence download buttons output local paths; UI reminds to copy to removable media.
Queue charts use snapshot data; manual refresh button loads latest records from Offline Kit.
Tenants absent from snapshot hidden to avoid partial data.

11. Screenshot coordination

Placeholders:
- ![Runs dashboard placeholder](../assets/ui/runs/dashboard-placeholder.png)
- ![Run detail placeholder](../assets/ui/runs/detail-placeholder.png)
Coordinate with Scheduler Guild for updated screenshots after Sprint 23 UI stabilises (tracked in #console-screenshots, entry 2025-10-26).

12. References

/docs/ui/console-overview.md - shell, SSE ticker.
/docs/ui/navigation.md - route map and deep links.
/docs/ui/findings.md - findings filtered by run.
/docs/ui/downloads.md - download manager, export retention, CLI parity.
/docs/modules/scheduler/architecture.md - scheduler architecture and data model.
/docs/policy/runs.md - policy run integration.
/docs/modules/cli/guides/policy.md and /docs/modules/cli/guides/policy.md section 5 for CLI parity (runs commands pending).
/docs/modules/scheduler/operations/worker.md - troubleshooting.

13. Compliance checklist

Runs table columns, filters, and states described.
Detail drawer sections documented (segments, deltas, evidence, logs).
Queue management, manual run, and preview coverage included.
SSE and live update behaviour detailed.
Retry, remediation, and runbook references provided.
Evidence downloads and bundle workflows documented with CLI parity.
Offline behaviour and screenshot coordination recorded.
References validated.

Last updated: 2025-10-26 (Sprint 23).

8.8 KiB Raw Blame History