Files
git.stella-ops.org/docs/ui/console.md
root 68da90a11a
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Restructure solution layout by module
2025-10-28 15:10:40 +02:00

145 lines
10 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Console AOC Dashboard
> **Audience:** Console PMs, UI engineers, Concelier/Excititor operators, SREs monitoring ingestion health.
> **Scope:** Layout, RBAC, workflow, and observability for the Aggregation-Only Contract (AOC) dashboard that ships with Sprint19.
The Console AOC dashboard gives operators a live view of ingestion guardrails across all configured sources. It surfaces raw Concelier/Excititor health, highlights violations raised by `AOCWriteGuard`, and lets on-call staff trigger verification without leaving the browser. Use it alongside the [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md) and the [architecture overview](../architecture/overview.md) when rolling out AOC changes.
---
## 1·Access & prerequisites
- **Route:** `/console/sources` (dashboard) with contextual drawer routes `/console/sources/:sourceKey` and `/console/sources/:sourceKey/violations/:documentId`.
- **Feature flag:** `aocDashboard.enabled` (default `true` once Concelier WebService exposes `/aoc/verify`). Toggle is tenant-scoped to support phased rollout.
- **Scopes:**
- `ui.read` (base navigation) plus `advisory:read` to view Concelier ingestion metrics/violations.
- `vex:read` to see Excititor entries and run VEX verifications.
- `aoc:verify` to trigger guard runs from the dashboard action bar.
- `advisory:ingest` / `vex:ingest` **not** required; the dashboard uses read-only APIs.
- **Tenancy:** All data is filtered by the active tenant selector. Switching tenants re-fetches tiles and drill-down tables with tenant-scoped tokens.
- **Back-end contracts:** Requires Concelier/Excititor 19.x (AOC guards enabled) and Authority scopes updated per [Authority service docs](../ARCHITECTURE_AUTHORITY.md#new-aoc-scopes).
---
## 2·Layout overview
```
┌────────────────────────────────────────────────────────────────────────────┐
│ Header: tenant picker • live status pill • Last verify (“2h ago”) │
├────────────────────────────────────────────────────────────────────────────┤
│ Tile grid (4 per row) │
│ ┌───── Concelier sources ─────┐ ┌────── Excititor sources ────────┐ │
│ │ Red Hat | Ubuntu | OSV ... │ │ Vendor VEX | CSAF feeds ... │ │
├────────────────────────────────────────────────────────────────────────────┤
│ Violations & history table │
│ • Filters: timeframe, source, ERR_AOC code, severity (warning/block) │
│ • Columns: timestamp, source, code, summary, supersedes link, actions │
├────────────────────────────────────────────────────────────────────────────┤
│ Action bar: Run Verify • Download CSV • Open Concelier raw doc • Help │
└────────────────────────────────────────────────────────────────────────────┘
```
Tiles summarise the latest ingestion runs. The table and drawers provide drill-down views, and the action bar launches verifier workflows or exports evidence for audits.
---
## 3·Source tiles
Each tile represents a Concelier or Excititor source and contains the fields below.
| Field | Description | Thresholds & colours |
| ------ | ----------- | -------------------- |
| **Status badge** | Aggregated health computed from the latest job. | `Healthy` (green) when last job finished <30min ago and `violations24h = 0`; `Warning` (amber) when age 30min or 5 violations; `Critical` (red) on any guard rejection (`ERR_AOC_00x`) or if job age >2h. |
| **Last ingest** | Timestamp and relative age of last successful append to `advisory_raw`/`vex_raw`. | Clicking opens job detail drawer. |
| **Violations (24h)** | Count of guard failures grouped by `ERR_AOC` code across the last 24hours. | Shows pill per code (e.g., `ERR_AOC_001 ×2`). |
| **Supersedes depth** | Average length of supersedes chain for the source over the last day. | Helps spot runaway revisions. |
| **Signature pass rate** | % of documents where signature/checksum verification succeeded. | Derived from `ingestion_signature_verified_total`. |
| **Latency P95** | Write latency recorded by ingestion spans / histograms. | Mirrors `ingestion_latency_seconds{quantile=0.95}`. |
Tile menus expose quick actions:
- **View history** jumps to table filtered by the selected source.
- **Open metrics** deep links to Grafana panel seeded with `source=<key>` for `ingestion_write_total` and `aoc_violation_total`.
- **Download raw sample** fetches the most recent document via `GET /advisories/raw/{id}` (or VEX equivalent) for debugging.
---
## 4·Violation drill-down workflow
1. **Select a tile** or use table filters to focus on a source, timeframe, or `ERR_AOC` code.
2. **Inspect the violation row:** summary shows offending field, guard code, and document hash.
3. **Open detail drawer:** reveals provenance (source URI, signature info), supersedes chain, and raw JSON (redacted secrets). Drawer also lists linked `effective_finding_*` entries if Policy Engine has already materialised overlays.
4. **Remediate / annotate:** operators can add notes (stored as structured annotations) or flag as *acknowledged* (for on-call rotations). Annotations sync to Concelier audit logs.
5. **Escalate:** “Create incident” button opens the standard incident template pre-filled with context (requires `ui.incidents` scope).
The drill-down retains filter state, so back navigation returns to the scoped table without reloading the entire dashboard.
---
## 5·Verification & actions
- **Run Verify:** calls `POST /aoc/verify` with the chosen `since` window (default 24h). UI displays summary cards (documents checked, violations found, top codes) and stores reports for 7days. Results include a downloadable JSON manifest mirroring CLI output.
- **Schedule verify:** schedule modal configures automated verification (daily/weekly) and optional email/Notifier hooks.
- **Export evidence:** CSV/JSON export buttons include tile metrics, verification summaries, and violation annotations—useful for audits.
- **Open in CLI:** copies `stella aoc verify --tenant <tenant> --since <window>` for parity with automation scripts.
All verify actions are scoped by tenant and recorded in Authority audit logs (`action=aoc.verify.ui`).
---
## 6·Metrics & observability
The dashboard consumes the same metrics emitted by Concelier/Excititor (documented in the [AOC reference](../ingestion/aggregation-only-contract.md#9-observability-and-diagnostics)):
- `ingestion_write_total{source,tenant,result}` populates success/error sparklines beneath each tile.
- `aoc_violation_total{source,tenant,code}` feeds violation pills and trend chart.
- `ingestion_signature_verified_total{source,result}` renders signature pass-rate gauge.
- `ingestion_latency_seconds{source,quantile}` used for latency badges and alert banners.
- `advisory_revision_count{source}` displayed in supersedes depth tooltip.
The page shows the correlation ID for each violation entry, matching structured logs emitted by Concelier and Excititor, enabling quick log pivoting.
---
## 7·Security & tenancy
- Tokens are DPoP-bound; every API call includes the UIs DPoP proof and inherits tenant scoping from Authority.
- Violations drawer hides sensitive fields (credentials, private keys) using the same redaction rules as Concelier events.
- Run Verify honours rate limits to avoid overloading ingestion services; repeated failures trigger a cool-down banner.
- The dashboard never exposes derived severity or policy status—only raw ingestion facts and guard results, preserving AOC separation of duties.
---
## 8·Offline & air-gap behaviour
- In sealed/offline mode the dashboard switches to **“offline snapshot”** banner, reading from Offline Kit snapshots seeded via `ouk` imports.
- Verification requests queue until connectivity resumes; UI provides `Download script` to run `stella aoc verify` on a workstation and upload results later.
- Tiles display the timestamp of the last imported snapshot and flag when it exceeds the configured staleness threshold (default 48h offline).
- CSV/JSON exports include checksums so operators can transfer evidence across air gaps securely.
---
## 9·Related references
- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
- [Architecture overview](../architecture/overview.md)
- [Concelier architecture](../ARCHITECTURE_CONCELIER.md)
- [Excititor architecture](../ARCHITECTURE_EXCITITOR.md)
- [CLI AOC commands](../cli/cli-reference.md)
---
## 10·Compliance checklist
- [ ] Dashboard wired to live AOC metrics (`ingestion_*`, `aoc_violation_total`).
- [ ] Verify action logs to Authority audit trail with tenant context.
- [ ] UI enforces read-only access to raw stores; no mutation endpoints invoked.
- [ ] Offline/air-gap mode documented and validated with Offline Kit snapshots.
- [ ] Violation exports include provenance and `ERR_AOC_00x` codes.
- [ ] Accessibility tested (WCAG2.2 AA) for tiles, tables, and drawers.
- [ ] Screenshot/recording captured for Docs release notes (pending UI capture).
---
*Last updated: 2025-10-26 (Sprint19).*