feat: Add new projects to solution and implement contract testing documentation

- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
This commit is contained in:
2025-10-27 07:57:55 +02:00
parent 1e41ba7ffa
commit 651b8e0fa3
355 changed files with 17276 additions and 1160 deletions

531
EPIC_3.md Normal file
View File

@@ -0,0 +1,531 @@
Heres Epic 3 in the same “pasteintorepo” format: exhaustive, implementationready, and aligned with the AOC model plus the Policy Engine from the previous epics.
---
# Epic 3: StellaOps Console (Web UI over WebServices)
> Short name: **StellaOps Console**
> Services touched: **Web API Gateway, Authority (authN/Z), Policy Engine, SBOM Service, Conseiller (Feedser), Excitator (Vexer), Scheduler/Workers, Telemetry**
> Data stores used via APIs: **MongoDB (reads only from UI), object storage for traces**, optional **Redis/NATS** for live updates
> Deliverable: **TypeScript/React web application** with a component library and feature modules, packaged as container images and static builds
---
## 1) What it is
**StellaOps Console** is the firstparty Web UI for all Stella WebServices. It provides a cohesive, roleaware surface for:
* Viewing raw AOC facts (advisories, VEX, SBOMs) without mutation.
* Applying and simulating policies (VEX application rules, advisory normalization) then exploring **effective findings**.
* Navigating SBOMs as graphs, zooming into components, and seeing linked advisories/VEX with clear precedence.
* Running and monitoring evaluations, auditing why decisions were made, and exporting evidence.
* Administering tenants, users and roles, API tokens, and integrations.
* Publishing a selfhostable Console image and a **Download & Install** page covering all product containers.
The Console is a readwrite client for allowable operations (policy authoring, run orchestration, approvals), and strictly readonly for **raw facts** per the AOC enforcement. It is **not** a new API; it is a UI over the existing ones with strong guardrails and deterministic behavior.
---
## 2) Why
* Teams need a single, consistent interface to explore SBOMs, advisories, VEX, and policy outcomes.
* Audits require visible provenance, replayable evidence, and explanation chains.
* Policy creation and simulation are safer when you can see deltas and traces.
* Many workflows benefit from visual tools: graph explorers, diff views, and stepwise wizards.
* Not everyone wants to live in the CLI all day. Parity and choice matter.
---
## 3) How it should work (maximum detail)
### 3.1 Information architecture
Toplevel navigation with a tenant context picker:
1. **Dashboard**: highlevel posture and recent changes.
2. **SBOMs**: catalog, search, and **Graph Explorer**.
3. **Advisories & VEX**: raw fact browsers with aggregationnotmerge semantics.
4. **Findings**: policymaterialized findings with filters and explanations.
5. **Policies**: editor, simulation, versioning, approvals.
6. **Runs**: orchestration, live progress, history, diffs.
7. **Reports & Export**: evidence packages, CSV/JSON exports.
8. **Admin**: users/roles, tokens, SSO, tenants, registries, settings.
9. **Downloads**: product containers and installation instructions.
Global elements:
* **Global Filters**: policy version, environment profile, severity band, time window.
* **Search Bar**: PURL, CVE/GHSA IDs, SBOM IDs.
* **Live Status**: background jobs, queue lag, last sync cursors.
* **Help & Docs**: contextual deep links into `/docs/*`.
### 3.2 Navigation & routes
```
/dashboard
/sboms
/sboms/:sbomId
/sboms/:sbomId/graph
/advisories
/advisories/:advisoryId (shows all linked sources; aggregation only)
/vex
/vex/:vexId
/findings?policy=:pid&sbom=:sid&status=:st&severity=:sev
/findings/:findingId/explain
/policies
/policies/:policyId/versions/:v
/policies/:policyId/simulate
/runs
/runs/:runId
/reports
/admin/users
/admin/roles
/admin/tenants
/admin/integrations
/admin/tokens
/downloads
```
### 3.3 Core feature modules
#### 3.3.1 Dashboard
* Cards: “Findings by severity,” “VEX overrides in last 24h,” “New advisories linked,” “Run health,” “Policy changes.”
* Clickthrough to filtered views.
* Data sources: aggregated endpoints exposed by Web API (no clientside aggregation over large sets).
#### 3.3.2 SBOM Explorer (catalog + graph)
* **Catalog**: table with SBOM ID, artifact name/version, source, ingest time, component count, last evaluation per policy.
* **Detail**: components tabular view with paging; filters by package type, license, scope.
* **Graph Explorer**:
* Interactive canvas with pan/zoom, focus on component, dependency paths, reachability placeholders.
* Overlay toggles: highlight components with affected findings; show VEX “not_affected” zones; show licenses risk overlay.
* **Policy overlays**: toggle between policy versions to see inplace severity/status changes.
* **Actions**: export component list, copy PURL, open related findings.
**AOC alignment**: SBOM content is immutable; any edits are proposed as new SBOM versions upstream. UI displays raw SBOM JSON in a readonly side panel.
#### 3.3.3 Advisories & VEX browsers
* **Advisories list**:
* Left panel: filters by source (OSV, GHSA, CSAF vendors, NVD), published/modified time, affected ecosystem.
* Middle panel: **aggregation group** keyed by linkset identity (same vulnerability across sources). No merging; show a rollup with persource chips.
* Right panel: selected advisory source view with raw JSON, references, CVSS vectors, and “linked SBOM components” sample.
* Severity shown three ways: vendorreported, normalized (per mapping), and **effective** under the currently selected policy.
* **VEX list**:
* Filters by vendor, product, status, justification, scope.
* Detail panel: all statements applying to the same `(component, advisory)` tuple, with precedence logic visualization and the statement that won under the current policy.
* Raw JSON viewer for each document.
**Strict rule**: Conseiller and Excitator are visualized as **aggregators only**. No UI affordance suggests serverside merging. All links route to raw documents with reference IDs.
#### 3.3.4 Findings
* Virtualized table supporting millions of rows via serverside pagination and cursoring.
* Columns: policy, SBOM, component PURL, advisory IDs (chips for each source), status, severity, last updated, rationale count.
* Row click → **Explain** view with rule hits in order, references to advisories/VEX used, and links to raw docs and trace blobs.
* Bulk export with query replay (the export API reruns the same filters on the server and streams CSV/JSON).
#### 3.3.5 Policies
* Embedded **Policy Editor** (from Epic 2) with Monaco features, simulation panel, diffs, and approval workflow.
* Precommit lint and compile; cannot submit with syntax errors.
* Simulation results show increase/decrease unchanged counts, top rules impacting results, and sample affected components.
#### 3.3.6 Runs
* Queue view: queued/running/succeeded/failed with timestamps and SLA hints.
* Live progress with **SSE/WebSocket** updates: tuples processed, rules fired, findings materialized.
* Diff view between runs for the same policy and SBOM set.
* Retry and cancel actions as allowed by RBAC.
#### 3.3.7 Reports & Export
* Evidence bundle creation: include policy version, run ID, sample traces, and result slices.
* Export templates (CSV for management, JSON/NDJSON for SIEM ingestion).
* Signed export manifests with checksums.
#### 3.3.8 Admin
* Users & Roles: invite, disable, role mapping.
* Tenants: create, switch, default policy bindings.
* Tokens: create scoped API tokens with expirations.
* Integrations: configure SSO (OIDC), registries, webhooks.
* Settings: environment defaults for policy evaluation (exposure, runtime hints).
#### 3.3.9 Downloads
* List of official Docker images: `stella-console`, `stella-api`, `conseiller`, `excitator`, `sbom-svc`, `policy-engine`, etc.
* Version matrix, pull commands, Helm chart snippet, offline tarballs, and system requirements.
* Link to `/docs/install/docker.md` and `/docs/deploy/console.md`.
### 3.4 UX flows (key tasks)
* **Triage a vulnerability**: search CVE → open rollup → view all sources → jump to affected findings → open Explain → see VEX precedence → decide if policy change is needed → simulate policy → if good, submit and request approval → run → verify new findings.
* **Investigate SBOM**: open SBOM → Graph Explorer → highlight affected nodes under policy PX vN → click a node → see linked advisories + VEX → open Explain for a specific finding.
* **Audit evidence**: open run → download evidence bundle with policy, run metadata, traces, and effective finding slice.
* **Onboard team**: invite users, set roles, define default tenant policies, give readonly access to auditors.
### 3.5 CLI vs UI parity
Create `/docs/cli-vs-ui-parity.md` with a matrix. Principle:
* All **read** capabilities must exist in both CLI and UI.
* All **policy lifecycle** actions exist in both.
* Longrunning operations can be initiated in UI and monitored in either surface.
### 3.6 Security & auth
* Auth: OIDC with PKCE; shortlived ID tokens; silent refresh.
* RBAC enforced by the API; UI only gates affordances and never trusts itself.
* CSRF not applicable for tokenbased APIs; still set robust **CSP**, **XFrameOptions**, and **ReferrerPolicy**.
* Tenancy: every API call includes tenant header; UI shows explicit tenant badge.
* Sensitive pages require **fresh auth** (reprompt).
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 3.7 Accessibility & i18n
* WCAG 2.1 AA: keyboard nav, focus indicators, ARIA for tables and graphs, colorcontrast tests.
* i18n scaffolding via ICU messages; English shipped first; content keys stored in code, translations as JSON resources.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 3.8 Performance
* Use serverside pagination and cursoring everywhere; never fetch unbounded lists.
* Virtualized tables and lazy panels.
* Graph Explorer loads neighborhood windows, not whole graphs.
* Cache with TanStack Query; deduplicate requests; stalewhilerevalidate.
* Performance budgets in CI (Lighthouse): TTI < 3.5s on reference hardware.
### 3.9 Error handling & offline
* Error boundaries per feature; retry buttons; copyable request IDs.
* Network loss banner + readonly cached views where safe.
* Clear messages for **AOC** constraints: raw facts cannot be edited.
### 3.10 Telemetry & observability
* UI event telemetry to internal sink (no thirdparty beacons by default).
* Metrics: UI API latency percentiles, error rates, SSE subscription health.
* Feature flags to darklaunch modules.
---
## 4) APIs consumed (representative)
* `GET /sboms`, `GET /sboms/{id}`, `GET /sboms/{id}/components?cursor=...`
* `GET /advisories?source=...`, `GET /advisories/{id}`, `GET /advisories/{id}/linked`
* `GET /vex?status=...`, `GET /vex/{id}`
* `GET /findings/{policyId}` and `GET /findings/{policyId}/{findingId}/explain`
* `POST /policies`, `POST /policies/{id}/compile`, `POST /policies/{id}/simulate`, `POST /policies/{id}/approve`
* `POST /policies/{id}/runs`, `GET /policies/{id}/runs/{runId}` with SSE for progress
* `POST /exports` for evidence bundles
* `GET /auth/user`, `GET /auth/tenants`, `POST /admin/users`, `POST /admin/tokens`
All calls include tenant scope headers and bearer tokens from Authority.
---
## 5) Implementation plan
### 5.1 Frontend architecture
* **Framework**: Next.js 14 (App Router) with TypeScript.
* **State/data**: TanStack Query for server state; Redux only if a global app state proves necessary.
* **UI toolkit**: Internal **Stella UI** component library (headless + primitives) with CSS variables and design tokens.
* **Visualization**: D3 for graph, Monaco for policy editing.
* **Testing**: Playwright (E2E), Vitest/Jest (unit), Storybook (components), Lighthouse (perf).
* **i18n**: `@formatjs/intl` + message catalogs.
* **Packaging**: static build served by Node adapter behind the API gateway; also a `stella-console` Docker image.
**Repo layout**
```
/console
/apps/web
/packages/ui # design system & components
/packages/api # typed API clients (OpenAPI codegen)
/packages/features # feature modules (sboms, advisories, vex, findings, policies, runs, admin)
/packages/utils
/e2e
/storybook
```
### 5.2 Design System (packages/ui)
* Foundation tokens: color, spacing, typography, elevation; dark/light modes.
* Components: AppShell, Nav, DataTable (virtualized), Badge/Chip, Tabs, Drawer, GraphCanvas, CodeViewer (readonly JSON), Form primitives, Modal, Toast, Pill filters.
* Accessibility baked into components; snapshot and interaction tests in Storybook.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 5.3 Feature modules
Each module has:
* `routes.tsx` pages, `api` data hooks, `components`, `tests`, `docs link`.
* Query keys standardized for caching and invalidation.
**SBOMs**
* Hooks: `useSboms`, `useSbom(id)`, `useComponents(sbomId, query)`.
* GraphCanvas using neighborhood loaders: `/sboms/:id/graph?center=:purl&depth=1..3`.
**Advisories**
* `useAdvisories(filters)` and `useAdvisory(id)` plus `useLinkedAdvisories(id)`.
* UI explicitly shows aggregation groups; never collapses sources into one record.
**VEX**
* `useVex(filters)`, `useVexDoc(id)`, `useVexForTuple(purl, advisoryId)` for precedence views.
**Findings**
* `useFindings(policyId, filters, cursor)` and `useFinding(findingId)`.
* Explain viewer reading `/findings/:policyId/:findingId/explain`.
**Policies**
* Monaco editor wrapper; compile/simulate actions; approval dialog.
* Diff viewer using the compilers diagnostics and rule stats.
**Runs**
* `useRuns`, `useRun(runId)` + SSE hook `useRunProgress(runId)`.
**Admin**
* `useUsers`, `useRoles`, `useTenants`, `useTokens`, `useIntegrations`.
**Downloads**
* Static page with dynamic image tags fetched from registry metadata endpoint; copyable commands and Helm snippets.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 5.4 Live updates
* SSE/WebSocket client with backoff, heartbeat, and resubscribe logic.
* Only Runs and slim ticker endpoints use live channels; everything else is HTTP pull with caching.
### 5.5 Security
* OIDC PKCE flow; token storage in memory; refresh via hidden iframe or refresh endpoint.
* CSP locked to sameorigin, with hashes for inline scripts from Next.
* Feature flags control admin features visibility; RBAC doublechecked on server responses.
### 5.6 Packaging & distribution
* `stella-console:<version>` image built in CI; Nginx or Node serve.
* Helm chart values include Authority issuer, API base URL, tenant defaults.
* Offline bundle artifact for airgapped deployments.
---
## 6) Documentation changes (create/update)
1. **`/docs/ui/console-overview.md`**
* Purpose, IA, tenant model, role mapping, AOC alignment.
2. **`/docs/ui/navigation.md`**
* Route map, global filters, keyboard shortcuts, deep links.
3. **`/docs/ui/sbom-explorer.md`**
* Catalog, detail, Graph Explorer, overlays, exports.
4. **`/docs/ui/advisories-and-vex.md`**
* Aggregationnotmerge, multisource rollups, raw viewers.
5. **`/docs/ui/findings.md`**
* Filters, table semantics, explain view, exports.
6. **`/docs/ui/policies.md`**
* Editor, simulation, diffs, approvals, links to DSL docs.
7. **`/docs/ui/runs.md`**
* Queue, live progress, diffs, retries, evidence bundles.
8. **`/docs/ui/admin.md`**
* Users, roles, tenants, tokens, integrations.
9. **`/docs/ui/downloads.md`**
* Containers list, versions, pull/install commands, airgapped flow.
10. **`/docs/deploy/console.md`**
* Helm, ingress, TLS, CSP, environment variables, health checks.
11. **`/docs/install/docker.md`**
* All container images, pull commands, compose/Helm examples.
12. **`/docs/security/console-security.md`**
* OIDC, RBAC, CSP, tenancy, evidence of least privilege.
13. **`/docs/observability/ui-telemetry.md`**
* UI metrics, logs, dashboards, feature flags.
14. **`/docs/cli-vs-ui-parity.md`**
* Matrix of operations and surfaces.
15. **`/docs/architecture/console.md`**
* Frontend architecture, packages, data flow diagrams, SSE design.
16. **`/docs/accessibility.md`**
* WCAG checklist, testing tools, color tokens.
17. **`/docs/examples/ui-tours.md`**
* Taskcentric walkthroughs: triage, audit, policy rollout.
> Each document includes a “Compliance checklist” section.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
---
## 7) Tasks (tracked per team)
### 7.1 Console scaffold & infra
* [ ] Initialize Next.js 14 TypeScript app with App Router.
* [ ] Set up TanStack Query, Auth context, Error boundaries, Toasts.
* [ ] Integrate OIDC client; implement login/logout, tenant picker.
* [ ] Add design tokens and base components in `packages/ui`.
* [ ] Configure CI: build, test, lint, typecheck, Lighthouse budgets.
* [ ] Build `stella-console` container image and Helm chart.
### 7.2 Typed API client
* [ ] Generate clients from OpenAPI; wrap with hooks in `packages/api`.
* [ ] Centralize retry, error mapping, tenant header injection.
### 7.3 Feature delivery
**SBOMs**
* [ ] Catalog page with filters, server pagination.
* [ ] SBOM detail with components table.
* [ ] Graph Explorer with overlays and neighborhood loaders.
* [ ] Raw JSON viewer drawer.
**Advisories & VEX**
* [ ] Advisory aggregation list; persource chips; raw view.
* [ ] VEX list with filters; precedence explainer per tuple.
* [ ] Link outs to Findings and SBOMs.
**Findings**
* [ ] Virtualized table; filters; saved views.
* [ ] Explain view: rules fired, references, trace links.
* [ ] Export actions (CSV/JSON stream).
**Policies**
* [ ] Monaco editor with syntax/diagnostics; compile and simulate.
* [ ] Diff and impact panel; submit and approve workflow.
* [ ] Run from simulation context.
**Runs**
* [ ] Runs list; run detail with SSE progress.
* [ ] Diff between runs; evidence bundle download.
**Admin**
* [ ] Users/roles CRUD; token issuance; tenant management.
* [ ] Integrations: OIDC config form; registry connections.
* [ ] Settings for environment defaults.
**Downloads**
* [ ] Registry tag fetch, pull commands, Helm snippet generator.
* [ ] Airgapped instructions and offline bundle download.
> **Imposed rule:** Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.
### 7.4 Quality gates
* [ ] Playwright E2E for core flows: triage, simulate, approve, run, explain.
* [ ] Storybook with a11y addon and interaction tests.
* [ ] Lighthouse CI budgets met; perf regressions block merges.
* [ ] i18n scaffolding ready; all strings externalized.
* [ ] Security checks: CSP effective, OIDC flows tested, RBAC enforced.
### 7.5 Docs tasks
* [ ] Populate all docs listed in section 6 with screenshots and animated GIFs.
* [ ] Add CLI vs UI parity matrix and keep it in CI to detect drift.
* [ ] Add AOC user guide callouts explaining raw fact immutability across pages.
---
## 8) Feature flags
* `ui.graph-explorer`
* `ui.policy-editor`
* `ui.ai-assist` (off by default; when enabled, renders a rightrail for humanintheloop summaries)
* `ui.downloads`
Flag definitions and defaults live in `/docs/observability/ui-telemetry.md` and config map.
---
## 9) Acceptance criteria
* Console ships as a container image with Helm deployment and a static build option.
* SBOM Explorer visualizes graphs and overlays policy outcomes without page crashes on large SBOMs.
* Advisories/VEX browsers display **aggregation only**, never merge sources; raw document viewers are present.
* Findings view supports serverside pagination and Explain with rule traces.
* Policy Editor compiles, simulates, diffs, and supports approval workflows.
* Runs page shows live progress and enables evidence exports.
* Admin handles users, roles, tenants, tokens, and OIDC configuration.
* Downloads page lists all images and installation paths.
* All pages meet a11y checks and pass Lighthouse budgets.
* RBAC enforced in UI affordances and validated by API responses.
---
## 10) Risks and mitigations
* **Graph performance** on very large SBOMs use neighborhood windows and server filters; cap depth.
* **UI/CLI drift** parity matrix in CI; failing check blocks release.
* **Overfetching** TanStack caching, cursorbased endpoints, and strict datalayer reviews.
* **Scope creep** in Admin featureflag granular sections, ship iteratively.
* **AOC confusion** constant raw/derived labeling and view raw toggles.
---
## 11) Test plan
* **Unit**: hooks and components; data adapters; graph layout utils.
* **E2E**: Playwright flows for triage, simulationapprovalrunexplain, admin RBAC.
* **A11y**: axecore in CI and manual keyboard checks.
* **Perf**: Lighthouse against seeded data; visual regression on Storybook.
* **Security**: OIDC happy and unhappy paths, CSP violation tests, SSRF resistance for downloads metadata.
* **Resilience**: simulate API timeouts; verify error boundaries and retries.
---
## 12) Nongoals (this epic)
* No serverside report authoring engine beyond export templates.
* No proprietary graph database; server remains RESTful with indexed queries.
* No speculative automatic policy changes; all edits remain humandriven.
---
## 13) Philosophy and guiding principles
* **AOC first**: the UI respects facts vs decisions. Raw content is immutable and visible.
* **Deterministic outcomes**: what you see equals what the Policy Engine produced, with an explanation you can export.
* **Explainability** over cleverness: every badge, color, and status maps to a rule and a source.
* **Parity**: UI is not a secondclass citizen, and the CLI is not an afterthought.
* **Composability**: modules are independent packages with clear contracts and tests.
> Final reminder: **Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.**