Files
git.stella-ops.org/EPIC_3.md
master 651b8e0fa3 feat: Add new projects to solution and implement contract testing documentation
- Added "StellaOps.Policy.Engine", "StellaOps.Cartographer", and "StellaOps.SbomService" projects to the StellaOps solution.
- Created AGENTS.md to outline the Contract Testing Guild Charter, detailing mission, scope, and definition of done.
- Established TASKS.md for the Contract Testing Task Board, outlining tasks for Sprint 62 and Sprint 63 related to mock servers and replay testing.
2025-10-27 07:57:55 +02:00

22 KiB
Raw Blame History

Heres Epic 3 in the same “pasteintorepo” format: exhaustive, implementationready, and aligned with the AOC model plus the Policy Engine from the previous epics.


Epic 3: StellaOps Console (Web UI over WebServices)

Short name: StellaOps Console Services touched: Web API Gateway, Authority (authN/Z), Policy Engine, SBOM Service, Conseiller (Feedser), Excitator (Vexer), Scheduler/Workers, Telemetry Data stores used via APIs: MongoDB (reads only from UI), object storage for traces, optional Redis/NATS for live updates Deliverable: TypeScript/React web application with a component library and feature modules, packaged as container images and static builds


1) What it is

StellaOps Console is the firstparty Web UI for all Stella WebServices. It provides a cohesive, roleaware surface for:

  • Viewing raw AOC facts (advisories, VEX, SBOMs) without mutation.
  • Applying and simulating policies (VEX application rules, advisory normalization) then exploring effective findings.
  • Navigating SBOMs as graphs, zooming into components, and seeing linked advisories/VEX with clear precedence.
  • Running and monitoring evaluations, auditing why decisions were made, and exporting evidence.
  • Administering tenants, users and roles, API tokens, and integrations.
  • Publishing a selfhostable Console image and a Download & Install page covering all product containers.

The Console is a readwrite client for allowable operations (policy authoring, run orchestration, approvals), and strictly readonly for raw facts per the AOC enforcement. It is not a new API; it is a UI over the existing ones with strong guardrails and deterministic behavior.


2) Why

  • Teams need a single, consistent interface to explore SBOMs, advisories, VEX, and policy outcomes.
  • Audits require visible provenance, replayable evidence, and explanation chains.
  • Policy creation and simulation are safer when you can see deltas and traces.
  • Many workflows benefit from visual tools: graph explorers, diff views, and stepwise wizards.
  • Not everyone wants to live in the CLI all day. Parity and choice matter.

3) How it should work (maximum detail)

3.1 Information architecture

Toplevel navigation with a tenant context picker:

  1. Dashboard: highlevel posture and recent changes.
  2. SBOMs: catalog, search, and Graph Explorer.
  3. Advisories & VEX: raw fact browsers with aggregationnotmerge semantics.
  4. Findings: policymaterialized findings with filters and explanations.
  5. Policies: editor, simulation, versioning, approvals.
  6. Runs: orchestration, live progress, history, diffs.
  7. Reports & Export: evidence packages, CSV/JSON exports.
  8. Admin: users/roles, tokens, SSO, tenants, registries, settings.
  9. Downloads: product containers and installation instructions.

Global elements:

  • Global Filters: policy version, environment profile, severity band, time window.
  • Search Bar: PURL, CVE/GHSA IDs, SBOM IDs.
  • Live Status: background jobs, queue lag, last sync cursors.
  • Help & Docs: contextual deep links into /docs/*.

3.2 Navigation & routes

/dashboard
/sboms
/sboms/:sbomId
/sboms/:sbomId/graph
/advisories
/advisories/:advisoryId (shows all linked sources; aggregation only)
/vex
/vex/:vexId
/findings?policy=:pid&sbom=:sid&status=:st&severity=:sev
/findings/:findingId/explain
/policies
/policies/:policyId/versions/:v
/policies/:policyId/simulate
/runs
/runs/:runId
/reports
/admin/users
/admin/roles
/admin/tenants
/admin/integrations
/admin/tokens
/downloads

3.3 Core feature modules

3.3.1 Dashboard

  • Cards: “Findings by severity,” “VEX overrides in last 24h,” “New advisories linked,” “Run health,” “Policy changes.”
  • Clickthrough to filtered views.
  • Data sources: aggregated endpoints exposed by Web API (no clientside aggregation over large sets).

3.3.2 SBOM Explorer (catalog + graph)

  • Catalog: table with SBOM ID, artifact name/version, source, ingest time, component count, last evaluation per policy.

  • Detail: components tabular view with paging; filters by package type, license, scope.

  • Graph Explorer:

    • Interactive canvas with pan/zoom, focus on component, dependency paths, reachability placeholders.
    • Overlay toggles: highlight components with affected findings; show VEX “not_affected” zones; show licenses risk overlay.
    • Policy overlays: toggle between policy versions to see inplace severity/status changes.
  • Actions: export component list, copy PURL, open related findings.

AOC alignment: SBOM content is immutable; any edits are proposed as new SBOM versions upstream. UI displays raw SBOM JSON in a readonly side panel.

3.3.3 Advisories & VEX browsers

  • Advisories list:

    • Left panel: filters by source (OSV, GHSA, CSAF vendors, NVD), published/modified time, affected ecosystem.
    • Middle panel: aggregation group keyed by linkset identity (same vulnerability across sources). No merging; show a rollup with persource chips.
    • Right panel: selected advisory source view with raw JSON, references, CVSS vectors, and “linked SBOM components” sample.
    • Severity shown three ways: vendorreported, normalized (per mapping), and effective under the currently selected policy.
  • VEX list:

    • Filters by vendor, product, status, justification, scope.
    • Detail panel: all statements applying to the same (component, advisory) tuple, with precedence logic visualization and the statement that won under the current policy.
    • Raw JSON viewer for each document.

Strict rule: Conseiller and Excitator are visualized as aggregators only. No UI affordance suggests serverside merging. All links route to raw documents with reference IDs.

3.3.4 Findings

  • Virtualized table supporting millions of rows via serverside pagination and cursoring.
  • Columns: policy, SBOM, component PURL, advisory IDs (chips for each source), status, severity, last updated, rationale count.
  • Row click → Explain view with rule hits in order, references to advisories/VEX used, and links to raw docs and trace blobs.
  • Bulk export with query replay (the export API reruns the same filters on the server and streams CSV/JSON).

3.3.5 Policies

  • Embedded Policy Editor (from Epic 2) with Monaco features, simulation panel, diffs, and approval workflow.
  • Precommit lint and compile; cannot submit with syntax errors.
  • Simulation results show increase/decrease unchanged counts, top rules impacting results, and sample affected components.

3.3.6 Runs

  • Queue view: queued/running/succeeded/failed with timestamps and SLA hints.
  • Live progress with SSE/WebSocket updates: tuples processed, rules fired, findings materialized.
  • Diff view between runs for the same policy and SBOM set.
  • Retry and cancel actions as allowed by RBAC.

3.3.7 Reports & Export

  • Evidence bundle creation: include policy version, run ID, sample traces, and result slices.
  • Export templates (CSV for management, JSON/NDJSON for SIEM ingestion).
  • Signed export manifests with checksums.

3.3.8 Admin

  • Users & Roles: invite, disable, role mapping.
  • Tenants: create, switch, default policy bindings.
  • Tokens: create scoped API tokens with expirations.
  • Integrations: configure SSO (OIDC), registries, webhooks.
  • Settings: environment defaults for policy evaluation (exposure, runtime hints).

3.3.9 Downloads

  • List of official Docker images: stella-console, stella-api, conseiller, excitator, sbom-svc, policy-engine, etc.
  • Version matrix, pull commands, Helm chart snippet, offline tarballs, and system requirements.
  • Link to /docs/install/docker.md and /docs/deploy/console.md.

3.4 UX flows (key tasks)

  • Triage a vulnerability: search CVE → open rollup → view all sources → jump to affected findings → open Explain → see VEX precedence → decide if policy change is needed → simulate policy → if good, submit and request approval → run → verify new findings.
  • Investigate SBOM: open SBOM → Graph Explorer → highlight affected nodes under policy PX vN → click a node → see linked advisories + VEX → open Explain for a specific finding.
  • Audit evidence: open run → download evidence bundle with policy, run metadata, traces, and effective finding slice.
  • Onboard team: invite users, set roles, define default tenant policies, give readonly access to auditors.

3.5 CLI vs UI parity

Create /docs/cli-vs-ui-parity.md with a matrix. Principle:

  • All read capabilities must exist in both CLI and UI.
  • All policy lifecycle actions exist in both.
  • Longrunning operations can be initiated in UI and monitored in either surface.

3.6 Security & auth

  • Auth: OIDC with PKCE; shortlived ID tokens; silent refresh.
  • RBAC enforced by the API; UI only gates affordances and never trusts itself.
  • CSRF not applicable for tokenbased APIs; still set robust CSP, XFrameOptions, and ReferrerPolicy.
  • Tenancy: every API call includes tenant header; UI shows explicit tenant badge.
  • Sensitive pages require fresh auth (reprompt).

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

3.7 Accessibility & i18n

  • WCAG 2.1 AA: keyboard nav, focus indicators, ARIA for tables and graphs, colorcontrast tests.
  • i18n scaffolding via ICU messages; English shipped first; content keys stored in code, translations as JSON resources.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

3.8 Performance

  • Use serverside pagination and cursoring everywhere; never fetch unbounded lists.
  • Virtualized tables and lazy panels.
  • Graph Explorer loads neighborhood windows, not whole graphs.
  • Cache with TanStack Query; deduplicate requests; stalewhilerevalidate.
  • Performance budgets in CI (Lighthouse): TTI < 3.5s on reference hardware.

3.9 Error handling & offline

  • Error boundaries per feature; retry buttons; copyable request IDs.
  • Network loss → banner + readonly cached views where safe.
  • Clear messages for AOC constraints: raw facts cannot be edited.

3.10 Telemetry & observability

  • UI event telemetry to internal sink (no thirdparty beacons by default).
  • Metrics: UI API latency percentiles, error rates, SSE subscription health.
  • Feature flags to darklaunch modules.

4) APIs consumed (representative)

  • GET /sboms, GET /sboms/{id}, GET /sboms/{id}/components?cursor=...
  • GET /advisories?source=..., GET /advisories/{id}, GET /advisories/{id}/linked
  • GET /vex?status=..., GET /vex/{id}
  • GET /findings/{policyId} and GET /findings/{policyId}/{findingId}/explain
  • POST /policies, POST /policies/{id}/compile, POST /policies/{id}/simulate, POST /policies/{id}/approve
  • POST /policies/{id}/runs, GET /policies/{id}/runs/{runId} with SSE for progress
  • POST /exports for evidence bundles
  • GET /auth/user, GET /auth/tenants, POST /admin/users, POST /admin/tokens

All calls include tenant scope headers and bearer tokens from Authority.


5) Implementation plan

5.1 Frontend architecture

  • Framework: Next.js 14 (App Router) with TypeScript.
  • State/data: TanStack Query for server state; Redux only if a global app state proves necessary.
  • UI toolkit: Internal Stella UI component library (headless + primitives) with CSS variables and design tokens.
  • Visualization: D3 for graph, Monaco for policy editing.
  • Testing: Playwright (E2E), Vitest/Jest (unit), Storybook (components), Lighthouse (perf).
  • i18n: @formatjs/intl + message catalogs.
  • Packaging: static build served by Node adapter behind the API gateway; also a stella-console Docker image.

Repo layout

/console
  /apps/web
  /packages/ui         # design system & components
  /packages/api        # typed API clients (OpenAPI codegen)
  /packages/features   # feature modules (sboms, advisories, vex, findings, policies, runs, admin)
  /packages/utils
  /e2e
  /storybook

5.2 Design System (packages/ui)

  • Foundation tokens: color, spacing, typography, elevation; dark/light modes.
  • Components: AppShell, Nav, DataTable (virtualized), Badge/Chip, Tabs, Drawer, GraphCanvas, CodeViewer (readonly JSON), Form primitives, Modal, Toast, Pill filters.
  • Accessibility baked into components; snapshot and interaction tests in Storybook.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

5.3 Feature modules

Each module has:

  • routes.tsx pages, api data hooks, components, tests, docs link.
  • Query keys standardized for caching and invalidation.

SBOMs

  • Hooks: useSboms, useSbom(id), useComponents(sbomId, query).
  • GraphCanvas using neighborhood loaders: /sboms/:id/graph?center=:purl&depth=1..3.

Advisories

  • useAdvisories(filters) and useAdvisory(id) plus useLinkedAdvisories(id).
  • UI explicitly shows aggregation groups; never collapses sources into one record.

VEX

  • useVex(filters), useVexDoc(id), useVexForTuple(purl, advisoryId) for precedence views.

Findings

  • useFindings(policyId, filters, cursor) and useFinding(findingId).
  • Explain viewer reading /findings/:policyId/:findingId/explain.

Policies

  • Monaco editor wrapper; compile/simulate actions; approval dialog.
  • Diff viewer using the compilers diagnostics and rule stats.

Runs

  • useRuns, useRun(runId) + SSE hook useRunProgress(runId).

Admin

  • useUsers, useRoles, useTenants, useTokens, useIntegrations.

Downloads

  • Static page with dynamic image tags fetched from registry metadata endpoint; copyable commands and Helm snippets.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

5.4 Live updates

  • SSE/WebSocket client with backoff, heartbeat, and resubscribe logic.
  • Only Runs and slim “ticker” endpoints use live channels; everything else is HTTP pull with caching.

5.5 Security

  • OIDC PKCE flow; token storage in memory; refresh via hidden iframe or refresh endpoint.
  • CSP locked to sameorigin, with hashes for inline scripts from Next.
  • Feature flags control admin features visibility; RBAC doublechecked on server responses.

5.6 Packaging & distribution

  • stella-console:<version> image built in CI; Nginx or Node serve.
  • Helm chart values include Authority issuer, API base URL, tenant defaults.
  • Offline bundle artifact for airgapped deployments.

6) Documentation changes (create/update)

  1. /docs/ui/console-overview.md

    • Purpose, IA, tenant model, role mapping, AOC alignment.
  2. /docs/ui/navigation.md

    • Route map, global filters, keyboard shortcuts, deep links.
  3. /docs/ui/sbom-explorer.md

    • Catalog, detail, Graph Explorer, overlays, exports.
  4. /docs/ui/advisories-and-vex.md

    • Aggregationnotmerge, multisource rollups, raw viewers.
  5. /docs/ui/findings.md

    • Filters, table semantics, explain view, exports.
  6. /docs/ui/policies.md

    • Editor, simulation, diffs, approvals, links to DSL docs.
  7. /docs/ui/runs.md

    • Queue, live progress, diffs, retries, evidence bundles.
  8. /docs/ui/admin.md

    • Users, roles, tenants, tokens, integrations.
  9. /docs/ui/downloads.md

    • Containers list, versions, pull/install commands, airgapped flow.
  10. /docs/deploy/console.md

    • Helm, ingress, TLS, CSP, environment variables, health checks.
  11. /docs/install/docker.md

    • All container images, pull commands, compose/Helm examples.
  12. /docs/security/console-security.md

    • OIDC, RBAC, CSP, tenancy, evidence of least privilege.
  13. /docs/observability/ui-telemetry.md

    • UI metrics, logs, dashboards, feature flags.
  14. /docs/cli-vs-ui-parity.md

    • Matrix of operations and surfaces.
  15. /docs/architecture/console.md

    • Frontend architecture, packages, data flow diagrams, SSE design.
  16. /docs/accessibility.md

    • WCAG checklist, testing tools, color tokens.
  17. /docs/examples/ui-tours.md

    • Taskcentric walkthroughs: triage, audit, policy rollout.

Each document includes a “Compliance checklist” section. Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.


7) Tasks (tracked per team)

7.1 Console scaffold & infra

  • Initialize Next.js 14 TypeScript app with App Router.
  • Set up TanStack Query, Auth context, Error boundaries, Toasts.
  • Integrate OIDC client; implement login/logout, tenant picker.
  • Add design tokens and base components in packages/ui.
  • Configure CI: build, test, lint, typecheck, Lighthouse budgets.
  • Build stella-console container image and Helm chart.

7.2 Typed API client

  • Generate clients from OpenAPI; wrap with hooks in packages/api.
  • Centralize retry, error mapping, tenant header injection.

7.3 Feature delivery

SBOMs

  • Catalog page with filters, server pagination.
  • SBOM detail with components table.
  • Graph Explorer with overlays and neighborhood loaders.
  • Raw JSON viewer drawer.

Advisories & VEX

  • Advisory aggregation list; persource chips; raw view.
  • VEX list with filters; precedence explainer per tuple.
  • Link outs to Findings and SBOMs.

Findings

  • Virtualized table; filters; saved views.
  • Explain view: rules fired, references, trace links.
  • Export actions (CSV/JSON stream).

Policies

  • Monaco editor with syntax/diagnostics; compile and simulate.
  • Diff and impact panel; submit and approve workflow.
  • Run from simulation context.

Runs

  • Runs list; run detail with SSE progress.
  • Diff between runs; evidence bundle download.

Admin

  • Users/roles CRUD; token issuance; tenant management.
  • Integrations: OIDC config form; registry connections.
  • Settings for environment defaults.

Downloads

  • Registry tag fetch, pull commands, Helm snippet generator.
  • Airgapped instructions and offline bundle download.

Imposed rule: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.

7.4 Quality gates

  • Playwright E2E for core flows: triage, simulate, approve, run, explain.
  • Storybook with a11y addon and interaction tests.
  • Lighthouse CI budgets met; perf regressions block merges.
  • i18n scaffolding ready; all strings externalized.
  • Security checks: CSP effective, OIDC flows tested, RBAC enforced.

7.5 Docs tasks

  • Populate all docs listed in section 6 with screenshots and animated GIFs.
  • Add “CLI vs UI” parity matrix and keep it in CI to detect drift.
  • Add “AOC user guide” callouts explaining raw fact immutability across pages.

8) Feature flags

  • ui.graph-explorer
  • ui.policy-editor
  • ui.ai-assist (off by default; when enabled, renders a rightrail for humanintheloop summaries)
  • ui.downloads

Flag definitions and defaults live in /docs/observability/ui-telemetry.md and config map.


9) Acceptance criteria

  • Console ships as a container image with Helm deployment and a static build option.
  • SBOM Explorer visualizes graphs and overlays policy outcomes without page crashes on large SBOMs.
  • Advisories/VEX browsers display aggregation only, never merge sources; raw document viewers are present.
  • Findings view supports serverside pagination and Explain with rule traces.
  • Policy Editor compiles, simulates, diffs, and supports approval workflows.
  • Runs page shows live progress and enables evidence exports.
  • Admin handles users, roles, tenants, tokens, and OIDC configuration.
  • Downloads page lists all images and installation paths.
  • All pages meet a11y checks and pass Lighthouse budgets.
  • RBAC enforced in UI affordances and validated by API responses.

10) Risks and mitigations

  • Graph performance on very large SBOMs → use neighborhood windows and server filters; cap depth.
  • UI/CLI drift → parity matrix in CI; failing check blocks release.
  • Overfetching → TanStack caching, cursorbased endpoints, and strict datalayer reviews.
  • Scope creep in Admin → featureflag granular sections, ship iteratively.
  • AOC confusion → constant raw/derived labeling and “view raw” toggles.

11) Test plan

  • Unit: hooks and components; data adapters; graph layout utils.
  • E2E: Playwright flows for triage, simulation→approval→run→explain, admin RBAC.
  • A11y: axecore in CI and manual keyboard checks.
  • Perf: Lighthouse against seeded data; visual regression on Storybook.
  • Security: OIDC happy and unhappy paths, CSP violation tests, SSRF resistance for downloads metadata.
  • Resilience: simulate API timeouts; verify error boundaries and retries.

12) Nongoals (this epic)

  • No serverside report authoring engine beyond export templates.
  • No proprietary graph database; server remains RESTful with indexed queries.
  • No speculative automatic policy changes; all edits remain humandriven.

13) Philosophy and guiding principles

  • AOC first: the UI respects facts vs decisions. Raw content is immutable and visible.
  • Deterministic outcomes: what you see equals what the Policy Engine produced, with an explanation you can export.
  • Explainability over cleverness: every badge, color, and status maps to a rule and a source.
  • Parity: UI is not a secondclass citizen, and the CLI is not an afterthought.
  • Composability: modules are independent packages with clear contracts and tests.

Final reminder: Work of this type or tasks of this type on this component must also be applied everywhere else it should be applied.