feat(docs): Add comprehensive documentation for Vexer, Vulnerability Explorer, and Zastava modules
- Introduced AGENTS.md, README.md, TASKS.md, and implementation_plan.md for Vexer, detailing mission, responsibilities, key components, and operational notes. - Established similar documentation structure for Vulnerability Explorer and Zastava modules, including their respective workflows, integrations, and observability notes. - Created risk scoring profiles documentation outlining the core workflow, factor model, governance, and deliverables. - Ensured all modules adhere to the Aggregation-Only Contract and maintain determinism and provenance in outputs.
This commit is contained in:
		| @@ -1,228 +1,228 @@ | ||||
| # Deploying the StellaOps Console | ||||
|  | ||||
| > **Audience:** Deployment Guild, Console Guild, operators rolling out the web console.   | ||||
| > **Scope:** Helm and Docker Compose deployment steps, ingress/TLS configuration, required environment variables, health checks, offline/air-gap operation, and compliance checklist (Sprint 23). | ||||
|  | ||||
| The StellaOps Console ships as part of the `stellaops` stack Helm chart and Compose bundles maintained under `deploy/`. This guide describes the supported deployment paths, the configuration surface, and operational checks needed to run the console in connected or air-gapped environments. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1. Prerequisites | ||||
|  | ||||
| - Kubernetes cluster (v1.28+) with ingress controller (NGINX, Traefik, or equivalent) and Cert-Manager for automated TLS, or Docker host for Compose deployments.   | ||||
| - Container registry access to `registry.stella-ops.org` (or mirrored registry) for all images listed in `deploy/releases/*.yaml`.   | ||||
| - Authority service configured with console client (`aud=ui`, scopes `ui.read`, `ui.admin`).   | ||||
| - DNS entry pointing to the console hostname (for example, `console.acme.internal`).   | ||||
| - Cosign public key for manifest verification (`deploy/releases/manifest.json.sig`).   | ||||
| - Optional: Offline Kit bundle for air-gapped sites (`stella-ops-offline-kit-<ver>.tar.gz`). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2. Helm deployment (recommended) | ||||
|  | ||||
| ### 2.1 Install chart repository | ||||
|  | ||||
| ```bash | ||||
| helm repo add stellaops https://downloads.stella-ops.org/helm | ||||
| helm repo update stellaops | ||||
| ``` | ||||
|  | ||||
| If operating offline, copy the chart archive from the Offline Kit (`deploy/helm/stellaops-<ver>.tgz`) and run: | ||||
|  | ||||
| ```bash | ||||
| helm install stellaops ./stellaops-<ver>.tgz --namespace stellaops --create-namespace | ||||
| ``` | ||||
|  | ||||
| ### 2.2 Base installation | ||||
|  | ||||
| ```bash | ||||
| helm install stellaops stellaops/stellaops \ | ||||
|   --namespace stellaops \ | ||||
|   --create-namespace \ | ||||
|   --values deploy/helm/stellaops/values-prod.yaml | ||||
| ``` | ||||
|  | ||||
| The chart deploys Authority, Console web/API gateway, Scanner API, Scheduler, and supporting services. The console frontend pod is labelled `app=stellaops-web-ui`. | ||||
|  | ||||
| ### 2.3 Helm values highlights | ||||
|  | ||||
| Key sections in `deploy/helm/stellaops/values-prod.yaml`: | ||||
|  | ||||
| | Path | Description | | ||||
| |------|-------------| | ||||
| | `console.ingress.host` | Hostname served by the console (`console.example.com`). | | ||||
| | `console.ingress.tls.secretName` | Kubernetes secret containing TLS certificate (generated by Cert-Manager or uploaded manually). | | ||||
| | `console.config.apiGateway.baseUrl` | Internal base URL the UI uses to reach the gateway (defaults to `https://stellaops-web`). | | ||||
| | `console.env.AUTHORITY_ISSUER` | Authority issuer URL (for example, `https://authority.example.com`). | | ||||
| | `console.env.AUTHORITY_CLIENT_ID` | Authority client ID for the console UI. | | ||||
| | `console.env.AUTHORITY_SCOPES` | Space-separated scopes required by UI (`ui.read ui.admin`). | | ||||
| | `console.resources` | CPU/memory requests and limits (default 250m CPU / 512Mi memory). | | ||||
| | `console.podAnnotations` | Optional annotations for service mesh or monitoring. | | ||||
|  | ||||
| Use `values-stage.yaml`, `values-dev.yaml`, or `values-airgap.yaml` as templates for other environments. | ||||
|  | ||||
| ### 2.4 TLS and ingress | ||||
|  | ||||
| Example ingress override: | ||||
|  | ||||
| ```yaml | ||||
| console: | ||||
|   ingress: | ||||
|     enabled: true | ||||
|     className: nginx | ||||
|     host: console.acme.internal | ||||
|     tls: | ||||
|       enabled: true | ||||
|       secretName: console-tls | ||||
| ``` | ||||
|  | ||||
| Generate certificates using Cert-Manager or provide an existing secret. For air-gapped deployments, pre-create the secret with the mirrored CA chain. | ||||
|  | ||||
| ### 2.5 Health checks | ||||
|  | ||||
| Console pods expose: | ||||
|  | ||||
| | Path | Purpose | Notes | | ||||
| |------|---------|-------| | ||||
| | `/health/live` | Liveness probe | Confirms process responsive. | | ||||
| | `/health/ready` | Readiness probe | Verifies configuration bootstrap and Authority reachability. | | ||||
| | `/metrics` | Prometheus metrics | Enabled when `console.metrics.enabled=true`. | | ||||
|  | ||||
| Helm chart sets default probes (`initialDelaySeconds: 10`, `periodSeconds: 15`). Adjust via `console.livenessProbe` and `console.readinessProbe`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3. Docker Compose deployment | ||||
|  | ||||
| Located in `deploy/compose/docker-compose.console.yaml`. Quick start: | ||||
|  | ||||
| ```bash | ||||
| cd deploy/compose | ||||
| docker compose -f docker-compose.console.yaml --env-file console.env up -d | ||||
| ``` | ||||
|  | ||||
| `console.env` should define: | ||||
|  | ||||
| ``` | ||||
| CONSOLE_PUBLIC_BASE_URL=https://console.acme.internal | ||||
| AUTHORITY_ISSUER=https://authority.acme.internal | ||||
| AUTHORITY_CLIENT_ID=console-ui | ||||
| AUTHORITY_CLIENT_SECRET=<if using confidential client> | ||||
| AUTHORITY_SCOPES=ui.read ui.admin | ||||
| CONSOLE_GATEWAY_BASE_URL=https://api.acme.internal | ||||
| ``` | ||||
|  | ||||
| The compose bundle includes Traefik as reverse proxy with TLS termination. Update `traefik/dynamic/console.yml` for custom certificates or additional middlewares (CSP headers, rate limits). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4. Environment variables | ||||
|  | ||||
| | Variable | Description | Default | | ||||
| |----------|-------------|---------| | ||||
| | `CONSOLE_PUBLIC_BASE_URL` | External URL used for redirects, deep links, and telemetry. | None (required). | | ||||
| | `CONSOLE_GATEWAY_BASE_URL` | URL of the web gateway that proxies API calls (`/console/*`). | Chart service name. | | ||||
| | `AUTHORITY_ISSUER` | Authority issuer (`https://authority.example.com`). | None (required). | | ||||
| | `AUTHORITY_CLIENT_ID` | OIDC client configured in Authority. | None (required). | | ||||
| | `AUTHORITY_SCOPES` | Space-separated scopes assigned to the console client. | `ui.read ui.admin`. | | ||||
| | `AUTHORITY_DPOP_ENABLED` | Enables DPoP challenge/response (recommended true). | `true`. | | ||||
| | `CONSOLE_FEATURE_FLAGS` | Comma-separated feature flags (`runs`, `downloads.offline`, etc.). | `runs,downloads,policies`. | | ||||
| | `CONSOLE_LOG_LEVEL` | Minimum log level (`Information`, `Debug`, etc.). | `Information`. | | ||||
| | `CONSOLE_METRICS_ENABLED` | Expose `/metrics` endpoint. | `true`. | | ||||
| | `CONSOLE_SENTRY_DSN` | Optional error reporting DSN. | Blank. | | ||||
|  | ||||
| When running behind additional proxies, set `ASPNETCORE_FORWARDEDHEADERS_ENABLED=true` to honour `X-Forwarded-*` headers. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5. Security headers and CSP | ||||
|  | ||||
| The console serves a strict Content Security Policy (CSP) by default: | ||||
|  | ||||
| ``` | ||||
| default-src 'self'; | ||||
| connect-src 'self' https://*.stella-ops.local; | ||||
| script-src 'self'; | ||||
| style-src 'self' 'unsafe-inline'; | ||||
| img-src 'self' data:; | ||||
| font-src 'self'; | ||||
| frame-ancestors 'none'; | ||||
| ``` | ||||
|  | ||||
| Adjust via `console.config.cspOverrides` if additional domains are required. For integrations embedding the console, update OIDC redirect URIs and Authority scopes accordingly. | ||||
|  | ||||
| TLS recommendations: | ||||
|  | ||||
| - Use TLS 1.2+ with modern cipher suite policy.   | ||||
| - Enable HSTS (`Strict-Transport-Security: max-age=31536000; includeSubDomains`).   | ||||
| - Provide custom trust bundles via `console.config.trustBundleSecret` when using private CAs. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6. Logging and metrics | ||||
|  | ||||
| - Structured logs emitted to stdout with correlation IDs. Configure log shipping via Fluent Bit or similar.   | ||||
| - Metrics available at `/metrics` in Prometheus format. Key metrics include `ui_request_duration_seconds`, `ui_tenant_switch_total`, and `ui_download_manifest_refresh_seconds`.   | ||||
| - Enable OpenTelemetry exporter by setting `OTEL_EXPORTER_OTLP_ENDPOINT` and associated headers in environment variables. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7. Offline and air-gap deployment | ||||
|  | ||||
| - Mirror container images using the Downloads workspace or Offline Kit manifest. Example: | ||||
|  | ||||
| ```bash | ||||
| oras copy registry.stella-ops.org/stellaops/web-ui@sha256:<digest> \ | ||||
|   registry.airgap.local/stellaops/web-ui:2025.10.0 | ||||
| ``` | ||||
|  | ||||
| - Import Offline Kit using `stella ouk import` before starting the console so manifest parity checks succeed.   | ||||
| - Use `values-airgap.yaml` to disable external telemetry endpoints and configure internal certificate chains.   | ||||
| - Run `helm upgrade --install` using the mirrored chart (`stellaops-<ver>.tgz`) and set `console.offlineMode=true` to surface offline banners. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8. Health checks and remediation | ||||
|  | ||||
| | Check | Command | Expected result | | ||||
| |-------|---------|-----------------| | ||||
| | Pod status | `kubectl get pods -n stellaops` | `Running` state with restarts = 0. | | ||||
| | Liveness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/live` | Returns `{"status":"Healthy"}`. | | ||||
| | Readiness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/ready` | Returns `{"status":"Ready"}`. | | ||||
| | Gateway reachability | `curl -I https://console.example.com/api/console/status` | `200 OK` with CSP headers. | | ||||
| | Static assets | `curl -I https://console.example.com/static/assets/app.js` | `200 OK` with long cache headers. | | ||||
|  | ||||
| Troubleshooting steps: | ||||
|  | ||||
| - **Authority unreachable:** readiness fails with `AUTHORITY_UNREACHABLE`. Check DNS, trust bundles, and Authority service health.   | ||||
| - **Manifest mismatch:** console logs `DOWNLOAD_MANIFEST_SIGNATURE_INVALID`. Verify cosign key and re-sync manifest.   | ||||
| - **Ingress 404:** ensure ingress controller routes host to `stellaops-web-ui` service; check TLS secret name.   | ||||
| - **SSE blocked:** confirm proxy allows HTTP/1.1 and disables buffering on `/console/runs/*`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 9. References | ||||
|  | ||||
| - `deploy/helm/stellaops/values-*.yaml` - environment-specific overrides.   | ||||
| - `deploy/compose/docker-compose.console.yaml` - Compose bundle.   | ||||
| - `/docs/ui/downloads.md` - manifest and offline bundle guidance.   | ||||
| - `/docs/security/console-security.md` - CSP and Authority scopes.   | ||||
| - `/docs/24_OFFLINE_KIT.md` - Offline kit packaging and verification.   | ||||
| - `/docs/ops/deployment-runbook.md` (pending) - wider platform deployment steps. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 10. Compliance checklist | ||||
|  | ||||
| - [ ] Helm and Compose instructions verified against `deploy/` assets.   | ||||
| - [ ] Ingress/TLS guidance aligns with Security Guild recommendations.   | ||||
| - [ ] Environment variables documented with defaults and required values.   | ||||
| - [ ] Health/liveness/readiness endpoints tested and listed.   | ||||
| - [ ] Offline workflow (mirrors, manifest parity) captured.   | ||||
| - [ ] Logging and metrics surface documented metrics.   | ||||
| - [ ] CSP and security header defaults stated alongside override guidance.   | ||||
| - [ ] Troubleshooting section linked to relevant runbooks. | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-27 (Sprint 23).*  | ||||
| # Deploying the StellaOps Console | ||||
|  | ||||
| > **Audience:** Deployment Guild, Console Guild, operators rolling out the web console.   | ||||
| > **Scope:** Helm and Docker Compose deployment steps, ingress/TLS configuration, required environment variables, health checks, offline/air-gap operation, and compliance checklist (Sprint 23). | ||||
|  | ||||
| The StellaOps Console ships as part of the `stellaops` stack Helm chart and Compose bundles maintained under `deploy/`. This guide describes the supported deployment paths, the configuration surface, and operational checks needed to run the console in connected or air-gapped environments. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1. Prerequisites | ||||
|  | ||||
| - Kubernetes cluster (v1.28+) with ingress controller (NGINX, Traefik, or equivalent) and Cert-Manager for automated TLS, or Docker host for Compose deployments.   | ||||
| - Container registry access to `registry.stella-ops.org` (or mirrored registry) for all images listed in `deploy/releases/*.yaml`.   | ||||
| - Authority service configured with console client (`aud=ui`, scopes `ui.read`, `ui.admin`).   | ||||
| - DNS entry pointing to the console hostname (for example, `console.acme.internal`).   | ||||
| - Cosign public key for manifest verification (`deploy/releases/manifest.json.sig`).   | ||||
| - Optional: Offline Kit bundle for air-gapped sites (`stella-ops-offline-kit-<ver>.tar.gz`). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2. Helm deployment (recommended) | ||||
|  | ||||
| ### 2.1 Install chart repository | ||||
|  | ||||
| ```bash | ||||
| helm repo add stellaops https://downloads.stella-ops.org/helm | ||||
| helm repo update stellaops | ||||
| ``` | ||||
|  | ||||
| If operating offline, copy the chart archive from the Offline Kit (`deploy/helm/stellaops-<ver>.tgz`) and run: | ||||
|  | ||||
| ```bash | ||||
| helm install stellaops ./stellaops-<ver>.tgz --namespace stellaops --create-namespace | ||||
| ``` | ||||
|  | ||||
| ### 2.2 Base installation | ||||
|  | ||||
| ```bash | ||||
| helm install stellaops stellaops/stellaops \ | ||||
|   --namespace stellaops \ | ||||
|   --create-namespace \ | ||||
|   --values deploy/helm/stellaops/values-prod.yaml | ||||
| ``` | ||||
|  | ||||
| The chart deploys Authority, Console web/API gateway, Scanner API, Scheduler, and supporting services. The console frontend pod is labelled `app=stellaops-web-ui`. | ||||
|  | ||||
| ### 2.3 Helm values highlights | ||||
|  | ||||
| Key sections in `deploy/helm/stellaops/values-prod.yaml`: | ||||
|  | ||||
| | Path | Description | | ||||
| |------|-------------| | ||||
| | `console.ingress.host` | Hostname served by the console (`console.example.com`). | | ||||
| | `console.ingress.tls.secretName` | Kubernetes secret containing TLS certificate (generated by Cert-Manager or uploaded manually). | | ||||
| | `console.config.apiGateway.baseUrl` | Internal base URL the UI uses to reach the gateway (defaults to `https://stellaops-web`). | | ||||
| | `console.env.AUTHORITY_ISSUER` | Authority issuer URL (for example, `https://authority.example.com`). | | ||||
| | `console.env.AUTHORITY_CLIENT_ID` | Authority client ID for the console UI. | | ||||
| | `console.env.AUTHORITY_SCOPES` | Space-separated scopes required by UI (`ui.read ui.admin`). | | ||||
| | `console.resources` | CPU/memory requests and limits (default 250m CPU / 512Mi memory). | | ||||
| | `console.podAnnotations` | Optional annotations for service mesh or monitoring. | | ||||
|  | ||||
| Use `values-stage.yaml`, `values-dev.yaml`, or `values-airgap.yaml` as templates for other environments. | ||||
|  | ||||
| ### 2.4 TLS and ingress | ||||
|  | ||||
| Example ingress override: | ||||
|  | ||||
| ```yaml | ||||
| console: | ||||
|   ingress: | ||||
|     enabled: true | ||||
|     className: nginx | ||||
|     host: console.acme.internal | ||||
|     tls: | ||||
|       enabled: true | ||||
|       secretName: console-tls | ||||
| ``` | ||||
|  | ||||
| Generate certificates using Cert-Manager or provide an existing secret. For air-gapped deployments, pre-create the secret with the mirrored CA chain. | ||||
|  | ||||
| ### 2.5 Health checks | ||||
|  | ||||
| Console pods expose: | ||||
|  | ||||
| | Path | Purpose | Notes | | ||||
| |------|---------|-------| | ||||
| | `/health/live` | Liveness probe | Confirms process responsive. | | ||||
| | `/health/ready` | Readiness probe | Verifies configuration bootstrap and Authority reachability. | | ||||
| | `/metrics` | Prometheus metrics | Enabled when `console.metrics.enabled=true`. | | ||||
|  | ||||
| Helm chart sets default probes (`initialDelaySeconds: 10`, `periodSeconds: 15`). Adjust via `console.livenessProbe` and `console.readinessProbe`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3. Docker Compose deployment | ||||
|  | ||||
| Located in `deploy/compose/docker-compose.console.yaml`. Quick start: | ||||
|  | ||||
| ```bash | ||||
| cd deploy/compose | ||||
| docker compose -f docker-compose.console.yaml --env-file console.env up -d | ||||
| ``` | ||||
|  | ||||
| `console.env` should define: | ||||
|  | ||||
| ``` | ||||
| CONSOLE_PUBLIC_BASE_URL=https://console.acme.internal | ||||
| AUTHORITY_ISSUER=https://authority.acme.internal | ||||
| AUTHORITY_CLIENT_ID=console-ui | ||||
| AUTHORITY_CLIENT_SECRET=<if using confidential client> | ||||
| AUTHORITY_SCOPES=ui.read ui.admin | ||||
| CONSOLE_GATEWAY_BASE_URL=https://api.acme.internal | ||||
| ``` | ||||
|  | ||||
| The compose bundle includes Traefik as reverse proxy with TLS termination. Update `traefik/dynamic/console.yml` for custom certificates or additional middlewares (CSP headers, rate limits). | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4. Environment variables | ||||
|  | ||||
| | Variable | Description | Default | | ||||
| |----------|-------------|---------| | ||||
| | `CONSOLE_PUBLIC_BASE_URL` | External URL used for redirects, deep links, and telemetry. | None (required). | | ||||
| | `CONSOLE_GATEWAY_BASE_URL` | URL of the web gateway that proxies API calls (`/console/*`). | Chart service name. | | ||||
| | `AUTHORITY_ISSUER` | Authority issuer (`https://authority.example.com`). | None (required). | | ||||
| | `AUTHORITY_CLIENT_ID` | OIDC client configured in Authority. | None (required). | | ||||
| | `AUTHORITY_SCOPES` | Space-separated scopes assigned to the console client. | `ui.read ui.admin`. | | ||||
| | `AUTHORITY_DPOP_ENABLED` | Enables DPoP challenge/response (recommended true). | `true`. | | ||||
| | `CONSOLE_FEATURE_FLAGS` | Comma-separated feature flags (`runs`, `downloads.offline`, etc.). | `runs,downloads,policies`. | | ||||
| | `CONSOLE_LOG_LEVEL` | Minimum log level (`Information`, `Debug`, etc.). | `Information`. | | ||||
| | `CONSOLE_METRICS_ENABLED` | Expose `/metrics` endpoint. | `true`. | | ||||
| | `CONSOLE_SENTRY_DSN` | Optional error reporting DSN. | Blank. | | ||||
|  | ||||
| When running behind additional proxies, set `ASPNETCORE_FORWARDEDHEADERS_ENABLED=true` to honour `X-Forwarded-*` headers. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5. Security headers and CSP | ||||
|  | ||||
| The console serves a strict Content Security Policy (CSP) by default: | ||||
|  | ||||
| ``` | ||||
| default-src 'self'; | ||||
| connect-src 'self' https://*.stella-ops.local; | ||||
| script-src 'self'; | ||||
| style-src 'self' 'unsafe-inline'; | ||||
| img-src 'self' data:; | ||||
| font-src 'self'; | ||||
| frame-ancestors 'none'; | ||||
| ``` | ||||
|  | ||||
| Adjust via `console.config.cspOverrides` if additional domains are required. For integrations embedding the console, update OIDC redirect URIs and Authority scopes accordingly. | ||||
|  | ||||
| TLS recommendations: | ||||
|  | ||||
| - Use TLS 1.2+ with modern cipher suite policy.   | ||||
| - Enable HSTS (`Strict-Transport-Security: max-age=31536000; includeSubDomains`).   | ||||
| - Provide custom trust bundles via `console.config.trustBundleSecret` when using private CAs. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6. Logging and metrics | ||||
|  | ||||
| - Structured logs emitted to stdout with correlation IDs. Configure log shipping via Fluent Bit or similar.   | ||||
| - Metrics available at `/metrics` in Prometheus format. Key metrics include `ui_request_duration_seconds`, `ui_tenant_switch_total`, and `ui_download_manifest_refresh_seconds`.   | ||||
| - Enable OpenTelemetry exporter by setting `OTEL_EXPORTER_OTLP_ENDPOINT` and associated headers in environment variables. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7. Offline and air-gap deployment | ||||
|  | ||||
| - Mirror container images using the Downloads workspace or Offline Kit manifest. Example: | ||||
|  | ||||
| ```bash | ||||
| oras copy registry.stella-ops.org/stellaops/web-ui@sha256:<digest> \ | ||||
|   registry.airgap.local/stellaops/web-ui:2025.10.0 | ||||
| ``` | ||||
|  | ||||
| - Import Offline Kit using `stella ouk import` before starting the console so manifest parity checks succeed.   | ||||
| - Use `values-airgap.yaml` to disable external telemetry endpoints and configure internal certificate chains.   | ||||
| - Run `helm upgrade --install` using the mirrored chart (`stellaops-<ver>.tgz`) and set `console.offlineMode=true` to surface offline banners. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8. Health checks and remediation | ||||
|  | ||||
| | Check | Command | Expected result | | ||||
| |-------|---------|-----------------| | ||||
| | Pod status | `kubectl get pods -n stellaops` | `Running` state with restarts = 0. | | ||||
| | Liveness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/live` | Returns `{"status":"Healthy"}`. | | ||||
| | Readiness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/ready` | Returns `{"status":"Ready"}`. | | ||||
| | Gateway reachability | `curl -I https://console.example.com/api/console/status` | `200 OK` with CSP headers. | | ||||
| | Static assets | `curl -I https://console.example.com/static/assets/app.js` | `200 OK` with long cache headers. | | ||||
|  | ||||
| Troubleshooting steps: | ||||
|  | ||||
| - **Authority unreachable:** readiness fails with `AUTHORITY_UNREACHABLE`. Check DNS, trust bundles, and Authority service health.   | ||||
| - **Manifest mismatch:** console logs `DOWNLOAD_MANIFEST_SIGNATURE_INVALID`. Verify cosign key and re-sync manifest.   | ||||
| - **Ingress 404:** ensure ingress controller routes host to `stellaops-web-ui` service; check TLS secret name.   | ||||
| - **SSE blocked:** confirm proxy allows HTTP/1.1 and disables buffering on `/console/runs/*`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 9. References | ||||
|  | ||||
| - `deploy/helm/stellaops/values-*.yaml` - environment-specific overrides.   | ||||
| - `deploy/compose/docker-compose.console.yaml` - Compose bundle.   | ||||
| - `/docs/ui/downloads.md` - manifest and offline bundle guidance.   | ||||
| - `/docs/security/console-security.md` - CSP and Authority scopes.   | ||||
| - `/docs/24_OFFLINE_KIT.md` - Offline kit packaging and verification.   | ||||
| - `/docs/modules/devops/runbooks/deployment-runbook.md` (pending) - wider platform deployment steps. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 10. Compliance checklist | ||||
|  | ||||
| - [ ] Helm and Compose instructions verified against `deploy/` assets.   | ||||
| - [ ] Ingress/TLS guidance aligns with Security Guild recommendations.   | ||||
| - [ ] Environment variables documented with defaults and required values.   | ||||
| - [ ] Health/liveness/readiness endpoints tested and listed.   | ||||
| - [ ] Offline workflow (mirrors, manifest parity) captured.   | ||||
| - [ ] Logging and metrics surface documented metrics.   | ||||
| - [ ] CSP and security header defaults stated alongside override guidance.   | ||||
| - [ ] Troubleshooting section linked to relevant runbooks. | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-27 (Sprint 23).*  | ||||
|   | ||||
| @@ -1,160 +1,160 @@ | ||||
| # Container Deployment Guide — AOC Update | ||||
|  | ||||
| > **Audience:** DevOps Guild, platform operators deploying StellaOps services.   | ||||
| > **Scope:** Deployment configuration changes required by the Aggregation-Only Contract (AOC), including schema validators, guard environment flags, and verifier identities. | ||||
|  | ||||
| This guide supplements existing deployment manuals with AOC-specific configuration. It assumes familiarity with the base Compose/Helm manifests described in `ops/deployment/` and `docs/ARCHITECTURE_DEVOPS.md`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1 · Schema validator enablement | ||||
|  | ||||
| ### 1.1 MongoDB validators | ||||
|  | ||||
| - Apply JSON schema validators to `advisory_raw` and `vex_raw` collections before enabling AOC guards. | ||||
| - Before enabling validators or the idempotency index, run the duplicate audit helper to confirm no conflicting raw advisories remain: | ||||
|   ```bash | ||||
|   mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;' | ||||
|   ``` | ||||
|   Resolve any reported rows prior to rollout. | ||||
| - Use the migration script provided in `ops/devops/scripts/apply-aoc-validators.js`: | ||||
|  | ||||
| ```bash | ||||
| kubectl exec -n concelier deploy/concelier-mongo -- \ | ||||
|   mongo concelier ops/devops/scripts/apply-aoc-validators.js | ||||
|  | ||||
| kubectl exec -n excititor deploy/excititor-mongo -- \ | ||||
|   mongo excititor ops/devops/scripts/apply-aoc-validators.js | ||||
| ``` | ||||
|  | ||||
| - Validators enforce required fields (`tenant`, `source`, `upstream`, `linkset`) and reject forbidden keys at DB level. | ||||
| - Rollback plan: validators are applied with `validationLevel: moderate`—downgrade via the same script with `--remove` if required. | ||||
|  | ||||
| ### 1.2 Migration order | ||||
|  | ||||
| 1. Deploy validators in maintenance window. | ||||
| 2. Roll out Concelier/Excititor images with guard middleware enabled (`AOC_GUARD_ENABLED=true`). | ||||
| 3. Run smoke tests (`stella sources ingest --dry-run` fixtures) before resuming production ingestion. | ||||
|  | ||||
| ### 1.3 Supersedes backfill verification | ||||
|  | ||||
| 1. **Duplicate audit:** Confirm `mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'` reports no conflicts before restarting Concelier with the new migrations. | ||||
| 2. **Post-migration check:** After the service restarts, validate that `db.advisory` is a view pointing to `advisory_backup_20251028`: | ||||
|    ```bash | ||||
|    mongo concelier --quiet --eval 'db.getCollectionInfos({ name: "advisory" })[0]' | ||||
|    ``` | ||||
|    The `type` should be `"view"` and `options.viewOn` should equal `"advisory_backup_20251028"`. | ||||
| 3. **Supersedes chain spot-check:** Inspect a sample set to ensure deterministic chaining: | ||||
|    ```bash | ||||
|    mongo concelier --quiet --eval ' | ||||
|      db.advisory_raw.aggregate([ | ||||
|        { $match: { "upstream.upstream_id": { $exists: true } } }, | ||||
|        { $sort: { "tenant": 1, "source.vendor": 1, "upstream.upstream_id": 1, "upstream.retrieved_at": 1 } }, | ||||
|        { $limit: 5 }, | ||||
|        { $project: { _id: 1, supersedes: 1 } } | ||||
|      ]).forEach(printjson)' | ||||
|    ``` | ||||
|    Each revision should reference the previous `_id` (or `null` for the first revision). Record findings in the change ticket before proceeding to production. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2 · Container environment flags | ||||
|  | ||||
| Add the following environment variables to Concelier/Excititor deployments: | ||||
|  | ||||
| | Variable | Default | Description | | ||||
| |----------|---------|-------------| | ||||
| | `AOC_GUARD_ENABLED` | `true` | Enables `AOCWriteGuard` interception. Set `false` only for controlled rollback. | | ||||
| | `AOC_ALLOW_SUPERSEDES_RETROFIT` | `false` | Allows temporary supersedes backfill during migration. Remove after cutover. | | ||||
| | `AOC_METRICS_ENABLED` | `true` | Emits `ingestion_write_total`, `aoc_violation_total`, etc. | | ||||
| | `AOC_TENANT_HEADER` | `X-Stella-Tenant` | Header name expected from Gateway. | | ||||
| | `AOC_VERIFIER_USER` | `stella-aoc-verify` | Read-only service user used by UI/CLI verification. | | ||||
|  | ||||
| Compose snippet: | ||||
|  | ||||
| ```yaml | ||||
| environment: | ||||
|   - AOC_GUARD_ENABLED=true | ||||
|   - AOC_ALLOW_SUPERSEDES_RETROFIT=false | ||||
|   - AOC_METRICS_ENABLED=true | ||||
|   - AOC_TENANT_HEADER=X-Stella-Tenant | ||||
|   - AOC_VERIFIER_USER=stella-aoc-verify | ||||
| ``` | ||||
|  | ||||
| Ensure `AOC_VERIFIER_USER` exists in Authority with `aoc:verify` scope and no write permissions. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3 · Verifier identity | ||||
|  | ||||
| - Create a dedicated client (`stella-aoc-verify`) via Authority bootstrap: | ||||
|  | ||||
| ```yaml | ||||
| clients: | ||||
|   - clientId: stella-aoc-verify | ||||
|     grantTypes: [client_credentials] | ||||
|     scopes: [aoc:verify, advisory:read, vex:read] | ||||
|     tenants: [default] | ||||
| ``` | ||||
|  | ||||
| - Store credentials in secret store (`Kubernetes Secret`, `Docker swarm secret`). | ||||
| - Bind credentials to `stella aoc verify` CI jobs and Console verification service. | ||||
| - Rotate quarterly; document in `ops/authority-key-rotation.md`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4 · Deployment steps | ||||
|  | ||||
| 1. **Pre-checks:** Confirm database backups, alerting in maintenance mode, and staging environment validated. | ||||
| 2. **Apply validators:** Run scripts per § 1.1. | ||||
| 3. **Update manifests:** Inject environment variables (§ 2) and mount guard configuration configmaps. | ||||
| 4. **Redeploy services:** Rolling restart Concelier/Excititor pods. Monitor `ingestion_write_total` for steady throughput. | ||||
| 5. **Seed verifier:** Deploy read-only verifier user and store credentials. | ||||
| 6. **Run verification:** Execute `stella aoc verify --since 24h` and ensure exit code `0`. | ||||
| 7. **Update dashboards:** Point Grafana panels to new metrics (`aoc_violation_total`). | ||||
| 8. **Record handoff:** Capture console screenshots and verification logs for release notes. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5 · Offline Kit updates | ||||
|  | ||||
| - Ship validator scripts with Offline Kit (`offline-kit/scripts/apply-aoc-validators.js`). | ||||
| - Include pre-generated verification reports for air-gapped deployments. | ||||
| - Document offline CLI workflow in bundle README referencing `docs/cli/cli-reference.md`. | ||||
| - Ensure `stella-aoc-verify` credentials are scoped to offline tenant and rotated during bundle refresh. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6 · Rollback plan | ||||
|  | ||||
| 1. Disable guard via `AOC_GUARD_ENABLED=false` on Concelier/Excititor and rollout. | ||||
| 2. Remove validators with the migration script (`--remove`). | ||||
| 3. Pause verification jobs to prevent noise. | ||||
| 4. Investigate and remediate upstream issues before re-enabling guards. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7 · References | ||||
|  | ||||
| - [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md) | ||||
| - [Authority scopes & tenancy](../security/authority-scopes.md) | ||||
| - [Observability guide](../observability/observability.md) | ||||
| - [CLI AOC commands](../cli/cli-reference.md) | ||||
| - [Concelier architecture](../ARCHITECTURE_CONCELIER.md) | ||||
| - [Excititor architecture](../ARCHITECTURE_EXCITITOR.md) | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8 · Compliance checklist | ||||
|  | ||||
| - [ ] Validators documented and scripts referenced for online/offline deployments. | ||||
| - [ ] Environment variables cover guard enablement, metrics, and tenant header. | ||||
| - [ ] Read-only verifier user installation steps included. | ||||
| - [ ] Offline kit instructions align with validator/verification workflow. | ||||
| - [ ] Rollback procedure captured. | ||||
| - [ ] Cross-links to AOC docs, Authority scopes, and observability guides present. | ||||
| - [ ] DevOps Guild sign-off tracked (owner: @devops-guild, due 2025-10-29). | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-26 (Sprint 19).*  | ||||
| # Container Deployment Guide — AOC Update | ||||
|  | ||||
| > **Audience:** DevOps Guild, platform operators deploying StellaOps services.   | ||||
| > **Scope:** Deployment configuration changes required by the Aggregation-Only Contract (AOC), including schema validators, guard environment flags, and verifier identities. | ||||
|  | ||||
| This guide supplements existing deployment manuals with AOC-specific configuration. It assumes familiarity with the base Compose/Helm manifests described in `ops/deployment/` and `docs/modules/devops/architecture.md`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 1 · Schema validator enablement | ||||
|  | ||||
| ### 1.1 MongoDB validators | ||||
|  | ||||
| - Apply JSON schema validators to `advisory_raw` and `vex_raw` collections before enabling AOC guards. | ||||
| - Before enabling validators or the idempotency index, run the duplicate audit helper to confirm no conflicting raw advisories remain: | ||||
|   ```bash | ||||
|   mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;' | ||||
|   ``` | ||||
|   Resolve any reported rows prior to rollout. | ||||
| - Use the migration script provided in `ops/devops/scripts/apply-aoc-validators.js`: | ||||
|  | ||||
| ```bash | ||||
| kubectl exec -n concelier deploy/concelier-mongo -- \ | ||||
|   mongo concelier ops/devops/scripts/apply-aoc-validators.js | ||||
|  | ||||
| kubectl exec -n excititor deploy/excititor-mongo -- \ | ||||
|   mongo excititor ops/devops/scripts/apply-aoc-validators.js | ||||
| ``` | ||||
|  | ||||
| - Validators enforce required fields (`tenant`, `source`, `upstream`, `linkset`) and reject forbidden keys at DB level. | ||||
| - Rollback plan: validators are applied with `validationLevel: moderate`—downgrade via the same script with `--remove` if required. | ||||
|  | ||||
| ### 1.2 Migration order | ||||
|  | ||||
| 1. Deploy validators in maintenance window. | ||||
| 2. Roll out Concelier/Excititor images with guard middleware enabled (`AOC_GUARD_ENABLED=true`). | ||||
| 3. Run smoke tests (`stella sources ingest --dry-run` fixtures) before resuming production ingestion. | ||||
|  | ||||
| ### 1.3 Supersedes backfill verification | ||||
|  | ||||
| 1. **Duplicate audit:** Confirm `mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'` reports no conflicts before restarting Concelier with the new migrations. | ||||
| 2. **Post-migration check:** After the service restarts, validate that `db.advisory` is a view pointing to `advisory_backup_20251028`: | ||||
|    ```bash | ||||
|    mongo concelier --quiet --eval 'db.getCollectionInfos({ name: "advisory" })[0]' | ||||
|    ``` | ||||
|    The `type` should be `"view"` and `options.viewOn` should equal `"advisory_backup_20251028"`. | ||||
| 3. **Supersedes chain spot-check:** Inspect a sample set to ensure deterministic chaining: | ||||
|    ```bash | ||||
|    mongo concelier --quiet --eval ' | ||||
|      db.advisory_raw.aggregate([ | ||||
|        { $match: { "upstream.upstream_id": { $exists: true } } }, | ||||
|        { $sort: { "tenant": 1, "source.vendor": 1, "upstream.upstream_id": 1, "upstream.retrieved_at": 1 } }, | ||||
|        { $limit: 5 }, | ||||
|        { $project: { _id: 1, supersedes: 1 } } | ||||
|      ]).forEach(printjson)' | ||||
|    ``` | ||||
|    Each revision should reference the previous `_id` (or `null` for the first revision). Record findings in the change ticket before proceeding to production. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 2 · Container environment flags | ||||
|  | ||||
| Add the following environment variables to Concelier/Excititor deployments: | ||||
|  | ||||
| | Variable | Default | Description | | ||||
| |----------|---------|-------------| | ||||
| | `AOC_GUARD_ENABLED` | `true` | Enables `AOCWriteGuard` interception. Set `false` only for controlled rollback. | | ||||
| | `AOC_ALLOW_SUPERSEDES_RETROFIT` | `false` | Allows temporary supersedes backfill during migration. Remove after cutover. | | ||||
| | `AOC_METRICS_ENABLED` | `true` | Emits `ingestion_write_total`, `aoc_violation_total`, etc. | | ||||
| | `AOC_TENANT_HEADER` | `X-Stella-Tenant` | Header name expected from Gateway. | | ||||
| | `AOC_VERIFIER_USER` | `stella-aoc-verify` | Read-only service user used by UI/CLI verification. | | ||||
|  | ||||
| Compose snippet: | ||||
|  | ||||
| ```yaml | ||||
| environment: | ||||
|   - AOC_GUARD_ENABLED=true | ||||
|   - AOC_ALLOW_SUPERSEDES_RETROFIT=false | ||||
|   - AOC_METRICS_ENABLED=true | ||||
|   - AOC_TENANT_HEADER=X-Stella-Tenant | ||||
|   - AOC_VERIFIER_USER=stella-aoc-verify | ||||
| ``` | ||||
|  | ||||
| Ensure `AOC_VERIFIER_USER` exists in Authority with `aoc:verify` scope and no write permissions. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 3 · Verifier identity | ||||
|  | ||||
| - Create a dedicated client (`stella-aoc-verify`) via Authority bootstrap: | ||||
|  | ||||
| ```yaml | ||||
| clients: | ||||
|   - clientId: stella-aoc-verify | ||||
|     grantTypes: [client_credentials] | ||||
|     scopes: [aoc:verify, advisory:read, vex:read] | ||||
|     tenants: [default] | ||||
| ``` | ||||
|  | ||||
| - Store credentials in secret store (`Kubernetes Secret`, `Docker swarm secret`). | ||||
| - Bind credentials to `stella aoc verify` CI jobs and Console verification service. | ||||
| - Rotate quarterly; document in `ops/authority-key-rotation.md`. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 4 · Deployment steps | ||||
|  | ||||
| 1. **Pre-checks:** Confirm database backups, alerting in maintenance mode, and staging environment validated. | ||||
| 2. **Apply validators:** Run scripts per § 1.1. | ||||
| 3. **Update manifests:** Inject environment variables (§ 2) and mount guard configuration configmaps. | ||||
| 4. **Redeploy services:** Rolling restart Concelier/Excititor pods. Monitor `ingestion_write_total` for steady throughput. | ||||
| 5. **Seed verifier:** Deploy read-only verifier user and store credentials. | ||||
| 6. **Run verification:** Execute `stella aoc verify --since 24h` and ensure exit code `0`. | ||||
| 7. **Update dashboards:** Point Grafana panels to new metrics (`aoc_violation_total`). | ||||
| 8. **Record handoff:** Capture console screenshots and verification logs for release notes. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 5 · Offline Kit updates | ||||
|  | ||||
| - Ship validator scripts with Offline Kit (`offline-kit/scripts/apply-aoc-validators.js`). | ||||
| - Include pre-generated verification reports for air-gapped deployments. | ||||
| - Document offline CLI workflow in bundle README referencing `docs/modules/cli/guides/cli-reference.md`. | ||||
| - Ensure `stella-aoc-verify` credentials are scoped to offline tenant and rotated during bundle refresh. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 6 · Rollback plan | ||||
|  | ||||
| 1. Disable guard via `AOC_GUARD_ENABLED=false` on Concelier/Excititor and rollout. | ||||
| 2. Remove validators with the migration script (`--remove`). | ||||
| 3. Pause verification jobs to prevent noise. | ||||
| 4. Investigate and remediate upstream issues before re-enabling guards. | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 7 · References | ||||
|  | ||||
| - [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md) | ||||
| - [Authority scopes & tenancy](../security/authority-scopes.md) | ||||
| - [Observability guide](../observability/observability.md) | ||||
| - [CLI AOC commands](../modules/cli/guides/cli-reference.md) | ||||
| - [Concelier architecture](../modules/concelier/architecture.md) | ||||
| - [Excititor architecture](../modules/excititor/architecture.md) | ||||
|  | ||||
| --- | ||||
|  | ||||
| ## 8 · Compliance checklist | ||||
|  | ||||
| - [ ] Validators documented and scripts referenced for online/offline deployments. | ||||
| - [ ] Environment variables cover guard enablement, metrics, and tenant header. | ||||
| - [ ] Read-only verifier user installation steps included. | ||||
| - [ ] Offline kit instructions align with validator/verification workflow. | ||||
| - [ ] Rollback procedure captured. | ||||
| - [ ] Cross-links to AOC docs, Authority scopes, and observability guides present. | ||||
| - [ ] DevOps Guild sign-off tracked (owner: @devops-guild, due 2025-10-29). | ||||
|  | ||||
| --- | ||||
|  | ||||
| *Last updated: 2025-10-26 (Sprint 19).*  | ||||
|   | ||||
		Reference in New Issue
	
	Block a user