Align AOC tasks for Excititor and Concelier
This commit is contained in:
@@ -1,228 +1,228 @@
|
||||
# Deploying the StellaOps Console
|
||||
|
||||
> **Audience:** Deployment Guild, Console Guild, operators rolling out the web console.
|
||||
> **Scope:** Helm and Docker Compose deployment steps, ingress/TLS configuration, required environment variables, health checks, offline/air-gap operation, and compliance checklist (Sprint 23).
|
||||
|
||||
The StellaOps Console ships as part of the `stellaops` stack Helm chart and Compose bundles maintained under `deploy/`. This guide describes the supported deployment paths, the configuration surface, and operational checks needed to run the console in connected or air-gapped environments.
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
- Kubernetes cluster (v1.28+) with ingress controller (NGINX, Traefik, or equivalent) and Cert-Manager for automated TLS, or Docker host for Compose deployments.
|
||||
- Container registry access to `registry.stella-ops.org` (or mirrored registry) for all images listed in `deploy/releases/*.yaml`.
|
||||
- Authority service configured with console client (`aud=ui`, scopes `ui.read`, `ui.admin`).
|
||||
- DNS entry pointing to the console hostname (for example, `console.acme.internal`).
|
||||
- Cosign public key for manifest verification (`deploy/releases/manifest.json.sig`).
|
||||
- Optional: Offline Kit bundle for air-gapped sites (`stella-ops-offline-kit-<ver>.tar.gz`).
|
||||
|
||||
---
|
||||
|
||||
## 2. Helm deployment (recommended)
|
||||
|
||||
### 2.1 Install chart repository
|
||||
|
||||
```bash
|
||||
helm repo add stellaops https://downloads.stella-ops.org/helm
|
||||
helm repo update stellaops
|
||||
```
|
||||
|
||||
If operating offline, copy the chart archive from the Offline Kit (`deploy/helm/stellaops-<ver>.tgz`) and run:
|
||||
|
||||
```bash
|
||||
helm install stellaops ./stellaops-<ver>.tgz --namespace stellaops --create-namespace
|
||||
```
|
||||
|
||||
### 2.2 Base installation
|
||||
|
||||
```bash
|
||||
helm install stellaops stellaops/stellaops \
|
||||
--namespace stellaops \
|
||||
--create-namespace \
|
||||
--values deploy/helm/stellaops/values-prod.yaml
|
||||
```
|
||||
|
||||
The chart deploys Authority, Console web/API gateway, Scanner API, Scheduler, and supporting services. The console frontend pod is labelled `app=stellaops-web-ui`.
|
||||
|
||||
### 2.3 Helm values highlights
|
||||
|
||||
Key sections in `deploy/helm/stellaops/values-prod.yaml`:
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `console.ingress.host` | Hostname served by the console (`console.example.com`). |
|
||||
| `console.ingress.tls.secretName` | Kubernetes secret containing TLS certificate (generated by Cert-Manager or uploaded manually). |
|
||||
| `console.config.apiGateway.baseUrl` | Internal base URL the UI uses to reach the gateway (defaults to `https://stellaops-web`). |
|
||||
| `console.env.AUTHORITY_ISSUER` | Authority issuer URL (for example, `https://authority.example.com`). |
|
||||
| `console.env.AUTHORITY_CLIENT_ID` | Authority client ID for the console UI. |
|
||||
| `console.env.AUTHORITY_SCOPES` | Space-separated scopes required by UI (`ui.read ui.admin`). |
|
||||
| `console.resources` | CPU/memory requests and limits (default 250m CPU / 512Mi memory). |
|
||||
| `console.podAnnotations` | Optional annotations for service mesh or monitoring. |
|
||||
|
||||
Use `values-stage.yaml`, `values-dev.yaml`, or `values-airgap.yaml` as templates for other environments.
|
||||
|
||||
### 2.4 TLS and ingress
|
||||
|
||||
Example ingress override:
|
||||
|
||||
```yaml
|
||||
console:
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
host: console.acme.internal
|
||||
tls:
|
||||
enabled: true
|
||||
secretName: console-tls
|
||||
```
|
||||
|
||||
Generate certificates using Cert-Manager or provide an existing secret. For air-gapped deployments, pre-create the secret with the mirrored CA chain.
|
||||
|
||||
### 2.5 Health checks
|
||||
|
||||
Console pods expose:
|
||||
|
||||
| Path | Purpose | Notes |
|
||||
|------|---------|-------|
|
||||
| `/health/live` | Liveness probe | Confirms process responsive. |
|
||||
| `/health/ready` | Readiness probe | Verifies configuration bootstrap and Authority reachability. |
|
||||
| `/metrics` | Prometheus metrics | Enabled when `console.metrics.enabled=true`. |
|
||||
|
||||
Helm chart sets default probes (`initialDelaySeconds: 10`, `periodSeconds: 15`). Adjust via `console.livenessProbe` and `console.readinessProbe`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Docker Compose deployment
|
||||
|
||||
Located in `deploy/compose/docker-compose.console.yaml`. Quick start:
|
||||
|
||||
```bash
|
||||
cd deploy/compose
|
||||
docker compose -f docker-compose.console.yaml --env-file console.env up -d
|
||||
```
|
||||
|
||||
`console.env` should define:
|
||||
|
||||
```
|
||||
CONSOLE_PUBLIC_BASE_URL=https://console.acme.internal
|
||||
AUTHORITY_ISSUER=https://authority.acme.internal
|
||||
AUTHORITY_CLIENT_ID=console-ui
|
||||
AUTHORITY_CLIENT_SECRET=<if using confidential client>
|
||||
AUTHORITY_SCOPES=ui.read ui.admin
|
||||
CONSOLE_GATEWAY_BASE_URL=https://api.acme.internal
|
||||
```
|
||||
|
||||
The compose bundle includes Traefik as reverse proxy with TLS termination. Update `traefik/dynamic/console.yml` for custom certificates or additional middlewares (CSP headers, rate limits).
|
||||
|
||||
---
|
||||
|
||||
## 4. Environment variables
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `CONSOLE_PUBLIC_BASE_URL` | External URL used for redirects, deep links, and telemetry. | None (required). |
|
||||
| `CONSOLE_GATEWAY_BASE_URL` | URL of the web gateway that proxies API calls (`/console/*`). | Chart service name. |
|
||||
| `AUTHORITY_ISSUER` | Authority issuer (`https://authority.example.com`). | None (required). |
|
||||
| `AUTHORITY_CLIENT_ID` | OIDC client configured in Authority. | None (required). |
|
||||
| `AUTHORITY_SCOPES` | Space-separated scopes assigned to the console client. | `ui.read ui.admin`. |
|
||||
| `AUTHORITY_DPOP_ENABLED` | Enables DPoP challenge/response (recommended true). | `true`. |
|
||||
| `CONSOLE_FEATURE_FLAGS` | Comma-separated feature flags (`runs`, `downloads.offline`, etc.). | `runs,downloads,policies`. |
|
||||
| `CONSOLE_LOG_LEVEL` | Minimum log level (`Information`, `Debug`, etc.). | `Information`. |
|
||||
| `CONSOLE_METRICS_ENABLED` | Expose `/metrics` endpoint. | `true`. |
|
||||
| `CONSOLE_SENTRY_DSN` | Optional error reporting DSN. | Blank. |
|
||||
|
||||
When running behind additional proxies, set `ASPNETCORE_FORWARDEDHEADERS_ENABLED=true` to honour `X-Forwarded-*` headers.
|
||||
|
||||
---
|
||||
|
||||
## 5. Security headers and CSP
|
||||
|
||||
The console serves a strict Content Security Policy (CSP) by default:
|
||||
|
||||
```
|
||||
default-src 'self';
|
||||
connect-src 'self' https://*.stella-ops.local;
|
||||
script-src 'self';
|
||||
style-src 'self' 'unsafe-inline';
|
||||
img-src 'self' data:;
|
||||
font-src 'self';
|
||||
frame-ancestors 'none';
|
||||
```
|
||||
|
||||
Adjust via `console.config.cspOverrides` if additional domains are required. For integrations embedding the console, update OIDC redirect URIs and Authority scopes accordingly.
|
||||
|
||||
TLS recommendations:
|
||||
|
||||
- Use TLS 1.2+ with modern cipher suite policy.
|
||||
- Enable HSTS (`Strict-Transport-Security: max-age=31536000; includeSubDomains`).
|
||||
- Provide custom trust bundles via `console.config.trustBundleSecret` when using private CAs.
|
||||
|
||||
---
|
||||
|
||||
## 6. Logging and metrics
|
||||
|
||||
- Structured logs emitted to stdout with correlation IDs. Configure log shipping via Fluent Bit or similar.
|
||||
- Metrics available at `/metrics` in Prometheus format. Key metrics include `ui_request_duration_seconds`, `ui_tenant_switch_total`, and `ui_download_manifest_refresh_seconds`.
|
||||
- Enable OpenTelemetry exporter by setting `OTEL_EXPORTER_OTLP_ENDPOINT` and associated headers in environment variables.
|
||||
|
||||
---
|
||||
|
||||
## 7. Offline and air-gap deployment
|
||||
|
||||
- Mirror container images using the Downloads workspace or Offline Kit manifest. Example:
|
||||
|
||||
```bash
|
||||
oras copy registry.stella-ops.org/stellaops/web-ui@sha256:<digest> \
|
||||
registry.airgap.local/stellaops/web-ui:2025.10.0
|
||||
```
|
||||
|
||||
- Import Offline Kit using `stella ouk import` before starting the console so manifest parity checks succeed.
|
||||
- Use `values-airgap.yaml` to disable external telemetry endpoints and configure internal certificate chains.
|
||||
- Run `helm upgrade --install` using the mirrored chart (`stellaops-<ver>.tgz`) and set `console.offlineMode=true` to surface offline banners.
|
||||
|
||||
---
|
||||
|
||||
## 8. Health checks and remediation
|
||||
|
||||
| Check | Command | Expected result |
|
||||
|-------|---------|-----------------|
|
||||
| Pod status | `kubectl get pods -n stellaops` | `Running` state with restarts = 0. |
|
||||
| Liveness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/live` | Returns `{"status":"Healthy"}`. |
|
||||
| Readiness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/ready` | Returns `{"status":"Ready"}`. |
|
||||
| Gateway reachability | `curl -I https://console.example.com/api/console/status` | `200 OK` with CSP headers. |
|
||||
| Static assets | `curl -I https://console.example.com/static/assets/app.js` | `200 OK` with long cache headers. |
|
||||
|
||||
Troubleshooting steps:
|
||||
|
||||
- **Authority unreachable:** readiness fails with `AUTHORITY_UNREACHABLE`. Check DNS, trust bundles, and Authority service health.
|
||||
- **Manifest mismatch:** console logs `DOWNLOAD_MANIFEST_SIGNATURE_INVALID`. Verify cosign key and re-sync manifest.
|
||||
- **Ingress 404:** ensure ingress controller routes host to `stellaops-web-ui` service; check TLS secret name.
|
||||
- **SSE blocked:** confirm proxy allows HTTP/1.1 and disables buffering on `/console/runs/*`.
|
||||
|
||||
---
|
||||
|
||||
## 9. References
|
||||
|
||||
- `deploy/helm/stellaops/values-*.yaml` - environment-specific overrides.
|
||||
- `deploy/compose/docker-compose.console.yaml` - Compose bundle.
|
||||
- `/docs/ui/downloads.md` - manifest and offline bundle guidance.
|
||||
- `/docs/security/console-security.md` - CSP and Authority scopes.
|
||||
- `/docs/24_OFFLINE_KIT.md` - Offline kit packaging and verification.
|
||||
- `/docs/modules/devops/runbooks/deployment-runbook.md` (pending) - wider platform deployment steps.
|
||||
|
||||
---
|
||||
|
||||
## 10. Compliance checklist
|
||||
|
||||
- [ ] Helm and Compose instructions verified against `deploy/` assets.
|
||||
- [ ] Ingress/TLS guidance aligns with Security Guild recommendations.
|
||||
- [ ] Environment variables documented with defaults and required values.
|
||||
- [ ] Health/liveness/readiness endpoints tested and listed.
|
||||
- [ ] Offline workflow (mirrors, manifest parity) captured.
|
||||
- [ ] Logging and metrics surface documented metrics.
|
||||
- [ ] CSP and security header defaults stated alongside override guidance.
|
||||
- [ ] Troubleshooting section linked to relevant runbooks.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2025-10-27 (Sprint 23).*
|
||||
# Deploying the StellaOps Console
|
||||
|
||||
> **Audience:** Deployment Guild, Console Guild, operators rolling out the web console.
|
||||
> **Scope:** Helm and Docker Compose deployment steps, ingress/TLS configuration, required environment variables, health checks, offline/air-gap operation, and compliance checklist (Sprint 23).
|
||||
|
||||
The StellaOps Console ships as part of the `stellaops` stack Helm chart and Compose bundles maintained under `deploy/`. This guide describes the supported deployment paths, the configuration surface, and operational checks needed to run the console in connected or air-gapped environments.
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
- Kubernetes cluster (v1.28+) with ingress controller (NGINX, Traefik, or equivalent) and Cert-Manager for automated TLS, or Docker host for Compose deployments.
|
||||
- Container registry access to `registry.stella-ops.org` (or mirrored registry) for all images listed in `deploy/releases/*.yaml`.
|
||||
- Authority service configured with console client (`aud=ui`, scopes `ui.read`, `ui.admin`).
|
||||
- DNS entry pointing to the console hostname (for example, `console.acme.internal`).
|
||||
- Cosign public key for manifest verification (`deploy/releases/manifest.json.sig`).
|
||||
- Optional: Offline Kit bundle for air-gapped sites (`stella-ops-offline-kit-<ver>.tar.gz`).
|
||||
|
||||
---
|
||||
|
||||
## 2. Helm deployment (recommended)
|
||||
|
||||
### 2.1 Install chart repository
|
||||
|
||||
```bash
|
||||
helm repo add stellaops https://downloads.stella-ops.org/helm
|
||||
helm repo update stellaops
|
||||
```
|
||||
|
||||
If operating offline, copy the chart archive from the Offline Kit (`deploy/helm/stellaops-<ver>.tgz`) and run:
|
||||
|
||||
```bash
|
||||
helm install stellaops ./stellaops-<ver>.tgz --namespace stellaops --create-namespace
|
||||
```
|
||||
|
||||
### 2.2 Base installation
|
||||
|
||||
```bash
|
||||
helm install stellaops stellaops/stellaops \
|
||||
--namespace stellaops \
|
||||
--create-namespace \
|
||||
--values deploy/helm/stellaops/values-prod.yaml
|
||||
```
|
||||
|
||||
The chart deploys Authority, Console web/API gateway, Scanner API, Scheduler, and supporting services. The console frontend pod is labelled `app=stellaops-web-ui`.
|
||||
|
||||
### 2.3 Helm values highlights
|
||||
|
||||
Key sections in `deploy/helm/stellaops/values-prod.yaml`:
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `console.ingress.host` | Hostname served by the console (`console.example.com`). |
|
||||
| `console.ingress.tls.secretName` | Kubernetes secret containing TLS certificate (generated by Cert-Manager or uploaded manually). |
|
||||
| `console.config.apiGateway.baseUrl` | Internal base URL the UI uses to reach the gateway (defaults to `https://stellaops-web`). |
|
||||
| `console.env.AUTHORITY_ISSUER` | Authority issuer URL (for example, `https://authority.example.com`). |
|
||||
| `console.env.AUTHORITY_CLIENT_ID` | Authority client ID for the console UI. |
|
||||
| `console.env.AUTHORITY_SCOPES` | Space-separated scopes required by UI (`ui.read ui.admin`). |
|
||||
| `console.resources` | CPU/memory requests and limits (default 250m CPU / 512Mi memory). |
|
||||
| `console.podAnnotations` | Optional annotations for service mesh or monitoring. |
|
||||
|
||||
Use `values-stage.yaml`, `values-dev.yaml`, or `values-airgap.yaml` as templates for other environments.
|
||||
|
||||
### 2.4 TLS and ingress
|
||||
|
||||
Example ingress override:
|
||||
|
||||
```yaml
|
||||
console:
|
||||
ingress:
|
||||
enabled: true
|
||||
className: nginx
|
||||
host: console.acme.internal
|
||||
tls:
|
||||
enabled: true
|
||||
secretName: console-tls
|
||||
```
|
||||
|
||||
Generate certificates using Cert-Manager or provide an existing secret. For air-gapped deployments, pre-create the secret with the mirrored CA chain.
|
||||
|
||||
### 2.5 Health checks
|
||||
|
||||
Console pods expose:
|
||||
|
||||
| Path | Purpose | Notes |
|
||||
|------|---------|-------|
|
||||
| `/health/live` | Liveness probe | Confirms process responsive. |
|
||||
| `/health/ready` | Readiness probe | Verifies configuration bootstrap and Authority reachability. |
|
||||
| `/metrics` | Prometheus metrics | Enabled when `console.metrics.enabled=true`. |
|
||||
|
||||
Helm chart sets default probes (`initialDelaySeconds: 10`, `periodSeconds: 15`). Adjust via `console.livenessProbe` and `console.readinessProbe`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Docker Compose deployment
|
||||
|
||||
Located in `deploy/compose/docker-compose.console.yaml`. Quick start:
|
||||
|
||||
```bash
|
||||
cd deploy/compose
|
||||
docker compose -f docker-compose.console.yaml --env-file console.env up -d
|
||||
```
|
||||
|
||||
`console.env` should define:
|
||||
|
||||
```
|
||||
CONSOLE_PUBLIC_BASE_URL=https://console.acme.internal
|
||||
AUTHORITY_ISSUER=https://authority.acme.internal
|
||||
AUTHORITY_CLIENT_ID=console-ui
|
||||
AUTHORITY_CLIENT_SECRET=<if using confidential client>
|
||||
AUTHORITY_SCOPES=ui.read ui.admin
|
||||
CONSOLE_GATEWAY_BASE_URL=https://api.acme.internal
|
||||
```
|
||||
|
||||
The compose bundle includes Traefik as reverse proxy with TLS termination. Update `traefik/dynamic/console.yml` for custom certificates or additional middlewares (CSP headers, rate limits).
|
||||
|
||||
---
|
||||
|
||||
## 4. Environment variables
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `CONSOLE_PUBLIC_BASE_URL` | External URL used for redirects, deep links, and telemetry. | None (required). |
|
||||
| `CONSOLE_GATEWAY_BASE_URL` | URL of the web gateway that proxies API calls (`/console/*`). | Chart service name. |
|
||||
| `AUTHORITY_ISSUER` | Authority issuer (`https://authority.example.com`). | None (required). |
|
||||
| `AUTHORITY_CLIENT_ID` | OIDC client configured in Authority. | None (required). |
|
||||
| `AUTHORITY_SCOPES` | Space-separated scopes assigned to the console client. | `ui.read ui.admin`. |
|
||||
| `AUTHORITY_DPOP_ENABLED` | Enables DPoP challenge/response (recommended true). | `true`. |
|
||||
| `CONSOLE_FEATURE_FLAGS` | Comma-separated feature flags (`runs`, `downloads.offline`, etc.). | `runs,downloads,policies`. |
|
||||
| `CONSOLE_LOG_LEVEL` | Minimum log level (`Information`, `Debug`, etc.). | `Information`. |
|
||||
| `CONSOLE_METRICS_ENABLED` | Expose `/metrics` endpoint. | `true`. |
|
||||
| `CONSOLE_SENTRY_DSN` | Optional error reporting DSN. | Blank. |
|
||||
|
||||
When running behind additional proxies, set `ASPNETCORE_FORWARDEDHEADERS_ENABLED=true` to honour `X-Forwarded-*` headers.
|
||||
|
||||
---
|
||||
|
||||
## 5. Security headers and CSP
|
||||
|
||||
The console serves a strict Content Security Policy (CSP) by default:
|
||||
|
||||
```
|
||||
default-src 'self';
|
||||
connect-src 'self' https://*.stella-ops.local;
|
||||
script-src 'self';
|
||||
style-src 'self' 'unsafe-inline';
|
||||
img-src 'self' data:;
|
||||
font-src 'self';
|
||||
frame-ancestors 'none';
|
||||
```
|
||||
|
||||
Adjust via `console.config.cspOverrides` if additional domains are required. For integrations embedding the console, update OIDC redirect URIs and Authority scopes accordingly.
|
||||
|
||||
TLS recommendations:
|
||||
|
||||
- Use TLS 1.2+ with modern cipher suite policy.
|
||||
- Enable HSTS (`Strict-Transport-Security: max-age=31536000; includeSubDomains`).
|
||||
- Provide custom trust bundles via `console.config.trustBundleSecret` when using private CAs.
|
||||
|
||||
---
|
||||
|
||||
## 6. Logging and metrics
|
||||
|
||||
- Structured logs emitted to stdout with correlation IDs. Configure log shipping via Fluent Bit or similar.
|
||||
- Metrics available at `/metrics` in Prometheus format. Key metrics include `ui_request_duration_seconds`, `ui_tenant_switch_total`, and `ui_download_manifest_refresh_seconds`.
|
||||
- Enable OpenTelemetry exporter by setting `OTEL_EXPORTER_OTLP_ENDPOINT` and associated headers in environment variables.
|
||||
|
||||
---
|
||||
|
||||
## 7. Offline and air-gap deployment
|
||||
|
||||
- Mirror container images using the Downloads workspace or Offline Kit manifest. Example:
|
||||
|
||||
```bash
|
||||
oras copy registry.stella-ops.org/stellaops/web-ui@sha256:<digest> \
|
||||
registry.airgap.local/stellaops/web-ui:2025.10.0
|
||||
```
|
||||
|
||||
- Import Offline Kit using `stella ouk import` before starting the console so manifest parity checks succeed.
|
||||
- Use `values-airgap.yaml` to disable external telemetry endpoints and configure internal certificate chains.
|
||||
- Run `helm upgrade --install` using the mirrored chart (`stellaops-<ver>.tgz`) and set `console.offlineMode=true` to surface offline banners.
|
||||
|
||||
---
|
||||
|
||||
## 8. Health checks and remediation
|
||||
|
||||
| Check | Command | Expected result |
|
||||
|-------|---------|-----------------|
|
||||
| Pod status | `kubectl get pods -n stellaops` | `Running` state with restarts = 0. |
|
||||
| Liveness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/live` | Returns `{"status":"Healthy"}`. |
|
||||
| Readiness | `kubectl exec deploy/stellaops-web-ui -- curl -fsS http://localhost:8080/health/ready` | Returns `{"status":"Ready"}`. |
|
||||
| Gateway reachability | `curl -I https://console.example.com/api/console/status` | `200 OK` with CSP headers. |
|
||||
| Static assets | `curl -I https://console.example.com/static/assets/app.js` | `200 OK` with long cache headers. |
|
||||
|
||||
Troubleshooting steps:
|
||||
|
||||
- **Authority unreachable:** readiness fails with `AUTHORITY_UNREACHABLE`. Check DNS, trust bundles, and Authority service health.
|
||||
- **Manifest mismatch:** console logs `DOWNLOAD_MANIFEST_SIGNATURE_INVALID`. Verify cosign key and re-sync manifest.
|
||||
- **Ingress 404:** ensure ingress controller routes host to `stellaops-web-ui` service; check TLS secret name.
|
||||
- **SSE blocked:** confirm proxy allows HTTP/1.1 and disables buffering on `/console/runs/*`.
|
||||
|
||||
---
|
||||
|
||||
## 9. References
|
||||
|
||||
- `deploy/helm/stellaops/values-*.yaml` - environment-specific overrides.
|
||||
- `deploy/compose/docker-compose.console.yaml` - Compose bundle.
|
||||
- `/docs/ui/downloads.md` - manifest and offline bundle guidance.
|
||||
- `/docs/security/console-security.md` - CSP and Authority scopes.
|
||||
- `/docs/24_OFFLINE_KIT.md` - Offline kit packaging and verification.
|
||||
- `/docs/modules/devops/runbooks/deployment-runbook.md` (pending) - wider platform deployment steps.
|
||||
|
||||
---
|
||||
|
||||
## 10. Compliance checklist
|
||||
|
||||
- [ ] Helm and Compose instructions verified against `deploy/` assets.
|
||||
- [ ] Ingress/TLS guidance aligns with Security Guild recommendations.
|
||||
- [ ] Environment variables documented with defaults and required values.
|
||||
- [ ] Health/liveness/readiness endpoints tested and listed.
|
||||
- [ ] Offline workflow (mirrors, manifest parity) captured.
|
||||
- [ ] Logging and metrics surface documented metrics.
|
||||
- [ ] CSP and security header defaults stated alongside override guidance.
|
||||
- [ ] Troubleshooting section linked to relevant runbooks.
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2025-10-27 (Sprint 23).*
|
||||
|
||||
@@ -1,160 +1,160 @@
|
||||
# Container Deployment Guide — AOC Update
|
||||
|
||||
> **Audience:** DevOps Guild, platform operators deploying StellaOps services.
|
||||
> **Scope:** Deployment configuration changes required by the Aggregation-Only Contract (AOC), including schema validators, guard environment flags, and verifier identities.
|
||||
|
||||
This guide supplements existing deployment manuals with AOC-specific configuration. It assumes familiarity with the base Compose/Helm manifests described in `ops/deployment/` and `docs/modules/devops/architecture.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1 · Schema validator enablement
|
||||
|
||||
### 1.1 MongoDB validators
|
||||
|
||||
- Apply JSON schema validators to `advisory_raw` and `vex_raw` collections before enabling AOC guards.
|
||||
- Before enabling validators or the idempotency index, run the duplicate audit helper to confirm no conflicting raw advisories remain:
|
||||
```bash
|
||||
mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'
|
||||
```
|
||||
Resolve any reported rows prior to rollout.
|
||||
- Use the migration script provided in `ops/devops/scripts/apply-aoc-validators.js`:
|
||||
|
||||
```bash
|
||||
kubectl exec -n concelier deploy/concelier-mongo -- \
|
||||
mongo concelier ops/devops/scripts/apply-aoc-validators.js
|
||||
|
||||
kubectl exec -n excititor deploy/excititor-mongo -- \
|
||||
mongo excititor ops/devops/scripts/apply-aoc-validators.js
|
||||
```
|
||||
|
||||
- Validators enforce required fields (`tenant`, `source`, `upstream`, `linkset`) and reject forbidden keys at DB level.
|
||||
- Rollback plan: validators are applied with `validationLevel: moderate`—downgrade via the same script with `--remove` if required.
|
||||
|
||||
### 1.2 Migration order
|
||||
|
||||
1. Deploy validators in maintenance window.
|
||||
2. Roll out Concelier/Excititor images with guard middleware enabled (`AOC_GUARD_ENABLED=true`).
|
||||
3. Run smoke tests (`stella sources ingest --dry-run` fixtures) before resuming production ingestion.
|
||||
|
||||
### 1.3 Supersedes backfill verification
|
||||
|
||||
1. **Duplicate audit:** Confirm `mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'` reports no conflicts before restarting Concelier with the new migrations.
|
||||
2. **Post-migration check:** After the service restarts, validate that `db.advisory` is a view pointing to `advisory_backup_20251028`:
|
||||
```bash
|
||||
mongo concelier --quiet --eval 'db.getCollectionInfos({ name: "advisory" })[0]'
|
||||
```
|
||||
The `type` should be `"view"` and `options.viewOn` should equal `"advisory_backup_20251028"`.
|
||||
3. **Supersedes chain spot-check:** Inspect a sample set to ensure deterministic chaining:
|
||||
```bash
|
||||
mongo concelier --quiet --eval '
|
||||
db.advisory_raw.aggregate([
|
||||
{ $match: { "upstream.upstream_id": { $exists: true } } },
|
||||
{ $sort: { "tenant": 1, "source.vendor": 1, "upstream.upstream_id": 1, "upstream.retrieved_at": 1 } },
|
||||
{ $limit: 5 },
|
||||
{ $project: { _id: 1, supersedes: 1 } }
|
||||
]).forEach(printjson)'
|
||||
```
|
||||
Each revision should reference the previous `_id` (or `null` for the first revision). Record findings in the change ticket before proceeding to production.
|
||||
|
||||
---
|
||||
|
||||
## 2 · Container environment flags
|
||||
|
||||
Add the following environment variables to Concelier/Excititor deployments:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `AOC_GUARD_ENABLED` | `true` | Enables `AOCWriteGuard` interception. Set `false` only for controlled rollback. |
|
||||
| `AOC_ALLOW_SUPERSEDES_RETROFIT` | `false` | Allows temporary supersedes backfill during migration. Remove after cutover. |
|
||||
| `AOC_METRICS_ENABLED` | `true` | Emits `ingestion_write_total`, `aoc_violation_total`, etc. |
|
||||
| `AOC_TENANT_HEADER` | `X-Stella-Tenant` | Header name expected from Gateway. |
|
||||
| `AOC_VERIFIER_USER` | `stella-aoc-verify` | Read-only service user used by UI/CLI verification. |
|
||||
|
||||
Compose snippet:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
- AOC_GUARD_ENABLED=true
|
||||
- AOC_ALLOW_SUPERSEDES_RETROFIT=false
|
||||
- AOC_METRICS_ENABLED=true
|
||||
- AOC_TENANT_HEADER=X-Stella-Tenant
|
||||
- AOC_VERIFIER_USER=stella-aoc-verify
|
||||
```
|
||||
|
||||
Ensure `AOC_VERIFIER_USER` exists in Authority with `aoc:verify` scope and no write permissions.
|
||||
|
||||
---
|
||||
|
||||
## 3 · Verifier identity
|
||||
|
||||
- Create a dedicated client (`stella-aoc-verify`) via Authority bootstrap:
|
||||
|
||||
```yaml
|
||||
clients:
|
||||
- clientId: stella-aoc-verify
|
||||
grantTypes: [client_credentials]
|
||||
scopes: [aoc:verify, advisory:read, vex:read]
|
||||
tenants: [default]
|
||||
```
|
||||
|
||||
- Store credentials in secret store (`Kubernetes Secret`, `Docker swarm secret`).
|
||||
- Bind credentials to `stella aoc verify` CI jobs and Console verification service.
|
||||
- Rotate quarterly; document in `ops/authority-key-rotation.md`.
|
||||
|
||||
---
|
||||
|
||||
## 4 · Deployment steps
|
||||
|
||||
1. **Pre-checks:** Confirm database backups, alerting in maintenance mode, and staging environment validated.
|
||||
2. **Apply validators:** Run scripts per § 1.1.
|
||||
3. **Update manifests:** Inject environment variables (§ 2) and mount guard configuration configmaps.
|
||||
4. **Redeploy services:** Rolling restart Concelier/Excititor pods. Monitor `ingestion_write_total` for steady throughput.
|
||||
5. **Seed verifier:** Deploy read-only verifier user and store credentials.
|
||||
6. **Run verification:** Execute `stella aoc verify --since 24h` and ensure exit code `0`.
|
||||
7. **Update dashboards:** Point Grafana panels to new metrics (`aoc_violation_total`).
|
||||
8. **Record handoff:** Capture console screenshots and verification logs for release notes.
|
||||
|
||||
---
|
||||
|
||||
## 5 · Offline Kit updates
|
||||
|
||||
- Ship validator scripts with Offline Kit (`offline-kit/scripts/apply-aoc-validators.js`).
|
||||
- Include pre-generated verification reports for air-gapped deployments.
|
||||
- Document offline CLI workflow in bundle README referencing `docs/modules/cli/guides/cli-reference.md`.
|
||||
- Ensure `stella-aoc-verify` credentials are scoped to offline tenant and rotated during bundle refresh.
|
||||
|
||||
---
|
||||
|
||||
## 6 · Rollback plan
|
||||
|
||||
1. Disable guard via `AOC_GUARD_ENABLED=false` on Concelier/Excititor and rollout.
|
||||
2. Remove validators with the migration script (`--remove`).
|
||||
3. Pause verification jobs to prevent noise.
|
||||
4. Investigate and remediate upstream issues before re-enabling guards.
|
||||
|
||||
---
|
||||
|
||||
## 7 · References
|
||||
|
||||
- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
|
||||
- [Authority scopes & tenancy](../security/authority-scopes.md)
|
||||
- [Observability guide](../observability/observability.md)
|
||||
- [CLI AOC commands](../modules/cli/guides/cli-reference.md)
|
||||
- [Concelier architecture](../modules/concelier/architecture.md)
|
||||
- [Excititor architecture](../modules/excititor/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
## 8 · Compliance checklist
|
||||
|
||||
- [ ] Validators documented and scripts referenced for online/offline deployments.
|
||||
- [ ] Environment variables cover guard enablement, metrics, and tenant header.
|
||||
- [ ] Read-only verifier user installation steps included.
|
||||
- [ ] Offline kit instructions align with validator/verification workflow.
|
||||
- [ ] Rollback procedure captured.
|
||||
- [ ] Cross-links to AOC docs, Authority scopes, and observability guides present.
|
||||
- [ ] DevOps Guild sign-off tracked (owner: @devops-guild, due 2025-10-29).
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2025-10-26 (Sprint 19).*
|
||||
# Container Deployment Guide — AOC Update
|
||||
|
||||
> **Audience:** DevOps Guild, platform operators deploying StellaOps services.
|
||||
> **Scope:** Deployment configuration changes required by the Aggregation-Only Contract (AOC), including schema validators, guard environment flags, and verifier identities.
|
||||
|
||||
This guide supplements existing deployment manuals with AOC-specific configuration. It assumes familiarity with the base Compose/Helm manifests described in `ops/deployment/` and `docs/modules/devops/architecture.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1 · Schema validator enablement
|
||||
|
||||
### 1.1 MongoDB validators
|
||||
|
||||
- Apply JSON schema validators to `advisory_raw` and `vex_raw` collections before enabling AOC guards.
|
||||
- Before enabling validators or the idempotency index, run the duplicate audit helper to confirm no conflicting raw advisories remain:
|
||||
```bash
|
||||
mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'
|
||||
```
|
||||
Resolve any reported rows prior to rollout.
|
||||
- Use the migration script provided in `ops/devops/scripts/apply-aoc-validators.js`:
|
||||
|
||||
```bash
|
||||
kubectl exec -n concelier deploy/concelier-mongo -- \
|
||||
mongo concelier ops/devops/scripts/apply-aoc-validators.js
|
||||
|
||||
kubectl exec -n excititor deploy/excititor-mongo -- \
|
||||
mongo excititor ops/devops/scripts/apply-aoc-validators.js
|
||||
```
|
||||
|
||||
- Validators enforce required fields (`tenant`, `source`, `upstream`, `linkset`) and reject forbidden keys at DB level.
|
||||
- Rollback plan: validators are applied with `validationLevel: moderate`—downgrade via the same script with `--remove` if required.
|
||||
|
||||
### 1.2 Migration order
|
||||
|
||||
1. Deploy validators in maintenance window.
|
||||
2. Roll out Concelier/Excititor images with guard middleware enabled (`AOC_GUARD_ENABLED=true`).
|
||||
3. Run smoke tests (`stella sources ingest --dry-run` fixtures) before resuming production ingestion.
|
||||
|
||||
### 1.3 Supersedes backfill verification
|
||||
|
||||
1. **Duplicate audit:** Confirm `mongo concelier ops/devops/scripts/check-advisory-raw-duplicates.js --eval 'var LIMIT=200;'` reports no conflicts before restarting Concelier with the new migrations.
|
||||
2. **Post-migration check:** After the service restarts, validate that `db.advisory` is a view pointing to `advisory_backup_20251028`:
|
||||
```bash
|
||||
mongo concelier --quiet --eval 'db.getCollectionInfos({ name: "advisory" })[0]'
|
||||
```
|
||||
The `type` should be `"view"` and `options.viewOn` should equal `"advisory_backup_20251028"`.
|
||||
3. **Supersedes chain spot-check:** Inspect a sample set to ensure deterministic chaining:
|
||||
```bash
|
||||
mongo concelier --quiet --eval '
|
||||
db.advisory_raw.aggregate([
|
||||
{ $match: { "upstream.upstream_id": { $exists: true } } },
|
||||
{ $sort: { "tenant": 1, "source.vendor": 1, "upstream.upstream_id": 1, "upstream.retrieved_at": 1 } },
|
||||
{ $limit: 5 },
|
||||
{ $project: { _id: 1, supersedes: 1 } }
|
||||
]).forEach(printjson)'
|
||||
```
|
||||
Each revision should reference the previous `_id` (or `null` for the first revision). Record findings in the change ticket before proceeding to production.
|
||||
|
||||
---
|
||||
|
||||
## 2 · Container environment flags
|
||||
|
||||
Add the following environment variables to Concelier/Excititor deployments:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `AOC_GUARD_ENABLED` | `true` | Enables `AOCWriteGuard` interception. Set `false` only for controlled rollback. |
|
||||
| `AOC_ALLOW_SUPERSEDES_RETROFIT` | `false` | Allows temporary supersedes backfill during migration. Remove after cutover. |
|
||||
| `AOC_METRICS_ENABLED` | `true` | Emits `ingestion_write_total`, `aoc_violation_total`, etc. |
|
||||
| `AOC_TENANT_HEADER` | `X-Stella-Tenant` | Header name expected from Gateway. |
|
||||
| `AOC_VERIFIER_USER` | `stella-aoc-verify` | Read-only service user used by UI/CLI verification. |
|
||||
|
||||
Compose snippet:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
- AOC_GUARD_ENABLED=true
|
||||
- AOC_ALLOW_SUPERSEDES_RETROFIT=false
|
||||
- AOC_METRICS_ENABLED=true
|
||||
- AOC_TENANT_HEADER=X-Stella-Tenant
|
||||
- AOC_VERIFIER_USER=stella-aoc-verify
|
||||
```
|
||||
|
||||
Ensure `AOC_VERIFIER_USER` exists in Authority with `aoc:verify` scope and no write permissions.
|
||||
|
||||
---
|
||||
|
||||
## 3 · Verifier identity
|
||||
|
||||
- Create a dedicated client (`stella-aoc-verify`) via Authority bootstrap:
|
||||
|
||||
```yaml
|
||||
clients:
|
||||
- clientId: stella-aoc-verify
|
||||
grantTypes: [client_credentials]
|
||||
scopes: [aoc:verify, advisory:read, vex:read]
|
||||
tenants: [default]
|
||||
```
|
||||
|
||||
- Store credentials in secret store (`Kubernetes Secret`, `Docker swarm secret`).
|
||||
- Bind credentials to `stella aoc verify` CI jobs and Console verification service.
|
||||
- Rotate quarterly; document in `ops/authority-key-rotation.md`.
|
||||
|
||||
---
|
||||
|
||||
## 4 · Deployment steps
|
||||
|
||||
1. **Pre-checks:** Confirm database backups, alerting in maintenance mode, and staging environment validated.
|
||||
2. **Apply validators:** Run scripts per § 1.1.
|
||||
3. **Update manifests:** Inject environment variables (§ 2) and mount guard configuration configmaps.
|
||||
4. **Redeploy services:** Rolling restart Concelier/Excititor pods. Monitor `ingestion_write_total` for steady throughput.
|
||||
5. **Seed verifier:** Deploy read-only verifier user and store credentials.
|
||||
6. **Run verification:** Execute `stella aoc verify --since 24h` and ensure exit code `0`.
|
||||
7. **Update dashboards:** Point Grafana panels to new metrics (`aoc_violation_total`).
|
||||
8. **Record handoff:** Capture console screenshots and verification logs for release notes.
|
||||
|
||||
---
|
||||
|
||||
## 5 · Offline Kit updates
|
||||
|
||||
- Ship validator scripts with Offline Kit (`offline-kit/scripts/apply-aoc-validators.js`).
|
||||
- Include pre-generated verification reports for air-gapped deployments.
|
||||
- Document offline CLI workflow in bundle README referencing `docs/modules/cli/guides/cli-reference.md`.
|
||||
- Ensure `stella-aoc-verify` credentials are scoped to offline tenant and rotated during bundle refresh.
|
||||
|
||||
---
|
||||
|
||||
## 6 · Rollback plan
|
||||
|
||||
1. Disable guard via `AOC_GUARD_ENABLED=false` on Concelier/Excititor and rollout.
|
||||
2. Remove validators with the migration script (`--remove`).
|
||||
3. Pause verification jobs to prevent noise.
|
||||
4. Investigate and remediate upstream issues before re-enabling guards.
|
||||
|
||||
---
|
||||
|
||||
## 7 · References
|
||||
|
||||
- [Aggregation-Only Contract reference](../ingestion/aggregation-only-contract.md)
|
||||
- [Authority scopes & tenancy](../security/authority-scopes.md)
|
||||
- [Observability guide](../observability/observability.md)
|
||||
- [CLI AOC commands](../modules/cli/guides/cli-reference.md)
|
||||
- [Concelier architecture](../modules/concelier/architecture.md)
|
||||
- [Excititor architecture](../modules/excititor/architecture.md)
|
||||
|
||||
---
|
||||
|
||||
## 8 · Compliance checklist
|
||||
|
||||
- [ ] Validators documented and scripts referenced for online/offline deployments.
|
||||
- [ ] Environment variables cover guard enablement, metrics, and tenant header.
|
||||
- [ ] Read-only verifier user installation steps included.
|
||||
- [ ] Offline kit instructions align with validator/verification workflow.
|
||||
- [ ] Rollback procedure captured.
|
||||
- [ ] Cross-links to AOC docs, Authority scopes, and observability guides present.
|
||||
- [ ] DevOps Guild sign-off tracked (owner: @devops-guild, due 2025-10-29).
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2025-10-26 (Sprint 19).*
|
||||
|
||||
Reference in New Issue
Block a user