Files
git.stella-ops.org/docs/deploy/containers.md
master 8bbfe4d2d2 feat(rate-limiting): Implement core rate limiting functionality with configuration, decision-making, metrics, middleware, and service registration
- Add RateLimitConfig for configuration management with YAML binding support.
- Introduce RateLimitDecision to encapsulate the result of rate limit checks.
- Implement RateLimitMetrics for OpenTelemetry metrics tracking.
- Create RateLimitMiddleware for enforcing rate limits on incoming requests.
- Develop RateLimitService to orchestrate instance and environment rate limit checks.
- Add RateLimitServiceCollectionExtensions for dependency injection registration.
2025-12-17 18:02:37 +02:00

6.8 KiB
Raw Blame History

Container Deployment Guide — AOC Update

Audience: DevOps Guild, platform operators deploying StellaOps services.
Scope: Deployment configuration changes required by the Aggregation-Only Contract (AOC), including schema validators, guard environment flags, and verifier identities.

This guide supplements existing deployment manuals with AOC-specific configuration. It assumes familiarity with the base Compose/Helm manifests described in ops/deployment/ and docs/modules/devops/architecture.md.


1 · Schema constraint enablement

1.1 PostgreSQL constraints

  • Apply CHECK constraints and NOT NULL rules to advisory_raw and vex_raw tables before enabling AOC guards.
  • Before enabling constraints or the idempotency index, run the duplicate audit helper to confirm no conflicting raw advisories remain:
    psql -d concelier -f ops/devops/scripts/check-advisory-raw-duplicates.sql -v LIMIT=200
    
    Resolve any reported rows prior to rollout.
  • Use the migration script provided in ops/devops/scripts/apply-aoc-constraints.sql:
kubectl exec -n concelier deploy/concelier-postgres -- \
  psql -d concelier -f ops/devops/scripts/apply-aoc-constraints.sql

kubectl exec -n excititor deploy/excititor-postgres -- \
  psql -d excititor -f ops/devops/scripts/apply-aoc-constraints.sql
  • Constraints enforce required fields (tenant, source, upstream, linkset) and reject forbidden keys at DB level.
  • Rollback plan: constraints can be dropped via the same script with --remove if required.

1.2 Migration order

  1. Deploy constraints in maintenance window.
  2. Roll out Concelier/Excititor images with guard middleware enabled (AOC_GUARD_ENABLED=true).
  3. Run smoke tests (stella sources ingest --dry-run fixtures) before resuming production ingestion.

1.3Supersedes backfill verification

  1. Duplicate audit: Confirm psql -d concelier -f ops/devops/scripts/check-advisory-raw-duplicates.sql -v LIMIT=200 reports no conflicts before restarting Concelier with the new migrations.
  2. Post-migration check: After the service restarts, validate that the advisory view points to advisory_backup_20251028:
    psql -d concelier -c "SELECT viewname, definition FROM pg_views WHERE viewname = 'advisory';"
    
    The definition should reference advisory_backup_20251028.
  3. Supersedes chain spot-check: Inspect a sample set to ensure deterministic chaining:
    psql -d concelier -c "
      SELECT id, supersedes FROM advisory_raw
      WHERE upstream_id IS NOT NULL
      ORDER BY tenant, source_vendor, upstream_id, retrieved_at
      LIMIT 5;"
    
    Each revision should reference the previous id (or null for the first revision). Record findings in the change ticket before proceeding to production.

2·Container environment flags

Add the following environment variables to Concelier/Excititor deployments:

Variable Default Description
AOC_GUARD_ENABLED true Enables AOCWriteGuard interception. Set false only for controlled rollback.
AOC_ALLOW_SUPERSEDES_RETROFIT false Allows temporary supersedes backfill during migration. Remove after cutover.
AOC_METRICS_ENABLED true Emits ingestion_write_total, aoc_violation_total, etc.
AOC_TENANT_HEADER X-Stella-Tenant Header name expected from Gateway.
AOC_VERIFIER_USER stella-aoc-verify Read-only service user used by UI/CLI verification.

Compose snippet:

environment:
  - AOC_GUARD_ENABLED=true
  - AOC_ALLOW_SUPERSEDES_RETROFIT=false
  - AOC_METRICS_ENABLED=true
  - AOC_TENANT_HEADER=X-Stella-Tenant
  - AOC_VERIFIER_USER=stella-aoc-verify

Ensure AOC_VERIFIER_USER exists in Authority with aoc:verify scope and no write permissions.


3·Verifier identity

  • Create a dedicated client (stella-aoc-verify) via Authority bootstrap:
clients:
  - clientId: stella-aoc-verify
    grantTypes: [client_credentials]
    scopes: [aoc:verify, advisory:read, vex:read]
    tenants: [default]
  • Store credentials in secret store (Kubernetes Secret, Docker swarm secret).
  • Bind credentials to stella aoc verify CI jobs and Console verification service.
  • Rotate quarterly; document in ops/authority-key-rotation.md.

4·Deployment steps

  1. Pre-checks: Confirm database backups, alerting in maintenance mode, and staging environment validated.
  2. Apply validators: Run scripts per §1.1.
  3. Update manifests: Inject environment variables (§2) and mount guard configuration configmaps.
  4. Redeploy services: Rolling restart Concelier/Excititor pods. Monitor ingestion_write_total for steady throughput.
  5. Seed verifier: Deploy read-only verifier user and store credentials.
  6. Run verification: Execute stella aoc verify --since 24h and ensure exit code 0.
  7. Update dashboards: Point Grafana panels to new metrics (aoc_violation_total).
  8. Record handoff: Capture console screenshots and verification logs for release notes.

5·Offline Kit updates

  • Ship validator scripts with Offline Kit (offline-kit/scripts/apply-aoc-validators.js).
  • Include pre-generated verification reports for air-gapped deployments.
  • Document offline CLI workflow in bundle README referencing docs/modules/cli/guides/cli-reference.md.
  • Ensure stella-aoc-verify credentials are scoped to offline tenant and rotated during bundle refresh.

6·Rollback plan

  1. Disable guard via AOC_GUARD_ENABLED=false on Concelier/Excititor and rollout.
  2. Remove validators with the migration script (--remove).
  3. Pause verification jobs to prevent noise.
  4. Investigate and remediate upstream issues before re-enabling guards.

7·References


8·Compliance checklist

  • Validators documented and scripts referenced for online/offline deployments.
  • Environment variables cover guard enablement, metrics, and tenant header.
  • Read-only verifier user installation steps included.
  • Offline kit instructions align with validator/verification workflow.
  • Rollback procedure captured.
  • Cross-links to AOC docs, Authority scopes, and observability guides present.
  • DevOps Guild sign-off tracked (owner: @devops-guild, due 2025-10-29).

Last updated: 2025-10-26 (Sprint19).