diff --git a/deploy/README.md b/deploy/README.md new file mode 100644 index 000000000..119d1c463 --- /dev/null +++ b/deploy/README.md @@ -0,0 +1,164 @@ +# Deploy + +Deployment infrastructure for StellaOps. Clean, consolidated deployment configurations. + +## Infrastructure Stack + +| Component | Technology | Version | +|-----------|------------|---------| +| Database | PostgreSQL | 18.1 | +| Messaging/Cache | Valkey | 9.0.1 | +| Object Storage | RustFS | latest | +| Transparency Log | Rekor | v2 (tiles) | + +## Directory Structure + +``` +deploy/ +├── compose/ # Docker Compose configurations +│ ├── docker-compose.stella-ops.yml # Main stack +│ ├── docker-compose.telemetry.yml # Observability (OTEL, Prometheus, Tempo, Loki) +│ ├── docker-compose.testing.yml # CI/testing infrastructure +│ ├── docker-compose.compliance-*.yml # Regional crypto overlays +│ ├── env/ # Environment templates +│ └── scripts/ # Compose lifecycle scripts +│ +├── helm/ # Kubernetes Helm charts +│ └── stellaops/ # Main chart with env-specific values +│ ├── values-dev.yaml +│ ├── values-stage.yaml +│ ├── values-prod.yaml +│ └── values-airgap.yaml +│ +├── docker/ # Container build infrastructure +│ ├── Dockerfile.hardened.template # Multi-stage hardened template +│ ├── Dockerfile.console # Angular UI +│ ├── build-all.sh # Build matrix +│ └── services-matrix.env # Service build args +│ +├── database/ # PostgreSQL infrastructure +│ ├── migrations/ # Schema migrations +│ ├── postgres/ # CloudNativePG configs +│ ├── postgres-partitioning/ # Table partitioning +│ └── postgres-validation/ # RLS validation +│ +├── scripts/ # Operational scripts +│ ├── bootstrap-trust.sh # TrustMonger initialization +│ ├── rotate-rekor-key.sh # Key rotation +│ ├── test-local.sh # Local testing +│ └── lib/ # Shared script libraries +│ +├── offline/ # Air-gap deployment +│ ├── airgap/ # Bundle creation tools +│ ├── kit/ # Installation kit +│ └── templates/ # Offline config templates +│ +├── telemetry/ # Observability (consolidated) +│ ├── alerts/ # Prometheus/Alertmanager rules +│ ├── dashboards/ # Grafana dashboards +│ ├── collectors/ # OTEL collector configs +│ └── storage/ # Prometheus/Loki/Tempo configs +│ +├── secrets/ # Secret management templates +│ └── *.example # Example secret structures +│ +├── releases/ # Release manifests +│ └── *.yaml # Version pinning per channel +│ +└── tools/ # Curated operational tools + ├── ci/ # Build/CI tools (nuget-prime, determinism) + ├── feeds/ # Feed management (concelier, vex) + ├── security/ # Security (attest, cosign, crypto) + └── validation/ # Validation scripts +``` + +## Quick Start + +### Local Development (Docker Compose) + +```bash +# Start full stack +docker compose -f deploy/compose/docker-compose.stella-ops.yml up -d + +# Start with telemetry +docker compose -f deploy/compose/docker-compose.stella-ops.yml \ + -f deploy/compose/docker-compose.telemetry.yml up -d + +# Regional compliance overlay (e.g., China SM2/SM3/SM4) +docker compose -f deploy/compose/docker-compose.stella-ops.yml \ + -f deploy/compose/docker-compose.compliance-china.yml up -d +``` + +### Kubernetes (Helm) + +```bash +# Install to dev environment +helm install stellaops deploy/helm/stellaops \ + -f deploy/helm/stellaops/values-dev.yaml \ + -n stellaops --create-namespace + +# Install to production +helm install stellaops deploy/helm/stellaops \ + -f deploy/helm/stellaops/values-prod.yaml \ + -n stellaops --create-namespace +``` + +### Air-Gapped Installation + +```bash +# Create offline bundle +python deploy/offline/airgap/build_bootstrap_pack.py --version 2026.04 + +# Import on air-gapped system +deploy/offline/airgap/import-bundle.sh stellaops-2026.04-bundle.tar.gz +``` + +## Compose Profiles + +| File | Purpose | Services | +|------|---------|----------| +| `stella-ops.yml` | Main stack | PostgreSQL, Valkey, RustFS, Rekor, all StellaOps services | +| `telemetry.yml` | Observability | OTEL Collector, Prometheus, Tempo, Loki | +| `testing.yml` | CI/Testing | postgres-test, valkey-test, mock-registry | +| `compliance-china.yml` | China crypto | SM2/SM3/SM4 overlays | +| `compliance-russia.yml` | Russia crypto | GOST R 34.10 overlays | +| `compliance-eu.yml` | EU crypto | eIDAS overlays | +| `dev.yml` | Development | Minimal stack with hot-reload | + +## Connection Strings + +```bash +# PostgreSQL +Host=stellaops-postgres;Port=5432;Database=stellaops;Username=stellaops;Password= + +# Valkey +stellaops-valkey:6379 + +# RustFS (S3-compatible) +http://stellaops-rustfs:8080 +``` + +## Migration from devops/ + +This `deploy/` directory is the consolidated replacement for the scattered `devops/` directory. +Content has been reorganized: + +| Old Location | New Location | +|--------------|--------------| +| `devops/compose/` | `deploy/compose/` | +| `devops/helm/` | `deploy/helm/` | +| `devops/docker/` | `deploy/docker/` | +| `devops/database/` | `deploy/database/` | +| `devops/scripts/` | `deploy/scripts/` | +| `devops/offline/` | `deploy/offline/` | +| `devops/observability/` + `devops/telemetry/` | `deploy/telemetry/` | +| `devops/secrets/` | `deploy/secrets/` | +| `devops/releases/` | `deploy/releases/` | + +The following `devops/` content was archived or removed: +- `devops/services/` - Scattered service configs (use compose overlays or helm values) +- `devops/tools/` - Move operational tools to `tools/` at repo root +- `devops/artifacts/` - CI artifacts (transient, should not be committed) +- `devops/.nuget/` - Package cache (restore during build) +- `devops/docs/` - Move to `docs/operations/` +- `devops/gitlab/` - Legacy CI templates (repo uses Gitea) diff --git a/deploy/compose/README.md b/deploy/compose/README.md new file mode 100644 index 000000000..d218bc597 --- /dev/null +++ b/deploy/compose/README.md @@ -0,0 +1,459 @@ +# Stella Ops Docker Compose Profiles + +Consolidated Docker Compose configuration for the StellaOps platform. All profiles use immutable image digests from `deploy/releases/*.yaml` and are validated via `docker compose config` in CI. + +## Quick Reference + +| I want to... | Command | +|--------------|---------| +| Run the full platform | `docker compose -f docker-compose.stella-ops.yml up -d` | +| Add observability | `docker compose -f docker-compose.stella-ops.yml -f docker-compose.telemetry.yml up -d` | +| Run CI/testing infrastructure | `docker compose -f docker-compose.testing.yml --profile ci up -d` | +| Deploy with China compliance | See [China Compliance](#china-compliance-sm2sm3sm4) | +| Deploy with Russia compliance | See [Russia Compliance](#russia-compliance-gost) | +| Deploy with EU compliance | See [EU Compliance](#eu-compliance-eidas) | + +--- + +## File Structure + +### Core Stack Files + +| File | Purpose | +|------|---------| +| `docker-compose.stella-ops.yml` | **Main stack**: PostgreSQL 18.1, Valkey 9.0.1, RustFS, Rekor v2, all StellaOps services | +| `docker-compose.telemetry.yml` | **Observability**: OpenTelemetry collector, Prometheus, Tempo, Loki | +| `docker-compose.testing.yml` | **CI/Testing**: Test databases, mock services, Gitea for integration tests | +| `docker-compose.dev.yml` | **Minimal dev infrastructure**: PostgreSQL, Valkey, RustFS only | + +### Specialized Infrastructure + +| File | Purpose | +|------|---------| +| `docker-compose.bsim.yml` | **BSim analysis**: PostgreSQL for Ghidra binary similarity corpus | +| `docker-compose.corpus.yml` | **Function corpus**: PostgreSQL for function behavior database | +| `docker-compose.sealed-ci.yml` | **Air-gapped CI**: Sealed testing environment with authority, signer, attestor | +| `docker-compose.telemetry-offline.yml` | **Offline observability**: Air-gapped Loki, Promtail, OTEL collector, Tempo, Prometheus | + +### Regional Compliance Overlays + +| File | Purpose | Jurisdiction | +|------|---------|--------------| +| `docker-compose.compliance-china.yml` | SM2/SM3/SM4 ShangMi crypto configuration | China (OSCCA) | +| `docker-compose.compliance-russia.yml` | GOST R 34.10-2012 crypto configuration | Russia (FSB) | +| `docker-compose.compliance-eu.yml` | eIDAS qualified trust services configuration | EU | + +### Crypto Provider Overlays + +| File | Purpose | Use Case | +|------|---------|----------| +| `docker-compose.crypto-sim.yml` | Universal crypto simulation | Testing without licensed crypto | +| `docker-compose.cryptopro.yml` | CryptoPro CSP (real GOST) | Production Russia deployments | +| `docker-compose.sm-remote.yml` | SM Remote service (real SM2) | Production China deployments | + +### Additional Overlays + +| File | Purpose | Use Case | +|------|---------|----------| +| `docker-compose.gpu.yaml` | NVIDIA GPU acceleration | Advisory AI inference with GPU | +| `docker-compose.cas.yaml` | Content Addressable Storage | Dedicated CAS with retention policies | +| `docker-compose.tile-proxy.yml` | Rekor tile caching proxy | Air-gapped Sigstore deployments | + +### Supporting Files + +| Path | Purpose | +|------|---------| +| `env/*.env.example` | Environment variable templates per profile | +| `scripts/backup.sh` | Create deterministic volume snapshots | +| `scripts/reset.sh` | Stop stack and remove volumes (with confirmation) | + +--- + +## Usage Patterns + +### Basic Development + +```bash +# Copy environment template +cp env/stellaops.env.example .env + +# Validate configuration +docker compose -f docker-compose.stella-ops.yml config + +# Start the platform +docker compose -f docker-compose.stella-ops.yml up -d + +# View logs +docker compose -f docker-compose.stella-ops.yml logs -f scanner-web +``` + +### With Observability + +```bash +# Generate TLS certificates for telemetry +./ops/devops/telemetry/generate_dev_tls.sh + +# Start platform with telemetry +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.telemetry.yml up -d +``` + +### CI/Testing Infrastructure + +```bash +# Start CI infrastructure only (different ports to avoid conflicts) +docker compose -f docker-compose.testing.yml --profile ci up -d + +# Start mock services for integration testing +docker compose -f docker-compose.testing.yml --profile mock up -d + +# Start Gitea for SCM integration tests +docker compose -f docker-compose.testing.yml --profile gitea up -d + +# Start everything +docker compose -f docker-compose.testing.yml --profile all up -d +``` + +**Test Infrastructure Ports:** +| Service | Port | Purpose | +|---------|------|---------| +| postgres-test | 5433 | PostgreSQL 18 for tests | +| valkey-test | 6380 | Valkey for cache/queue tests | +| rustfs-test | 8180 | S3-compatible storage | +| mock-registry | 5001 | Container registry mock | +| gitea | 3000 | Git hosting for SCM tests | + +--- + +## Regional Compliance Deployments + +### China Compliance (SM2/SM3/SM4) + +**For Testing (simulation):** +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-china.yml \ + -f docker-compose.crypto-sim.yml up -d +``` + +**For Production (real SM crypto):** +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-china.yml \ + -f docker-compose.sm-remote.yml up -d +``` + +**With OSCCA-certified HSM:** +```bash +# Set HSM connection details in environment +export SM_REMOTE_HSM_URL="https://sm-hsm.example.com:8900" +export SM_REMOTE_HSM_API_KEY="your-api-key" + +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-china.yml \ + -f docker-compose.sm-remote.yml up -d +``` + +**Algorithms:** +- SM2: Public key cryptography (GM/T 0003-2012) +- SM3: Hash function, 256-bit (GM/T 0004-2012) +- SM4: Block cipher, 128-bit (GM/T 0002-2012) + +--- + +### Russia Compliance (GOST) + +**For Testing (simulation):** +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-russia.yml \ + -f docker-compose.crypto-sim.yml up -d +``` + +**For Production (CryptoPro CSP):** +```bash +# CryptoPro requires EULA acceptance +CRYPTOPRO_ACCEPT_EULA=1 docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-russia.yml \ + -f docker-compose.cryptopro.yml up -d +``` + +**Requirements for CryptoPro:** +- CryptoPro CSP license files in `opt/cryptopro/downloads/` +- `CRYPTOPRO_ACCEPT_EULA=1` environment variable +- Valid CryptoPro container images + +**Algorithms:** +- GOST R 34.10-2012: Digital signature (256/512-bit) +- GOST R 34.11-2012: Hash function (Streebog, 256/512-bit) +- GOST R 34.12-2015: Block cipher (Kuznyechik, Magma) + +--- + +### EU Compliance (eIDAS) + +**For Testing (simulation):** +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-eu.yml \ + -f docker-compose.crypto-sim.yml up -d +``` + +**For Production:** +EU eIDAS deployments typically integrate with external Qualified Trust Service Providers (QTSPs) rather than hosting crypto locally. Configure your QTSP integration in the application settings. + +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.compliance-eu.yml up -d +``` + +**Standards:** +- ETSI TS 119 312 compliant algorithms +- Qualified electronic signatures +- QTSP integration for qualified trust services + +--- + +## Crypto Simulation Details + +The `docker-compose.crypto-sim.yml` overlay provides a unified simulation service for all sovereign crypto profiles: + +| Algorithm ID | Simulation | Use Case | +|--------------|------------|----------| +| `SM2`, `sm.sim` | HMAC-SHA256 | China testing | +| `GOST12-256`, `GOST12-512` | HMAC-SHA256 | Russia testing | +| `ru.magma.sim`, `ru.kuznyechik.sim` | HMAC-SHA256 | Russia testing | +| `DILITHIUM3`, `FALCON512`, `pq.sim` | HMAC-SHA256 | Post-quantum testing | +| `fips.sim`, `eidas.sim`, `kcmvp.sim` | ECDSA P-256 | FIPS/EU/Korea testing | + +**Important:** Simulation is for testing only. Uses deterministic HMAC or static ECDSA keys—not suitable for production or compliance certification. + +--- + +## Configuration Reference + +### Infrastructure Services + +| Service | Default Port | Purpose | +|---------|--------------|---------| +| PostgreSQL | 5432 | Primary database | +| Valkey | 6379 | Cache, queues, events | +| RustFS | 8080 | S3-compatible artifact storage | +| Rekor v2 | (internal) | Sigstore transparency log | + +### Application Services + +| Service | Default Port | Purpose | +|---------|--------------|---------| +| Authority | 8440 | OAuth2/OIDC identity provider | +| Signer | 8441 | Cryptographic signing | +| Attestor | 8442 | SLSA attestation | +| Scanner Web | 8444 | SBOM/vulnerability scanning API | +| Concelier | 8445 | Advisory aggregation | +| Notify Web | 8446 | Notification service | +| Issuer Directory | 8447 | CSAF publisher registry | +| Advisory AI Web | 8448 | AI-powered advisory analysis | +| Web UI | 8443 | Angular frontend | + +### Environment Variables + +Key variables (see `env/*.env.example` for complete list): + +```bash +# Database +POSTGRES_USER=stellaops +POSTGRES_PASSWORD= +POSTGRES_DB=stellaops_platform + +# Authority +AUTHORITY_ISSUER=https://authority.example.com + +# Scanner +SCANNER_EVENTS_ENABLED=false +SCANNER_OFFLINEKIT_ENABLED=false + +# Crypto (for compliance overlays) +STELLAOPS_CRYPTO_PROFILE=default # or: china, russia, eu +STELLAOPS_CRYPTO_ENABLE_SIM=0 # set to 1 for simulation + +# CryptoPro (Russia only) +CRYPTOPRO_ACCEPT_EULA=0 # must be 1 to use CryptoPro + +# SM Remote (China only) +SM_SOFT_ALLOWED=1 # software-only SM2 +SM_REMOTE_HSM_URL= # optional: OSCCA-certified HSM +``` + +--- + +## Networking + +All profiles use a shared `stellaops` Docker network. Production deployments can attach a `frontdoor` network for reverse proxy integration: + +```bash +# Create external network for load balancer +docker network create stellaops_frontdoor + +# Set in environment +export FRONTDOOR_NETWORK=stellaops_frontdoor +``` + +Only externally-reachable services (Authority, Signer, Attestor, Concelier, Scanner Web, Notify Web, UI) attach to the frontdoor network. Infrastructure services (PostgreSQL, Valkey, RustFS) remain on the private network. + +--- + +## Sigstore Tools + +Enable Sigstore CLI tools (rekor-cli, cosign) with the `sigstore` profile: + +```bash +docker compose -f docker-compose.stella-ops.yml --profile sigstore up -d +``` + +--- + +## GPU Support for Advisory AI + +GPU is disabled by default. To enable NVIDIA GPU inference: + +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.gpu.yaml up -d +``` + +**Requirements:** +- NVIDIA GPU with CUDA support +- nvidia-container-toolkit installed +- Docker configured with nvidia runtime + +--- + +## Content Addressable Storage (CAS) + +The CAS overlay provides dedicated RustFS instances with retention policies for different artifact types: + +```bash +# Standalone CAS infrastructure +docker compose -f docker-compose.cas.yaml up -d + +# Combined with main stack +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.cas.yaml up -d +``` + +**CAS Services:** +| Service | Port | Purpose | +|---------|------|---------| +| rustfs-cas | 8180 | Runtime facts, signals, replay artifacts | +| rustfs-evidence | 8181 | Merkle roots, hash chains, evidence bundles (immutable) | +| rustfs-attestation | 8182 | DSSE envelopes, in-toto attestations (immutable) | + +**Retention Policies (configurable via `env/cas.env.example`):** +- Vulnerability DB: 7 days +- SBOM artifacts: 365 days +- Scan results: 90 days +- Evidence bundles: Indefinite (immutable) +- Attestations: Indefinite (immutable) + +--- + +## Tile Proxy (Air-Gapped Sigstore) + +For air-gapped deployments, the tile-proxy caches Rekor transparency log tiles locally from public Sigstore: + +```bash +docker compose -f docker-compose.stella-ops.yml \ + -f docker-compose.tile-proxy.yml up -d +``` + +**Tile Proxy vs Rekor v2:** +- Use `--profile sigstore` when running your own Rekor transparency log locally +- Use `docker-compose.tile-proxy.yml` when caching tiles from public Sigstore (rekor.sigstore.dev) + +**Configuration:** +| Variable | Default | Purpose | +|----------|---------|---------| +| `REKOR_SERVER_URL` | `https://rekor.sigstore.dev` | Upstream Rekor to proxy | +| `TILE_PROXY_SYNC_ENABLED` | `true` | Enable periodic tile sync | +| `TILE_PROXY_SYNC_SCHEDULE` | `0 */6 * * *` | Sync every 6 hours | +| `TILE_PROXY_CACHE_MAX_SIZE_GB` | `10` | Local cache size limit | + +The proxy syncs tiles on schedule and serves them to internal services for offline verification. + +--- + +## Maintenance + +### Backup + +```bash +./scripts/backup.sh # Creates timestamped tar.gz of volumes +``` + +### Reset + +```bash +./scripts/reset.sh # Stops stack, removes volumes (requires confirmation) +``` + +### Validate Configuration + +```bash +docker compose -f docker-compose.stella-ops.yml config +``` + +### Update to New Release + +1. Import new manifest to `deploy/releases/` +2. Update image digests in compose files +3. Run `docker compose config` to validate +4. Run `deploy/tools/validate-profiles.sh` for audit + +--- + +## Troubleshooting + +### Port Conflicts + +Override ports in your `.env` file: +```bash +POSTGRES_PORT=5433 +VALKEY_PORT=6380 +SCANNER_WEB_PORT=8544 +``` + +### Service Dependencies + +Services declare `depends_on` with health checks. If a service fails to start, check its dependencies: +```bash +docker compose -f docker-compose.stella-ops.yml ps +docker compose -f docker-compose.stella-ops.yml logs postgres +docker compose -f docker-compose.stella-ops.yml logs valkey +``` + +### Crypto Provider Issues + +For crypto simulation issues: +```bash +# Check sim-crypto service +docker compose logs sim-crypto +curl http://localhost:18090/keys +``` + +For CryptoPro issues: +```bash +# Verify EULA acceptance +echo $CRYPTOPRO_ACCEPT_EULA # must be 1 + +# Check CryptoPro service +docker compose logs cryptopro-csp +``` + +--- + +## Related Documentation + +- [Deployment Upgrade Runbook](../../docs/operations/devops/runbooks/deployment-upgrade.md) +- [Local CI Guide](../../docs/technical/testing/LOCAL_CI_GUIDE.md) +- [Crypto Profile Configuration](../../docs/security/crypto-profile-configuration.md) +- [Regional Deployments](../../docs/operations/regional-deployments.md) diff --git a/deploy/compose/docker-compose.bsim.yml b/deploy/compose/docker-compose.bsim.yml new file mode 100644 index 000000000..43353dc93 --- /dev/null +++ b/deploy/compose/docker-compose.bsim.yml @@ -0,0 +1,73 @@ +# ============================================================================= +# BSIM - BINARY SIMILARITY ANALYSIS +# ============================================================================= +# BSim PostgreSQL Database and Ghidra Headless Services for binary analysis. +# +# Usage: +# docker compose -f docker-compose.bsim.yml up -d +# +# Environment: +# BSIM_DB_PASSWORD - PostgreSQL password for BSim database +# ============================================================================= + +services: + bsim-postgres: + image: postgres:18.1-alpine + container_name: stellaops-bsim-db + environment: + POSTGRES_DB: bsim_corpus + POSTGRES_USER: bsim_user + POSTGRES_PASSWORD: ${BSIM_DB_PASSWORD:-stellaops_bsim_dev} + POSTGRES_INITDB_ARGS: "-E UTF8 --locale=C" + volumes: + - bsim-data:/var/lib/postgresql/data + - ../docker/ghidra/scripts/init-bsim.sql:/docker-entrypoint-initdb.d/10-init-bsim.sql:ro + ports: + - "${BSIM_DB_PORT:-5433}:5432" + networks: + - stellaops-bsim + healthcheck: + test: ["CMD-SHELL", "pg_isready -U bsim_user -d bsim_corpus"] + interval: 10s + timeout: 5s + retries: 5 + restart: unless-stopped + + ghidra-headless: + build: + context: ../docker/ghidra + dockerfile: Dockerfile.headless + image: stellaops/ghidra-headless:11.2 + container_name: stellaops-ghidra + depends_on: + bsim-postgres: + condition: service_healthy + environment: + BSIM_DB_URL: "postgresql://bsim-postgres:5432/bsim_corpus" + BSIM_DB_USER: bsim_user + BSIM_DB_PASSWORD: ${BSIM_DB_PASSWORD:-stellaops_bsim_dev} + JAVA_HOME: /opt/java/openjdk + MAXMEM: 4G + volumes: + - ghidra-projects:/projects + - ghidra-scripts:/scripts + - ghidra-output:/output + networks: + - stellaops-bsim + deploy: + resources: + limits: + cpus: '4' + memory: 8G + entrypoint: ["tail", "-f", "/dev/null"] + restart: unless-stopped + +volumes: + bsim-data: + ghidra-projects: + ghidra-scripts: + ghidra-output: + +networks: + stellaops-bsim: + driver: bridge diff --git a/deploy/compose/docker-compose.cas.yaml b/deploy/compose/docker-compose.cas.yaml new file mode 100644 index 000000000..5739034a8 --- /dev/null +++ b/deploy/compose/docker-compose.cas.yaml @@ -0,0 +1,212 @@ +# Content Addressable Storage (CAS) Infrastructure +# Uses RustFS for S3-compatible immutable object storage +# Aligned with best-in-class vulnerability scanner retention policies +# +# Usage (standalone): +# docker compose -f docker-compose.cas.yaml up -d +# +# Usage (with main stack): +# docker compose -f docker-compose.stella-ops.yml -f docker-compose.cas.yaml up -d + +x-release-labels: &release-labels + com.stellaops.release.version: "2025.10.0-edge" + com.stellaops.release.channel: "edge" + com.stellaops.profile: "cas" + +x-cas-config: &cas-config + # Retention policies (aligned with Trivy/Grype/Anchore Enterprise) + # - vulnerability-db: 7 days (matches Trivy default) + # - sbom-artifacts: 365 days (audit compliance) + # - scan-results: 90 days (SOC2/ISO27001 typical) + # - evidence-bundles: indefinite (immutable, content-addressed) + # - attestations: indefinite (in-toto/DSSE signed) + CAS__RETENTION__VULNERABILITY_DB_DAYS: "7" + CAS__RETENTION__SBOM_ARTIFACTS_DAYS: "365" + CAS__RETENTION__SCAN_RESULTS_DAYS: "90" + CAS__RETENTION__EVIDENCE_BUNDLES_DAYS: "0" # 0 = indefinite + CAS__RETENTION__ATTESTATIONS_DAYS: "0" # 0 = indefinite + CAS__RETENTION__TEMP_ARTIFACTS_DAYS: "1" + +networks: + cas: + driver: bridge + +volumes: + rustfs-cas-data: + driver: local + driver_opts: + type: none + o: bind + device: ${CAS_DATA_PATH:-/var/lib/stellaops/cas} + rustfs-evidence-data: + driver: local + driver_opts: + type: none + o: bind + device: ${CAS_EVIDENCE_PATH:-/var/lib/stellaops/evidence} + rustfs-attestation-data: + driver: local + driver_opts: + type: none + o: bind + device: ${CAS_ATTESTATION_PATH:-/var/lib/stellaops/attestations} + +services: + # Primary CAS storage - runtime facts, signals, replay artifacts + rustfs-cas: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + command: ["serve", "--listen", "0.0.0.0:8080", "--root", "/data"] + restart: unless-stopped + environment: + RUSTFS__LOG__LEVEL: "${RUSTFS_LOG_LEVEL:-info}" + RUSTFS__STORAGE__PATH: /data + RUSTFS__STORAGE__DEDUP: "true" + RUSTFS__STORAGE__COMPRESSION: "${RUSTFS_COMPRESSION:-zstd}" + RUSTFS__STORAGE__COMPRESSION_LEVEL: "${RUSTFS_COMPRESSION_LEVEL:-3}" + # Bucket lifecycle (retention enforcement) + RUSTFS__LIFECYCLE__ENABLED: "true" + RUSTFS__LIFECYCLE__SCAN_INTERVAL_HOURS: "24" + RUSTFS__LIFECYCLE__DEFAULT_RETENTION_DAYS: "90" + # Access control + RUSTFS__AUTH__ENABLED: "${RUSTFS_AUTH_ENABLED:-true}" + RUSTFS__AUTH__API_KEY: "${RUSTFS_CAS_API_KEY:-cas-api-key-change-me}" + RUSTFS__AUTH__READONLY_KEY: "${RUSTFS_CAS_READONLY_KEY:-cas-readonly-key-change-me}" + # Service account configuration + RUSTFS__ACCOUNTS__SCANNER__KEY: "${RUSTFS_SCANNER_KEY:-scanner-svc-key}" + RUSTFS__ACCOUNTS__SCANNER__BUCKETS: "scanner-artifacts,surface-cache,runtime-facts" + RUSTFS__ACCOUNTS__SCANNER__PERMISSIONS: "read,write" + RUSTFS__ACCOUNTS__SIGNALS__KEY: "${RUSTFS_SIGNALS_KEY:-signals-svc-key}" + RUSTFS__ACCOUNTS__SIGNALS__BUCKETS: "runtime-facts,signals-data,provenance-feed" + RUSTFS__ACCOUNTS__SIGNALS__PERMISSIONS: "read,write" + RUSTFS__ACCOUNTS__REPLAY__KEY: "${RUSTFS_REPLAY_KEY:-replay-svc-key}" + RUSTFS__ACCOUNTS__REPLAY__BUCKETS: "replay-bundles,inputs-lock" + RUSTFS__ACCOUNTS__REPLAY__PERMISSIONS: "read,write" + RUSTFS__ACCOUNTS__READONLY__KEY: "${RUSTFS_READONLY_KEY:-readonly-svc-key}" + RUSTFS__ACCOUNTS__READONLY__BUCKETS: "*" + RUSTFS__ACCOUNTS__READONLY__PERMISSIONS: "read" + <<: *cas-config + volumes: + - rustfs-cas-data:/data + ports: + - "${RUSTFS_CAS_PORT:-8180}:8080" + networks: + - cas + labels: *release-labels + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 10s + + # Evidence storage - Merkle roots, hash chains, evidence bundles (immutable) + rustfs-evidence: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + command: ["serve", "--listen", "0.0.0.0:8080", "--root", "/data", "--immutable"] + restart: unless-stopped + environment: + RUSTFS__LOG__LEVEL: "${RUSTFS_LOG_LEVEL:-info}" + RUSTFS__STORAGE__PATH: /data + RUSTFS__STORAGE__DEDUP: "true" + RUSTFS__STORAGE__COMPRESSION: "${RUSTFS_COMPRESSION:-zstd}" + RUSTFS__STORAGE__IMMUTABLE: "true" # Write-once, never delete + # Access control + RUSTFS__AUTH__ENABLED: "true" + RUSTFS__AUTH__API_KEY: "${RUSTFS_EVIDENCE_API_KEY:-evidence-api-key-change-me}" + RUSTFS__AUTH__READONLY_KEY: "${RUSTFS_EVIDENCE_READONLY_KEY:-evidence-readonly-key-change-me}" + # Service accounts + RUSTFS__ACCOUNTS__LEDGER__KEY: "${RUSTFS_LEDGER_KEY:-ledger-svc-key}" + RUSTFS__ACCOUNTS__LEDGER__BUCKETS: "evidence-bundles,merkle-roots,hash-chains" + RUSTFS__ACCOUNTS__LEDGER__PERMISSIONS: "read,write" + RUSTFS__ACCOUNTS__EXPORTER__KEY: "${RUSTFS_EXPORTER_KEY:-exporter-svc-key}" + RUSTFS__ACCOUNTS__EXPORTER__BUCKETS: "evidence-bundles" + RUSTFS__ACCOUNTS__EXPORTER__PERMISSIONS: "read" + volumes: + - rustfs-evidence-data:/data + ports: + - "${RUSTFS_EVIDENCE_PORT:-8181}:8080" + networks: + - cas + labels: *release-labels + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 10s + + # Attestation storage - DSSE envelopes, in-toto attestations (immutable) + rustfs-attestation: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + command: ["serve", "--listen", "0.0.0.0:8080", "--root", "/data", "--immutable"] + restart: unless-stopped + environment: + RUSTFS__LOG__LEVEL: "${RUSTFS_LOG_LEVEL:-info}" + RUSTFS__STORAGE__PATH: /data + RUSTFS__STORAGE__DEDUP: "true" + RUSTFS__STORAGE__COMPRESSION: "${RUSTFS_COMPRESSION:-zstd}" + RUSTFS__STORAGE__IMMUTABLE: "true" # Write-once, never delete + # Access control + RUSTFS__AUTH__ENABLED: "true" + RUSTFS__AUTH__API_KEY: "${RUSTFS_ATTESTATION_API_KEY:-attestation-api-key-change-me}" + RUSTFS__AUTH__READONLY_KEY: "${RUSTFS_ATTESTATION_READONLY_KEY:-attestation-readonly-key-change-me}" + # Service accounts + RUSTFS__ACCOUNTS__ATTESTOR__KEY: "${RUSTFS_ATTESTOR_KEY:-attestor-svc-key}" + RUSTFS__ACCOUNTS__ATTESTOR__BUCKETS: "attestations,dsse-envelopes,rekor-receipts" + RUSTFS__ACCOUNTS__ATTESTOR__PERMISSIONS: "read,write" + RUSTFS__ACCOUNTS__VERIFIER__KEY: "${RUSTFS_VERIFIER_KEY:-verifier-svc-key}" + RUSTFS__ACCOUNTS__VERIFIER__BUCKETS: "attestations,dsse-envelopes,rekor-receipts" + RUSTFS__ACCOUNTS__VERIFIER__PERMISSIONS: "read" + volumes: + - rustfs-attestation-data:/data + ports: + - "${RUSTFS_ATTESTATION_PORT:-8182}:8080" + networks: + - cas + labels: *release-labels + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 10s + + rekor-cli: + image: ghcr.io/sigstore/rekor-cli:v1.4.3 + entrypoint: ["rekor-cli"] + command: ["version"] + profiles: ["sigstore"] + networks: + - cas + labels: *release-labels + + cosign: + image: ghcr.io/sigstore/cosign:v3.0.4 + entrypoint: ["cosign"] + command: ["version"] + profiles: ["sigstore"] + networks: + - cas + labels: *release-labels + + # Lifecycle manager - enforces retention policies + cas-lifecycle: + image: registry.stella-ops.org/stellaops/cas-lifecycle:2025.10.0-edge + restart: unless-stopped + depends_on: + rustfs-cas: + condition: service_healthy + environment: + LIFECYCLE__CAS__ENDPOINT: "http://rustfs-cas:8080" + LIFECYCLE__CAS__API_KEY: "${RUSTFS_CAS_API_KEY:-cas-api-key-change-me}" + LIFECYCLE__SCHEDULE__CRON: "${LIFECYCLE_CRON:-0 3 * * *}" # 3 AM daily + LIFECYCLE__POLICIES__VULNERABILITY_DB: "7d" + LIFECYCLE__POLICIES__SBOM_ARTIFACTS: "365d" + LIFECYCLE__POLICIES__SCAN_RESULTS: "90d" + LIFECYCLE__POLICIES__TEMP_ARTIFACTS: "1d" + LIFECYCLE__TELEMETRY__ENABLED: "${LIFECYCLE_TELEMETRY:-true}" + LIFECYCLE__TELEMETRY__OTLP_ENDPOINT: "${OTLP_ENDPOINT:-}" + networks: + - cas + labels: *release-labels + diff --git a/deploy/compose/docker-compose.compliance-china.yml b/deploy/compose/docker-compose.compliance-china.yml new file mode 100644 index 000000000..d1ec22334 --- /dev/null +++ b/deploy/compose/docker-compose.compliance-china.yml @@ -0,0 +1,197 @@ +# ============================================================================= +# STELLA OPS - COMPLIANCE OVERLAY: CHINA +# ============================================================================= +# SM2/SM3/SM4 ShangMi (Commercial Cipher) crypto overlay. +# This file extends docker-compose.stella-ops.yml with China-specific crypto. +# +# Usage: +# docker compose -f devops/compose/docker-compose.stella-ops.yml \ +# -f devops/compose/docker-compose.compliance-china.yml up -d +# +# Cryptography: +# - SM2: Elliptic curve cryptography (signature, key exchange) +# - SM3: Hash function (256-bit digest) +# - SM4: Block cipher (128-bit) +# +# ============================================================================= + +x-crypto-env: &crypto-env + STELLAOPS_CRYPTO_PROFILE: "china" + STELLAOPS_CRYPTO_CONFIG_PATH: "/app/etc/appsettings.crypto.yaml" + STELLAOPS_CRYPTO_MANIFEST_PATH: "/app/etc/crypto-plugins-manifest.json" + +x-crypto-volumes: &crypto-volumes + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + +services: + # --------------------------------------------------------------------------- + # Authority - China crypto overlay + # --------------------------------------------------------------------------- + authority: + image: registry.stella-ops.org/stellaops/authority:china + environment: + <<: *crypto-env + volumes: + - ../../etc/authority:/app/etc/authority:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Signer - China crypto overlay + # --------------------------------------------------------------------------- + signer: + image: registry.stella-ops.org/stellaops/signer:china + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Attestor - China crypto overlay + # --------------------------------------------------------------------------- + attestor: + image: registry.stella-ops.org/stellaops/attestor:china + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Concelier - China crypto overlay + # --------------------------------------------------------------------------- + concelier: + image: registry.stella-ops.org/stellaops/concelier:china + environment: + <<: *crypto-env + volumes: + - concelier-jobs:/var/lib/concelier/jobs + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Scanner Web - China crypto overlay + # --------------------------------------------------------------------------- + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web:china + environment: + <<: *crypto-env + volumes: + - ../../etc/scanner:/app/etc/scanner:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ${SCANNER_OFFLINEKIT_TRUSTROOTS_HOST_PATH:-./offline/trust-roots}:${SCANNER_OFFLINEKIT_TRUSTROOTDIRECTORY:-/etc/stellaops/trust-roots}:ro + - ${SCANNER_OFFLINEKIT_REKOR_SNAPSHOT_HOST_PATH:-./offline/rekor-snapshot}:${SCANNER_OFFLINEKIT_REKORSNAPSHOTDIRECTORY:-/var/lib/stellaops/rekor-snapshot}:ro + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Scanner Worker - China crypto overlay + # --------------------------------------------------------------------------- + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker:china + environment: + <<: *crypto-env + volumes: + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Scheduler Worker - China crypto overlay + # --------------------------------------------------------------------------- + scheduler-worker: + image: registry.stella-ops.org/stellaops/scheduler-worker:china + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Notify Web - China crypto overlay + # --------------------------------------------------------------------------- + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:china + environment: + <<: *crypto-env + volumes: + - ../../etc/notify:/app/etc/notify:ro + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Excititor - China crypto overlay + # --------------------------------------------------------------------------- + excititor: + image: registry.stella-ops.org/stellaops/excititor:china + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Advisory AI Web - China crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:china + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Advisory AI Worker - China crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:china + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.china.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "china" + + # --------------------------------------------------------------------------- + # Web UI - China crypto overlay + # --------------------------------------------------------------------------- + web-ui: + image: registry.stella-ops.org/stellaops/web-ui:china + labels: + com.stellaops.crypto.profile: "china" diff --git a/deploy/compose/docker-compose.compliance-eu.yml b/deploy/compose/docker-compose.compliance-eu.yml new file mode 100644 index 000000000..62b5743db --- /dev/null +++ b/deploy/compose/docker-compose.compliance-eu.yml @@ -0,0 +1,209 @@ +# ============================================================================= +# STELLA OPS - COMPLIANCE OVERLAY: EU +# ============================================================================= +# eIDAS qualified trust services crypto overlay. +# This file extends docker-compose.stella-ops.yml with EU-specific crypto. +# +# Usage: +# docker compose -f devops/compose/docker-compose.stella-ops.yml \ +# -f devops/compose/docker-compose.compliance-eu.yml up -d +# +# Cryptography: +# - eIDAS-compliant qualified electronic signatures +# - ETSI TS 119 312 compliant algorithms +# - Qualified Trust Service Provider (QTSP) integration +# +# ============================================================================= + +x-crypto-env: &crypto-env + STELLAOPS_CRYPTO_PROFILE: "eu" + STELLAOPS_CRYPTO_CONFIG_PATH: "/app/etc/appsettings.crypto.yaml" + STELLAOPS_CRYPTO_MANIFEST_PATH: "/app/etc/crypto-plugins-manifest.json" + +x-crypto-volumes: &crypto-volumes + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + +services: + # --------------------------------------------------------------------------- + # Authority - EU crypto overlay + # --------------------------------------------------------------------------- + authority: + image: registry.stella-ops.org/stellaops/authority:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/authority:/app/etc/authority:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Signer - EU crypto overlay + # --------------------------------------------------------------------------- + signer: + image: registry.stella-ops.org/stellaops/signer:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Attestor - EU crypto overlay + # --------------------------------------------------------------------------- + attestor: + image: registry.stella-ops.org/stellaops/attestor:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Concelier - EU crypto overlay + # --------------------------------------------------------------------------- + concelier: + image: registry.stella-ops.org/stellaops/concelier:eu + environment: + <<: *crypto-env + volumes: + - concelier-jobs:/var/lib/concelier/jobs + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Scanner Web - EU crypto overlay + # --------------------------------------------------------------------------- + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/scanner:/app/etc/scanner:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ${SCANNER_OFFLINEKIT_TRUSTROOTS_HOST_PATH:-./offline/trust-roots}:${SCANNER_OFFLINEKIT_TRUSTROOTDIRECTORY:-/etc/stellaops/trust-roots}:ro + - ${SCANNER_OFFLINEKIT_REKOR_SNAPSHOT_HOST_PATH:-./offline/rekor-snapshot}:${SCANNER_OFFLINEKIT_REKORSNAPSHOTDIRECTORY:-/var/lib/stellaops/rekor-snapshot}:ro + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Scanner Worker - EU crypto overlay + # --------------------------------------------------------------------------- + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker:eu + environment: + <<: *crypto-env + volumes: + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Scheduler Worker - EU crypto overlay + # --------------------------------------------------------------------------- + scheduler-worker: + image: registry.stella-ops.org/stellaops/scheduler-worker:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Notify Web - EU crypto overlay + # --------------------------------------------------------------------------- + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/notify:/app/etc/notify:ro + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Excititor - EU crypto overlay + # --------------------------------------------------------------------------- + excititor: + image: registry.stella-ops.org/stellaops/excititor:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Advisory AI Web - EU crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Advisory AI Worker - EU crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:eu + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.eu.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" + + # --------------------------------------------------------------------------- + # Web UI - EU crypto overlay + # --------------------------------------------------------------------------- + web-ui: + image: registry.stella-ops.org/stellaops/web-ui:eu + labels: + com.stellaops.crypto.profile: "eu" + com.stellaops.compliance: "eidas" diff --git a/deploy/compose/docker-compose.compliance-russia.yml b/deploy/compose/docker-compose.compliance-russia.yml new file mode 100644 index 000000000..d387d5a40 --- /dev/null +++ b/deploy/compose/docker-compose.compliance-russia.yml @@ -0,0 +1,216 @@ +# ============================================================================= +# STELLA OPS - COMPLIANCE OVERLAY: RUSSIA +# ============================================================================= +# GOST R 34.10-2012, GOST R 34.11-2012 (Streebog) crypto overlay. +# This file extends docker-compose.stella-ops.yml with Russia-specific crypto. +# +# Usage: +# docker compose -f devops/compose/docker-compose.stella-ops.yml \ +# -f devops/compose/docker-compose.compliance-russia.yml up -d +# +# With CryptoPro CSP: +# docker compose -f devops/compose/docker-compose.stella-ops.yml \ +# -f devops/compose/docker-compose.compliance-russia.yml \ +# -f devops/compose/docker-compose.cryptopro.yml up -d +# +# Cryptography: +# - GOST R 34.10-2012: Digital signature +# - GOST R 34.11-2012: Hash function (Streebog, 256/512-bit) +# - GOST R 34.12-2015: Block cipher (Kuznyechik) +# +# Providers: openssl.gost, pkcs11.gost, cryptopro.gost +# +# ============================================================================= + +x-crypto-env: &crypto-env + STELLAOPS_CRYPTO_PROFILE: "russia" + STELLAOPS_CRYPTO_CONFIG_PATH: "/app/etc/appsettings.crypto.yaml" + STELLAOPS_CRYPTO_MANIFEST_PATH: "/app/etc/crypto-plugins-manifest.json" + STELLAOPS_CRYPTO_PROVIDERS: "openssl.gost,pkcs11.gost,cryptopro.gost" + +x-crypto-volumes: &crypto-volumes + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + +services: + # --------------------------------------------------------------------------- + # Authority - Russia crypto overlay + # --------------------------------------------------------------------------- + authority: + image: registry.stella-ops.org/stellaops/authority:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/authority:/app/etc/authority:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Signer - Russia crypto overlay + # --------------------------------------------------------------------------- + signer: + image: registry.stella-ops.org/stellaops/signer:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Attestor - Russia crypto overlay + # --------------------------------------------------------------------------- + attestor: + image: registry.stella-ops.org/stellaops/attestor:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Concelier - Russia crypto overlay + # --------------------------------------------------------------------------- + concelier: + image: registry.stella-ops.org/stellaops/concelier:russia + environment: + <<: *crypto-env + volumes: + - concelier-jobs:/var/lib/concelier/jobs + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Scanner Web - Russia crypto overlay + # --------------------------------------------------------------------------- + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/scanner:/app/etc/scanner:ro + - ../../etc/certificates/trust-roots:/etc/ssl/certs/stellaops:ro + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ${SCANNER_OFFLINEKIT_TRUSTROOTS_HOST_PATH:-./offline/trust-roots}:${SCANNER_OFFLINEKIT_TRUSTROOTDIRECTORY:-/etc/stellaops/trust-roots}:ro + - ${SCANNER_OFFLINEKIT_REKOR_SNAPSHOT_HOST_PATH:-./offline/rekor-snapshot}:${SCANNER_OFFLINEKIT_REKORSNAPSHOTDIRECTORY:-/var/lib/stellaops/rekor-snapshot}:ro + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Scanner Worker - Russia crypto overlay + # --------------------------------------------------------------------------- + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker:russia + environment: + <<: *crypto-env + volumes: + - scanner-surface-cache:/var/lib/stellaops/surface + - ${SURFACE_SECRETS_HOST_PATH:-./offline/surface-secrets}:${SCANNER_SURFACE_SECRETS_ROOT:-/etc/stellaops/secrets}:ro + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Scheduler Worker - Russia crypto overlay + # --------------------------------------------------------------------------- + scheduler-worker: + image: registry.stella-ops.org/stellaops/scheduler-worker:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Notify Web - Russia crypto overlay + # --------------------------------------------------------------------------- + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/notify:/app/etc/notify:ro + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Excititor - Russia crypto overlay + # --------------------------------------------------------------------------- + excititor: + image: registry.stella-ops.org/stellaops/excititor:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Advisory AI Web - Russia crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Advisory AI Worker - Russia crypto overlay + # --------------------------------------------------------------------------- + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:russia + environment: + <<: *crypto-env + volumes: + - ../../etc/llm-providers:/app/etc/llm-providers:ro + - advisory-ai-queue:/var/lib/advisory-ai/queue + - advisory-ai-plans:/var/lib/advisory-ai/plans + - advisory-ai-outputs:/var/lib/advisory-ai/outputs + - ../../etc/appsettings.crypto.russia.yaml:/app/etc/appsettings.crypto.yaml:ro + - ../../etc/crypto-plugins-manifest.json:/app/etc/crypto-plugins-manifest.json:ro + labels: + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.provider: "openssl.gost,pkcs11.gost,cryptopro.gost" + + # --------------------------------------------------------------------------- + # Web UI - Russia crypto overlay + # --------------------------------------------------------------------------- + web-ui: + image: registry.stella-ops.org/stellaops/web-ui:russia + labels: + com.stellaops.crypto.profile: "russia" diff --git a/deploy/compose/docker-compose.corpus.yml b/deploy/compose/docker-compose.corpus.yml new file mode 100644 index 000000000..a4cb45a5a --- /dev/null +++ b/deploy/compose/docker-compose.corpus.yml @@ -0,0 +1,42 @@ +# ============================================================================= +# CORPUS - FUNCTION BEHAVIOR DATABASE +# ============================================================================= +# PostgreSQL database for function behavior corpus analysis. +# +# Usage: +# docker compose -f docker-compose.corpus.yml up -d +# +# Environment: +# CORPUS_DB_PASSWORD - PostgreSQL password for corpus database +# ============================================================================= + +services: + corpus-postgres: + image: postgres:18.1-alpine + container_name: stellaops-corpus-db + environment: + POSTGRES_DB: stellaops_corpus + POSTGRES_USER: corpus_user + POSTGRES_PASSWORD: ${CORPUS_DB_PASSWORD:-stellaops_corpus_dev} + POSTGRES_INITDB_ARGS: "-E UTF8 --locale=C" + volumes: + - corpus-data:/var/lib/postgresql/data + - ../../docs/db/schemas/corpus.sql:/docker-entrypoint-initdb.d/10-corpus-schema.sql:ro + - ../docker/corpus/scripts/init-test-data.sql:/docker-entrypoint-initdb.d/20-test-data.sql:ro + ports: + - "${CORPUS_DB_PORT:-5435}:5432" + networks: + - stellaops-corpus + healthcheck: + test: ["CMD-SHELL", "pg_isready -U corpus_user -d stellaops_corpus"] + interval: 10s + timeout: 5s + retries: 5 + restart: unless-stopped + +volumes: + corpus-data: + +networks: + stellaops-corpus: + driver: bridge diff --git a/deploy/compose/docker-compose.crypto-sim.yml b/deploy/compose/docker-compose.crypto-sim.yml new file mode 100644 index 000000000..73f794609 --- /dev/null +++ b/deploy/compose/docker-compose.crypto-sim.yml @@ -0,0 +1,119 @@ +# ============================================================================= +# STELLA OPS - CRYPTO SIMULATION OVERLAY +# ============================================================================= +# Universal crypto simulation service for testing sovereign crypto without +# licensed hardware or certified modules. +# +# This overlay provides the sim-crypto-service which simulates: +# - GOST R 34.10-2012 (Russia): GOST12-256, GOST12-512, ru.magma.sim, ru.kuznyechik.sim +# - SM2/SM3/SM4 (China): SM2, sm.sim, sm2.sim +# - Post-Quantum: DILITHIUM3, FALCON512, pq.sim +# - FIPS/eIDAS/KCMVP: fips.sim, eidas.sim, kcmvp.sim, world.sim +# +# Usage with China compliance: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-china.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Usage with Russia compliance: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-russia.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Usage with EU compliance: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-eu.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# IMPORTANT: This is for TESTING/DEVELOPMENT ONLY. +# - Uses deterministic HMAC-SHA256 for SM/GOST/PQ (not real algorithms) +# - Uses static ECDSA P-256 key for FIPS/eIDAS/KCMVP +# - NOT suitable for production or compliance certification +# +# ============================================================================= + +x-crypto-sim-labels: &crypto-sim-labels + com.stellaops.component: "crypto-sim" + com.stellaops.profile: "simulation" + com.stellaops.production: "false" + +x-sim-crypto-env: &sim-crypto-env + STELLAOPS_CRYPTO_ENABLE_SIM: "1" + STELLAOPS_CRYPTO_SIM_URL: "http://sim-crypto:8080" + +networks: + stellaops: + external: true + name: stellaops + +services: + # --------------------------------------------------------------------------- + # Sim Crypto Service - Universal sovereign crypto simulator + # --------------------------------------------------------------------------- + sim-crypto: + build: + context: ../services/crypto/sim-crypto-service + dockerfile: Dockerfile + image: registry.stella-ops.org/stellaops/sim-crypto:dev + container_name: stellaops-sim-crypto + restart: unless-stopped + environment: + ASPNETCORE_URLS: "http://0.0.0.0:8080" + ASPNETCORE_ENVIRONMENT: "Development" + ports: + - "${SIM_CRYPTO_PORT:-18090}:8080" + networks: + - stellaops + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/keys"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 10s + labels: *crypto-sim-labels + + # --------------------------------------------------------------------------- + # Override services to use sim-crypto + # --------------------------------------------------------------------------- + + # Authority - Enable sim crypto + authority: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" + + # Signer - Enable sim crypto + signer: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" + + # Attestor - Enable sim crypto + attestor: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" + + # Scanner Web - Enable sim crypto + scanner-web: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" + + # Scanner Worker - Enable sim crypto + scanner-worker: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" + + # Excititor - Enable sim crypto + excititor: + environment: + <<: *sim-crypto-env + labels: + com.stellaops.crypto.simulator: "enabled" diff --git a/deploy/compose/docker-compose.cryptopro.yml b/deploy/compose/docker-compose.cryptopro.yml new file mode 100644 index 000000000..eec9c6040 --- /dev/null +++ b/deploy/compose/docker-compose.cryptopro.yml @@ -0,0 +1,149 @@ +# ============================================================================= +# STELLA OPS - CRYPTOPRO CSP OVERLAY (Russia) +# ============================================================================= +# CryptoPro CSP licensed provider overlay for compliance-russia.yml. +# Adds real CryptoPro CSP service for certified GOST R 34.10-2012 operations. +# +# IMPORTANT: Requires EULA acceptance before use. +# +# Usage (MUST be combined with stella-ops AND compliance-russia): +# CRYPTOPRO_ACCEPT_EULA=1 docker compose \ +# -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-russia.yml \ +# -f docker-compose.cryptopro.yml up -d +# +# For development/testing without CryptoPro license, use crypto-sim.yml instead: +# docker compose \ +# -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-russia.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Requirements: +# - CryptoPro CSP license files in opt/cryptopro/downloads/ +# - CRYPTOPRO_ACCEPT_EULA=1 environment variable +# - CryptoPro container images with GOST engine +# +# GOST Algorithms Provided: +# - GOST R 34.10-2012: Digital signature (256/512-bit) +# - GOST R 34.11-2012: Hash function (Streebog, 256/512-bit) +# - GOST R 34.12-2015: Block cipher (Kuznyechik, Magma) +# +# ============================================================================= + +x-cryptopro-labels: &cryptopro-labels + com.stellaops.component: "cryptopro-csp" + com.stellaops.crypto.provider: "cryptopro" + com.stellaops.crypto.profile: "russia" + com.stellaops.crypto.certified: "true" + +x-cryptopro-env: &cryptopro-env + STELLAOPS_CRYPTO_PROVIDERS: "cryptopro.gost" + STELLAOPS_CRYPTO_CRYPTOPRO_URL: "http://cryptopro-csp:8080" + STELLAOPS_CRYPTO_CRYPTOPRO_ENABLED: "true" + +networks: + stellaops: + external: true + name: stellaops + +services: + # --------------------------------------------------------------------------- + # CryptoPro CSP - Certified GOST cryptography provider + # --------------------------------------------------------------------------- + cryptopro-csp: + build: + context: ../.. + dockerfile: devops/services/cryptopro/linux-csp-service/Dockerfile + args: + CRYPTOPRO_ACCEPT_EULA: "${CRYPTOPRO_ACCEPT_EULA:-0}" + image: registry.stella-ops.org/stellaops/cryptopro-csp:2025.10.0 + container_name: stellaops-cryptopro-csp + restart: unless-stopped + environment: + ASPNETCORE_URLS: "http://0.0.0.0:8080" + CRYPTOPRO_ACCEPT_EULA: "${CRYPTOPRO_ACCEPT_EULA:-0}" + # GOST algorithm configuration + CRYPTOPRO_GOST_SIGNATURE_ALGORITHM: "GOST R 34.10-2012" + CRYPTOPRO_GOST_HASH_ALGORITHM: "GOST R 34.11-2012" + # Container and key store settings + CRYPTOPRO_CONTAINER_NAME: "${CRYPTOPRO_CONTAINER_NAME:-stellaops-signing}" + CRYPTOPRO_USE_MACHINE_STORE: "${CRYPTOPRO_USE_MACHINE_STORE:-true}" + CRYPTOPRO_PROVIDER_TYPE: "${CRYPTOPRO_PROVIDER_TYPE:-80}" + volumes: + - ../../opt/cryptopro/downloads:/opt/cryptopro/downloads:ro + - ../../etc/cryptopro:/app/etc/cryptopro:ro + # Optional: Mount key containers + - cryptopro-keys:/var/opt/cprocsp/keys + ports: + - "${CRYPTOPRO_PORT:-18080}:8080" + networks: + - stellaops + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 30s + labels: *cryptopro-labels + + # --------------------------------------------------------------------------- + # Override services to use CryptoPro + # --------------------------------------------------------------------------- + + # Authority - Use CryptoPro for GOST signatures + authority: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + + # Signer - Use CryptoPro for GOST signatures + signer: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + + # Attestor - Use CryptoPro for GOST signatures + attestor: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + + # Scanner Web - Use CryptoPro for verification + scanner-web: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + + # Scanner Worker - Use CryptoPro for verification + scanner-worker: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + + # Excititor - Use CryptoPro for VEX signing + excititor: + environment: + <<: *cryptopro-env + depends_on: + - cryptopro-csp + labels: + com.stellaops.crypto.provider: "cryptopro" + +volumes: + cryptopro-keys: + name: stellaops-cryptopro-keys diff --git a/deploy/compose/docker-compose.dev.yml b/deploy/compose/docker-compose.dev.yml new file mode 100644 index 000000000..ada7997ac --- /dev/null +++ b/deploy/compose/docker-compose.dev.yml @@ -0,0 +1,73 @@ +# ============================================================================= +# DEVELOPMENT STACK - MINIMAL LOCAL DEVELOPMENT +# ============================================================================= +# Minimal infrastructure for local development. Use this when you only need +# the core infrastructure without all application services. +# +# For full platform, use docker-compose.stella-ops.yml instead. +# +# Usage: +# docker compose -f docker-compose.dev.yml up -d +# +# This provides: +# - PostgreSQL 18.1 on port 5432 +# - Valkey 9.0.1 on port 6379 +# - RustFS on port 8080 +# ============================================================================= + +services: + postgres: + image: postgres:18.1-alpine + container_name: stellaops-dev-postgres + restart: unless-stopped + environment: + POSTGRES_USER: ${POSTGRES_USER:-stellaops} + POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-stellaops} + POSTGRES_DB: ${POSTGRES_DB:-stellaops_dev} + volumes: + - postgres-data:/var/lib/postgresql/data + ports: + - "${POSTGRES_PORT:-5432}:5432" + healthcheck: + test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-stellaops}"] + interval: 10s + timeout: 5s + retries: 5 + + valkey: + image: valkey/valkey:9.0.1-alpine + container_name: stellaops-dev-valkey + restart: unless-stopped + command: ["valkey-server", "--appendonly", "yes"] + volumes: + - valkey-data:/data + ports: + - "${VALKEY_PORT:-6379}:6379" + healthcheck: + test: ["CMD", "valkey-cli", "ping"] + interval: 10s + timeout: 5s + retries: 5 + + rustfs: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + container_name: stellaops-dev-rustfs + restart: unless-stopped + command: ["serve", "--listen", "0.0.0.0:8080", "--root", "/data"] + environment: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumes: + - rustfs-data:/data + ports: + - "${RUSTFS_PORT:-8080}:8080" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/health"] + interval: 30s + timeout: 10s + retries: 3 + +volumes: + postgres-data: + valkey-data: + rustfs-data: diff --git a/deploy/compose/docker-compose.gpu.yaml b/deploy/compose/docker-compose.gpu.yaml new file mode 100644 index 000000000..999330cfe --- /dev/null +++ b/deploy/compose/docker-compose.gpu.yaml @@ -0,0 +1,40 @@ +# ============================================================================= +# STELLA OPS GPU OVERLAY +# ============================================================================= +# Enables NVIDIA GPU acceleration for Advisory AI inference services. +# +# Prerequisites: +# - NVIDIA GPU with CUDA support +# - nvidia-container-toolkit installed +# - Docker configured with nvidia runtime +# +# Usage: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.gpu.yaml up -d +# +# ============================================================================= + +services: + advisory-ai-worker: + deploy: + resources: + reservations: + devices: + - capabilities: [gpu] + driver: nvidia + count: 1 + environment: + ADVISORY_AI_INFERENCE_GPU: "true" + runtime: nvidia + + advisory-ai-web: + deploy: + resources: + reservations: + devices: + - capabilities: [gpu] + driver: nvidia + count: 1 + environment: + ADVISORY_AI_INFERENCE_GPU: "true" + runtime: nvidia diff --git a/deploy/compose/docker-compose.sealed-ci.yml b/deploy/compose/docker-compose.sealed-ci.yml new file mode 100644 index 000000000..e677a7acd --- /dev/null +++ b/deploy/compose/docker-compose.sealed-ci.yml @@ -0,0 +1,121 @@ +# ============================================================================= +# SEALED CI - AIR-GAPPED TESTING ENVIRONMENT +# ============================================================================= +# Sealed/air-gapped CI environment for testing offline functionality. +# All services run in isolated network with no external egress. +# +# Usage: +# docker compose -f docker-compose.sealed-ci.yml up -d +# ============================================================================= + +x-release-labels: &release-labels + com.stellaops.profile: 'sealed-ci' + com.stellaops.airgap.mode: 'sealed' + +networks: + sealed-ci: + driver: bridge + +volumes: + sealed-postgres-data: + sealed-valkey-data: + +services: + postgres: + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + restart: unless-stopped + environment: + POSTGRES_USER: sealedci + POSTGRES_PASSWORD: sealedci-secret + POSTGRES_DB: stellaops + volumes: + - sealed-postgres-data:/var/lib/postgresql/data + networks: + - sealed-ci + healthcheck: + test: ["CMD-SHELL", "pg_isready -U sealedci -d stellaops"] + interval: 10s + timeout: 5s + retries: 5 + labels: *release-labels + + valkey: + image: docker.io/valkey/valkey:9.0.1-alpine + restart: unless-stopped + command: ["valkey-server", "--appendonly", "yes"] + volumes: + - sealed-valkey-data:/data + networks: + - sealed-ci + healthcheck: + test: ["CMD", "valkey-cli", "ping"] + interval: 10s + timeout: 5s + retries: 5 + labels: *release-labels + + authority: + image: registry.stella-ops.org/stellaops/authority@sha256:a8e8faec44a579aa5714e58be835f25575710430b1ad2ccd1282a018cd9ffcdd + depends_on: + postgres: + condition: service_healthy + valkey: + condition: service_healthy + restart: unless-stopped + environment: + ASPNETCORE_URLS: http://+:5088 + STELLAOPS_AUTHORITY__ISSUER: http://authority.sealed-ci.local + STELLAOPS_AUTHORITY__STORAGE__DRIVER: postgres + STELLAOPS_AUTHORITY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres;Port=5432;Database=authority;Username=sealedci;Password=sealedci-secret" + STELLAOPS_AUTHORITY__CACHE__REDIS__CONNECTIONSTRING: "valkey:6379" + STELLAOPS_AUTHORITY__PLUGINDIRECTORIES__0: /app/plugins + STELLAOPS_AUTHORITY__PLUGINS__CONFIGURATIONDIRECTORY: /app/plugins + STELLAOPS_AUTHORITY__SECURITY__SENDERCONSTRAINTS__DPOP__ENABLED: 'true' + STELLAOPS_AUTHORITY__SECURITY__SENDERCONSTRAINTS__MTLS__ENABLED: 'true' + STELLAOPS_AUTHORITY__AIRGAP__EGRESS__MODE: Sealed + volumes: + - ../services/sealed-mode-ci/authority.harness.yaml:/etc/authority.yaml:ro + - ../services/sealed-mode-ci/plugins:/app/plugins:ro + - ../../certificates:/certificates:ro + ports: + - '5088:5088' + networks: + - sealed-ci + labels: *release-labels + + signer: + image: registry.stella-ops.org/stellaops/signer@sha256:8bfef9a75783883d49fc18e3566553934e970b00ee090abee9cb110d2d5c3298 + depends_on: + - authority + restart: unless-stopped + environment: + ASPNETCORE_URLS: http://+:6088 + SIGNER__AUTHORITY__BASEURL: http://authority:5088 + SIGNER__POE__INTROSPECTURL: http://authority:5088/device-code + SIGNER__STORAGE__DRIVER: postgres + SIGNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres;Port=5432;Database=signer;Username=sealedci;Password=sealedci-secret" + SIGNER__CACHE__REDIS__CONNECTIONSTRING: "valkey:6379" + SIGNER__SEALED__MODE: Enabled + ports: + - '6088:6088' + networks: + - sealed-ci + labels: *release-labels + + attestor: + image: registry.stella-ops.org/stellaops/attestor@sha256:5cc417948c029da01dccf36e4645d961a3f6d8de7e62fe98d845f07cd2282114 + depends_on: + - signer + restart: unless-stopped + environment: + ASPNETCORE_URLS: http://+:7088 + ATTESTOR__SIGNER__BASEURL: http://signer:6088 + ATTESTOR__STORAGE__DRIVER: postgres + ATTESTOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres;Port=5432;Database=attestor;Username=sealedci;Password=sealedci-secret" + ATTESTOR__CACHE__REDIS__CONNECTIONSTRING: "valkey:6379" + ATTESTOR__SEALED__MODE: Enabled + ports: + - '7088:7088' + networks: + - sealed-ci + labels: *release-labels diff --git a/deploy/compose/docker-compose.sm-remote.yml b/deploy/compose/docker-compose.sm-remote.yml new file mode 100644 index 000000000..78143d025 --- /dev/null +++ b/deploy/compose/docker-compose.sm-remote.yml @@ -0,0 +1,153 @@ +# ============================================================================= +# STELLA OPS - SM REMOTE OVERLAY (China) +# ============================================================================= +# SM Remote service overlay for compliance-china.yml. +# Provides SM2/SM3/SM4 (ShangMi) cryptographic operations via software provider +# or integration with OSCCA-certified hardware security modules. +# +# Usage (MUST be combined with stella-ops AND compliance-china): +# docker compose \ +# -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-china.yml \ +# -f docker-compose.sm-remote.yml up -d +# +# For development/testing without SM hardware, use crypto-sim.yml instead: +# docker compose \ +# -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-china.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# SM Algorithms Provided: +# - SM2: Public key cryptography (ECDSA-like, 256-bit curve) - GM/T 0003-2012 +# - SM3: Cryptographic hash function (256-bit output) - GM/T 0004-2012 +# - SM4: Block cipher (128-bit key/block, AES-like) - GM/T 0002-2012 +# - SM9: Identity-based cryptography - GM/T 0044-2016 +# +# Providers: +# - cn.sm.soft: Software-only implementation using BouncyCastle +# - cn.sm.remote.http: Remote HSM integration via HTTP API +# +# OSCCA Compliance: +# - All cryptographic operations use SM algorithms exclusively +# - Hardware Security Modules should be OSCCA-certified +# - Certificates comply with GM/T 0015 (Certificate Profile) +# +# ============================================================================= + +x-sm-remote-labels: &sm-remote-labels + com.stellaops.component: "sm-remote" + com.stellaops.crypto.provider: "sm" + com.stellaops.crypto.profile: "china" + com.stellaops.crypto.jurisdiction: "china" + +x-sm-remote-env: &sm-remote-env + STELLAOPS_CRYPTO_PROVIDERS: "cn.sm.soft,cn.sm.remote.http" + STELLAOPS_CRYPTO_SM_REMOTE_URL: "http://sm-remote:56080" + STELLAOPS_CRYPTO_SM_ENABLED: "true" + SM_SOFT_ALLOWED: "1" + +networks: + stellaops: + external: true + name: stellaops + +services: + # --------------------------------------------------------------------------- + # SM Remote Service - ShangMi cryptography provider + # --------------------------------------------------------------------------- + sm-remote: + build: + context: ../.. + dockerfile: devops/services/sm-remote/Dockerfile + image: registry.stella-ops.org/stellaops/sm-remote:2025.10.0 + container_name: stellaops-sm-remote + restart: unless-stopped + environment: + ASPNETCORE_URLS: "http://0.0.0.0:56080" + ASPNETCORE_ENVIRONMENT: "Production" + # Enable software-only SM2 provider (for testing/development) + SM_SOFT_ALLOWED: "${SM_SOFT_ALLOWED:-1}" + # Optional: Remote HSM configuration (for production with OSCCA-certified HSM) + SM_REMOTE_HSM_URL: "${SM_REMOTE_HSM_URL:-}" + SM_REMOTE_HSM_API_KEY: "${SM_REMOTE_HSM_API_KEY:-}" + SM_REMOTE_HSM_TIMEOUT: "${SM_REMOTE_HSM_TIMEOUT:-30000}" + # Optional: Client certificate authentication for HSM + SM_REMOTE_CLIENT_CERT_PATH: "${SM_REMOTE_CLIENT_CERT_PATH:-}" + SM_REMOTE_CLIENT_CERT_PASSWORD: "${SM_REMOTE_CLIENT_CERT_PASSWORD:-}" + volumes: + - ../../etc/sm-remote:/app/etc/sm-remote:ro + # Optional: Mount SM key containers + - sm-remote-keys:/var/lib/stellaops/sm-keys + ports: + - "${SM_REMOTE_PORT:-56080}:56080" + networks: + - stellaops + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:56080/status"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 15s + labels: *sm-remote-labels + + # --------------------------------------------------------------------------- + # Override services to use SM Remote + # --------------------------------------------------------------------------- + + # Authority - Use SM Remote for SM2 signatures + authority: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + + # Signer - Use SM Remote for SM2 signatures + signer: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + + # Attestor - Use SM Remote for SM2 signatures + attestor: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + + # Scanner Web - Use SM Remote for verification + scanner-web: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + + # Scanner Worker - Use SM Remote for verification + scanner-worker: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + + # Excititor - Use SM Remote for VEX signing + excititor: + environment: + <<: *sm-remote-env + depends_on: + - sm-remote + labels: + com.stellaops.crypto.provider: "sm" + +volumes: + sm-remote-keys: + name: stellaops-sm-remote-keys diff --git a/deploy/compose/docker-compose.telemetry-offline.yml b/deploy/compose/docker-compose.telemetry-offline.yml new file mode 100644 index 000000000..6b35f3b69 --- /dev/null +++ b/deploy/compose/docker-compose.telemetry-offline.yml @@ -0,0 +1,90 @@ +# ============================================================================= +# TELEMETRY OFFLINE - AIR-GAPPED OBSERVABILITY +# ============================================================================= +# Offline-compatible telemetry stack for air-gapped deployments. +# Does not require external connectivity. +# +# Usage: +# docker compose -f docker-compose.telemetry-offline.yml up -d +# +# For online deployments, use docker-compose.telemetry.yml instead. +# ============================================================================= + +services: + loki: + image: grafana/loki:3.0.1 + container_name: stellaops-loki-offline + command: ["-config.file=/etc/loki/local-config.yaml"] + volumes: + - loki-data:/loki + - ../offline/airgap/observability/loki-config.yaml:/etc/loki/local-config.yaml:ro + ports: + - "${LOKI_PORT:-3100}:3100" + networks: + - sealed + restart: unless-stopped + + promtail: + image: grafana/promtail:3.0.1 + container_name: stellaops-promtail-offline + command: ["-config.file=/etc/promtail/config.yml"] + volumes: + - promtail-data:/var/log + - ../offline/airgap/promtail-config.yaml:/etc/promtail/config.yml:ro + networks: + - sealed + restart: unless-stopped + + otel-collector: + image: otel/opentelemetry-collector-contrib:0.97.0 + container_name: stellaops-otel-offline + command: ["--config=/etc/otel/config.yaml"] + volumes: + - ../offline/airgap/otel-offline.yaml:/etc/otel/config.yaml:ro + - otel-data:/var/otel + ports: + - "${OTEL_GRPC_PORT:-4317}:4317" + - "${OTEL_HTTP_PORT:-4318}:4318" + networks: + - sealed + restart: unless-stopped + + tempo: + image: grafana/tempo:2.4.1 + container_name: stellaops-tempo-offline + command: ["-config.file=/etc/tempo/config.yaml"] + volumes: + - tempo-data:/var/tempo + - ../offline/airgap/observability/tempo-config.yaml:/etc/tempo/config.yaml:ro + ports: + - "${TEMPO_PORT:-3200}:3200" + networks: + - sealed + restart: unless-stopped + + prometheus: + image: prom/prometheus:v2.51.0 + container_name: stellaops-prometheus-offline + command: + - '--config.file=/etc/prometheus/prometheus.yml' + - '--storage.tsdb.path=/prometheus' + - '--storage.tsdb.retention.time=15d' + volumes: + - prometheus-data:/prometheus + - ../offline/airgap/observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro + ports: + - "${PROMETHEUS_PORT:-9090}:9090" + networks: + - sealed + restart: unless-stopped + +networks: + sealed: + driver: bridge + +volumes: + loki-data: + promtail-data: + otel-data: + tempo-data: + prometheus-data: diff --git a/deploy/compose/docker-compose.telemetry.yml b/deploy/compose/docker-compose.telemetry.yml new file mode 100644 index 000000000..eca075313 --- /dev/null +++ b/deploy/compose/docker-compose.telemetry.yml @@ -0,0 +1,144 @@ +# ============================================================================= +# STELLA OPS - TELEMETRY STACK +# ============================================================================= +# All-in-one observability: OpenTelemetry Collector, Prometheus, Tempo, Loki +# +# Usage: +# docker compose -f devops/compose/docker-compose.telemetry.yml up -d +# +# With main stack: +# docker compose -f devops/compose/docker-compose.stella-ops.yml \ +# -f devops/compose/docker-compose.telemetry.yml up -d +# +# ============================================================================= + +x-telemetry-labels: &telemetry-labels + com.stellaops.component: "telemetry" + com.stellaops.profile: "observability" + +networks: + stellaops-telemetry: + driver: bridge + name: stellaops-telemetry + stellaops: + external: true + name: stellaops + +volumes: + prometheus-data: + tempo-data: + loki-data: + +services: + # --------------------------------------------------------------------------- + # OpenTelemetry Collector - Unified telemetry ingestion + # --------------------------------------------------------------------------- + otel-collector: + image: otel/opentelemetry-collector:0.105.0 + container_name: stellaops-otel-collector + restart: unless-stopped + command: + - "--config=/etc/otel-collector/config.yaml" + environment: + STELLAOPS_OTEL_TLS_CERT: /etc/otel-collector/tls/collector.crt + STELLAOPS_OTEL_TLS_KEY: /etc/otel-collector/tls/collector.key + STELLAOPS_OTEL_TLS_CA: /etc/otel-collector/tls/ca.crt + STELLAOPS_OTEL_PROMETHEUS_ENDPOINT: 0.0.0.0:9464 + STELLAOPS_OTEL_REQUIRE_CLIENT_CERT: "true" + STELLAOPS_TENANT_ID: ${STELLAOPS_TENANT_ID:-default} + STELLAOPS_TEMPO_ENDPOINT: http://tempo:3200 + STELLAOPS_TEMPO_TLS_CERT_FILE: /etc/otel-collector/tls/client.crt + STELLAOPS_TEMPO_TLS_KEY_FILE: /etc/otel-collector/tls/client.key + STELLAOPS_TEMPO_TLS_CA_FILE: /etc/otel-collector/tls/ca.crt + STELLAOPS_LOKI_ENDPOINT: http://loki:3100/loki/api/v1/push + STELLAOPS_LOKI_TLS_CERT_FILE: /etc/otel-collector/tls/client.crt + STELLAOPS_LOKI_TLS_KEY_FILE: /etc/otel-collector/tls/client.key + STELLAOPS_LOKI_TLS_CA_FILE: /etc/otel-collector/tls/ca.crt + volumes: + - ../telemetry/otel-collector-config.yaml:/etc/otel-collector/config.yaml:ro + - ../telemetry/certs:/etc/otel-collector/tls:ro + ports: + - "${OTEL_GRPC_PORT:-4317}:4317" # OTLP gRPC + - "${OTEL_HTTP_PORT:-4318}:4318" # OTLP HTTP + - "${OTEL_PROMETHEUS_PORT:-9464}:9464" # Prometheus exporter + - "${OTEL_HEALTH_PORT:-13133}:13133" # Health check + - "${OTEL_PPROF_PORT:-1777}:1777" # pprof + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:13133/healthz"] + interval: 30s + start_period: 15s + timeout: 5s + retries: 3 + networks: + - stellaops-telemetry + - stellaops + labels: *telemetry-labels + + # --------------------------------------------------------------------------- + # Prometheus - Metrics storage + # --------------------------------------------------------------------------- + prometheus: + image: prom/prometheus:v2.53.0 + container_name: stellaops-prometheus + restart: unless-stopped + command: + - "--config.file=/etc/prometheus/prometheus.yaml" + - "--storage.tsdb.path=/prometheus" + - "--storage.tsdb.retention.time=${PROMETHEUS_RETENTION:-15d}" + - "--web.enable-lifecycle" + volumes: + - ../telemetry/storage/prometheus.yaml:/etc/prometheus/prometheus.yaml:ro + - prometheus-data:/prometheus + - ../telemetry/certs:/etc/telemetry/tls:ro + - ../telemetry/storage/auth:/etc/telemetry/auth:ro + environment: + PROMETHEUS_COLLECTOR_TARGET: otel-collector:9464 + ports: + - "${PROMETHEUS_PORT:-9090}:9090" + depends_on: + - otel-collector + networks: + - stellaops-telemetry + labels: *telemetry-labels + + # --------------------------------------------------------------------------- + # Tempo - Distributed tracing backend + # --------------------------------------------------------------------------- + tempo: + image: grafana/tempo:2.5.0 + container_name: stellaops-tempo + restart: unless-stopped + command: + - "-config.file=/etc/tempo/tempo.yaml" + volumes: + - ../telemetry/storage/tempo.yaml:/etc/tempo/tempo.yaml:ro + - ../telemetry/storage/tenants/tempo-overrides.yaml:/etc/telemetry/tenants/tempo-overrides.yaml:ro + - ../telemetry/certs:/etc/telemetry/tls:ro + - tempo-data:/var/tempo + environment: + TEMPO_ZONE: docker + ports: + - "${TEMPO_PORT:-3200}:3200" + networks: + - stellaops-telemetry + labels: *telemetry-labels + + # --------------------------------------------------------------------------- + # Loki - Log aggregation + # --------------------------------------------------------------------------- + loki: + image: grafana/loki:3.1.0 + container_name: stellaops-loki + restart: unless-stopped + command: + - "-config.file=/etc/loki/loki.yaml" + volumes: + - ../telemetry/storage/loki.yaml:/etc/loki/loki.yaml:ro + - ../telemetry/storage/tenants/loki-overrides.yaml:/etc/telemetry/tenants/loki-overrides.yaml:ro + - ../telemetry/certs:/etc/telemetry/tls:ro + - loki-data:/var/loki + ports: + - "${LOKI_PORT:-3100}:3100" + networks: + - stellaops-telemetry + labels: *telemetry-labels diff --git a/deploy/compose/docker-compose.testing.yml b/deploy/compose/docker-compose.testing.yml new file mode 100644 index 000000000..d3540b9f6 --- /dev/null +++ b/deploy/compose/docker-compose.testing.yml @@ -0,0 +1,327 @@ +# ============================================================================= +# STELLA OPS - TESTING STACK +# ============================================================================= +# Consolidated CI, mock services, and Gitea for integration testing. +# Uses different ports to avoid conflicts with development/production services. +# +# Usage: +# docker compose -f devops/compose/docker-compose.testing.yml up -d +# +# CI infrastructure only: +# docker compose -f devops/compose/docker-compose.testing.yml --profile ci up -d +# +# Mock services only: +# docker compose -f devops/compose/docker-compose.testing.yml --profile mock up -d +# +# Gitea only: +# docker compose -f devops/compose/docker-compose.testing.yml --profile gitea up -d +# +# ============================================================================= + +x-testing-labels: &testing-labels + com.stellaops.profile: "testing" + com.stellaops.environment: "ci" + +networks: + testing-net: + driver: bridge + name: stellaops-testing + +volumes: + # CI volumes + ci-postgres-data: + name: stellaops-ci-postgres + ci-valkey-data: + name: stellaops-ci-valkey + ci-rustfs-data: + name: stellaops-ci-rustfs + # Gitea volumes + gitea-data: + gitea-config: + +services: + # =========================================================================== + # CI INFRASTRUCTURE (different ports to avoid conflicts) + # =========================================================================== + + # --------------------------------------------------------------------------- + # PostgreSQL 18.1 - Test database (port 5433) + # --------------------------------------------------------------------------- + postgres-test: + image: postgres:18.1-alpine + container_name: stellaops-postgres-test + profiles: ["ci", "all"] + environment: + POSTGRES_USER: stellaops_ci + POSTGRES_PASSWORD: ci_test_password + POSTGRES_DB: stellaops_test + POSTGRES_INITDB_ARGS: "--data-checksums" + ports: + - "${TEST_POSTGRES_PORT:-5433}:5432" + volumes: + - ci-postgres-data:/var/lib/postgresql/data + networks: + - testing-net + healthcheck: + test: ["CMD-SHELL", "pg_isready -U stellaops_ci -d stellaops_test"] + interval: 5s + timeout: 5s + retries: 10 + start_period: 10s + restart: unless-stopped + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Valkey 9.0.1 - Test cache/queue (port 6380) + # --------------------------------------------------------------------------- + valkey-test: + image: valkey/valkey:9.0.1-alpine + container_name: stellaops-valkey-test + profiles: ["ci", "all"] + command: ["valkey-server", "--appendonly", "yes", "--maxmemory", "256mb", "--maxmemory-policy", "allkeys-lru"] + ports: + - "${TEST_VALKEY_PORT:-6380}:6379" + volumes: + - ci-valkey-data:/data + networks: + - testing-net + healthcheck: + test: ["CMD", "valkey-cli", "ping"] + interval: 5s + timeout: 5s + retries: 5 + restart: unless-stopped + labels: *testing-labels + + # --------------------------------------------------------------------------- + # RustFS - Test artifact storage (port 8180) + # --------------------------------------------------------------------------- + rustfs-test: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + container_name: stellaops-rustfs-test + profiles: ["ci", "all"] + command: ["serve", "--listen", "0.0.0.0:8080", "--root", "/data"] + environment: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + ports: + - "${TEST_RUSTFS_PORT:-8180}:8080" + volumes: + - ci-rustfs-data:/data + networks: + - testing-net + restart: unless-stopped + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Mock Container Registry (port 5001) + # --------------------------------------------------------------------------- + mock-registry: + image: registry:2 + container_name: stellaops-registry-test + profiles: ["ci", "all"] + ports: + - "${TEST_REGISTRY_PORT:-5001}:5000" + environment: + REGISTRY_STORAGE_DELETE_ENABLED: "true" + networks: + - testing-net + restart: unless-stopped + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Sigstore CLI tools (on-demand) + # --------------------------------------------------------------------------- + rekor-cli: + image: ghcr.io/sigstore/rekor-cli:v1.4.3 + entrypoint: ["rekor-cli"] + command: ["version"] + profiles: ["sigstore"] + networks: + - testing-net + labels: *testing-labels + + cosign: + image: ghcr.io/sigstore/cosign:v3.0.4 + entrypoint: ["cosign"] + command: ["version"] + profiles: ["sigstore"] + networks: + - testing-net + labels: *testing-labels + + # =========================================================================== + # MOCK SERVICES (for extended integration testing) + # =========================================================================== + + # --------------------------------------------------------------------------- + # Orchestrator mock + # --------------------------------------------------------------------------- + orchestrator: + image: registry.stella-ops.org/stellaops/orchestrator@sha256:97f12856ce870bafd3328bda86833bcccbf56d255941d804966b5557f6610119 + container_name: stellaops-orchestrator-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.Orchestrator.WebService.dll"] + depends_on: + - postgres-test + - valkey-test + environment: + ORCHESTRATOR__STORAGE__DRIVER: "postgres" + ORCHESTRATOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + ORCHESTRATOR__QUEUE__DRIVER: "valkey" + ORCHESTRATOR__QUEUE__VALKEY__URL: "valkey-test:6379" + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Policy Registry mock + # --------------------------------------------------------------------------- + policy-registry: + image: registry.stella-ops.org/stellaops/policy-registry@sha256:c6cad8055e9827ebcbebb6ad4d6866dce4b83a0a49b0a8a6500b736a5cb26fa7 + container_name: stellaops-policy-registry-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.Policy.Engine.dll"] + depends_on: + - postgres-test + environment: + POLICY__STORAGE__DRIVER: "postgres" + POLICY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # VEX Lens mock + # --------------------------------------------------------------------------- + vex-lens: + image: registry.stella-ops.org/stellaops/vex-lens@sha256:b44e63ecfeebc345a70c073c1ce5ace709c58be0ffaad0e2862758aeee3092fb + container_name: stellaops-vex-lens-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.VexLens.dll"] + depends_on: + - postgres-test + environment: + VEXLENS__STORAGE__DRIVER: "postgres" + VEXLENS__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Findings Ledger mock + # --------------------------------------------------------------------------- + findings-ledger: + image: registry.stella-ops.org/stellaops/findings-ledger@sha256:71d4c361ba8b2f8b69d652597bc3f2efc8a64f93fab854ce25272a88506df49c + container_name: stellaops-findings-ledger-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.Findings.Ledger.WebService.dll"] + depends_on: + - postgres-test + environment: + FINDINGSLEDGER__STORAGE__DRIVER: "postgres" + FINDINGSLEDGER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Vuln Explorer API mock + # --------------------------------------------------------------------------- + vuln-explorer-api: + image: registry.stella-ops.org/stellaops/vuln-explorer-api@sha256:7fc7e43a05cbeb0106ce7d4d634612e83de6fdc119aaab754a71c1d60b82841d + container_name: stellaops-vuln-explorer-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.VulnExplorer.Api.dll"] + depends_on: + - findings-ledger + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Packs Registry mock + # --------------------------------------------------------------------------- + packs-registry: + image: registry.stella-ops.org/stellaops/packs-registry@sha256:1f5e9416c4dc608594ad6fad87c24d72134427f899c192b494e22b268499c791 + container_name: stellaops-packs-registry-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.PacksRegistry.dll"] + depends_on: + - postgres-test + environment: + PACKSREGISTRY__STORAGE__DRIVER: "postgres" + PACKSREGISTRY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + networks: + - testing-net + labels: *testing-labels + + # --------------------------------------------------------------------------- + # Task Runner mock + # --------------------------------------------------------------------------- + task-runner: + image: registry.stella-ops.org/stellaops/task-runner@sha256:eb5ad992b49a41554f41516be1a6afcfa6522faf2111c08ff2b3664ad2fc954b + container_name: stellaops-task-runner-mock + profiles: ["mock", "all"] + command: ["dotnet", "StellaOps.TaskRunner.WebService.dll"] + depends_on: + - packs-registry + - postgres-test + environment: + TASKRUNNER__STORAGE__DRIVER: "postgres" + TASKRUNNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=postgres-test;Port=5432;Database=stellaops_test;Username=stellaops_ci;Password=ci_test_password" + networks: + - testing-net + labels: *testing-labels + + # =========================================================================== + # GITEA (SCM integration testing) + # =========================================================================== + + # --------------------------------------------------------------------------- + # Gitea - Git hosting with package registry + # --------------------------------------------------------------------------- + gitea: + image: gitea/gitea:1.21 + container_name: stellaops-gitea-test + profiles: ["gitea", "all"] + environment: + - USER_UID=1000 + - USER_GID=1000 + # Enable package registry + - GITEA__packages__ENABLED=true + - GITEA__packages__CHUNKED_UPLOAD_PATH=/data/tmp/package-upload + # Enable NuGet + - GITEA__packages__NUGET_ENABLED=true + # Enable Container registry + - GITEA__packages__CONTAINER_ENABLED=true + # Database (SQLite for simplicity) + - GITEA__database__DB_TYPE=sqlite3 + - GITEA__database__PATH=/data/gitea/gitea.db + # Server config + - GITEA__server__ROOT_URL=http://localhost:${TEST_GITEA_PORT:-3000}/ + - GITEA__server__HTTP_PORT=3000 + # Disable metrics/telemetry + - GITEA__metrics__ENABLED=false + # Session config + - GITEA__session__PROVIDER=memory + # Cache config + - GITEA__cache__ADAPTER=memory + # Log level + - GITEA__log__LEVEL=Warn + volumes: + - gitea-data:/data + - gitea-config:/etc/gitea + ports: + - "${TEST_GITEA_PORT:-3000}:3000" + - "${TEST_GITEA_SSH_PORT:-3022}:22" + networks: + - testing-net + restart: unless-stopped + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:3000/api/healthz"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 60s + labels: *testing-labels diff --git a/deploy/compose/docker-compose.tile-proxy.yml b/deploy/compose/docker-compose.tile-proxy.yml new file mode 100644 index 000000000..424c53ad9 --- /dev/null +++ b/deploy/compose/docker-compose.tile-proxy.yml @@ -0,0 +1,80 @@ +# ============================================================================= +# STELLA OPS TILE PROXY OVERLAY +# ============================================================================= +# Rekor tile caching proxy for air-gapped and offline deployments. +# Caches tiles from upstream Rekor (public Sigstore or private) locally. +# +# Use Cases: +# - Air-gapped deployments with periodic sync +# - Reduce latency by caching frequently-accessed tiles +# - Offline verification when upstream is unavailable +# +# Note: This is an ALTERNATIVE to running your own rekor-v2 instance. +# Use tile-proxy when you want to cache from public Sigstore. +# Use rekor-v2 (--profile sigstore) when running your own transparency log. +# +# Usage: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.tile-proxy.yml up -d +# +# ============================================================================= + +x-release-labels: &release-labels + com.stellaops.release.version: "2025.10.0" + com.stellaops.release.channel: "stable" + com.stellaops.component: "tile-proxy" + +volumes: + tile-cache: + driver: local + tuf-cache: + driver: local + +services: + tile-proxy: + build: + context: ../.. + dockerfile: src/Attestor/StellaOps.Attestor.TileProxy/Dockerfile + image: registry.stella-ops.org/stellaops/tile-proxy:2025.10.0 + container_name: stellaops-tile-proxy + restart: unless-stopped + ports: + - "${TILE_PROXY_PORT:-8090}:8080" + volumes: + - tile-cache:/var/cache/stellaops/tiles + - tuf-cache:/var/cache/stellaops/tuf + environment: + # Upstream Rekor configuration + TILE_PROXY__UPSTREAMURL: "${REKOR_SERVER_URL:-https://rekor.sigstore.dev}" + TILE_PROXY__ORIGIN: "${REKOR_ORIGIN:-rekor.sigstore.dev - 1985497715}" + + # TUF configuration (optional - for checkpoint signature validation) + TILE_PROXY__TUF__ENABLED: "${TILE_PROXY_TUF_ENABLED:-false}" + TILE_PROXY__TUF__URL: "${TILE_PROXY_TUF_ROOT_URL:-}" + TILE_PROXY__TUF__VALIDATECHECKPOINTSIGNATURE: "${TILE_PROXY_TUF_VALIDATE_CHECKPOINT:-true}" + + # Cache configuration + TILE_PROXY__CACHE__BASEPATH: /var/cache/stellaops/tiles + TILE_PROXY__CACHE__MAXSIZEGB: "${TILE_PROXY_CACHE_MAX_SIZE_GB:-10}" + TILE_PROXY__CACHE__CHECKPOINTTTLMINUTES: "${TILE_PROXY_CHECKPOINT_TTL_MINUTES:-5}" + + # Sync job configuration (for air-gapped pre-fetching) + TILE_PROXY__SYNC__ENABLED: "${TILE_PROXY_SYNC_ENABLED:-true}" + TILE_PROXY__SYNC__SCHEDULE: "${TILE_PROXY_SYNC_SCHEDULE:-0 */6 * * *}" + TILE_PROXY__SYNC__DEPTH: "${TILE_PROXY_SYNC_DEPTH:-10000}" + + # Request handling + TILE_PROXY__REQUEST__COALESCINGENABLED: "${TILE_PROXY_COALESCING_ENABLED:-true}" + TILE_PROXY__REQUEST__TIMEOUTSECONDS: "${TILE_PROXY_REQUEST_TIMEOUT_SECONDS:-30}" + + # Logging + Serilog__MinimumLevel__Default: "${TILE_PROXY_LOG_LEVEL:-Information}" + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:8080/_admin/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 5s + networks: + - stellaops + labels: *release-labels diff --git a/deploy/compose/env/cas.env.example b/deploy/compose/env/cas.env.example new file mode 100644 index 000000000..377e5b8f7 --- /dev/null +++ b/deploy/compose/env/cas.env.example @@ -0,0 +1,118 @@ +# CAS (Content Addressable Storage) Environment Configuration +# Copy to .env and customize for your deployment +# +# Aligned with best-in-class vulnerability scanner retention policies: +# - Trivy: 7 days vulnerability DB +# - Grype: 5 days DB, configurable +# - Anchore Enterprise: 90-365 days typical +# - Snyk Enterprise: 365 days + +# ============================================================================= +# DATA PATHS (ensure directories exist with proper permissions) +# ============================================================================= +CAS_DATA_PATH=/var/lib/stellaops/cas +CAS_EVIDENCE_PATH=/var/lib/stellaops/evidence +CAS_ATTESTATION_PATH=/var/lib/stellaops/attestations + +# ============================================================================= +# RUSTFS CONFIGURATION +# ============================================================================= +RUSTFS_LOG_LEVEL=info +RUSTFS_COMPRESSION=zstd +RUSTFS_COMPRESSION_LEVEL=3 + +# ============================================================================= +# PORTS +# ============================================================================= +RUSTFS_CAS_PORT=8180 +RUSTFS_EVIDENCE_PORT=8181 +RUSTFS_ATTESTATION_PORT=8182 + +# ============================================================================= +# ACCESS CONTROL - API KEYS +# IMPORTANT: Change these in production! +# ============================================================================= + +# CAS Storage (mutable, lifecycle-managed) +RUSTFS_CAS_API_KEY=cas-api-key-CHANGE-IN-PRODUCTION +RUSTFS_CAS_READONLY_KEY=cas-readonly-key-CHANGE-IN-PRODUCTION + +# Evidence Storage (immutable) +RUSTFS_EVIDENCE_API_KEY=evidence-api-key-CHANGE-IN-PRODUCTION +RUSTFS_EVIDENCE_READONLY_KEY=evidence-readonly-key-CHANGE-IN-PRODUCTION + +# Attestation Storage (immutable) +RUSTFS_ATTESTATION_API_KEY=attestation-api-key-CHANGE-IN-PRODUCTION +RUSTFS_ATTESTATION_READONLY_KEY=attestation-readonly-key-CHANGE-IN-PRODUCTION + +# ============================================================================= +# SERVICE ACCOUNT KEYS +# Each service has its own key for fine-grained access control +# IMPORTANT: Generate unique keys per environment! +# ============================================================================= + +# Scanner service - access to scanner artifacts, surface cache, runtime facts +RUSTFS_SCANNER_KEY=scanner-svc-key-GENERATE-UNIQUE +# Bucket access: scanner-artifacts (rw), surface-cache (rw), runtime-facts (rw) + +# Signals service - access to runtime facts, signals data, provenance feed +RUSTFS_SIGNALS_KEY=signals-svc-key-GENERATE-UNIQUE +# Bucket access: runtime-facts (rw), signals-data (rw), provenance-feed (rw) + +# Replay service - access to replay bundles, inputs lock files +RUSTFS_REPLAY_KEY=replay-svc-key-GENERATE-UNIQUE +# Bucket access: replay-bundles (rw), inputs-lock (rw) + +# Ledger service - access to evidence bundles, merkle roots, hash chains +RUSTFS_LEDGER_KEY=ledger-svc-key-GENERATE-UNIQUE +# Bucket access: evidence-bundles (rw), merkle-roots (rw), hash-chains (rw) + +# Exporter service - read-only access to evidence bundles +RUSTFS_EXPORTER_KEY=exporter-svc-key-GENERATE-UNIQUE +# Bucket access: evidence-bundles (r) + +# Attestor service - access to attestations, DSSE envelopes, Rekor receipts +RUSTFS_ATTESTOR_KEY=attestor-svc-key-GENERATE-UNIQUE +# Bucket access: attestations (rw), dsse-envelopes (rw), rekor-receipts (rw) + +# Verifier service - read-only access to attestations +RUSTFS_VERIFIER_KEY=verifier-svc-key-GENERATE-UNIQUE +# Bucket access: attestations (r), dsse-envelopes (r), rekor-receipts (r) + +# Global read-only key (for debugging/auditing) +RUSTFS_READONLY_KEY=readonly-global-key-GENERATE-UNIQUE +# Bucket access: * (r) + +# ============================================================================= +# LIFECYCLE MANAGEMENT +# ============================================================================= +# Cron schedule for retention policy enforcement (default: 3 AM daily) +LIFECYCLE_CRON=0 3 * * * +LIFECYCLE_TELEMETRY=true + +# ============================================================================= +# RETENTION POLICIES (days, 0 = indefinite) +# Aligned with enterprise vulnerability scanner best practices +# ============================================================================= +# Vulnerability DB: 7 days (matches Trivy default, Grype uses 5) +CAS_RETENTION_VULNERABILITY_DB_DAYS=7 + +# SBOM artifacts: 365 days (audit compliance - SOC2, ISO27001, FedRAMP) +CAS_RETENTION_SBOM_ARTIFACTS_DAYS=365 + +# Scan results: 90 days (common compliance window) +CAS_RETENTION_SCAN_RESULTS_DAYS=90 + +# Evidence bundles: indefinite (content-addressed, immutable, audit trail) +CAS_RETENTION_EVIDENCE_BUNDLES_DAYS=0 + +# Attestations: indefinite (signed, immutable, verifiable) +CAS_RETENTION_ATTESTATIONS_DAYS=0 + +# Temporary artifacts: 1 day (work-in-progress, intermediate files) +CAS_RETENTION_TEMP_ARTIFACTS_DAYS=1 + +# ============================================================================= +# TELEMETRY (optional) +# ============================================================================= +OTLP_ENDPOINT= diff --git a/deploy/compose/env/compliance-china.env.example b/deploy/compose/env/compliance-china.env.example new file mode 100644 index 000000000..b157b0d10 --- /dev/null +++ b/deploy/compose/env/compliance-china.env.example @@ -0,0 +1,48 @@ +# ============================================================================= +# STELLA OPS CHINA COMPLIANCE ENVIRONMENT +# ============================================================================= +# Environment template for China (SM2/SM3/SM4) compliance deployments. +# +# Usage with simulation: +# cp env/compliance-china.env.example .env +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-china.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Usage with SM Remote (production): +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-china.yml \ +# -f docker-compose.sm-remote.yml up -d +# +# ============================================================================= + +# Crypto profile +STELLAOPS_CRYPTO_PROFILE=china + +# ============================================================================= +# SM REMOTE SERVICE CONFIGURATION +# ============================================================================= + +SM_REMOTE_PORT=56080 + +# Software-only SM2 provider (for testing/development) +SM_SOFT_ALLOWED=1 + +# OSCCA-certified HSM configuration (for production) +# Set these when using a certified hardware security module +SM_REMOTE_HSM_URL= +SM_REMOTE_HSM_API_KEY= +SM_REMOTE_HSM_TIMEOUT=30000 + +# Client certificate authentication for HSM (optional) +SM_REMOTE_CLIENT_CERT_PATH= +SM_REMOTE_CLIENT_CERT_PASSWORD= + +# ============================================================================= +# CRYPTO SIMULATION (for testing only) +# ============================================================================= + +# Enable simulation mode +STELLAOPS_CRYPTO_ENABLE_SIM=1 +STELLAOPS_CRYPTO_SIM_URL=http://sim-crypto:8080 +SIM_CRYPTO_PORT=18090 diff --git a/deploy/compose/env/compliance-eu.env.example b/deploy/compose/env/compliance-eu.env.example new file mode 100644 index 000000000..227af769a --- /dev/null +++ b/deploy/compose/env/compliance-eu.env.example @@ -0,0 +1,40 @@ +# ============================================================================= +# STELLA OPS EU COMPLIANCE ENVIRONMENT +# ============================================================================= +# Environment template for EU (eIDAS) compliance deployments. +# +# Usage with simulation: +# cp env/compliance-eu.env.example .env +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-eu.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Usage for production: +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-eu.yml up -d +# +# Note: EU eIDAS deployments typically integrate with external Qualified Trust +# Service Providers (QTSPs) rather than hosting crypto locally. +# +# ============================================================================= + +# Crypto profile +STELLAOPS_CRYPTO_PROFILE=eu + +# ============================================================================= +# eIDAS / QTSP CONFIGURATION +# ============================================================================= + +# Qualified Trust Service Provider integration (configure in application settings) +# EIDAS_QTSP_URL=https://qtsp.example.eu +# EIDAS_QTSP_CLIENT_ID= +# EIDAS_QTSP_CLIENT_SECRET= + +# ============================================================================= +# CRYPTO SIMULATION (for testing only) +# ============================================================================= + +# Enable simulation mode +STELLAOPS_CRYPTO_ENABLE_SIM=1 +STELLAOPS_CRYPTO_SIM_URL=http://sim-crypto:8080 +SIM_CRYPTO_PORT=18090 diff --git a/deploy/compose/env/compliance-russia.env.example b/deploy/compose/env/compliance-russia.env.example new file mode 100644 index 000000000..63c4b6a29 --- /dev/null +++ b/deploy/compose/env/compliance-russia.env.example @@ -0,0 +1,51 @@ +# ============================================================================= +# STELLA OPS RUSSIA COMPLIANCE ENVIRONMENT +# ============================================================================= +# Environment template for Russia (GOST R 34.10-2012) compliance deployments. +# +# Usage with simulation: +# cp env/compliance-russia.env.example .env +# docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-russia.yml \ +# -f docker-compose.crypto-sim.yml up -d +# +# Usage with CryptoPro CSP (production): +# CRYPTOPRO_ACCEPT_EULA=1 docker compose -f docker-compose.stella-ops.yml \ +# -f docker-compose.compliance-russia.yml \ +# -f docker-compose.cryptopro.yml up -d +# +# ============================================================================= + +# Crypto profile +STELLAOPS_CRYPTO_PROFILE=russia + +# ============================================================================= +# CRYPTOPRO CSP CONFIGURATION +# ============================================================================= + +CRYPTOPRO_PORT=18080 + +# IMPORTANT: Set to 1 to accept CryptoPro EULA (required for production) +CRYPTOPRO_ACCEPT_EULA=0 + +# CryptoPro container settings +CRYPTOPRO_CONTAINER_NAME=stellaops-signing +CRYPTOPRO_USE_MACHINE_STORE=true +CRYPTOPRO_PROVIDER_TYPE=80 + +# ============================================================================= +# GOST ALGORITHM CONFIGURATION +# ============================================================================= + +# Default GOST algorithms +CRYPTOPRO_GOST_SIGNATURE_ALGORITHM=GOST R 34.10-2012 +CRYPTOPRO_GOST_HASH_ALGORITHM=GOST R 34.11-2012 + +# ============================================================================= +# CRYPTO SIMULATION (for testing only) +# ============================================================================= + +# Enable simulation mode +STELLAOPS_CRYPTO_ENABLE_SIM=1 +STELLAOPS_CRYPTO_SIM_URL=http://sim-crypto:8080 +SIM_CRYPTO_PORT=18090 diff --git a/deploy/compose/env/stellaops.env.example b/deploy/compose/env/stellaops.env.example new file mode 100644 index 000000000..879c8294e --- /dev/null +++ b/deploy/compose/env/stellaops.env.example @@ -0,0 +1,171 @@ +# ============================================================================= +# STELLA OPS ENVIRONMENT CONFIGURATION +# ============================================================================= +# Main environment template for docker-compose.stella-ops.yml +# Copy to .env and customize for your deployment. +# +# Usage: +# cp env/stellaops.env.example .env +# docker compose -f docker-compose.stella-ops.yml up -d +# +# ============================================================================= + +# ============================================================================= +# INFRASTRUCTURE +# ============================================================================= + +# PostgreSQL Database +POSTGRES_USER=stellaops +POSTGRES_PASSWORD=REPLACE_WITH_STRONG_PASSWORD +POSTGRES_DB=stellaops_platform +POSTGRES_PORT=5432 + +# Valkey (Redis-compatible cache and messaging) +VALKEY_PORT=6379 + +# RustFS Object Storage +RUSTFS_HTTP_PORT=8080 + +# ============================================================================= +# CORE SERVICES +# ============================================================================= + +# Authority (OAuth2/OIDC) +AUTHORITY_ISSUER=https://authority.example.com +AUTHORITY_PORT=8440 +AUTHORITY_OFFLINE_CACHE_TOLERANCE=00:30:00 + +# Signer +SIGNER_POE_INTROSPECT_URL=https://licensing.example.com/introspect +SIGNER_PORT=8441 + +# Attestor +ATTESTOR_PORT=8442 + +# Issuer Directory +ISSUER_DIRECTORY_PORT=8447 +ISSUER_DIRECTORY_SEED_CSAF=true + +# Concelier +CONCELIER_PORT=8445 + +# Notify +NOTIFY_WEB_PORT=8446 + +# Web UI +UI_PORT=8443 + +# ============================================================================= +# SCANNER CONFIGURATION +# ============================================================================= + +SCANNER_WEB_PORT=8444 + +# Queue configuration (Valkey only - NATS removed) +SCANNER__QUEUE__BROKER=valkey://valkey:6379 + +# Event streaming +SCANNER_EVENTS_ENABLED=false +SCANNER_EVENTS_DRIVER=valkey +SCANNER_EVENTS_DSN=valkey:6379 +SCANNER_EVENTS_STREAM=stella.events +SCANNER_EVENTS_PUBLISH_TIMEOUT_SECONDS=5 +SCANNER_EVENTS_MAX_STREAM_LENGTH=10000 + +# Surface cache configuration +SCANNER_SURFACE_FS_ENDPOINT=http://rustfs:8080 +SCANNER_SURFACE_FS_BUCKET=surface-cache +SCANNER_SURFACE_CACHE_ROOT=/var/lib/stellaops/surface +SCANNER_SURFACE_CACHE_QUOTA_MB=4096 +SCANNER_SURFACE_PREFETCH_ENABLED=false +SCANNER_SURFACE_TENANT=default +SCANNER_SURFACE_FEATURES= +SCANNER_SURFACE_SECRETS_PROVIDER=file +SCANNER_SURFACE_SECRETS_NAMESPACE= +SCANNER_SURFACE_SECRETS_ROOT=/etc/stellaops/secrets +SCANNER_SURFACE_SECRETS_FALLBACK_PROVIDER= +SCANNER_SURFACE_SECRETS_ALLOW_INLINE=false +SURFACE_SECRETS_HOST_PATH=./offline/surface-secrets + +# Offline Kit configuration +SCANNER_OFFLINEKIT_ENABLED=false +SCANNER_OFFLINEKIT_REQUIREDSSE=true +SCANNER_OFFLINEKIT_REKOROFFLINEMODE=true +SCANNER_OFFLINEKIT_TRUSTROOTDIRECTORY=/etc/stellaops/trust-roots +SCANNER_OFFLINEKIT_REKORSNAPSHOTDIRECTORY=/var/lib/stellaops/rekor-snapshot +SCANNER_OFFLINEKIT_TRUSTROOTS_HOST_PATH=./offline/trust-roots +SCANNER_OFFLINEKIT_REKOR_SNAPSHOT_HOST_PATH=./offline/rekor-snapshot + +# ============================================================================= +# SCHEDULER CONFIGURATION +# ============================================================================= + +# Queue configuration (Valkey only - NATS removed) +SCHEDULER__QUEUE__KIND=Valkey +SCHEDULER__QUEUE__VALKEY__URL=valkey:6379 +SCHEDULER_SCANNER_BASEADDRESS=http://scanner-web:8444 + +# ============================================================================= +# REKOR / SIGSTORE CONFIGURATION +# ============================================================================= + +# Rekor server URL (default: public Sigstore, use http://rekor-v2:3000 for local) +REKOR_SERVER_URL=https://rekor.sigstore.dev +REKOR_VERSION=V2 +REKOR_TILE_BASE_URL= +REKOR_LOG_ID=c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d +REKOR_TILES_IMAGE=ghcr.io/sigstore/rekor-tiles:latest + +# ============================================================================= +# ADVISORY AI CONFIGURATION +# ============================================================================= + +ADVISORY_AI_WEB_PORT=8448 +ADVISORY_AI_SBOM_BASEADDRESS=http://scanner-web:8444 +ADVISORY_AI_INFERENCE_MODE=Local +ADVISORY_AI_REMOTE_BASEADDRESS= +ADVISORY_AI_REMOTE_APIKEY= + +# ============================================================================= +# CRYPTO CONFIGURATION +# ============================================================================= + +# Crypto profile: default, china, russia, eu +STELLAOPS_CRYPTO_PROFILE=default + +# Enable crypto simulation (for testing) +STELLAOPS_CRYPTO_ENABLE_SIM=0 +STELLAOPS_CRYPTO_SIM_URL=http://sim-crypto:8080 + +# CryptoPro (Russia only) - requires EULA acceptance +CRYPTOPRO_PORT=18080 +CRYPTOPRO_ACCEPT_EULA=0 +CRYPTOPRO_CONTAINER_NAME=stellaops-signing +CRYPTOPRO_USE_MACHINE_STORE=true +CRYPTOPRO_PROVIDER_TYPE=80 + +# SM Remote (China only) +SM_REMOTE_PORT=56080 +SM_SOFT_ALLOWED=1 +SM_REMOTE_HSM_URL= +SM_REMOTE_HSM_API_KEY= +SM_REMOTE_HSM_TIMEOUT=30000 + +# ============================================================================= +# NETWORKING +# ============================================================================= + +# External reverse proxy network (Traefik, Envoy, etc.) +FRONTDOOR_NETWORK=stellaops_frontdoor + +# ============================================================================= +# TELEMETRY (optional) +# ============================================================================= + +OTEL_GRPC_PORT=4317 +OTEL_HTTP_PORT=4318 +OTEL_PROMETHEUS_PORT=9464 +PROMETHEUS_PORT=9090 +TEMPO_PORT=3200 +LOKI_PORT=3100 +PROMETHEUS_RETENTION=15d diff --git a/deploy/compose/env/testing.env.example b/deploy/compose/env/testing.env.example new file mode 100644 index 000000000..0e71938a3 --- /dev/null +++ b/deploy/compose/env/testing.env.example @@ -0,0 +1,45 @@ +# ============================================================================= +# STELLA OPS TESTING ENVIRONMENT CONFIGURATION +# ============================================================================= +# Environment template for docker-compose.testing.yml +# Uses different ports to avoid conflicts with development/production. +# +# Usage: +# cp env/testing.env.example .env +# docker compose -f docker-compose.testing.yml --profile ci up -d +# +# ============================================================================= + +# ============================================================================= +# CI INFRASTRUCTURE (different ports to avoid conflicts) +# ============================================================================= + +# PostgreSQL Test Database (port 5433) +TEST_POSTGRES_PORT=5433 +TEST_POSTGRES_USER=stellaops_ci +TEST_POSTGRES_PASSWORD=ci_test_password +TEST_POSTGRES_DB=stellaops_test + +# Valkey Test (port 6380) +TEST_VALKEY_PORT=6380 + +# RustFS Test (port 8180) +TEST_RUSTFS_PORT=8180 + +# Mock Registry (port 5001) +TEST_REGISTRY_PORT=5001 + +# ============================================================================= +# GITEA CONFIGURATION +# ============================================================================= + +TEST_GITEA_PORT=3000 +TEST_GITEA_SSH_PORT=3022 + +# ============================================================================= +# SIGSTORE TOOLS +# ============================================================================= + +# Rekor CLI and Cosign versions (for sigstore profile) +REKOR_CLI_VERSION=v1.4.3 +COSIGN_VERSION=v3.0.4 diff --git a/deploy/compose/scripts/backup.sh b/deploy/compose/scripts/backup.sh new file mode 100644 index 000000000..1a033325f --- /dev/null +++ b/deploy/compose/scripts/backup.sh @@ -0,0 +1,28 @@ +#!/usr/bin/env bash +set -euo pipefail + +echo "StellaOps Compose Backup" +echo "This will create a tar.gz of PostgreSQL, RustFS (object-store), and Valkey data volumes." +read -rp "Proceed? [y/N] " ans +[[ ${ans:-N} =~ ^[Yy]$ ]] || { echo "Aborted."; exit 1; } + +TS=$(date -u +%Y%m%dT%H%M%SZ) +OUT_DIR=${BACKUP_DIR:-backups} +mkdir -p "$OUT_DIR" + +docker compose ps >/dev/null + +echo "Pausing worker containers for consistency..." +docker compose pause scanner-worker scheduler-worker taskrunner-worker || true + +echo "Backing up volumes..." +docker run --rm \ + -v stellaops-postgres:/data/postgres:ro \ + -v stellaops-rustfs:/data/rustfs:ro \ + -v stellaops-valkey:/data/valkey:ro \ + -v "$PWD/$OUT_DIR":/out \ + alpine sh -c "cd / && tar czf /out/stellaops-backup-$TS.tar.gz data" + +docker compose unpause scanner-worker scheduler-worker taskrunner-worker || true + +echo "Backup written to $OUT_DIR/stellaops-backup-$TS.tar.gz" diff --git a/deploy/compose/scripts/quickstart.sh b/deploy/compose/scripts/quickstart.sh new file mode 100644 index 000000000..ec85460b6 --- /dev/null +++ b/deploy/compose/scripts/quickstart.sh @@ -0,0 +1,25 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +COMPOSE_DIR="$(cd "$SCRIPT_DIR/.." && pwd)" + +ENV_FILE="${1:-$COMPOSE_DIR/env/dev.env.example}" +USE_MOCK="${USE_MOCK:-0}" + +FILES=(-f "$COMPOSE_DIR/docker-compose.dev.yaml") +ENV_FILES=(--env-file "$ENV_FILE") + +if [[ "$USE_MOCK" == "1" ]]; then + FILES+=(-f "$COMPOSE_DIR/docker-compose.mock.yaml") + ENV_FILES+=(--env-file "$COMPOSE_DIR/env/mock.env.example") +fi + +echo "Validating compose config..." +docker compose "${ENV_FILES[@]}" "${FILES[@]}" config > /tmp/compose-validated.yaml +echo "Config written to /tmp/compose-validated.yaml" + +echo "Starting stack..." +docker compose "${ENV_FILES[@]}" "${FILES[@]}" up -d + +echo "Stack started. To stop: docker compose ${ENV_FILES[*]} ${FILES[*]} down" diff --git a/deploy/compose/scripts/reset.sh b/deploy/compose/scripts/reset.sh new file mode 100644 index 000000000..248f94aa5 --- /dev/null +++ b/deploy/compose/scripts/reset.sh @@ -0,0 +1,15 @@ +#!/usr/bin/env bash +set -euo pipefail + +echo "WARNING: This will stop the stack and wipe PostgreSQL, RustFS, and Valkey volumes." +read -rp "Type 'RESET' to continue: " ans +[[ ${ans:-} == "RESET" ]] || { echo "Aborted."; exit 1; } + +docker compose down + +for vol in stellaops-postgres stellaops-rustfs stellaops-valkey; do + echo "Removing volume $vol" + docker volume rm "$vol" || true +done + +echo "Reset complete. Re-run compose with your env file to recreate volumes." diff --git a/deploy/database/migrations/005_timestamp_evidence.sql b/deploy/database/migrations/005_timestamp_evidence.sql new file mode 100644 index 000000000..46366b8d0 --- /dev/null +++ b/deploy/database/migrations/005_timestamp_evidence.sql @@ -0,0 +1,69 @@ +-- ----------------------------------------------------------------------------- +-- 005_timestamp_evidence.sql +-- Sprint: SPRINT_20260119_009 Evidence Storage for Timestamps +-- Task: EVT-002 - PostgreSQL Schema Extension +-- Description: Schema for storing timestamp and revocation evidence. +-- ----------------------------------------------------------------------------- + +-- Ensure the evidence schema exists +CREATE SCHEMA IF NOT EXISTS evidence; + +-- Timestamp evidence storage +CREATE TABLE IF NOT EXISTS evidence.timestamp_tokens ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + artifact_digest TEXT NOT NULL, + digest_algorithm TEXT NOT NULL, + tst_blob BYTEA NOT NULL, + generation_time TIMESTAMPTZ NOT NULL, + tsa_name TEXT NOT NULL, + tsa_policy_oid TEXT NOT NULL, + serial_number TEXT NOT NULL, + tsa_chain_pem TEXT NOT NULL, + ocsp_response BYTEA, + crl_snapshot BYTEA, + captured_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + provider_name TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + CONSTRAINT uq_timestamp_artifact_time UNIQUE (artifact_digest, generation_time) +); + +-- Indexes for timestamp queries +CREATE INDEX IF NOT EXISTS idx_timestamp_artifact ON evidence.timestamp_tokens(artifact_digest); +CREATE INDEX IF NOT EXISTS idx_timestamp_generation ON evidence.timestamp_tokens(generation_time); +CREATE INDEX IF NOT EXISTS idx_timestamp_provider ON evidence.timestamp_tokens(provider_name); +CREATE INDEX IF NOT EXISTS idx_timestamp_created ON evidence.timestamp_tokens(created_at); + +-- Revocation evidence storage +CREATE TABLE IF NOT EXISTS evidence.revocation_snapshots ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + certificate_fingerprint TEXT NOT NULL, + source TEXT NOT NULL CHECK (source IN ('Ocsp', 'Crl', 'None')), + raw_response BYTEA NOT NULL, + response_time TIMESTAMPTZ NOT NULL, + valid_until TIMESTAMPTZ NOT NULL, + status TEXT NOT NULL CHECK (status IN ('Good', 'Revoked', 'Unknown')), + revocation_time TIMESTAMPTZ, + reason TEXT, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Indexes for revocation queries +CREATE INDEX IF NOT EXISTS idx_revocation_cert ON evidence.revocation_snapshots(certificate_fingerprint); +CREATE INDEX IF NOT EXISTS idx_revocation_valid ON evidence.revocation_snapshots(valid_until); +CREATE INDEX IF NOT EXISTS idx_revocation_status ON evidence.revocation_snapshots(status); +CREATE INDEX IF NOT EXISTS idx_revocation_created ON evidence.revocation_snapshots(created_at); + +-- Comments +COMMENT ON TABLE evidence.timestamp_tokens IS 'RFC-3161 TimeStampToken evidence for long-term validation'; +COMMENT ON TABLE evidence.revocation_snapshots IS 'OCSP/CRL certificate revocation evidence snapshots'; + +COMMENT ON COLUMN evidence.timestamp_tokens.artifact_digest IS 'SHA-256 digest of the timestamped artifact'; +COMMENT ON COLUMN evidence.timestamp_tokens.tst_blob IS 'Raw DER-encoded RFC 3161 TimeStampToken'; +COMMENT ON COLUMN evidence.timestamp_tokens.tsa_chain_pem IS 'PEM-encoded TSA certificate chain for LTV'; +COMMENT ON COLUMN evidence.timestamp_tokens.ocsp_response IS 'Stapled OCSP response at signing time'; +COMMENT ON COLUMN evidence.timestamp_tokens.crl_snapshot IS 'CRL snapshot at signing time (fallback for OCSP)'; + +COMMENT ON COLUMN evidence.revocation_snapshots.certificate_fingerprint IS 'SHA-256 fingerprint of the certificate'; +COMMENT ON COLUMN evidence.revocation_snapshots.raw_response IS 'Raw OCSP response or CRL bytes'; +COMMENT ON COLUMN evidence.revocation_snapshots.response_time IS 'thisUpdate from the response'; +COMMENT ON COLUMN evidence.revocation_snapshots.valid_until IS 'nextUpdate from the response'; diff --git a/deploy/database/migrations/005_timestamp_evidence_rollback.sql b/deploy/database/migrations/005_timestamp_evidence_rollback.sql new file mode 100644 index 000000000..304944e52 --- /dev/null +++ b/deploy/database/migrations/005_timestamp_evidence_rollback.sql @@ -0,0 +1,21 @@ +-- ----------------------------------------------------------------------------- +-- 005_timestamp_evidence_rollback.sql +-- Sprint: SPRINT_20260119_009 Evidence Storage for Timestamps +-- Task: EVT-002 - PostgreSQL Schema Extension +-- Description: Rollback migration for timestamp and revocation evidence. +-- ----------------------------------------------------------------------------- + +-- Drop indexes first +DROP INDEX IF EXISTS evidence.idx_timestamp_artifact; +DROP INDEX IF EXISTS evidence.idx_timestamp_generation; +DROP INDEX IF EXISTS evidence.idx_timestamp_provider; +DROP INDEX IF EXISTS evidence.idx_timestamp_created; + +DROP INDEX IF EXISTS evidence.idx_revocation_cert; +DROP INDEX IF EXISTS evidence.idx_revocation_valid; +DROP INDEX IF EXISTS evidence.idx_revocation_status; +DROP INDEX IF EXISTS evidence.idx_revocation_created; + +-- Drop tables +DROP TABLE IF EXISTS evidence.revocation_snapshots; +DROP TABLE IF EXISTS evidence.timestamp_tokens; diff --git a/deploy/database/migrations/005_validation_harness.sql b/deploy/database/migrations/005_validation_harness.sql new file mode 100644 index 000000000..fec063b64 --- /dev/null +++ b/deploy/database/migrations/005_validation_harness.sql @@ -0,0 +1,120 @@ +-- Validation harness schema for tracking validation runs and match results +-- Migration: 005_validation_harness.sql + +-- Validation runs table +CREATE TABLE IF NOT EXISTS groundtruth.validation_runs ( + run_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + name TEXT NOT NULL, + description TEXT, + status TEXT NOT NULL DEFAULT 'pending', + + -- Configuration (stored as JSONB) + config JSONB NOT NULL, + + -- Timestamps + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + started_at TIMESTAMPTZ, + completed_at TIMESTAMPTZ, + + -- Metrics (populated after completion) + total_pairs INT, + total_functions INT, + true_positives INT, + false_positives INT, + true_negatives INT, + false_negatives INT, + match_rate DOUBLE PRECISION, + precision_score DOUBLE PRECISION, + recall_score DOUBLE PRECISION, + f1_score DOUBLE PRECISION, + average_match_score DOUBLE PRECISION, + + -- Mismatch counts by bucket (JSONB map) + mismatch_counts JSONB, + + -- Metadata + corpus_snapshot_id TEXT, + matcher_version TEXT, + error_message TEXT, + tags TEXT[] DEFAULT '{}', + + -- Constraints + CONSTRAINT valid_status CHECK (status IN ('pending', 'running', 'completed', 'failed', 'cancelled')) +); + +-- Indexes for validation runs +CREATE INDEX IF NOT EXISTS idx_validation_runs_status ON groundtruth.validation_runs(status); +CREATE INDEX IF NOT EXISTS idx_validation_runs_created_at ON groundtruth.validation_runs(created_at DESC); +CREATE INDEX IF NOT EXISTS idx_validation_runs_tags ON groundtruth.validation_runs USING GIN (tags); + +-- Match results table +CREATE TABLE IF NOT EXISTS groundtruth.match_results ( + result_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + run_id UUID NOT NULL REFERENCES groundtruth.validation_runs(run_id) ON DELETE CASCADE, + security_pair_id UUID NOT NULL, + + -- Source function + source_name TEXT NOT NULL, + source_demangled_name TEXT, + source_address BIGINT NOT NULL, + source_size BIGINT, + source_build_id TEXT NOT NULL, + source_binary_name TEXT NOT NULL, + + -- Expected target + expected_name TEXT NOT NULL, + expected_demangled_name TEXT, + expected_address BIGINT NOT NULL, + expected_size BIGINT, + expected_build_id TEXT NOT NULL, + expected_binary_name TEXT NOT NULL, + + -- Actual matched target (nullable if no match found) + actual_name TEXT, + actual_demangled_name TEXT, + actual_address BIGINT, + actual_size BIGINT, + actual_build_id TEXT, + actual_binary_name TEXT, + + -- Outcome + outcome TEXT NOT NULL, + match_score DOUBLE PRECISION, + confidence TEXT, + + -- Mismatch analysis + inferred_cause TEXT, + mismatch_detail JSONB, + + -- Performance + match_duration_ms DOUBLE PRECISION, + + -- Constraints + CONSTRAINT valid_outcome CHECK (outcome IN ('true_positive', 'false_positive', 'true_negative', 'false_negative')) +); + +-- Indexes for match results +CREATE INDEX IF NOT EXISTS idx_match_results_run_id ON groundtruth.match_results(run_id); +CREATE INDEX IF NOT EXISTS idx_match_results_security_pair_id ON groundtruth.match_results(security_pair_id); +CREATE INDEX IF NOT EXISTS idx_match_results_outcome ON groundtruth.match_results(outcome); +CREATE INDEX IF NOT EXISTS idx_match_results_inferred_cause ON groundtruth.match_results(inferred_cause) WHERE inferred_cause IS NOT NULL; + +-- View for run summaries +CREATE OR REPLACE VIEW groundtruth.validation_run_summaries AS +SELECT + run_id AS id, + name, + status, + created_at, + completed_at, + match_rate, + f1_score, + total_pairs AS pair_count, + total_functions AS function_count, + tags +FROM groundtruth.validation_runs; + +-- Comments +COMMENT ON TABLE groundtruth.validation_runs IS 'Validation harness runs with aggregate metrics'; +COMMENT ON TABLE groundtruth.match_results IS 'Per-function match results from validation runs'; +COMMENT ON VIEW groundtruth.validation_run_summaries IS 'Summary view for listing validation runs'; diff --git a/deploy/database/migrations/006_timestamp_supersession.sql b/deploy/database/migrations/006_timestamp_supersession.sql new file mode 100644 index 000000000..04421a91f --- /dev/null +++ b/deploy/database/migrations/006_timestamp_supersession.sql @@ -0,0 +1,27 @@ +-- ----------------------------------------------------------------------------- +-- 006_timestamp_supersession.sql +-- Sprint: SPRINT_20260119_009 Evidence Storage for Timestamps +-- Task: EVT-005 - Re-Timestamping Support +-- Description: Schema extension for timestamp supersession chain. +-- ----------------------------------------------------------------------------- + +-- Add supersession column for re-timestamping chain +ALTER TABLE evidence.timestamp_tokens +ADD COLUMN IF NOT EXISTS supersedes_id UUID REFERENCES evidence.timestamp_tokens(id); + +-- Index for finding superseding timestamps +CREATE INDEX IF NOT EXISTS idx_timestamp_supersedes ON evidence.timestamp_tokens(supersedes_id); + +-- Index for finding timestamps by expiry (for re-timestamp scheduling) +-- Note: We need to track TSA certificate expiry separately - for now use generation_time + typical cert lifetime +CREATE INDEX IF NOT EXISTS idx_timestamp_for_retimestamp +ON evidence.timestamp_tokens(generation_time) +WHERE supersedes_id IS NULL; -- Only query leaf timestamps (not already superseded) + +-- Comments +COMMENT ON COLUMN evidence.timestamp_tokens.supersedes_id IS 'ID of the timestamp this supersedes (for re-timestamping chain)'; + +-- Rollback script (execute separately if needed): +-- ALTER TABLE evidence.timestamp_tokens DROP COLUMN IF EXISTS supersedes_id; +-- DROP INDEX IF EXISTS evidence.idx_timestamp_supersedes; +-- DROP INDEX IF EXISTS evidence.idx_timestamp_for_retimestamp; diff --git a/deploy/database/migrations/V20260108__opsmemory_advisoryai_schema.sql b/deploy/database/migrations/V20260108__opsmemory_advisoryai_schema.sql new file mode 100644 index 000000000..e0a262c07 --- /dev/null +++ b/deploy/database/migrations/V20260108__opsmemory_advisoryai_schema.sql @@ -0,0 +1,108 @@ +-- OpsMemory and AdvisoryAI PostgreSQL Schema Migration +-- Version: 20260108 +-- Author: StellaOps Agent +-- Sprint: SPRINT_20260107_006_004 (OpsMemory), SPRINT_20260107_006_003 (AdvisoryAI) + +-- ============================================================================ +-- OpsMemory Schema +-- ============================================================================ + +CREATE SCHEMA IF NOT EXISTS opsmemory; + +-- Decision records table +CREATE TABLE IF NOT EXISTS opsmemory.decisions ( + memory_id TEXT PRIMARY KEY, + tenant_id TEXT NOT NULL, + recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + + -- Situation context + cve_id TEXT, + component_purl TEXT, + severity TEXT, + reachability TEXT, + epss_score DECIMAL(5, 4), + cvss_score DECIMAL(3, 1), + context_tags TEXT[], + similarity_vector DOUBLE PRECISION[], + + -- Decision details + action TEXT NOT NULL, + rationale TEXT, + decided_by TEXT NOT NULL, + policy_reference TEXT, + mitigation_type TEXT, + mitigation_details TEXT, + + -- Outcome (nullable until recorded) + outcome_status TEXT, + resolution_time INTERVAL, + actual_impact TEXT, + lessons_learned TEXT, + outcome_recorded_by TEXT, + outcome_recorded_at TIMESTAMPTZ +); + +-- Indexes for querying +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_tenant ON opsmemory.decisions(tenant_id); +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_cve ON opsmemory.decisions(cve_id); +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_component ON opsmemory.decisions(component_purl); +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_recorded ON opsmemory.decisions(recorded_at); +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_action ON opsmemory.decisions(action); +CREATE INDEX IF NOT EXISTS idx_opsmemory_decisions_outcome ON opsmemory.decisions(outcome_status); + +-- ============================================================================ +-- AdvisoryAI Schema +-- ============================================================================ + +CREATE SCHEMA IF NOT EXISTS advisoryai; + +-- Conversations table +CREATE TABLE IF NOT EXISTS advisoryai.conversations ( + conversation_id TEXT PRIMARY KEY, + tenant_id TEXT NOT NULL, + user_id TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + context JSONB, + metadata JSONB +); + +-- Conversation turns table +CREATE TABLE IF NOT EXISTS advisoryai.turns ( + turn_id TEXT PRIMARY KEY, + conversation_id TEXT NOT NULL REFERENCES advisoryai.conversations(conversation_id) ON DELETE CASCADE, + role TEXT NOT NULL, + content TEXT NOT NULL, + timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(), + evidence_links JSONB, + proposed_actions JSONB, + metadata JSONB +); + +-- Indexes for querying +CREATE INDEX IF NOT EXISTS idx_advisoryai_conv_tenant ON advisoryai.conversations(tenant_id); +CREATE INDEX IF NOT EXISTS idx_advisoryai_conv_user ON advisoryai.conversations(user_id); +CREATE INDEX IF NOT EXISTS idx_advisoryai_conv_updated ON advisoryai.conversations(updated_at); +CREATE INDEX IF NOT EXISTS idx_advisoryai_turns_conv ON advisoryai.turns(conversation_id); +CREATE INDEX IF NOT EXISTS idx_advisoryai_turns_timestamp ON advisoryai.turns(timestamp); + +-- ============================================================================ +-- Comments for documentation +-- ============================================================================ + +COMMENT ON SCHEMA opsmemory IS 'OpsMemory: Decision ledger for security playbook learning'; +COMMENT ON SCHEMA advisoryai IS 'AdvisoryAI: Chat conversation storage'; + +COMMENT ON TABLE opsmemory.decisions IS 'Stores security decisions and their outcomes for playbook suggestions'; +COMMENT ON TABLE advisoryai.conversations IS 'Stores AI chat conversations with context'; +COMMENT ON TABLE advisoryai.turns IS 'Individual messages in conversations'; + +-- ============================================================================ +-- Grants (adjust as needed for your environment) +-- ============================================================================ + +-- GRANT USAGE ON SCHEMA opsmemory TO stellaops_app; +-- GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA opsmemory TO stellaops_app; + +-- GRANT USAGE ON SCHEMA advisoryai TO stellaops_app; +-- GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA advisoryai TO stellaops_app; diff --git a/deploy/database/migrations/V20260110__reachability_cve_mapping_schema.sql b/deploy/database/migrations/V20260110__reachability_cve_mapping_schema.sql new file mode 100644 index 000000000..e2156acb9 --- /dev/null +++ b/deploy/database/migrations/V20260110__reachability_cve_mapping_schema.sql @@ -0,0 +1,220 @@ +-- CVE-Symbol Mapping PostgreSQL Schema Migration +-- Version: 20260110 +-- Author: StellaOps Agent +-- Sprint: SPRINT_20260109_009_003_BE_cve_symbol_mapping + +-- ============================================================================ +-- Reachability Schema +-- ============================================================================ + +CREATE SCHEMA IF NOT EXISTS reachability; + +-- ============================================================================ +-- CVE-Symbol Mapping Tables +-- ============================================================================ + +-- Mapping source enumeration type +CREATE TYPE reachability.mapping_source AS ENUM ( + 'patch_analysis', + 'osv_advisory', + 'nvd_cpe', + 'manual_curation', + 'fuzzing_corpus', + 'exploit_database', + 'unknown' +); + +-- Vulnerability type enumeration (for taint analysis) +CREATE TYPE reachability.vulnerability_type AS ENUM ( + 'source', + 'sink', + 'gadget', + 'both_source_and_sink', + 'unknown' +); + +-- Main CVE-symbol mapping table +CREATE TABLE IF NOT EXISTS reachability.cve_symbol_mappings ( + mapping_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + + -- CVE identification + cve_id TEXT NOT NULL, + cve_id_normalized TEXT NOT NULL GENERATED ALWAYS AS (UPPER(cve_id)) STORED, + + -- Affected package (PURL format) + purl TEXT NOT NULL, + affected_versions TEXT[], -- Version ranges like [">=1.0.0,<2.0.0"] + fixed_versions TEXT[], -- Versions where fix is applied + + -- Vulnerable symbol details + symbol_name TEXT NOT NULL, + canonical_id TEXT, -- Normalized symbol ID from canonicalization service + file_path TEXT, + start_line INTEGER, + end_line INTEGER, + + -- Metadata + source reachability.mapping_source NOT NULL DEFAULT 'unknown', + vulnerability_type reachability.vulnerability_type NOT NULL DEFAULT 'unknown', + confidence DECIMAL(3, 2) NOT NULL DEFAULT 0.5 CHECK (confidence >= 0 AND confidence <= 1), + + -- Provenance + evidence_uri TEXT, -- stella:// URI to evidence + source_commit_url TEXT, + patch_url TEXT, + + -- Timestamps + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + verified_at TIMESTAMPTZ, + verified_by TEXT, + + -- Tenant support + tenant_id TEXT NOT NULL DEFAULT 'default' +); + +-- Vulnerable symbol detail records (for additional symbol metadata) +CREATE TABLE IF NOT EXISTS reachability.vulnerable_symbols ( + symbol_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + mapping_id UUID NOT NULL REFERENCES reachability.cve_symbol_mappings(mapping_id) ON DELETE CASCADE, + + -- Symbol identification + symbol_name TEXT NOT NULL, + canonical_id TEXT, + symbol_type TEXT, -- 'function', 'method', 'class', 'module' + + -- Location + file_path TEXT, + start_line INTEGER, + end_line INTEGER, + + -- Code context + signature TEXT, -- Function signature + containing_class TEXT, + namespace TEXT, + + -- Vulnerability context + vulnerability_type reachability.vulnerability_type NOT NULL DEFAULT 'unknown', + is_entry_point BOOLEAN DEFAULT FALSE, + requires_control_flow BOOLEAN DEFAULT FALSE, + + -- Metadata + confidence DECIMAL(3, 2) NOT NULL DEFAULT 0.5, + notes TEXT, + + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Patch analysis results (cached) +CREATE TABLE IF NOT EXISTS reachability.patch_analysis ( + analysis_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + + -- Source identification + commit_url TEXT NOT NULL UNIQUE, + repository_url TEXT, + commit_sha TEXT, + + -- Analysis results (stored as JSONB for flexibility) + diff_content TEXT, + extracted_symbols JSONB NOT NULL DEFAULT '[]', + language_detected TEXT, + + -- Metadata + analyzed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + analyzer_version TEXT, + + -- Error tracking + analysis_status TEXT NOT NULL DEFAULT 'pending', + error_message TEXT +); + +-- ============================================================================ +-- Indexes +-- ============================================================================ + +-- CVE lookup indexes +CREATE INDEX IF NOT EXISTS idx_cve_mapping_cve_normalized ON reachability.cve_symbol_mappings(cve_id_normalized); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_purl ON reachability.cve_symbol_mappings(purl); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_symbol ON reachability.cve_symbol_mappings(symbol_name); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_canonical ON reachability.cve_symbol_mappings(canonical_id) WHERE canonical_id IS NOT NULL; +CREATE INDEX IF NOT EXISTS idx_cve_mapping_tenant ON reachability.cve_symbol_mappings(tenant_id); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_source ON reachability.cve_symbol_mappings(source); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_confidence ON reachability.cve_symbol_mappings(confidence); +CREATE INDEX IF NOT EXISTS idx_cve_mapping_created ON reachability.cve_symbol_mappings(created_at); + +-- Composite index for common queries +CREATE INDEX IF NOT EXISTS idx_cve_mapping_cve_purl ON reachability.cve_symbol_mappings(cve_id_normalized, purl); + +-- Symbol indexes +CREATE INDEX IF NOT EXISTS idx_vuln_symbol_mapping ON reachability.vulnerable_symbols(mapping_id); +CREATE INDEX IF NOT EXISTS idx_vuln_symbol_name ON reachability.vulnerable_symbols(symbol_name); +CREATE INDEX IF NOT EXISTS idx_vuln_symbol_canonical ON reachability.vulnerable_symbols(canonical_id) WHERE canonical_id IS NOT NULL; + +-- Patch analysis indexes +CREATE INDEX IF NOT EXISTS idx_patch_analysis_commit ON reachability.patch_analysis(commit_sha); +CREATE INDEX IF NOT EXISTS idx_patch_analysis_repo ON reachability.patch_analysis(repository_url); + +-- ============================================================================ +-- Full-text search +-- ============================================================================ + +-- Add tsvector column for symbol search +ALTER TABLE reachability.cve_symbol_mappings +ADD COLUMN IF NOT EXISTS symbol_search_vector tsvector +GENERATED ALWAYS AS (to_tsvector('simple', coalesce(symbol_name, '') || ' ' || coalesce(file_path, ''))) STORED; + +CREATE INDEX IF NOT EXISTS idx_cve_mapping_fts ON reachability.cve_symbol_mappings USING GIN(symbol_search_vector); + +-- ============================================================================ +-- Trigger for updated_at +-- ============================================================================ + +CREATE OR REPLACE FUNCTION reachability.update_modified_column() +RETURNS TRIGGER AS $$ +BEGIN + NEW.updated_at = NOW(); + RETURN NEW; +END; +$$ LANGUAGE plpgsql; + +CREATE TRIGGER update_cve_mapping_modtime + BEFORE UPDATE ON reachability.cve_symbol_mappings + FOR EACH ROW + EXECUTE FUNCTION reachability.update_modified_column(); + +-- ============================================================================ +-- Comments for documentation +-- ============================================================================ + +COMMENT ON SCHEMA reachability IS 'Hybrid reachability analysis: CVE-symbol mappings, static/runtime evidence'; + +COMMENT ON TABLE reachability.cve_symbol_mappings IS 'Maps CVE IDs to vulnerable symbols with confidence scores'; +COMMENT ON COLUMN reachability.cve_symbol_mappings.cve_id_normalized IS 'Uppercase normalized CVE ID for case-insensitive lookup'; +COMMENT ON COLUMN reachability.cve_symbol_mappings.canonical_id IS 'Symbol canonical ID from canonicalization service'; +COMMENT ON COLUMN reachability.cve_symbol_mappings.evidence_uri IS 'stella:// URI pointing to evidence bundle'; + +COMMENT ON TABLE reachability.vulnerable_symbols IS 'Additional symbol details for a CVE mapping'; +COMMENT ON TABLE reachability.patch_analysis IS 'Cached patch analysis results for commit URLs'; + +-- ============================================================================ +-- Initial data / seed (optional well-known CVEs for testing) +-- ============================================================================ + +-- Example: Log4Shell (CVE-2021-44228) +INSERT INTO reachability.cve_symbol_mappings (cve_id, purl, symbol_name, file_path, source, confidence, vulnerability_type) +VALUES + ('CVE-2021-44228', 'pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1', 'JndiLookup.lookup', 'log4j-core/src/main/java/org/apache/logging/log4j/core/lookup/JndiLookup.java', 'manual_curation', 0.99, 'sink'), + ('CVE-2021-44228', 'pkg:maven/org.apache.logging.log4j/log4j-core@2.14.1', 'JndiManager.lookup', 'log4j-core/src/main/java/org/apache/logging/log4j/core/net/JndiManager.java', 'manual_curation', 0.95, 'sink') +ON CONFLICT DO NOTHING; + +-- Example: Spring4Shell (CVE-2022-22965) +INSERT INTO reachability.cve_symbol_mappings (cve_id, purl, symbol_name, file_path, source, confidence, vulnerability_type) +VALUES + ('CVE-2022-22965', 'pkg:maven/org.springframework/spring-beans@5.3.17', 'CachedIntrospectionResults.getBeanInfo', 'spring-beans/src/main/java/org/springframework/beans/CachedIntrospectionResults.java', 'patch_analysis', 0.90, 'source') +ON CONFLICT DO NOTHING; + +-- Example: polyfill.io supply chain (CVE-2024-38526) +INSERT INTO reachability.cve_symbol_mappings (cve_id, purl, symbol_name, source, confidence, vulnerability_type) +VALUES + ('CVE-2024-38526', 'pkg:npm/polyfill.io', 'window.polyfill', 'manual_curation', 0.85, 'source') +ON CONFLICT DO NOTHING; diff --git a/deploy/database/migrations/V20260117__create_doctor_reports_table.sql b/deploy/database/migrations/V20260117__create_doctor_reports_table.sql new file mode 100644 index 000000000..779138f87 --- /dev/null +++ b/deploy/database/migrations/V20260117__create_doctor_reports_table.sql @@ -0,0 +1,38 @@ +-- ----------------------------------------------------------------------------- +-- V20260117__create_doctor_reports_table.sql +-- Sprint: SPRINT_20260117_025_Doctor_coverage_expansion +-- Task: DOC-EXP-005 - Persistent Report Storage +-- Description: Migration to create doctor_reports table for persistent storage +-- ----------------------------------------------------------------------------- + +-- Doctor reports table for persistent storage +CREATE TABLE IF NOT EXISTS doctor_reports ( + run_id VARCHAR(64) PRIMARY KEY, + started_at TIMESTAMPTZ NOT NULL, + completed_at TIMESTAMPTZ, + overall_severity VARCHAR(16) NOT NULL, + passed_count INTEGER NOT NULL DEFAULT 0, + warning_count INTEGER NOT NULL DEFAULT 0, + failed_count INTEGER NOT NULL DEFAULT 0, + skipped_count INTEGER NOT NULL DEFAULT 0, + info_count INTEGER NOT NULL DEFAULT 0, + total_count INTEGER NOT NULL DEFAULT 0, + report_json_compressed BYTEA NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Index for listing reports by date +CREATE INDEX IF NOT EXISTS idx_doctor_reports_started_at + ON doctor_reports (started_at DESC); + +-- Index for retention cleanup +CREATE INDEX IF NOT EXISTS idx_doctor_reports_created_at + ON doctor_reports (created_at); + +-- Index for filtering by severity +CREATE INDEX IF NOT EXISTS idx_doctor_reports_severity + ON doctor_reports (overall_severity); + +-- Comment on table +COMMENT ON TABLE doctor_reports IS 'Stores Doctor diagnostic reports with compression for audit trail'; +COMMENT ON COLUMN doctor_reports.report_json_compressed IS 'GZip compressed JSON report data'; diff --git a/deploy/database/migrations/V20260117__vex_rekor_linkage.sql b/deploy/database/migrations/V20260117__vex_rekor_linkage.sql new file mode 100644 index 000000000..2b12774b1 --- /dev/null +++ b/deploy/database/migrations/V20260117__vex_rekor_linkage.sql @@ -0,0 +1,153 @@ +-- Migration: V20260117__vex_rekor_linkage.sql +-- Sprint: SPRINT_20260117_002_EXCITITOR_vex_rekor_linkage +-- Task: VRL-004, VRL-005 - Create Excititor and VexHub database migrations +-- Description: Add Rekor transparency log linkage columns to VEX tables +-- Author: StellaOps +-- Date: 2026-01-17 + +-- ============================================================================ +-- EXCITITOR SCHEMA: vex_observations table +-- ============================================================================ + +-- Add Rekor linkage columns to vex_observations +ALTER TABLE IF EXISTS excititor.vex_observations +ADD COLUMN IF NOT EXISTS rekor_uuid TEXT, +ADD COLUMN IF NOT EXISTS rekor_log_index BIGINT, +ADD COLUMN IF NOT EXISTS rekor_integrated_time TIMESTAMPTZ, +ADD COLUMN IF NOT EXISTS rekor_log_url TEXT, +ADD COLUMN IF NOT EXISTS rekor_tree_root TEXT, +ADD COLUMN IF NOT EXISTS rekor_tree_size BIGINT, +ADD COLUMN IF NOT EXISTS rekor_inclusion_proof JSONB, +ADD COLUMN IF NOT EXISTS rekor_entry_body_hash TEXT, +ADD COLUMN IF NOT EXISTS rekor_entry_kind TEXT, +ADD COLUMN IF NOT EXISTS rekor_linked_at TIMESTAMPTZ; + +-- Index for Rekor queries by UUID +CREATE INDEX IF NOT EXISTS idx_vex_observations_rekor_uuid +ON excititor.vex_observations(rekor_uuid) +WHERE rekor_uuid IS NOT NULL; + +-- Index for Rekor queries by log index (for ordered traversal) +CREATE INDEX IF NOT EXISTS idx_vex_observations_rekor_log_index +ON excititor.vex_observations(rekor_log_index DESC) +WHERE rekor_log_index IS NOT NULL; + +-- Index for finding unlinked observations (for retry/backfill) +CREATE INDEX IF NOT EXISTS idx_vex_observations_pending_rekor +ON excititor.vex_observations(created_at) +WHERE rekor_uuid IS NULL; + +-- Comment on columns +COMMENT ON COLUMN excititor.vex_observations.rekor_uuid IS 'Rekor entry UUID (64-char hex)'; +COMMENT ON COLUMN excititor.vex_observations.rekor_log_index IS 'Monotonically increasing log position'; +COMMENT ON COLUMN excititor.vex_observations.rekor_integrated_time IS 'Time entry was integrated into Rekor log'; +COMMENT ON COLUMN excititor.vex_observations.rekor_log_url IS 'Rekor server URL where entry was submitted'; +COMMENT ON COLUMN excititor.vex_observations.rekor_tree_root IS 'Merkle tree root hash at submission time (base64)'; +COMMENT ON COLUMN excititor.vex_observations.rekor_tree_size IS 'Tree size at submission time'; +COMMENT ON COLUMN excititor.vex_observations.rekor_inclusion_proof IS 'RFC 6962 inclusion proof for offline verification'; +COMMENT ON COLUMN excititor.vex_observations.rekor_entry_body_hash IS 'SHA-256 hash of entry body'; +COMMENT ON COLUMN excititor.vex_observations.rekor_entry_kind IS 'Entry kind (dsse, intoto, hashedrekord)'; +COMMENT ON COLUMN excititor.vex_observations.rekor_linked_at IS 'When linkage was recorded locally'; + +-- ============================================================================ +-- EXCITITOR SCHEMA: vex_statement_change_events table +-- ============================================================================ + +-- Add Rekor linkage to change events +ALTER TABLE IF EXISTS excititor.vex_statement_change_events +ADD COLUMN IF NOT EXISTS rekor_entry_id TEXT, +ADD COLUMN IF NOT EXISTS rekor_log_index BIGINT; + +-- Index for Rekor queries on change events +CREATE INDEX IF NOT EXISTS idx_vex_change_events_rekor +ON excititor.vex_statement_change_events(rekor_entry_id) +WHERE rekor_entry_id IS NOT NULL; + +COMMENT ON COLUMN excititor.vex_statement_change_events.rekor_entry_id IS 'Rekor entry UUID for change attestation'; +COMMENT ON COLUMN excititor.vex_statement_change_events.rekor_log_index IS 'Rekor log index for change attestation'; + +-- ============================================================================ +-- VEXHUB SCHEMA: vex_statements table +-- ============================================================================ + +-- Add Rekor linkage columns to vex_statements +ALTER TABLE IF EXISTS vexhub.vex_statements +ADD COLUMN IF NOT EXISTS rekor_uuid TEXT, +ADD COLUMN IF NOT EXISTS rekor_log_index BIGINT, +ADD COLUMN IF NOT EXISTS rekor_integrated_time TIMESTAMPTZ, +ADD COLUMN IF NOT EXISTS rekor_inclusion_proof JSONB; + +-- Index for Rekor queries +CREATE INDEX IF NOT EXISTS idx_vexhub_statements_rekor_uuid +ON vexhub.vex_statements(rekor_uuid) +WHERE rekor_uuid IS NOT NULL; + +CREATE INDEX IF NOT EXISTS idx_vexhub_statements_rekor_log_index +ON vexhub.vex_statements(rekor_log_index DESC) +WHERE rekor_log_index IS NOT NULL; + +COMMENT ON COLUMN vexhub.vex_statements.rekor_uuid IS 'Rekor entry UUID for statement attestation'; +COMMENT ON COLUMN vexhub.vex_statements.rekor_log_index IS 'Rekor log index for statement attestation'; +COMMENT ON COLUMN vexhub.vex_statements.rekor_integrated_time IS 'Time statement was integrated into Rekor log'; +COMMENT ON COLUMN vexhub.vex_statements.rekor_inclusion_proof IS 'RFC 6962 inclusion proof for offline verification'; + +-- ============================================================================ +-- ATTESTOR SCHEMA: rekor_entries verification tracking +-- Sprint: SPRINT_20260117_001_ATTESTOR_periodic_rekor_verification (PRV-003) +-- ============================================================================ + +-- Add verification tracking columns to existing rekor_entries table +ALTER TABLE IF EXISTS attestor.rekor_entries +ADD COLUMN IF NOT EXISTS last_verified_at TIMESTAMPTZ, +ADD COLUMN IF NOT EXISTS verification_count INT NOT NULL DEFAULT 0, +ADD COLUMN IF NOT EXISTS last_verification_result TEXT; + +-- Index for verification queries (find entries needing verification) +CREATE INDEX IF NOT EXISTS idx_rekor_entries_verification +ON attestor.rekor_entries(created_at DESC, last_verified_at NULLS FIRST) +WHERE last_verification_result IS DISTINCT FROM 'invalid'; + +-- Index for finding never-verified entries +CREATE INDEX IF NOT EXISTS idx_rekor_entries_unverified +ON attestor.rekor_entries(created_at DESC) +WHERE last_verified_at IS NULL; + +COMMENT ON COLUMN attestor.rekor_entries.last_verified_at IS 'Timestamp of last successful verification'; +COMMENT ON COLUMN attestor.rekor_entries.verification_count IS 'Number of times entry has been verified'; +COMMENT ON COLUMN attestor.rekor_entries.last_verification_result IS 'Result of last verification: valid, invalid, skipped'; + +-- ============================================================================ +-- ATTESTOR SCHEMA: rekor_root_checkpoints table +-- Stores tree root checkpoints for consistency verification +-- ============================================================================ + +CREATE TABLE IF NOT EXISTS attestor.rekor_root_checkpoints ( + id BIGSERIAL PRIMARY KEY, + tree_root TEXT NOT NULL, + tree_size BIGINT NOT NULL, + log_id TEXT NOT NULL, + log_url TEXT, + checkpoint_envelope TEXT, + captured_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + verified_at TIMESTAMPTZ, + is_consistent BOOLEAN, + inconsistency_reason TEXT, + CONSTRAINT uq_root_checkpoint UNIQUE (log_id, tree_root, tree_size) +); + +-- Index for finding latest checkpoints per log +CREATE INDEX IF NOT EXISTS idx_rekor_root_checkpoints_latest +ON attestor.rekor_root_checkpoints(log_id, captured_at DESC); + +-- Index for consistency verification +CREATE INDEX IF NOT EXISTS idx_rekor_root_checkpoints_unverified +ON attestor.rekor_root_checkpoints(captured_at DESC) +WHERE verified_at IS NULL; + +COMMENT ON TABLE attestor.rekor_root_checkpoints IS 'Stores Rekor tree root checkpoints for consistency verification'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.tree_root IS 'Merkle tree root hash (base64)'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.tree_size IS 'Tree size at checkpoint'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.log_id IS 'Rekor log identifier (hash of public key)'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.checkpoint_envelope IS 'Signed checkpoint in note format'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.is_consistent IS 'Whether checkpoint was consistent with previous'; +COMMENT ON COLUMN attestor.rekor_root_checkpoints.inconsistency_reason IS 'Reason for inconsistency if detected'; diff --git a/deploy/database/migrations/V20260119_001__Add_UnderReview_Escalated_Rejected_States.sql b/deploy/database/migrations/V20260119_001__Add_UnderReview_Escalated_Rejected_States.sql new file mode 100644 index 000000000..1e41173c6 --- /dev/null +++ b/deploy/database/migrations/V20260119_001__Add_UnderReview_Escalated_Rejected_States.sql @@ -0,0 +1,139 @@ +-- ----------------------------------------------------------------------------- +-- V20260119_001__Add_UnderReview_Escalated_Rejected_States.sql +-- Sprint: SPRINT_20260118_018_Unknowns_queue_enhancement +-- Task: UQ-005 - Migration for existing entries (map to new states) +-- Description: Adds new state machine states and required columns +-- ----------------------------------------------------------------------------- + +-- Add new columns for UnderReview and Escalated states +ALTER TABLE grey_queue_entries +ADD COLUMN IF NOT EXISTS assignee VARCHAR(255) NULL, +ADD COLUMN IF NOT EXISTS assigned_at TIMESTAMPTZ NULL, +ADD COLUMN IF NOT EXISTS escalated_at TIMESTAMPTZ NULL, +ADD COLUMN IF NOT EXISTS escalation_reason TEXT NULL; + +-- Add new enum values to grey_queue_status +-- Note: PostgreSQL requires special handling for enum additions + +-- First, check if we need to add the values (idempotent) +DO $$ +BEGIN + -- Add 'under_review' if not exists + IF NOT EXISTS ( + SELECT 1 FROM pg_enum + WHERE enumlabel = 'under_review' + AND enumtypid = 'grey_queue_status'::regtype + ) THEN + ALTER TYPE grey_queue_status ADD VALUE 'under_review' AFTER 'retrying'; + END IF; + + -- Add 'escalated' if not exists + IF NOT EXISTS ( + SELECT 1 FROM pg_enum + WHERE enumlabel = 'escalated' + AND enumtypid = 'grey_queue_status'::regtype + ) THEN + ALTER TYPE grey_queue_status ADD VALUE 'escalated' AFTER 'under_review'; + END IF; + + -- Add 'rejected' if not exists + IF NOT EXISTS ( + SELECT 1 FROM pg_enum + WHERE enumlabel = 'rejected' + AND enumtypid = 'grey_queue_status'::regtype + ) THEN + ALTER TYPE grey_queue_status ADD VALUE 'rejected' AFTER 'resolved'; + END IF; +EXCEPTION + WHEN others THEN + -- Enum values may already exist, which is fine + NULL; +END $$; + +-- Add indexes for new query patterns +CREATE INDEX IF NOT EXISTS idx_grey_queue_assignee + ON grey_queue_entries(assignee) + WHERE assignee IS NOT NULL; + +CREATE INDEX IF NOT EXISTS idx_grey_queue_status_assignee + ON grey_queue_entries(status, assignee) + WHERE status IN ('under_review', 'escalated'); + +CREATE INDEX IF NOT EXISTS idx_grey_queue_escalated_at + ON grey_queue_entries(escalated_at DESC) + WHERE escalated_at IS NOT NULL; + +-- Add audit trigger for state transitions +CREATE TABLE IF NOT EXISTS grey_queue_state_transitions ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + entry_id UUID NOT NULL REFERENCES grey_queue_entries(id), + tenant_id VARCHAR(128) NOT NULL, + from_state VARCHAR(32) NOT NULL, + to_state VARCHAR(32) NOT NULL, + transitioned_by VARCHAR(255), + reason TEXT, + transitioned_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + metadata JSONB +); + +CREATE INDEX IF NOT EXISTS idx_grey_queue_transitions_entry + ON grey_queue_state_transitions(entry_id); + +CREATE INDEX IF NOT EXISTS idx_grey_queue_transitions_tenant_time + ON grey_queue_state_transitions(tenant_id, transitioned_at DESC); + +-- Function to record state transitions +CREATE OR REPLACE FUNCTION record_grey_queue_transition() +RETURNS TRIGGER AS $$ +BEGIN + IF OLD.status IS DISTINCT FROM NEW.status THEN + INSERT INTO grey_queue_state_transitions ( + entry_id, tenant_id, from_state, to_state, + transitioned_by, transitioned_at + ) VALUES ( + NEW.id, + NEW.tenant_id, + OLD.status::text, + NEW.status::text, + COALESCE(NEW.assignee, current_user), + NOW() + ); + END IF; + RETURN NEW; +END; +$$ LANGUAGE plpgsql; + +-- Create trigger if not exists +DROP TRIGGER IF EXISTS trg_grey_queue_state_transition ON grey_queue_entries; +CREATE TRIGGER trg_grey_queue_state_transition + AFTER UPDATE ON grey_queue_entries + FOR EACH ROW + EXECUTE FUNCTION record_grey_queue_transition(); + +-- Update summary view to include new states +CREATE OR REPLACE VIEW grey_queue_summary AS +SELECT + tenant_id, + COUNT(*) FILTER (WHERE status = 'pending') as pending_count, + COUNT(*) FILTER (WHERE status = 'processing') as processing_count, + COUNT(*) FILTER (WHERE status = 'retrying') as retrying_count, + COUNT(*) FILTER (WHERE status = 'under_review') as under_review_count, + COUNT(*) FILTER (WHERE status = 'escalated') as escalated_count, + COUNT(*) FILTER (WHERE status = 'resolved') as resolved_count, + COUNT(*) FILTER (WHERE status = 'rejected') as rejected_count, + COUNT(*) FILTER (WHERE status = 'failed') as failed_count, + COUNT(*) FILTER (WHERE status = 'expired') as expired_count, + COUNT(*) FILTER (WHERE status = 'dismissed') as dismissed_count, + COUNT(*) as total_count +FROM grey_queue_entries +GROUP BY tenant_id; + +-- Comment for documentation +COMMENT ON COLUMN grey_queue_entries.assignee IS + 'Assignee for entries in UnderReview state (Sprint UQ-005)'; +COMMENT ON COLUMN grey_queue_entries.assigned_at IS + 'When the entry was assigned for review (Sprint UQ-005)'; +COMMENT ON COLUMN grey_queue_entries.escalated_at IS + 'When the entry was escalated to security team (Sprint UQ-005)'; +COMMENT ON COLUMN grey_queue_entries.escalation_reason IS + 'Reason for escalation (Sprint UQ-005)'; diff --git a/deploy/database/migrations/V20260119__scanner_layer_diffid.sql b/deploy/database/migrations/V20260119__scanner_layer_diffid.sql new file mode 100644 index 000000000..d860ecbcf --- /dev/null +++ b/deploy/database/migrations/V20260119__scanner_layer_diffid.sql @@ -0,0 +1,130 @@ +-- Migration: Add diff_id column to scanner layers table +-- Sprint: SPRINT_025_Scanner_layer_manifest_infrastructure +-- Task: TASK-025-03 + +-- Add diff_id column to layers table (sha256:64hex = 71 chars) +ALTER TABLE scanner.layers +ADD COLUMN IF NOT EXISTS diff_id VARCHAR(71); + +-- Add timestamp for when diffID was computed +ALTER TABLE scanner.layers +ADD COLUMN IF NOT EXISTS diff_id_computed_at_utc TIMESTAMP; + +-- Create index on diff_id for fast lookups +CREATE INDEX IF NOT EXISTS idx_layers_diff_id +ON scanner.layers (diff_id) +WHERE diff_id IS NOT NULL; + +-- Create image_layers junction table if it doesn't exist +-- This tracks which layers belong to which images +CREATE TABLE IF NOT EXISTS scanner.image_layers ( + image_reference VARCHAR(512) NOT NULL, + layer_digest VARCHAR(71) NOT NULL, + layer_index INT NOT NULL, + created_at_utc TIMESTAMP NOT NULL DEFAULT NOW(), + PRIMARY KEY (image_reference, layer_digest) +); + +CREATE INDEX IF NOT EXISTS idx_image_layers_digest +ON scanner.image_layers (layer_digest); + +-- DiffID cache table for resolved diffIDs +CREATE TABLE IF NOT EXISTS scanner.scanner_diffid_cache ( + layer_digest VARCHAR(71) PRIMARY KEY, + diff_id VARCHAR(71) NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Base image fingerprint tables for layer reuse detection +CREATE TABLE IF NOT EXISTS scanner.scanner_base_image_fingerprints ( + image_reference VARCHAR(512) PRIMARY KEY, + layer_count INT NOT NULL, + registered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + detection_count BIGINT NOT NULL DEFAULT 0 +); + +CREATE TABLE IF NOT EXISTS scanner.scanner_base_image_layers ( + image_reference VARCHAR(512) NOT NULL REFERENCES scanner.scanner_base_image_fingerprints(image_reference) ON DELETE CASCADE, + layer_index INT NOT NULL, + diff_id VARCHAR(71) NOT NULL, + PRIMARY KEY (image_reference, layer_index) +); + +CREATE INDEX IF NOT EXISTS idx_base_image_layers_diff_id +ON scanner.scanner_base_image_layers (diff_id); + +-- Manifest snapshots table for IOciManifestSnapshotService +CREATE TABLE IF NOT EXISTS scanner.manifest_snapshots ( + id UUID PRIMARY KEY, + image_reference VARCHAR(512) NOT NULL, + registry VARCHAR(256) NOT NULL, + repository VARCHAR(256) NOT NULL, + tag VARCHAR(128), + manifest_digest VARCHAR(71) NOT NULL, + config_digest VARCHAR(71) NOT NULL, + media_type VARCHAR(128) NOT NULL, + layers JSONB NOT NULL, + diff_ids JSONB NOT NULL, + platform JSONB, + total_size BIGINT NOT NULL, + captured_at TIMESTAMPTZ NOT NULL, + snapshot_version VARCHAR(32), + UNIQUE (manifest_digest) +); + +CREATE INDEX IF NOT EXISTS idx_manifest_snapshots_image_ref +ON scanner.manifest_snapshots (image_reference); + +CREATE INDEX IF NOT EXISTS idx_manifest_snapshots_repository +ON scanner.manifest_snapshots (registry, repository); + +CREATE INDEX IF NOT EXISTS idx_manifest_snapshots_captured_at +ON scanner.manifest_snapshots (captured_at DESC); + +-- Layer scan history for reuse detection (TASK-025-04) +CREATE TABLE IF NOT EXISTS scanner.layer_scans ( + diff_id VARCHAR(71) PRIMARY KEY, + scanned_at TIMESTAMPTZ NOT NULL, + finding_count INT, + scanned_by VARCHAR(128) NOT NULL, + scanner_version VARCHAR(64) +); + +-- Layer reuse counts for statistics +CREATE TABLE IF NOT EXISTS scanner.layer_reuse_counts ( + diff_id VARCHAR(71) PRIMARY KEY, + reuse_count INT NOT NULL DEFAULT 1, + first_seen_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +CREATE INDEX IF NOT EXISTS idx_layer_reuse_counts_count +ON scanner.layer_reuse_counts (reuse_count DESC); + +COMMENT ON COLUMN scanner.layers.diff_id IS 'Uncompressed layer content hash (sha256:hex64). Immutable once computed.'; +COMMENT ON TABLE scanner.scanner_diffid_cache IS 'Cache of layer digest to diffID mappings. Layer digests are immutable so cache entries never expire.'; +COMMENT ON TABLE scanner.scanner_base_image_fingerprints IS 'Known base image fingerprints for layer reuse detection.'; +COMMENT ON TABLE scanner.manifest_snapshots IS 'Point-in-time captures of OCI image manifests for delta scanning.'; +COMMENT ON TABLE scanner.layer_scans IS 'History of layer scans for deduplication. One entry per diffID.'; +COMMENT ON TABLE scanner.layer_reuse_counts IS 'Counts of how many times each layer appears across images.'; + +-- Layer SBOM CAS for per-layer SBOM storage (TASK-026-02) +CREATE TABLE IF NOT EXISTS scanner.layer_sbom_cas ( + diff_id VARCHAR(71) NOT NULL, + format VARCHAR(20) NOT NULL, + content BYTEA NOT NULL, + size_bytes BIGINT NOT NULL, + compressed BOOLEAN NOT NULL DEFAULT TRUE, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + last_accessed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + PRIMARY KEY (diff_id, format) +); + +CREATE INDEX IF NOT EXISTS idx_layer_sbom_cas_last_accessed +ON scanner.layer_sbom_cas (last_accessed_at); + +CREATE INDEX IF NOT EXISTS idx_layer_sbom_cas_format +ON scanner.layer_sbom_cas (format); + +COMMENT ON TABLE scanner.layer_sbom_cas IS 'Content-addressable storage for per-layer SBOMs. Keyed by diffID (immutable).'; +COMMENT ON COLUMN scanner.layer_sbom_cas.content IS 'Compressed (gzip) SBOM content.'; +COMMENT ON COLUMN scanner.layer_sbom_cas.last_accessed_at IS 'For TTL-based eviction of cold entries.'; diff --git a/deploy/database/postgres-partitioning/001_partition_infrastructure.sql b/deploy/database/postgres-partitioning/001_partition_infrastructure.sql new file mode 100644 index 000000000..7aedf2e69 --- /dev/null +++ b/deploy/database/postgres-partitioning/001_partition_infrastructure.sql @@ -0,0 +1,561 @@ +-- Partitioning Infrastructure Migration 001: Foundation +-- Sprint: SPRINT_3422_0001_0001 - Time-Based Partitioning +-- Category: C (infrastructure setup, requires planned maintenance) +-- +-- Purpose: Create partition management infrastructure including: +-- - Helper functions for partition creation and maintenance +-- - Utility functions for BRIN index optimization +-- - Partition maintenance scheduling support +-- +-- This migration creates the foundation; table conversion is done in separate migrations. + +BEGIN; + +-- ============================================================================ +-- Step 1: Create partition management schema +-- ============================================================================ + +CREATE SCHEMA IF NOT EXISTS partition_mgmt; + +COMMENT ON SCHEMA partition_mgmt IS + 'Partition management utilities for time-series tables'; + +-- ============================================================================ +-- Step 2: Managed table registration +-- ============================================================================ + +CREATE TABLE IF NOT EXISTS partition_mgmt.managed_tables ( + schema_name TEXT NOT NULL, + table_name TEXT NOT NULL, + partition_key TEXT NOT NULL, + partition_type TEXT NOT NULL, + retention_months INT NOT NULL DEFAULT 0, + months_ahead INT NOT NULL DEFAULT 3, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + PRIMARY KEY (schema_name, table_name) +); + +COMMENT ON TABLE partition_mgmt.managed_tables IS + 'Tracks partitioned tables with retention and creation settings'; + +-- ============================================================================ +-- Step 3: Partition creation function +-- ============================================================================ + +-- Creates a new partition for a given table and date range +CREATE OR REPLACE FUNCTION partition_mgmt.create_partition( + p_schema_name TEXT, + p_table_name TEXT, + p_partition_column TEXT, + p_start_date DATE, + p_end_date DATE, + p_partition_suffix TEXT DEFAULT NULL +) +RETURNS TEXT +LANGUAGE plpgsql +AS $$ +DECLARE + v_partition_name TEXT; + v_parent_table TEXT; + v_sql TEXT; +BEGIN + v_parent_table := format('%I.%I', p_schema_name, p_table_name); + + -- Generate partition name: tablename_YYYY_MM or tablename_YYYY_Q# + IF p_partition_suffix IS NOT NULL THEN + v_partition_name := format('%s_%s', p_table_name, p_partition_suffix); + ELSE + v_partition_name := format('%s_%s', p_table_name, to_char(p_start_date, 'YYYY_MM')); + END IF; + + -- Check if partition already exists + IF EXISTS ( + SELECT 1 FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + WHERE n.nspname = p_schema_name AND c.relname = v_partition_name + ) THEN + RAISE NOTICE 'Partition % already exists, skipping', v_partition_name; + RETURN v_partition_name; + END IF; + + -- Create partition + v_sql := format( + 'CREATE TABLE %I.%I PARTITION OF %s FOR VALUES FROM (%L) TO (%L)', + p_schema_name, + v_partition_name, + v_parent_table, + p_start_date, + p_end_date + ); + + EXECUTE v_sql; + + RAISE NOTICE 'Created partition %.%', p_schema_name, v_partition_name; + RETURN v_partition_name; +END; +$$; + +-- ============================================================================ +-- Step 4: Monthly partition creation helper +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.create_monthly_partitions( + p_schema_name TEXT, + p_table_name TEXT, + p_partition_column TEXT, + p_start_month DATE, + p_months_ahead INT DEFAULT 3 +) +RETURNS SETOF TEXT +LANGUAGE plpgsql +AS $$ +DECLARE + v_current_month DATE; + v_end_month DATE; + v_partition_name TEXT; +BEGIN + v_current_month := date_trunc('month', p_start_month)::DATE; + v_end_month := date_trunc('month', NOW() + (p_months_ahead || ' months')::INTERVAL)::DATE; + + WHILE v_current_month <= v_end_month LOOP + v_partition_name := partition_mgmt.create_partition( + p_schema_name, + p_table_name, + p_partition_column, + v_current_month, + (v_current_month + INTERVAL '1 month')::DATE + ); + RETURN NEXT v_partition_name; + v_current_month := (v_current_month + INTERVAL '1 month')::DATE; + END LOOP; +END; +$$; + +-- ============================================================================ +-- Step 5: Quarterly partition creation helper +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.create_quarterly_partitions( + p_schema_name TEXT, + p_table_name TEXT, + p_partition_column TEXT, + p_start_quarter DATE, + p_quarters_ahead INT DEFAULT 2 +) +RETURNS SETOF TEXT +LANGUAGE plpgsql +AS $$ +DECLARE + v_current_quarter DATE; + v_end_quarter DATE; + v_partition_name TEXT; + v_suffix TEXT; +BEGIN + v_current_quarter := date_trunc('quarter', p_start_quarter)::DATE; + v_end_quarter := date_trunc('quarter', NOW() + (p_quarters_ahead * 3 || ' months')::INTERVAL)::DATE; + + WHILE v_current_quarter <= v_end_quarter LOOP + -- Generate suffix like 2025_Q1, 2025_Q2, etc. + v_suffix := to_char(v_current_quarter, 'YYYY') || '_Q' || + EXTRACT(QUARTER FROM v_current_quarter)::TEXT; + + v_partition_name := partition_mgmt.create_partition( + p_schema_name, + p_table_name, + p_partition_column, + v_current_quarter, + (v_current_quarter + INTERVAL '3 months')::DATE, + v_suffix + ); + RETURN NEXT v_partition_name; + v_current_quarter := (v_current_quarter + INTERVAL '3 months')::DATE; + END LOOP; +END; +$$; + +-- ============================================================================ +-- Step 6: Ensure future partitions exist +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.ensure_future_partitions( + p_schema_name TEXT, + p_table_name TEXT, + p_months_ahead INT +) +RETURNS INT +LANGUAGE plpgsql +AS $$ +DECLARE + v_partition_key TEXT; + v_partition_type TEXT; + v_months_ahead INT; + v_created INT := 0; + v_current DATE; + v_end DATE; + v_suffix TEXT; + v_partition_name TEXT; +BEGIN + SELECT partition_key, partition_type, months_ahead + INTO v_partition_key, v_partition_type, v_months_ahead + FROM partition_mgmt.managed_tables + WHERE schema_name = p_schema_name + AND table_name = p_table_name; + + IF v_partition_key IS NULL THEN + RETURN 0; + END IF; + + IF p_months_ahead IS NOT NULL AND p_months_ahead > 0 THEN + v_months_ahead := p_months_ahead; + END IF; + + IF v_months_ahead IS NULL OR v_months_ahead <= 0 THEN + RETURN 0; + END IF; + + v_partition_type := lower(coalesce(v_partition_type, 'monthly')); + + IF v_partition_type = 'monthly' THEN + v_current := date_trunc('month', NOW())::DATE; + v_end := date_trunc('month', NOW() + (v_months_ahead || ' months')::INTERVAL)::DATE; + + WHILE v_current <= v_end LOOP + v_partition_name := format('%s_%s', p_table_name, to_char(v_current, 'YYYY_MM')); + IF NOT EXISTS ( + SELECT 1 FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + WHERE n.nspname = p_schema_name AND c.relname = v_partition_name + ) THEN + PERFORM partition_mgmt.create_partition( + p_schema_name, + p_table_name, + v_partition_key, + v_current, + (v_current + INTERVAL '1 month')::DATE + ); + v_created := v_created + 1; + END IF; + + v_current := (v_current + INTERVAL '1 month')::DATE; + END LOOP; + ELSIF v_partition_type = 'quarterly' THEN + v_current := date_trunc('quarter', NOW())::DATE; + v_end := date_trunc('quarter', NOW() + (v_months_ahead || ' months')::INTERVAL)::DATE; + + WHILE v_current <= v_end LOOP + v_suffix := to_char(v_current, 'YYYY') || '_Q' || + EXTRACT(QUARTER FROM v_current)::TEXT; + v_partition_name := format('%s_%s', p_table_name, v_suffix); + + IF NOT EXISTS ( + SELECT 1 FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + WHERE n.nspname = p_schema_name AND c.relname = v_partition_name + ) THEN + PERFORM partition_mgmt.create_partition( + p_schema_name, + p_table_name, + v_partition_key, + v_current, + (v_current + INTERVAL '3 months')::DATE, + v_suffix + ); + v_created := v_created + 1; + END IF; + + v_current := (v_current + INTERVAL '3 months')::DATE; + END LOOP; + END IF; + + RETURN v_created; +END; +$$; + +-- ============================================================================ +-- Step 7: Retention enforcement function +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.enforce_retention( + p_schema_name TEXT, + p_table_name TEXT, + p_retention_months INT +) +RETURNS INT +LANGUAGE plpgsql +AS $$ +DECLARE + v_retention_months INT; + v_cutoff_date DATE; + v_partition RECORD; + v_dropped INT := 0; +BEGIN + SELECT retention_months + INTO v_retention_months + FROM partition_mgmt.managed_tables + WHERE schema_name = p_schema_name + AND table_name = p_table_name; + + IF p_retention_months IS NOT NULL AND p_retention_months > 0 THEN + v_retention_months := p_retention_months; + END IF; + + IF v_retention_months IS NULL OR v_retention_months <= 0 THEN + RETURN 0; + END IF; + + v_cutoff_date := (NOW() - (v_retention_months || ' months')::INTERVAL)::DATE; + + FOR v_partition IN + SELECT partition_name, partition_end + FROM partition_mgmt.partition_stats + WHERE schema_name = p_schema_name + AND table_name = p_table_name + LOOP + IF v_partition.partition_end IS NOT NULL AND v_partition.partition_end < v_cutoff_date THEN + EXECUTE format('DROP TABLE IF EXISTS %I.%I', p_schema_name, v_partition.partition_name); + v_dropped := v_dropped + 1; + END IF; + END LOOP; + + RETURN v_dropped; +END; +$$; + +-- ============================================================================ +-- Step 8: Partition detach and archive function +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.detach_partition( + p_schema_name TEXT, + p_table_name TEXT, + p_partition_name TEXT, + p_archive_schema TEXT DEFAULT 'archive' +) +RETURNS BOOLEAN +LANGUAGE plpgsql +AS $$ +DECLARE + v_parent_table TEXT; + v_partition_full TEXT; + v_archive_table TEXT; +BEGIN + v_parent_table := format('%I.%I', p_schema_name, p_table_name); + v_partition_full := format('%I.%I', p_schema_name, p_partition_name); + v_archive_table := format('%I.%I', p_archive_schema, p_partition_name); + + -- Create archive schema if not exists + EXECUTE format('CREATE SCHEMA IF NOT EXISTS %I', p_archive_schema); + + -- Detach partition + EXECUTE format( + 'ALTER TABLE %s DETACH PARTITION %s', + v_parent_table, + v_partition_full + ); + + -- Move to archive schema + EXECUTE format( + 'ALTER TABLE %s SET SCHEMA %I', + v_partition_full, + p_archive_schema + ); + + RAISE NOTICE 'Detached and archived partition % to %', p_partition_name, v_archive_table; + RETURN TRUE; +EXCEPTION + WHEN OTHERS THEN + RAISE WARNING 'Failed to detach partition %: %', p_partition_name, SQLERRM; + RETURN FALSE; +END; +$$; + +-- ============================================================================ +-- Step 9: Partition retention cleanup function +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.cleanup_old_partitions( + p_schema_name TEXT, + p_table_name TEXT, + p_retention_months INT, + p_archive_schema TEXT DEFAULT 'archive', + p_dry_run BOOLEAN DEFAULT TRUE +) +RETURNS TABLE(partition_name TEXT, action TEXT) +LANGUAGE plpgsql +AS $$ +DECLARE + v_cutoff_date DATE; + v_partition RECORD; + v_partition_end DATE; +BEGIN + v_cutoff_date := (NOW() - (p_retention_months || ' months')::INTERVAL)::DATE; + + FOR v_partition IN + SELECT c.relname as name, + pg_get_expr(c.relpartbound, c.oid) as bound_expr + FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + JOIN pg_inherits i ON c.oid = i.inhrelid + JOIN pg_class parent ON i.inhparent = parent.oid + WHERE n.nspname = p_schema_name + AND parent.relname = p_table_name + AND c.relkind = 'r' + LOOP + -- Parse the partition bound to get end date + -- Format: FOR VALUES FROM ('2024-01-01') TO ('2024-02-01') + v_partition_end := (regexp_match(v_partition.bound_expr, + 'TO \(''([^'']+)''\)'))[1]::DATE; + + IF v_partition_end IS NOT NULL AND v_partition_end < v_cutoff_date THEN + partition_name := v_partition.name; + + IF p_dry_run THEN + action := 'WOULD_ARCHIVE'; + ELSE + IF partition_mgmt.detach_partition( + p_schema_name, p_table_name, v_partition.name, p_archive_schema + ) THEN + action := 'ARCHIVED'; + ELSE + action := 'FAILED'; + END IF; + END IF; + + RETURN NEXT; + END IF; + END LOOP; +END; +$$; + +-- ============================================================================ +-- Step 10: Partition statistics view +-- ============================================================================ + +CREATE OR REPLACE VIEW partition_mgmt.partition_stats AS +SELECT + n.nspname AS schema_name, + parent.relname AS table_name, + c.relname AS partition_name, + pg_get_expr(c.relpartbound, c.oid) AS partition_range, + (regexp_match(pg_get_expr(c.relpartbound, c.oid), 'FROM \(''([^'']+)''\)'))[1]::DATE AS partition_start, + (regexp_match(pg_get_expr(c.relpartbound, c.oid), 'TO \(''([^'']+)''\)'))[1]::DATE AS partition_end, + pg_size_pretty(pg_relation_size(c.oid)) AS size, + pg_relation_size(c.oid) AS size_bytes, + COALESCE(s.n_live_tup, 0) AS estimated_rows, + s.last_vacuum, + s.last_autovacuum, + s.last_analyze, + s.last_autoanalyze +FROM pg_class c +JOIN pg_namespace n ON c.relnamespace = n.oid +JOIN pg_inherits i ON c.oid = i.inhrelid +JOIN pg_class parent ON i.inhparent = parent.oid +LEFT JOIN pg_stat_user_tables s ON c.oid = s.relid +WHERE c.relkind = 'r' + AND parent.relkind = 'p' +ORDER BY n.nspname, parent.relname, c.relname; + +COMMENT ON VIEW partition_mgmt.partition_stats IS + 'Statistics for all partitioned tables in the database'; + +-- ============================================================================ +-- Step 11: BRIN index optimization helper +-- ============================================================================ + +CREATE OR REPLACE FUNCTION partition_mgmt.create_brin_index_if_not_exists( + p_schema_name TEXT, + p_table_name TEXT, + p_column_name TEXT, + p_pages_per_range INT DEFAULT 128 +) +RETURNS BOOLEAN +LANGUAGE plpgsql +AS $$ +DECLARE + v_index_name TEXT; + v_sql TEXT; +BEGIN + v_index_name := format('brin_%s_%s', p_table_name, p_column_name); + + -- Check if index exists + IF EXISTS ( + SELECT 1 FROM pg_indexes + WHERE schemaname = p_schema_name AND indexname = v_index_name + ) THEN + RAISE NOTICE 'BRIN index % already exists', v_index_name; + RETURN FALSE; + END IF; + + v_sql := format( + 'CREATE INDEX %I ON %I.%I USING brin (%I) WITH (pages_per_range = %s)', + v_index_name, + p_schema_name, + p_table_name, + p_column_name, + p_pages_per_range + ); + + EXECUTE v_sql; + + RAISE NOTICE 'Created BRIN index % on %.%(%)', + v_index_name, p_schema_name, p_table_name, p_column_name; + RETURN TRUE; +END; +$$; + +-- ============================================================================ +-- Step 12: Maintenance job tracking table +-- ============================================================================ + +CREATE TABLE IF NOT EXISTS partition_mgmt.maintenance_log ( + id BIGSERIAL PRIMARY KEY, + operation TEXT NOT NULL, + schema_name TEXT NOT NULL, + table_name TEXT NOT NULL, + partition_name TEXT, + status TEXT NOT NULL DEFAULT 'started', + details JSONB NOT NULL DEFAULT '{}', + started_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + completed_at TIMESTAMPTZ, + error_message TEXT +); + +CREATE INDEX idx_maintenance_log_table ON partition_mgmt.maintenance_log(schema_name, table_name); +CREATE INDEX idx_maintenance_log_status ON partition_mgmt.maintenance_log(status, started_at); + +-- ============================================================================ +-- Step 13: Archive schema for detached partitions +-- ============================================================================ + +CREATE SCHEMA IF NOT EXISTS archive; + +COMMENT ON SCHEMA archive IS + 'Storage for detached/archived partitions awaiting deletion or offload'; + +COMMIT; + +-- ============================================================================ +-- Usage Examples (commented out) +-- ============================================================================ + +/* +-- Create monthly partitions for audit table, 3 months ahead +SELECT partition_mgmt.create_monthly_partitions( + 'scheduler', 'audit', 'created_at', '2024-01-01'::DATE, 3 +); + +-- Preview old partitions that would be archived (dry run) +SELECT * FROM partition_mgmt.cleanup_old_partitions( + 'scheduler', 'audit', 12, 'archive', TRUE +); + +-- Actually archive old partitions +SELECT * FROM partition_mgmt.cleanup_old_partitions( + 'scheduler', 'audit', 12, 'archive', FALSE +); + +-- View partition statistics +SELECT * FROM partition_mgmt.partition_stats +WHERE schema_name = 'scheduler' +ORDER BY table_name, partition_name; +*/ diff --git a/deploy/database/postgres-partitioning/002_calibration_schema.sql b/deploy/database/postgres-partitioning/002_calibration_schema.sql new file mode 100644 index 000000000..f5341201f --- /dev/null +++ b/deploy/database/postgres-partitioning/002_calibration_schema.sql @@ -0,0 +1,143 @@ +-- Migration: Trust Vector Calibration Schema +-- Sprint: 7100.0002.0002 +-- Description: Creates schema and tables for trust vector calibration system + +-- Create calibration schema +CREATE SCHEMA IF NOT EXISTS excititor_calibration; + +-- Calibration manifests table +-- Stores signed manifests for each calibration epoch +CREATE TABLE IF NOT EXISTS excititor_calibration.calibration_manifests ( + manifest_id TEXT PRIMARY KEY, + tenant_id TEXT NOT NULL, + epoch_number INTEGER NOT NULL, + epoch_start_utc TIMESTAMP NOT NULL, + epoch_end_utc TIMESTAMP NOT NULL, + sample_count INTEGER NOT NULL, + learning_rate DOUBLE PRECISION NOT NULL, + policy_hash TEXT, + lattice_version TEXT NOT NULL, + manifest_json JSONB NOT NULL, + signature_envelope JSONB, + created_at_utc TIMESTAMP NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC'), + created_by TEXT NOT NULL, + + CONSTRAINT uq_calibration_manifest_tenant_epoch UNIQUE (tenant_id, epoch_number) +); + +CREATE INDEX idx_calibration_manifests_tenant + ON excititor_calibration.calibration_manifests(tenant_id); +CREATE INDEX idx_calibration_manifests_created + ON excititor_calibration.calibration_manifests(created_at_utc DESC); + +-- Trust vector adjustments table +-- Records each provider's trust vector changes per epoch +CREATE TABLE IF NOT EXISTS excititor_calibration.trust_vector_adjustments ( + adjustment_id BIGSERIAL PRIMARY KEY, + manifest_id TEXT NOT NULL REFERENCES excititor_calibration.calibration_manifests(manifest_id), + source_id TEXT NOT NULL, + old_provenance DOUBLE PRECISION NOT NULL, + old_coverage DOUBLE PRECISION NOT NULL, + old_replayability DOUBLE PRECISION NOT NULL, + new_provenance DOUBLE PRECISION NOT NULL, + new_coverage DOUBLE PRECISION NOT NULL, + new_replayability DOUBLE PRECISION NOT NULL, + adjustment_magnitude DOUBLE PRECISION NOT NULL, + confidence_in_adjustment DOUBLE PRECISION NOT NULL, + sample_count_for_source INTEGER NOT NULL, + created_at_utc TIMESTAMP NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC'), + + CONSTRAINT chk_old_provenance_range CHECK (old_provenance >= 0 AND old_provenance <= 1), + CONSTRAINT chk_old_coverage_range CHECK (old_coverage >= 0 AND old_coverage <= 1), + CONSTRAINT chk_old_replayability_range CHECK (old_replayability >= 0 AND old_replayability <= 1), + CONSTRAINT chk_new_provenance_range CHECK (new_provenance >= 0 AND new_provenance <= 1), + CONSTRAINT chk_new_coverage_range CHECK (new_coverage >= 0 AND new_coverage <= 1), + CONSTRAINT chk_new_replayability_range CHECK (new_replayability >= 0 AND new_replayability <= 1), + CONSTRAINT chk_confidence_range CHECK (confidence_in_adjustment >= 0 AND confidence_in_adjustment <= 1) +); + +CREATE INDEX idx_trust_adjustments_manifest + ON excititor_calibration.trust_vector_adjustments(manifest_id); +CREATE INDEX idx_trust_adjustments_source + ON excititor_calibration.trust_vector_adjustments(source_id); + +-- Calibration feedback samples table +-- Stores empirical evidence used for calibration +CREATE TABLE IF NOT EXISTS excititor_calibration.calibration_samples ( + sample_id BIGSERIAL PRIMARY KEY, + tenant_id TEXT NOT NULL, + source_id TEXT NOT NULL, + cve_id TEXT NOT NULL, + purl TEXT NOT NULL, + expected_status TEXT NOT NULL, + actual_status TEXT NOT NULL, + verdict_confidence DOUBLE PRECISION NOT NULL, + is_match BOOLEAN NOT NULL, + feedback_source TEXT NOT NULL, -- 'reachability', 'customer_feedback', 'integration_tests' + feedback_weight DOUBLE PRECISION NOT NULL DEFAULT 1.0, + scan_id TEXT, + collected_at_utc TIMESTAMP NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC'), + processed BOOLEAN NOT NULL DEFAULT FALSE, + processed_in_manifest_id TEXT REFERENCES excititor_calibration.calibration_manifests(manifest_id), + + CONSTRAINT chk_verdict_confidence_range CHECK (verdict_confidence >= 0 AND verdict_confidence <= 1), + CONSTRAINT chk_feedback_weight_range CHECK (feedback_weight >= 0 AND feedback_weight <= 1) +); + +CREATE INDEX idx_calibration_samples_tenant + ON excititor_calibration.calibration_samples(tenant_id); +CREATE INDEX idx_calibration_samples_source + ON excititor_calibration.calibration_samples(source_id); +CREATE INDEX idx_calibration_samples_collected + ON excititor_calibration.calibration_samples(collected_at_utc DESC); +CREATE INDEX idx_calibration_samples_processed + ON excititor_calibration.calibration_samples(processed) WHERE NOT processed; + +-- Calibration metrics table +-- Tracks performance metrics per source/severity/status +CREATE TABLE IF NOT EXISTS excititor_calibration.calibration_metrics ( + metric_id BIGSERIAL PRIMARY KEY, + manifest_id TEXT NOT NULL REFERENCES excititor_calibration.calibration_manifests(manifest_id), + source_id TEXT, + severity TEXT, + status TEXT, + precision DOUBLE PRECISION NOT NULL, + recall DOUBLE PRECISION NOT NULL, + f1_score DOUBLE PRECISION NOT NULL, + false_positive_rate DOUBLE PRECISION NOT NULL, + false_negative_rate DOUBLE PRECISION NOT NULL, + sample_count INTEGER NOT NULL, + created_at_utc TIMESTAMP NOT NULL DEFAULT (NOW() AT TIME ZONE 'UTC'), + + CONSTRAINT chk_precision_range CHECK (precision >= 0 AND precision <= 1), + CONSTRAINT chk_recall_range CHECK (recall >= 0 AND recall <= 1), + CONSTRAINT chk_f1_range CHECK (f1_score >= 0 AND f1_score <= 1), + CONSTRAINT chk_fpr_range CHECK (false_positive_rate >= 0 AND false_positive_rate <= 1), + CONSTRAINT chk_fnr_range CHECK (false_negative_rate >= 0 AND false_negative_rate <= 1) +); + +CREATE INDEX idx_calibration_metrics_manifest + ON excititor_calibration.calibration_metrics(manifest_id); +CREATE INDEX idx_calibration_metrics_source + ON excititor_calibration.calibration_metrics(source_id) WHERE source_id IS NOT NULL; + +-- Grant permissions to excititor service role +DO $$ +BEGIN + IF EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'excititor_service') THEN + GRANT USAGE ON SCHEMA excititor_calibration TO excititor_service; + GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA excititor_calibration TO excititor_service; + GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA excititor_calibration TO excititor_service; + ALTER DEFAULT PRIVILEGES IN SCHEMA excititor_calibration + GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO excititor_service; + ALTER DEFAULT PRIVILEGES IN SCHEMA excititor_calibration + GRANT USAGE, SELECT ON SEQUENCES TO excititor_service; + END IF; +END $$; + +-- Comments for documentation +COMMENT ON SCHEMA excititor_calibration IS 'Trust vector calibration data for VEX source scoring'; +COMMENT ON TABLE excititor_calibration.calibration_manifests IS 'Signed calibration epoch results'; +COMMENT ON TABLE excititor_calibration.trust_vector_adjustments IS 'Per-source trust vector changes per epoch'; +COMMENT ON TABLE excititor_calibration.calibration_samples IS 'Empirical feedback samples for calibration'; +COMMENT ON TABLE excititor_calibration.calibration_metrics IS 'Performance metrics per calibration epoch'; diff --git a/deploy/database/postgres-partitioning/provcache/create_provcache_schema.sql b/deploy/database/postgres-partitioning/provcache/create_provcache_schema.sql new file mode 100644 index 000000000..9ce86d3b2 --- /dev/null +++ b/deploy/database/postgres-partitioning/provcache/create_provcache_schema.sql @@ -0,0 +1,97 @@ +-- Provcache schema migration +-- Run as: psql -d stellaops -f create_provcache_schema.sql + +-- Create schema +CREATE SCHEMA IF NOT EXISTS provcache; + +-- Main cache items table +CREATE TABLE IF NOT EXISTS provcache.provcache_items ( + verikey TEXT PRIMARY KEY, + digest_version TEXT NOT NULL DEFAULT 'v1', + verdict_hash TEXT NOT NULL, + proof_root TEXT NOT NULL, + replay_seed JSONB NOT NULL, + policy_hash TEXT NOT NULL, + signer_set_hash TEXT NOT NULL, + feed_epoch TEXT NOT NULL, + trust_score INTEGER NOT NULL CHECK (trust_score >= 0 AND trust_score <= 100), + hit_count BIGINT NOT NULL DEFAULT 0, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + expires_at TIMESTAMPTZ NOT NULL, + updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + last_accessed_at TIMESTAMPTZ, + + -- Constraint: expires_at must be after created_at + CONSTRAINT provcache_items_expires_check CHECK (expires_at > created_at) +); + +-- Indexes for invalidation queries +CREATE INDEX IF NOT EXISTS idx_provcache_policy_hash + ON provcache.provcache_items(policy_hash); +CREATE INDEX IF NOT EXISTS idx_provcache_signer_set_hash + ON provcache.provcache_items(signer_set_hash); +CREATE INDEX IF NOT EXISTS idx_provcache_feed_epoch + ON provcache.provcache_items(feed_epoch); +CREATE INDEX IF NOT EXISTS idx_provcache_expires_at + ON provcache.provcache_items(expires_at); +CREATE INDEX IF NOT EXISTS idx_provcache_created_at + ON provcache.provcache_items(created_at); + +-- Evidence chunks table for large evidence storage +CREATE TABLE IF NOT EXISTS provcache.prov_evidence_chunks ( + chunk_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + proof_root TEXT NOT NULL, + chunk_index INTEGER NOT NULL, + chunk_hash TEXT NOT NULL, + blob BYTEA NOT NULL, + blob_size INTEGER NOT NULL, + content_type TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), + + CONSTRAINT prov_evidence_chunks_unique_index + UNIQUE (proof_root, chunk_index) +); + +CREATE INDEX IF NOT EXISTS idx_prov_chunks_proof_root + ON provcache.prov_evidence_chunks(proof_root); + +-- Revocation audit log +CREATE TABLE IF NOT EXISTS provcache.prov_revocations ( + revocation_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + revocation_type TEXT NOT NULL, + target_hash TEXT NOT NULL, + reason TEXT, + actor TEXT, + entries_affected BIGINT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +CREATE INDEX IF NOT EXISTS idx_prov_revocations_created_at + ON provcache.prov_revocations(created_at); +CREATE INDEX IF NOT EXISTS idx_prov_revocations_target_hash + ON provcache.prov_revocations(target_hash); + +-- Function to update updated_at timestamp +CREATE OR REPLACE FUNCTION provcache.update_updated_at_column() +RETURNS TRIGGER AS $$ +BEGIN + NEW.updated_at = NOW(); + RETURN NEW; +END; +$$ language 'plpgsql'; + +-- Trigger for auto-updating updated_at +DROP TRIGGER IF EXISTS update_provcache_items_updated_at ON provcache.provcache_items; +CREATE TRIGGER update_provcache_items_updated_at + BEFORE UPDATE ON provcache.provcache_items + FOR EACH ROW + EXECUTE FUNCTION provcache.update_updated_at_column(); + +-- Grant permissions (adjust role as needed) +-- GRANT USAGE ON SCHEMA provcache TO stellaops_app; +-- GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA provcache TO stellaops_app; +-- GRANT USAGE ON ALL SEQUENCES IN SCHEMA provcache TO stellaops_app; + +COMMENT ON TABLE provcache.provcache_items IS 'Provenance cache entries for cached security decisions'; +COMMENT ON TABLE provcache.prov_evidence_chunks IS 'Chunked evidence storage for large SBOMs and attestations'; +COMMENT ON TABLE provcache.prov_revocations IS 'Audit log of cache invalidation events'; diff --git a/deploy/database/postgres-validation/001_validate_rls.sql b/deploy/database/postgres-validation/001_validate_rls.sql new file mode 100644 index 000000000..8d9b28cb9 --- /dev/null +++ b/deploy/database/postgres-validation/001_validate_rls.sql @@ -0,0 +1,159 @@ +-- RLS Validation Script +-- Sprint: SPRINT_3421_0001_0001 - RLS Expansion +-- +-- Purpose: Verify that RLS is properly configured on all tenant-scoped tables +-- Run this script after deploying RLS migrations to validate configuration + +-- ============================================================================ +-- Part 1: List all tables with RLS status +-- ============================================================================ + +\echo '=== RLS Status for All Schemas ===' + +SELECT + schemaname AS schema, + tablename AS table_name, + rowsecurity AS rls_enabled, + forcerowsecurity AS rls_forced, + CASE + WHEN rowsecurity AND forcerowsecurity THEN 'OK' + WHEN rowsecurity AND NOT forcerowsecurity THEN 'WARN: Not forced' + ELSE 'MISSING' + END AS status +FROM pg_tables +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') +ORDER BY schemaname, tablename; + +-- ============================================================================ +-- Part 2: List all RLS policies +-- ============================================================================ + +\echo '' +\echo '=== RLS Policies ===' + +SELECT + schemaname AS schema, + tablename AS table_name, + policyname AS policy_name, + permissive, + roles, + cmd AS applies_to, + qual IS NOT NULL AS has_using, + with_check IS NOT NULL AS has_check +FROM pg_policies +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') +ORDER BY schemaname, tablename, policyname; + +-- ============================================================================ +-- Part 3: Tables missing RLS that should have it (have tenant_id column) +-- ============================================================================ + +\echo '' +\echo '=== Tables with tenant_id but NO RLS ===' + +SELECT + c.table_schema AS schema, + c.table_name AS table_name, + 'MISSING RLS' AS issue +FROM information_schema.columns c +JOIN pg_tables t ON c.table_schema = t.schemaname AND c.table_name = t.tablename +WHERE c.column_name IN ('tenant_id', 'tenant') + AND c.table_schema IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') + AND NOT t.rowsecurity +ORDER BY c.table_schema, c.table_name; + +-- ============================================================================ +-- Part 4: Verify helper functions exist +-- ============================================================================ + +\echo '' +\echo '=== RLS Helper Functions ===' + +SELECT + n.nspname AS schema, + p.proname AS function_name, + CASE + WHEN p.prosecdef THEN 'SECURITY DEFINER' + ELSE 'SECURITY INVOKER' + END AS security, + CASE + WHEN p.provolatile = 's' THEN 'STABLE' + WHEN p.provolatile = 'i' THEN 'IMMUTABLE' + ELSE 'VOLATILE' + END AS volatility +FROM pg_proc p +JOIN pg_namespace n ON p.pronamespace = n.oid +WHERE p.proname = 'require_current_tenant' + AND n.nspname LIKE '%_app' +ORDER BY n.nspname; + +-- ============================================================================ +-- Part 5: Test RLS enforcement (expect failure without tenant context) +-- ============================================================================ + +\echo '' +\echo '=== RLS Enforcement Test ===' +\echo 'Testing RLS on scheduler.runs (should fail without tenant context)...' + +-- Reset tenant context +SELECT set_config('app.tenant_id', '', false); + +DO $$ +BEGIN + -- This should raise an exception if RLS is working + PERFORM * FROM scheduler.runs LIMIT 1; + RAISE NOTICE 'WARNING: Query succeeded without tenant context - RLS may not be working!'; +EXCEPTION + WHEN OTHERS THEN + RAISE NOTICE 'OK: RLS blocked query without tenant context: %', SQLERRM; +END +$$; + +-- ============================================================================ +-- Part 6: Admin bypass role verification +-- ============================================================================ + +\echo '' +\echo '=== Admin Bypass Roles ===' + +SELECT + rolname AS role_name, + rolbypassrls AS can_bypass_rls, + rolcanlogin AS can_login +FROM pg_roles +WHERE rolname LIKE '%_admin' + AND rolbypassrls = TRUE +ORDER BY rolname; + +-- ============================================================================ +-- Summary +-- ============================================================================ + +\echo '' +\echo '=== Summary ===' + +SELECT + 'Total Tables' AS metric, + COUNT(*)::TEXT AS value +FROM pg_tables +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') +UNION ALL +SELECT + 'Tables with RLS Enabled', + COUNT(*)::TEXT +FROM pg_tables +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') + AND rowsecurity = TRUE +UNION ALL +SELECT + 'Tables with RLS Forced', + COUNT(*)::TEXT +FROM pg_tables +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns') + AND forcerowsecurity = TRUE +UNION ALL +SELECT + 'Active Policies', + COUNT(*)::TEXT +FROM pg_policies +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns'); diff --git a/deploy/database/postgres-validation/002_validate_partitions.sql b/deploy/database/postgres-validation/002_validate_partitions.sql new file mode 100644 index 000000000..3b7aeea3a --- /dev/null +++ b/deploy/database/postgres-validation/002_validate_partitions.sql @@ -0,0 +1,238 @@ +-- Partition Validation Script +-- Sprint: SPRINT_3422_0001_0001 - Time-Based Partitioning +-- +-- Purpose: Verify that partitioned tables are properly configured and healthy + +-- ============================================================================ +-- Part 1: List all partitioned tables +-- ============================================================================ + +\echo '=== Partitioned Tables ===' + +SELECT + n.nspname AS schema, + c.relname AS table_name, + CASE pt.partstrat + WHEN 'r' THEN 'RANGE' + WHEN 'l' THEN 'LIST' + WHEN 'h' THEN 'HASH' + END AS partition_strategy, + array_to_string(array_agg(a.attname ORDER BY k.col), ', ') AS partition_key +FROM pg_class c +JOIN pg_namespace n ON c.relnamespace = n.oid +JOIN pg_partitioned_table pt ON c.oid = pt.partrelid +JOIN LATERAL unnest(pt.partattrs) WITH ORDINALITY AS k(col, idx) ON true +LEFT JOIN pg_attribute a ON a.attrelid = c.oid AND a.attnum = k.col +WHERE n.nspname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') +GROUP BY n.nspname, c.relname, pt.partstrat +ORDER BY n.nspname, c.relname; + +-- ============================================================================ +-- Part 2: Partition inventory with sizes +-- ============================================================================ + +\echo '' +\echo '=== Partition Inventory ===' + +SELECT + n.nspname AS schema, + parent.relname AS parent_table, + c.relname AS partition_name, + pg_get_expr(c.relpartbound, c.oid) AS bounds, + pg_size_pretty(pg_relation_size(c.oid)) AS size, + s.n_live_tup AS estimated_rows +FROM pg_class c +JOIN pg_namespace n ON c.relnamespace = n.oid +JOIN pg_inherits i ON c.oid = i.inhrelid +JOIN pg_class parent ON i.inhparent = parent.oid +LEFT JOIN pg_stat_user_tables s ON c.oid = s.relid +WHERE n.nspname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') + AND c.relkind = 'r' + AND parent.relkind = 'p' +ORDER BY n.nspname, parent.relname, c.relname; + +-- ============================================================================ +-- Part 3: Check for missing future partitions +-- ============================================================================ + +\echo '' +\echo '=== Future Partition Coverage ===' + +WITH partition_bounds AS ( + SELECT + n.nspname AS schema_name, + parent.relname AS table_name, + c.relname AS partition_name, + -- Extract the TO date from partition bound + (regexp_match(pg_get_expr(c.relpartbound, c.oid), 'TO \(''([^'']+)''\)'))[1]::DATE AS end_date + FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + JOIN pg_inherits i ON c.oid = i.inhrelid + JOIN pg_class parent ON i.inhparent = parent.oid + WHERE c.relkind = 'r' + AND parent.relkind = 'p' + AND c.relname NOT LIKE '%_default' +), +max_bounds AS ( + SELECT + schema_name, + table_name, + MAX(end_date) AS max_partition_date + FROM partition_bounds + WHERE end_date IS NOT NULL + GROUP BY schema_name, table_name +) +SELECT + schema_name, + table_name, + max_partition_date, + (max_partition_date - CURRENT_DATE) AS days_ahead, + CASE + WHEN (max_partition_date - CURRENT_DATE) < 30 THEN 'CRITICAL: Create partitions!' + WHEN (max_partition_date - CURRENT_DATE) < 60 THEN 'WARNING: Running low' + ELSE 'OK' + END AS status +FROM max_bounds +ORDER BY days_ahead; + +-- ============================================================================ +-- Part 4: Check for orphaned data in default partitions +-- ============================================================================ + +\echo '' +\echo '=== Default Partition Data (should be empty) ===' + +DO $$ +DECLARE + v_schema TEXT; + v_table TEXT; + v_count BIGINT; + v_sql TEXT; +BEGIN + FOR v_schema, v_table IN + SELECT n.nspname, c.relname + FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + WHERE c.relname LIKE '%_default' + AND n.nspname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') + LOOP + v_sql := format('SELECT COUNT(*) FROM %I.%I', v_schema, v_table); + EXECUTE v_sql INTO v_count; + + IF v_count > 0 THEN + RAISE NOTICE 'WARNING: %.% has % rows in default partition!', + v_schema, v_table, v_count; + ELSE + RAISE NOTICE 'OK: %.% is empty', v_schema, v_table; + END IF; + END LOOP; +END +$$; + +-- ============================================================================ +-- Part 5: Index health on partitions +-- ============================================================================ + +\echo '' +\echo '=== Partition Index Coverage ===' + +SELECT + schemaname AS schema, + tablename AS table_name, + indexname AS index_name, + indexdef +FROM pg_indexes +WHERE schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') + AND tablename LIKE '%_partitioned' OR tablename LIKE '%_202%' +ORDER BY schemaname, tablename, indexname; + +-- ============================================================================ +-- Part 6: BRIN index effectiveness check +-- ============================================================================ + +\echo '' +\echo '=== BRIN Index Statistics ===' + +SELECT + schemaname AS schema, + tablename AS table_name, + indexrelname AS index_name, + idx_scan AS scans, + idx_tup_read AS tuples_read, + idx_tup_fetch AS tuples_fetched, + pg_size_pretty(pg_relation_size(indexrelid)) AS index_size +FROM pg_stat_user_indexes +WHERE indexrelname LIKE 'brin_%' +ORDER BY schemaname, tablename; + +-- ============================================================================ +-- Part 7: Partition maintenance recommendations +-- ============================================================================ + +\echo '' +\echo '=== Maintenance Recommendations ===' + +WITH partition_ages AS ( + SELECT + n.nspname AS schema_name, + parent.relname AS table_name, + c.relname AS partition_name, + (regexp_match(pg_get_expr(c.relpartbound, c.oid), 'FROM \(''([^'']+)''\)'))[1]::DATE AS start_date, + (regexp_match(pg_get_expr(c.relpartbound, c.oid), 'TO \(''([^'']+)''\)'))[1]::DATE AS end_date + FROM pg_class c + JOIN pg_namespace n ON c.relnamespace = n.oid + JOIN pg_inherits i ON c.oid = i.inhrelid + JOIN pg_class parent ON i.inhparent = parent.oid + WHERE c.relkind = 'r' + AND parent.relkind = 'p' + AND c.relname NOT LIKE '%_default' +) +SELECT + schema_name, + table_name, + partition_name, + start_date, + end_date, + (CURRENT_DATE - end_date) AS days_old, + CASE + WHEN (CURRENT_DATE - end_date) > 365 THEN 'Consider archiving (>1 year old)' + WHEN (CURRENT_DATE - end_date) > 180 THEN 'Review retention policy (>6 months old)' + ELSE 'Current' + END AS recommendation +FROM partition_ages +WHERE start_date IS NOT NULL +ORDER BY schema_name, table_name, start_date; + +-- ============================================================================ +-- Summary +-- ============================================================================ + +\echo '' +\echo '=== Summary ===' + +SELECT + 'Partitioned Tables' AS metric, + COUNT(DISTINCT parent.relname)::TEXT AS value +FROM pg_class c +JOIN pg_namespace n ON c.relnamespace = n.oid +JOIN pg_inherits i ON c.oid = i.inhrelid +JOIN pg_class parent ON i.inhparent = parent.oid +WHERE n.nspname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') + AND parent.relkind = 'p' +UNION ALL +SELECT + 'Total Partitions', + COUNT(*)::TEXT +FROM pg_class c +JOIN pg_namespace n ON c.relnamespace = n.oid +JOIN pg_inherits i ON c.oid = i.inhrelid +JOIN pg_class parent ON i.inhparent = parent.oid +WHERE n.nspname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln') + AND parent.relkind = 'p' +UNION ALL +SELECT + 'BRIN Indexes', + COUNT(*)::TEXT +FROM pg_indexes +WHERE indexname LIKE 'brin_%' + AND schemaname IN ('scheduler', 'notify', 'authority', 'vex', 'policy', 'unknowns', 'vuln'); diff --git a/deploy/database/postgres/README.md b/deploy/database/postgres/README.md new file mode 100644 index 000000000..d1aa8c446 --- /dev/null +++ b/deploy/database/postgres/README.md @@ -0,0 +1,66 @@ +# PostgreSQL 16 Cluster (staging / production) + +This directory provisions StellaOps PostgreSQL clusters with **CloudNativePG (CNPG)**. It is pinned to Postgres 16.x, includes connection pooling (PgBouncer), Prometheus scraping, and S3-compatible backups. Everything is air-gap friendly: fetch the operator and images once, then render/apply manifests offline. + +## Targets +- **Staging:** `stellaops-pg-stg` (2 instances, 200 Gi data, WAL 64 Gi, PgBouncer x2) +- **Production:** `stellaops-pg-prod` (3 instances, 500 Gi data, WAL 128 Gi, PgBouncer x3) +- **Namespace:** `platform-postgres` + +## Prerequisites +- Kubernetes ≥ 1.27 with CSI storage classes `fast-ssd` (data) and `fast-wal` (WAL) available. +- CloudNativePG operator 1.23.x mirrored or downloaded to `artifacts/cloudnative-pg-1.23.0.yaml`. +- Images mirrored to your registry (example tags): + - `ghcr.io/cloudnative-pg/postgresql:16.4` + - `ghcr.io/cloudnative-pg/postgresql-operator:1.23.0` + - `ghcr.io/cloudnative-pg/pgbouncer:1.23.0` +- Secrets created from the templates under `ops/devops/postgres/secrets/` (superuser, app user, backup credentials). + +## Render & Apply (deterministic) +```bash +# 1) Create namespace +kubectl apply -f ops/devops/postgres/namespace.yaml + +# 2) Install operator (offline-friendly: use the pinned manifest you mirrored) +kubectl apply -f artifacts/cloudnative-pg-1.23.0.yaml + +# 3) Create secrets (replace passwords/keys first) +kubectl apply -f ops/devops/postgres/secrets/example-superuser.yaml +kubectl apply -f ops/devops/postgres/secrets/example-app.yaml +kubectl apply -f ops/devops/postgres/secrets/example-backup-credentials.yaml + +# 4) Apply the cluster and pooler for the target environment +kubectl apply -f ops/devops/postgres/cluster-staging.yaml +kubectl apply -f ops/devops/postgres/pooler-staging.yaml +# or +kubectl apply -f ops/devops/postgres/cluster-production.yaml +kubectl apply -f ops/devops/postgres/pooler-production.yaml +``` + +## Connection Endpoints +- RW service: `-rw` (e.g., `stellaops-pg-stg-rw:5432`) +- RO service: `-ro` +- PgBouncer pooler: `` (e.g., `stellaops-pg-stg-pooler:6432`) + +**Application connection string (matches library defaults):** +`Host=stellaops-pg-stg-pooler;Port=6432;Username=stellaops_app;Password=;Database=stellaops;Pooling=true;Timeout=15;CommandTimeout=30;Ssl Mode=Require;` + +## Monitoring & Backups +- `monitoring.enablePodMonitor: true` exposes PodMonitor for Prometheus Operator. +- Barman/S3 backups are enabled by default; set `backup.barmanObjectStore.destinationPath` per env and populate `stellaops-pg-backup` credentials. +- WAL compression is `gzip`; retention is operator-managed (configure via Barman bucket policies). + +## Alignment with code defaults +- Session settings: UTC timezone, 30s `statement_timeout`, tenant context via `set_config('app.current_tenant', ...)`. +- Connection pooler uses **transaction** mode with a `server_reset_query` that clears session state, keeping RepositoryBase deterministic. + +## Verification checklist +- `kubectl get cluster -n platform-postgres` shows `Ready` replicas matching `instances`. +- `kubectl logs deploy/cnpg-controller-manager -n cnpg-system` has no failing webhooks. +- `kubectl get podmonitor -n platform-postgres` returns entries for the cluster and pooler. +- `psql "" -c 'select 1'` works from CI runner subnet. +- `cnpg` `barman-cloud-backup-list` shows successful full + WAL backups. + +## Offline notes +- Mirror the operator manifest and container images to the approved registry first; no live downloads occur at runtime. +- If Prometheus is not present, leave PodMonitor applied; it is inert without the CRD. diff --git a/deploy/database/postgres/cluster-production.yaml b/deploy/database/postgres/cluster-production.yaml new file mode 100644 index 000000000..27d5c7bd7 --- /dev/null +++ b/deploy/database/postgres/cluster-production.yaml @@ -0,0 +1,57 @@ +apiVersion: postgresql.cnpg.io/v1 +kind: Cluster +metadata: + name: stellaops-pg-prod + namespace: platform-postgres +spec: + instances: 3 + imageName: ghcr.io/cloudnative-pg/postgresql:16.4 + primaryUpdateStrategy: unsupervised + storage: + size: 500Gi + storageClass: fast-ssd + walStorage: + size: 128Gi + storageClass: fast-wal + superuserSecret: + name: stellaops-pg-superuser + bootstrap: + initdb: + database: stellaops + owner: stellaops_app + secret: + name: stellaops-pg-app + monitoring: + enablePodMonitor: true + postgresql: + parameters: + max_connections: "900" + shared_buffers: "4096MB" + work_mem: "96MB" + maintenance_work_mem: "768MB" + wal_level: "replica" + max_wal_size: "4GB" + timezone: "UTC" + log_min_duration_statement: "250" + statement_timeout: "30000" + resources: + requests: + cpu: "4" + memory: "16Gi" + limits: + cpu: "8" + memory: "24Gi" + backup: + barmanObjectStore: + destinationPath: s3://stellaops-backups/production + s3Credentials: + accessKeyId: + name: stellaops-pg-backup + key: ACCESS_KEY_ID + secretAccessKey: + name: stellaops-pg-backup + key: SECRET_ACCESS_KEY + wal: + compression: gzip + maxParallel: 4 + logLevel: info diff --git a/deploy/database/postgres/cluster-staging.yaml b/deploy/database/postgres/cluster-staging.yaml new file mode 100644 index 000000000..aa327276d --- /dev/null +++ b/deploy/database/postgres/cluster-staging.yaml @@ -0,0 +1,57 @@ +apiVersion: postgresql.cnpg.io/v1 +kind: Cluster +metadata: + name: stellaops-pg-stg + namespace: platform-postgres +spec: + instances: 2 + imageName: ghcr.io/cloudnative-pg/postgresql:16.4 + primaryUpdateStrategy: unsupervised + storage: + size: 200Gi + storageClass: fast-ssd + walStorage: + size: 64Gi + storageClass: fast-wal + superuserSecret: + name: stellaops-pg-superuser + bootstrap: + initdb: + database: stellaops + owner: stellaops_app + secret: + name: stellaops-pg-app + monitoring: + enablePodMonitor: true + postgresql: + parameters: + max_connections: "600" + shared_buffers: "2048MB" + work_mem: "64MB" + maintenance_work_mem: "512MB" + wal_level: "replica" + max_wal_size: "2GB" + timezone: "UTC" + log_min_duration_statement: "500" + statement_timeout: "30000" + resources: + requests: + cpu: "2" + memory: "8Gi" + limits: + cpu: "4" + memory: "12Gi" + backup: + barmanObjectStore: + destinationPath: s3://stellaops-backups/staging + s3Credentials: + accessKeyId: + name: stellaops-pg-backup + key: ACCESS_KEY_ID + secretAccessKey: + name: stellaops-pg-backup + key: SECRET_ACCESS_KEY + wal: + compression: gzip + maxParallel: 2 + logLevel: info diff --git a/deploy/database/postgres/namespace.yaml b/deploy/database/postgres/namespace.yaml new file mode 100644 index 000000000..793ef0de8 --- /dev/null +++ b/deploy/database/postgres/namespace.yaml @@ -0,0 +1,4 @@ +apiVersion: v1 +kind: Namespace +metadata: + name: platform-postgres diff --git a/deploy/database/postgres/pooler-production.yaml b/deploy/database/postgres/pooler-production.yaml new file mode 100644 index 000000000..7cd184fc5 --- /dev/null +++ b/deploy/database/postgres/pooler-production.yaml @@ -0,0 +1,29 @@ +apiVersion: postgresql.cnpg.io/v1 +kind: Pooler +metadata: + name: stellaops-pg-prod-pooler + namespace: platform-postgres +spec: + cluster: + name: stellaops-pg-prod + instances: 3 + type: rw + pgbouncer: + parameters: + pool_mode: transaction + max_client_conn: "1500" + default_pool_size: "80" + server_reset_query: "RESET ALL; SET SESSION AUTHORIZATION DEFAULT; SET TIME ZONE 'UTC';" + authQuerySecret: + name: stellaops-pg-app + template: + spec: + containers: + - name: pgbouncer + resources: + requests: + cpu: "150m" + memory: "192Mi" + limits: + cpu: "750m" + memory: "384Mi" diff --git a/deploy/database/postgres/pooler-staging.yaml b/deploy/database/postgres/pooler-staging.yaml new file mode 100644 index 000000000..3b8d744e0 --- /dev/null +++ b/deploy/database/postgres/pooler-staging.yaml @@ -0,0 +1,29 @@ +apiVersion: postgresql.cnpg.io/v1 +kind: Pooler +metadata: + name: stellaops-pg-stg-pooler + namespace: platform-postgres +spec: + cluster: + name: stellaops-pg-stg + instances: 2 + type: rw + pgbouncer: + parameters: + pool_mode: transaction + max_client_conn: "800" + default_pool_size: "50" + server_reset_query: "RESET ALL; SET SESSION AUTHORIZATION DEFAULT; SET TIME ZONE 'UTC';" + authQuerySecret: + name: stellaops-pg-app + template: + spec: + containers: + - name: pgbouncer + resources: + requests: + cpu: "100m" + memory: "128Mi" + limits: + cpu: "500m" + memory: "256Mi" diff --git a/deploy/database/postgres/secrets/example-app.yaml b/deploy/database/postgres/secrets/example-app.yaml new file mode 100644 index 000000000..fbe9e0628 --- /dev/null +++ b/deploy/database/postgres/secrets/example-app.yaml @@ -0,0 +1,9 @@ +apiVersion: v1 +kind: Secret +metadata: + name: stellaops-pg-app + namespace: platform-postgres +type: kubernetes.io/basic-auth +stringData: + username: stellaops_app + password: CHANGEME_APP_PASSWORD diff --git a/deploy/database/postgres/secrets/example-backup-credentials.yaml b/deploy/database/postgres/secrets/example-backup-credentials.yaml new file mode 100644 index 000000000..a5d79ad3a --- /dev/null +++ b/deploy/database/postgres/secrets/example-backup-credentials.yaml @@ -0,0 +1,9 @@ +apiVersion: v1 +kind: Secret +metadata: + name: stellaops-pg-backup + namespace: platform-postgres +type: Opaque +stringData: + ACCESS_KEY_ID: CHANGEME_ACCESS_KEY + SECRET_ACCESS_KEY: CHANGEME_SECRET_KEY diff --git a/deploy/database/postgres/secrets/example-superuser.yaml b/deploy/database/postgres/secrets/example-superuser.yaml new file mode 100644 index 000000000..dbaec5695 --- /dev/null +++ b/deploy/database/postgres/secrets/example-superuser.yaml @@ -0,0 +1,9 @@ +apiVersion: v1 +kind: Secret +metadata: + name: stellaops-pg-superuser + namespace: platform-postgres +type: kubernetes.io/basic-auth +stringData: + username: postgres + password: CHANGEME_SUPERUSER_PASSWORD diff --git a/deploy/docker/Dockerfile.ci b/deploy/docker/Dockerfile.ci new file mode 100644 index 000000000..39f95a377 --- /dev/null +++ b/deploy/docker/Dockerfile.ci @@ -0,0 +1,173 @@ +# Dockerfile.ci - Local CI testing container matching Gitea runner environment +# Sprint: SPRINT_20251226_006_CICD +# +# Usage: +# docker build -t stellaops-ci:local -f devops/docker/Dockerfile.ci . +# docker run --rm -v $(pwd):/src stellaops-ci:local ./devops/scripts/test-local.sh + +FROM ubuntu:22.04 + +LABEL org.opencontainers.image.title="StellaOps CI" +LABEL org.opencontainers.image.description="Local CI testing environment matching Gitea runner" +LABEL org.opencontainers.image.source="https://git.stella-ops.org/stella-ops.org/git.stella-ops.org" + +# Environment variables +ENV DEBIAN_FRONTEND=noninteractive +ENV DOTNET_VERSION=10.0.100 +ENV NODE_VERSION=20 +ENV HELM_VERSION=3.16.0 +ENV COSIGN_VERSION=3.0.4 +ENV REKOR_VERSION=1.4.3 +ENV TZ=UTC + +# Disable .NET telemetry +ENV DOTNET_NOLOGO=1 +ENV DOTNET_CLI_TELEMETRY_OPTOUT=1 + +# .NET paths +ENV DOTNET_ROOT=/usr/share/dotnet +ENV PATH="/usr/share/dotnet:/root/.dotnet/tools:${PATH}" + +# =========================================================================== +# BASE DEPENDENCIES +# =========================================================================== + +RUN apt-get update && apt-get install -y --no-install-recommends \ + # Core utilities + curl \ + wget \ + gnupg2 \ + ca-certificates \ + git \ + unzip \ + jq \ + # Build tools + build-essential \ + # Cross-compilation + binutils-aarch64-linux-gnu \ + # Python (for scripts) + python3 \ + python3-pip \ + # .NET dependencies + libicu70 \ + # Locales + locales \ + && rm -rf /var/lib/apt/lists/* + +# =========================================================================== +# DOCKER CLI & COMPOSE (from official Docker repo) +# =========================================================================== + +RUN install -m 0755 -d /etc/apt/keyrings \ + && curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc \ + && chmod a+r /etc/apt/keyrings/docker.asc \ + && echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu jammy stable" > /etc/apt/sources.list.d/docker.list \ + && apt-get update \ + && apt-get install -y --no-install-recommends docker-ce-cli docker-compose-plugin \ + && rm -rf /var/lib/apt/lists/* \ + && docker --version + +# Set locale +RUN locale-gen en_US.UTF-8 +ENV LANG=en_US.UTF-8 +ENV LANGUAGE=en_US:en +ENV LC_ALL=en_US.UTF-8 + +# =========================================================================== +# POSTGRESQL CLIENT 16 +# =========================================================================== + +RUN curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | gpg --dearmor -o /usr/share/keyrings/postgresql-archive-keyring.gpg \ + && echo "deb [signed-by=/usr/share/keyrings/postgresql-archive-keyring.gpg] http://apt.postgresql.org/pub/repos/apt jammy-pgdg main" > /etc/apt/sources.list.d/pgdg.list \ + && apt-get update \ + && apt-get install -y --no-install-recommends postgresql-client-16 \ + && rm -rf /var/lib/apt/lists/* + +# =========================================================================== +# .NET 10 SDK +# =========================================================================== + +RUN curl -fsSL https://dot.net/v1/dotnet-install.sh -o /tmp/dotnet-install.sh \ + && chmod +x /tmp/dotnet-install.sh \ + && /tmp/dotnet-install.sh --version ${DOTNET_VERSION} --install-dir /usr/share/dotnet \ + && rm /tmp/dotnet-install.sh \ + && dotnet --version + +# Install common .NET tools +RUN dotnet tool install -g trx2junit \ + && dotnet tool install -g dotnet-reportgenerator-globaltool + +# =========================================================================== +# NODE.JS 20 +# =========================================================================== + +RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \ + && apt-get install -y --no-install-recommends nodejs \ + && rm -rf /var/lib/apt/lists/* \ + && node --version \ + && npm --version + +# =========================================================================== +# HELM 3.16.0 +# =========================================================================== + +RUN curl -fsSL https://get.helm.sh/helm-v${HELM_VERSION}-linux-amd64.tar.gz | \ + tar -xzf - -C /tmp \ + && mv /tmp/linux-amd64/helm /usr/local/bin/helm \ + && rm -rf /tmp/linux-amd64 \ + && helm version + +# =========================================================================== +# COSIGN +# =========================================================================== + +RUN curl -fsSL https://github.com/sigstore/cosign/releases/download/v${COSIGN_VERSION}/cosign-linux-amd64 \ + -o /usr/local/bin/cosign \ + && chmod +x /usr/local/bin/cosign \ + && cosign version + +# =========================================================================== +# REKOR CLI +# =========================================================================== + +RUN curl -fsSL https://github.com/sigstore/rekor/releases/download/v${REKOR_VERSION}/rekor-cli-linux-amd64 \ + -o /usr/local/bin/rekor-cli \ + && chmod +x /usr/local/bin/rekor-cli \ + && rekor-cli version + +# =========================================================================== +# SYFT (SBOM generation) +# =========================================================================== + +RUN curl -fsSL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin + +# =========================================================================== +# SETUP +# =========================================================================== + +WORKDIR /src + +# Create non-root user for safer execution (optional) +RUN useradd -m -s /bin/bash ciuser \ + && mkdir -p /home/ciuser/.dotnet/tools \ + && chown -R ciuser:ciuser /home/ciuser + +# Health check script +RUN printf '%s\n' \ + '#!/bin/bash' \ + 'set -e' \ + 'echo "=== CI Environment Health Check ==="' \ + 'echo "OS: $(cat /etc/os-release | grep PRETTY_NAME | cut -d= -f2)"' \ + 'echo ".NET: $(dotnet --version)"' \ + 'echo "Node: $(node --version)"' \ + 'echo "npm: $(npm --version)"' \ + 'echo "Helm: $(helm version --short)"' \ + 'echo "Cosign: $(cosign version 2>&1 | head -1)"' \ + 'echo "Rekor CLI: $(rekor-cli version 2>&1 | head -1)"' \ + 'echo "Docker: $(docker --version 2>/dev/null || echo Not available)"' \ + 'echo "PostgreSQL client: $(psql --version)"' \ + 'echo "=== All checks passed ==="' \ + > /usr/local/bin/ci-health-check \ + && chmod +x /usr/local/bin/ci-health-check + +ENTRYPOINT ["/bin/bash"] diff --git a/deploy/docker/Dockerfile.console b/deploy/docker/Dockerfile.console new file mode 100644 index 000000000..ebe47db1d --- /dev/null +++ b/deploy/docker/Dockerfile.console @@ -0,0 +1,40 @@ +# syntax=docker/dockerfile:1.7 +# Multi-stage Angular console image with non-root runtime (DOCKER-44-001) +ARG NODE_IMAGE=node:20-bullseye-slim +ARG NGINX_IMAGE=nginxinc/nginx-unprivileged:1.27-alpine +ARG APP_DIR=src/Web/StellaOps.Web +ARG DIST_DIR=dist +ARG APP_PORT=8080 + +FROM ${NODE_IMAGE} AS build +ENV npm_config_fund=false npm_config_audit=false SOURCE_DATE_EPOCH=1704067200 +WORKDIR /app +COPY ${APP_DIR}/package*.json ./ +RUN npm ci --prefer-offline --no-progress --cache .npm +COPY ${APP_DIR}/ ./ +RUN npm run build -- --configuration=production --output-path=${DIST_DIR} + +FROM ${NGINX_IMAGE} AS runtime +ARG APP_PORT +ENV APP_PORT=${APP_PORT} +USER 101 +WORKDIR / +COPY --from=build /app/${DIST_DIR}/ /usr/share/nginx/html/ +COPY ops/devops/docker/healthcheck-frontend.sh /usr/local/bin/healthcheck-frontend.sh +RUN rm -f /etc/nginx/conf.d/default.conf && \ + cat > /etc/nginx/conf.d/default.conf <&2 + exit 1 +fi + +echo "Building services from ${MATRIX} -> ${REGISTRY}/:${TAG_SUFFIX}" >&2 + +while IFS='|' read -r service dockerfile project binary port; do + [[ -z "${service}" || "${service}" =~ ^# ]] && continue + image="${REGISTRY}/${service}:${TAG_SUFFIX}" + df_path="${ROOT}/${dockerfile}" + if [[ ! -f "${df_path}" ]]; then + echo "skipping ${service}: dockerfile missing (${df_path})" >&2 + continue + fi + + if [[ "${dockerfile}" == *"Dockerfile.console"* ]]; then + # Angular console build uses its dedicated Dockerfile + echo "[console] ${service} -> ${image}" >&2 + docker build \ + -f "${df_path}" "${ROOT}" \ + --build-arg APP_DIR="${project}" \ + --build-arg APP_PORT="${port}" \ + -t "${image}" + else + echo "[service] ${service} -> ${image}" >&2 + docker build \ + -f "${df_path}" "${ROOT}" \ + --build-arg SDK_IMAGE="${SDK_IMAGE}" \ + --build-arg RUNTIME_IMAGE="${RUNTIME_IMAGE}" \ + --build-arg APP_PROJECT="${project}" \ + --build-arg APP_BINARY="${binary}" \ + --build-arg APP_PORT="${port}" \ + -t "${image}" + fi + +done < "${MATRIX}" + +echo "Build complete. Remember to enforce readOnlyRootFilesystem at deploy time and run sbom_attest.sh (DOCKER-44-002)." >&2 diff --git a/deploy/docker/healthcheck-frontend.sh b/deploy/docker/healthcheck-frontend.sh new file mode 100644 index 000000000..fe282fa62 --- /dev/null +++ b/deploy/docker/healthcheck-frontend.sh @@ -0,0 +1,10 @@ +#!/bin/sh +set -eu +HOST="${HEALTH_HOST:-127.0.0.1}" +PORT="${HEALTH_PORT:-8080}" +PATH_CHECK="${HEALTH_PATH:-/}" +USER_AGENT="stellaops-frontend-healthcheck" + +wget -qO- "http://${HOST}:${PORT}${PATH_CHECK}" \ + --header="User-Agent: ${USER_AGENT}" \ + --timeout="${HEALTH_TIMEOUT:-4}" >/dev/null diff --git a/deploy/docker/healthcheck.sh b/deploy/docker/healthcheck.sh new file mode 100644 index 000000000..4c865269a --- /dev/null +++ b/deploy/docker/healthcheck.sh @@ -0,0 +1,24 @@ +#!/bin/sh +set -eu +HOST="${HEALTH_HOST:-127.0.0.1}" +PORT="${HEALTH_PORT:-8080}" +LIVENESS_PATH="${LIVENESS_PATH:-/health/liveness}" +READINESS_PATH="${READINESS_PATH:-/health/readiness}" +USER_AGENT="stellaops-healthcheck" + +fetch() { + target_path="$1" + # BusyBox wget is available in Alpine; curl not assumed. + wget -qO- "http://${HOST}:${PORT}${target_path}" \ + --header="User-Agent: ${USER_AGENT}" \ + --timeout="${HEALTH_TIMEOUT:-4}" >/dev/null +} + +fail=0 +if ! fetch "$LIVENESS_PATH"; then + fail=1 +fi +if ! fetch "$READINESS_PATH"; then + fail=1 +fi +exit "$fail" diff --git a/deploy/docker/sbom_attest.sh b/deploy/docker/sbom_attest.sh new file mode 100644 index 000000000..5ec525fa9 --- /dev/null +++ b/deploy/docker/sbom_attest.sh @@ -0,0 +1,48 @@ +#!/usr/bin/env bash +# Deterministic SBOM + attestation helper for DOCKER-44-002 +# Usage: ./sbom_attest.sh [output-dir] [cosign-key] +# - image-ref: fully qualified image (e.g., ghcr.io/stellaops/policy:1.2.3) +# - output-dir: defaults to ./sbom +# - cosign-key: path to cosign key (PEM). If omitted, uses keyless if allowed (COSIGN_EXPERIMENTAL=1) + +set -euo pipefail +IMAGE_REF=${1:?"image ref required"} +OUT_DIR=${2:-sbom} +COSIGN_KEY=${3:-} + +mkdir -p "${OUT_DIR}" + +# Normalize filename (replace / and : with _) +name_safe() { + echo "$1" | tr '/:' '__' +} + +BASENAME=$(name_safe "${IMAGE_REF}") +SPDX_JSON="${OUT_DIR}/${BASENAME}.spdx.json" +CDX_JSON="${OUT_DIR}/${BASENAME}.cdx.json" +ATTESTATION="${OUT_DIR}/${BASENAME}.sbom.att" + +# Freeze timestamps for reproducibility +export SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH:-1704067200} + +# Generate SPDX 3.0-ish JSON (syft formats are stable and offline-friendly) +syft "${IMAGE_REF}" -o spdx-json > "${SPDX_JSON}" +# Generate CycloneDX 1.6 JSON +syft "${IMAGE_REF}" -o cyclonedx-json > "${CDX_JSON}" + +# Attach SBOMs as cosign attestations (one per format) +export COSIGN_EXPERIMENTAL=${COSIGN_EXPERIMENTAL:-1} +COSIGN_ARGS=("attest" "--predicate" "${SPDX_JSON}" "--type" "spdx" "${IMAGE_REF}") +if [[ -n "${COSIGN_KEY}" ]]; then + COSIGN_ARGS+=("--key" "${COSIGN_KEY}") +fi +cosign "${COSIGN_ARGS[@]}" + +COSIGN_ARGS=("attest" "--predicate" "${CDX_JSON}" "--type" "cyclonedx" "${IMAGE_REF}") +if [[ -n "${COSIGN_KEY}" ]]; then + COSIGN_ARGS+=("--key" "${COSIGN_KEY}") +fi +cosign "${COSIGN_ARGS[@]}" + +echo "SBOMs written to ${SPDX_JSON} and ${CDX_JSON}" >&2 +echo "Attestations pushed for ${IMAGE_REF}" >&2 diff --git a/deploy/docker/services-matrix.env b/deploy/docker/services-matrix.env new file mode 100644 index 000000000..4a3a35f73 --- /dev/null +++ b/deploy/docker/services-matrix.env @@ -0,0 +1,12 @@ +# service|dockerfile|project|binary|port +# Paths are relative to repo root; dockerfile is usually the shared hardened template. +api|ops/devops/docker/Dockerfile.hardened.template|src/VulnExplorer/StellaOps.VulnExplorer.Api/StellaOps.VulnExplorer.Api.csproj|StellaOps.VulnExplorer.Api|8080 +orchestrator|ops/devops/docker/Dockerfile.hardened.template|src/Orchestrator/StellaOps.Orchestrator.WebService/StellaOps.Orchestrator.WebService.csproj|StellaOps.Orchestrator.WebService|8080 +task-runner|ops/devops/docker/Dockerfile.hardened.template|src/Orchestrator/StellaOps.Orchestrator.Worker/StellaOps.Orchestrator.Worker.csproj|StellaOps.Orchestrator.Worker|8081 +concelier|ops/devops/docker/Dockerfile.hardened.template|src/Concelier/StellaOps.Concelier.WebService/StellaOps.Concelier.WebService.csproj|StellaOps.Concelier.WebService|8080 +excititor|ops/devops/docker/Dockerfile.hardened.template|src/Excititor/StellaOps.Excititor.WebService/StellaOps.Excititor.WebService.csproj|StellaOps.Excititor.WebService|8080 +policy|ops/devops/docker/Dockerfile.hardened.template|src/Policy/StellaOps.Policy.Gateway/StellaOps.Policy.Gateway.csproj|StellaOps.Policy.Gateway|8084 +notify|ops/devops/docker/Dockerfile.hardened.template|src/Notify/StellaOps.Notify.WebService/StellaOps.Notify.WebService.csproj|StellaOps.Notify.WebService|8080 +export|ops/devops/docker/Dockerfile.hardened.template|src/ExportCenter/StellaOps.ExportCenter.WebService/StellaOps.ExportCenter.WebService.csproj|StellaOps.ExportCenter.WebService|8080 +advisoryai|ops/devops/docker/Dockerfile.hardened.template|src/AdvisoryAI/StellaOps.AdvisoryAI.WebService/StellaOps.AdvisoryAI.WebService.csproj|StellaOps.AdvisoryAI.WebService|8080 +console|ops/devops/docker/Dockerfile.console|src/Web/StellaOps.Web|StellaOps.Web|8080 diff --git a/deploy/docker/verify_health_endpoints.sh b/deploy/docker/verify_health_endpoints.sh new file mode 100644 index 000000000..d45ee9b7f --- /dev/null +++ b/deploy/docker/verify_health_endpoints.sh @@ -0,0 +1,70 @@ +#!/usr/bin/env bash +# Smoke-check /health and capability endpoints for a built image (DOCKER-44-003) +# Usage: ./verify_health_endpoints.sh [port] +# Requires: docker, curl or wget +set -euo pipefail +IMAGE=${1:?"image ref required"} +PORT=${2:-8080} +CONTAINER_NAME="healthcheck-$$" +TIMEOUT=30 +SLEEP=1 + +have_curl=1 +if ! command -v curl >/dev/null 2>&1; then + have_curl=0 +fi + +req() { + local path=$1 + local url="http://127.0.0.1:${PORT}${path}" + if [[ $have_curl -eq 1 ]]; then + curl -fsS --max-time 3 "$url" >/dev/null + else + wget -qO- --timeout=3 "$url" >/dev/null + fi +} + +cleanup() { + docker rm -f "$CONTAINER_NAME" >/dev/null 2>&1 || true +} +trap cleanup EXIT + +echo "[info] starting container ${IMAGE} on port ${PORT}" >&2 +cleanup +if ! docker run -d --rm --name "$CONTAINER_NAME" -p "${PORT}:${PORT}" "$IMAGE" >/dev/null; then + echo "[error] failed to start image ${IMAGE}" >&2 + exit 1 +fi + +# wait for readiness +start=$(date +%s) +while true; do + if req /health/liveness 2>/dev/null; then break; fi + now=$(date +%s) + if (( now - start > TIMEOUT )); then + echo "[error] liveness endpoint did not come up in ${TIMEOUT}s" >&2 + exit 1 + fi + sleep $SLEEP +done + +# verify endpoints +fail=0 +for path in /health/liveness /health/readiness /version /metrics; do + if ! req "$path"; then + echo "[error] missing or failing ${path}" >&2 + fail=1 + fi +done + +# capability endpoint optional; if present ensure merge=false for Concelier/Excititor +if req /capabilities 2>/dev/null; then + body="$(curl -fsS "http://127.0.0.1:${PORT}/capabilities" 2>/dev/null || true)" + if echo "$body" | grep -q '"merge"[[:space:]]*:[[:space:]]*false'; then + : + else + echo "[warn] /capabilities present but merge flag not false" >&2 + fi +fi + +exit $fail diff --git a/deploy/helm/stellaops/Chart.yaml b/deploy/helm/stellaops/Chart.yaml new file mode 100644 index 000000000..f5b57d429 --- /dev/null +++ b/deploy/helm/stellaops/Chart.yaml @@ -0,0 +1,6 @@ +apiVersion: v2 +name: stellaops +description: Stella Ops core stack (authority, signing, scanner, UI) with infrastructure primitives. +type: application +version: 0.1.0 +appVersion: "2025.10.0" diff --git a/deploy/helm/stellaops/INSTALL.md b/deploy/helm/stellaops/INSTALL.md new file mode 100644 index 000000000..909d7e783 --- /dev/null +++ b/deploy/helm/stellaops/INSTALL.md @@ -0,0 +1,64 @@ +# StellaOps Helm Install Guide + +This guide ships with the `stellaops` chart and provides deterministic install steps for **prod** and **airgap** profiles. All images are pinned by digest from `deploy/releases/.yaml`. + +## Prerequisites +- Helm ≥ 3.14 and kubectl configured for the target cluster. +- Pull secrets for `registry.stella-ops.org` (or your mirrored registry in air-gapped mode). +- TLS/ingress secrets created if you enable ingress in the values files. + +## Channels and values +- Prod/stable: `deploy/releases/2025.09-stable.yaml` + `values-prod.yaml` +- Airgap: `deploy/releases/2025.09-airgap.yaml` + `values-airgap.yaml` +- Mirror (optional): `values-mirror.yaml` overlays registry endpoints when using a private mirror. + +## Quick install (prod) +```bash +export RELEASE_CHANNEL=2025.09-stable +export NAMESPACE=stellaops + +helm upgrade --install stellaops ./deploy/helm/stellaops \ + --namespace "$NAMESPACE" --create-namespace \ + -f deploy/helm/stellaops/values-prod.yaml \ + --set global.release.channel=stable \ + --set global.release.version="2025.09.2" \ + --set global.release.manifestSha256="dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7" +``` + +## Quick install (airgap) +Assumes images are already loaded into your private registry and `values-airgap.yaml` points to that registry. +```bash +export NAMESPACE=stellaops + +helm upgrade --install stellaops ./deploy/helm/stellaops \ + --namespace "$NAMESPACE" --create-namespace \ + -f deploy/helm/stellaops/values-airgap.yaml \ + --set global.release.channel=airgap \ + --set global.release.version="2025.09.0-airgap" \ + --set global.release.manifestSha256="d422ae3ea01d5f27ea8b5fdc5b19667cb4e3e2c153a35cb761cb53a6ce4f6ba4" +``` + +## Mirror overlay +If using a mirrored registry, layer the mirror values: +```bash +helm upgrade --install stellaops ./deploy/helm/stellaops \ + --namespace "$NAMESPACE" --create-namespace \ + -f deploy/helm/stellaops/values-prod.yaml \ + -f deploy/helm/stellaops/values-mirror.yaml \ + --set global.release.version="2025.09.2" \ + --set global.release.manifestSha256="dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7" +``` + +## Validate chart and digests +```bash +deploy/tools/check-channel-alignment.py --manifest deploy/releases/$RELEASE_CHANNEL.yaml \ + --values deploy/helm/stellaops/values-prod.yaml + +helm lint ./deploy/helm/stellaops +helm template stellaops ./deploy/helm/stellaops -f deploy/helm/stellaops/values-prod.yaml >/tmp/stellaops.yaml +``` + +## Notes +- Surface.Env and Surface.Secrets defaults are defined in `values*.yaml`; adjust endpoints, cache roots, and providers before promotion. +- Keep `global.release.*` in sync with the chosen release manifest; never deploy with empty version/channel/manifestSha256. +- For offline clusters, run image preload and secret creation before `helm upgrade` to avoid pull failures. diff --git a/deploy/helm/stellaops/README-mock.md b/deploy/helm/stellaops/README-mock.md new file mode 100644 index 000000000..2683f1665 --- /dev/null +++ b/deploy/helm/stellaops/README-mock.md @@ -0,0 +1,16 @@ +# Mock Overlay (Dev Only) + +Purpose: let deployment tasks progress with placeholder digests until real releases land. + +Use: +```bash +helm template mock ./deploy/helm/stellaops -f deploy/helm/stellaops/values-mock.yaml +``` + +Contents: +- Mock deployments for orchestrator, policy-registry, packs-registry, task-runner, VEX Lens, issuer-directory, findings-ledger, vuln-explorer-api. +- Image pins pulled from `deploy/releases/2025.09-mock-dev.yaml`. + +Notes: +- Annotated with `stellaops.dev/mock: "true"` to discourage production use. +- Swap to real values once official digests publish; keep mock overlay gated behind `mock.enabled`. diff --git a/deploy/helm/stellaops/files/otel-collector-config.yaml b/deploy/helm/stellaops/files/otel-collector-config.yaml new file mode 100644 index 000000000..d5d0167ea --- /dev/null +++ b/deploy/helm/stellaops/files/otel-collector-config.yaml @@ -0,0 +1,64 @@ +receivers: + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + require_client_certificate: ${STELLAOPS_OTEL_REQUIRE_CLIENT_CERT:true} + http: + endpoint: 0.0.0.0:4318 + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + require_client_certificate: ${STELLAOPS_OTEL_REQUIRE_CLIENT_CERT:true} + +processors: + attributes/tenant-tag: + actions: + - key: tenant.id + action: insert + value: ${STELLAOPS_TENANT_ID:unknown} + batch: + send_batch_size: 1024 + timeout: 5s + +exporters: + logging: + verbosity: normal + prometheus: + endpoint: ${STELLAOPS_OTEL_PROMETHEUS_ENDPOINT:0.0.0.0:9464} + enable_open_metrics: true + metric_expiration: 5m + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + +extensions: + health_check: + endpoint: ${STELLAOPS_OTEL_HEALTH_ENDPOINT:0.0.0.0:13133} + pprof: + endpoint: ${STELLAOPS_OTEL_PPROF_ENDPOINT:0.0.0.0:1777} + +service: + telemetry: + logs: + level: ${STELLAOPS_OTEL_LOG_LEVEL:info} + extensions: [health_check, pprof] + pipelines: + traces: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging] + metrics: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging, prometheus] + logs: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging] diff --git a/deploy/helm/stellaops/templates/_helpers.tpl b/deploy/helm/stellaops/templates/_helpers.tpl new file mode 100644 index 000000000..d69efc321 --- /dev/null +++ b/deploy/helm/stellaops/templates/_helpers.tpl @@ -0,0 +1,43 @@ +{{- define "stellaops.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{- define "stellaops.telemetryCollector.config" -}} +{{- if .Values.telemetry.collector.config }} +{{ tpl .Values.telemetry.collector.config . }} +{{- else }} +{{ tpl (.Files.Get "files/otel-collector-config.yaml") . }} +{{- end }} +{{- end -}} + +{{- define "stellaops.telemetryCollector.fullname" -}} +{{- printf "%s-otel-collector" (include "stellaops.name" .) | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{- define "stellaops.fullname" -}} +{{- $name := default .root.Chart.Name .root.Values.fullnameOverride -}} +{{- printf "%s-%s" $name .name | trunc 63 | trimSuffix "-" -}} +{{- end -}} + +{{- define "stellaops.selectorLabels" -}} +app.kubernetes.io/name: {{ include "stellaops.name" .root | quote }} +app.kubernetes.io/instance: {{ .root.Release.Name | quote }} +app.kubernetes.io/component: {{ .name | quote }} +{{- if .svc.class }} +app.kubernetes.io/part-of: {{ printf "stellaops-%s" .svc.class | quote }} +{{- else }} +app.kubernetes.io/part-of: "stellaops-core" +{{- end }} +{{- end -}} + +{{- define "stellaops.labels" -}} +{{ include "stellaops.selectorLabels" . }} +helm.sh/chart: {{ printf "%s-%s" .root.Chart.Name .root.Chart.Version | quote }} +app.kubernetes.io/version: {{ .root.Values.global.release.version | quote }} +app.kubernetes.io/managed-by: {{ .root.Release.Service | quote }} +stellaops.release/channel: {{ .root.Values.global.release.channel | quote }} +stellaops.profile: {{ .root.Values.global.profile | quote }} +{{- range $k, $v := .root.Values.global.labels }} +{{ $k }}: {{ $v | quote }} +{{- end }} +{{- end -}} diff --git a/deploy/helm/stellaops/templates/configmap-release.yaml b/deploy/helm/stellaops/templates/configmap-release.yaml new file mode 100644 index 000000000..e788ba99a --- /dev/null +++ b/deploy/helm/stellaops/templates/configmap-release.yaml @@ -0,0 +1,10 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "stellaops.fullname" (dict "root" . "name" "release") }} + labels: + {{- include "stellaops.labels" (dict "root" . "name" "release" "svc" (dict "class" "meta")) | nindent 4 }} +data: + version: {{ .Values.global.release.version | quote }} + channel: {{ .Values.global.release.channel | quote }} + manifestSha256: {{ default "" .Values.global.release.manifestSha256 | quote }} diff --git a/deploy/helm/stellaops/templates/configmaps.yaml b/deploy/helm/stellaops/templates/configmaps.yaml new file mode 100644 index 000000000..e67dd0935 --- /dev/null +++ b/deploy/helm/stellaops/templates/configmaps.yaml @@ -0,0 +1,15 @@ +{{- $root := . -}} +{{- range $name, $cfg := .Values.configMaps }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "stellaops.fullname" (dict "root" $root "name" $name) }} + labels: + {{- include "stellaops.labels" (dict "root" $root "name" $name "svc" (dict "class" "config")) | nindent 4 }} +data: +{{- range $fileName, $content := $cfg.data }} + {{ $fileName }}: | +{{ tpl $content $root | nindent 4 }} +{{- end }} +--- +{{- end }} diff --git a/deploy/helm/stellaops/templates/console.yaml b/deploy/helm/stellaops/templates/console.yaml new file mode 100644 index 000000000..08904a10f --- /dev/null +++ b/deploy/helm/stellaops/templates/console.yaml @@ -0,0 +1,108 @@ +{{- if .Values.console.enabled }} +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "stellaops.fullname" . }}-console + labels: + app.kubernetes.io/component: console + {{- include "stellaops.labels" . | nindent 4 }} +spec: + replicas: {{ .Values.console.replicas | default 1 }} + selector: + matchLabels: + app.kubernetes.io/component: console + {{- include "stellaops.selectorLabels" . | nindent 6 }} + template: + metadata: + labels: + app.kubernetes.io/component: console + {{- include "stellaops.selectorLabels" . | nindent 8 }} + spec: + securityContext: + {{- toYaml .Values.console.securityContext | nindent 8 }} + containers: + - name: console + image: {{ .Values.console.image }} + imagePullPolicy: {{ .Values.global.image.pullPolicy | default "IfNotPresent" }} + ports: + - name: http + containerPort: {{ .Values.console.port | default 8080 }} + protocol: TCP + securityContext: + {{- toYaml .Values.console.containerSecurityContext | nindent 12 }} + livenessProbe: + {{- toYaml .Values.console.livenessProbe | nindent 12 }} + readinessProbe: + {{- toYaml .Values.console.readinessProbe | nindent 12 }} + resources: + {{- toYaml .Values.console.resources | nindent 12 }} + volumeMounts: + {{- toYaml .Values.console.volumeMounts | nindent 12 }} + env: + - name: APP_PORT + value: "{{ .Values.console.port | default 8080 }}" + volumes: + {{- toYaml .Values.console.volumes | nindent 8 }} +--- +apiVersion: v1 +kind: Service +metadata: + name: {{ include "stellaops.fullname" . }}-console + labels: + app.kubernetes.io/component: console + {{- include "stellaops.labels" . | nindent 4 }} +spec: + type: {{ .Values.console.service.type | default "ClusterIP" }} + ports: + - port: {{ .Values.console.service.port | default 80 }} + targetPort: {{ .Values.console.service.targetPort | default 8080 }} + protocol: TCP + name: http + selector: + app.kubernetes.io/component: console + {{- include "stellaops.selectorLabels" . | nindent 4 }} +{{- if .Values.console.ingress.enabled }} +--- +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: {{ include "stellaops.fullname" . }}-console + labels: + app.kubernetes.io/component: console + {{- include "stellaops.labels" . | nindent 4 }} + {{- with .Values.console.ingress.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + {{- if .Values.console.ingress.className }} + ingressClassName: {{ .Values.console.ingress.className }} + {{- end }} + {{- if .Values.console.ingress.tls }} + tls: + {{- range .Values.console.ingress.tls }} + - hosts: + {{- range .hosts }} + - {{ . | quote }} + {{- end }} + secretName: {{ .secretName }} + {{- end }} + {{- end }} + rules: + {{- range .Values.console.ingress.hosts }} + - host: {{ .host | quote }} + http: + paths: + {{- range .paths }} + - path: {{ .path }} + pathType: {{ .pathType | default "Prefix" }} + backend: + service: + name: {{ include "stellaops.fullname" $ }}-console + port: + name: http + {{- end }} + {{- end }} +{{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/core.yaml b/deploy/helm/stellaops/templates/core.yaml new file mode 100644 index 000000000..9158c5905 --- /dev/null +++ b/deploy/helm/stellaops/templates/core.yaml @@ -0,0 +1,225 @@ +{{- $root := . -}} +{{- $configMaps := default (dict) .Values.configMaps -}} +{{- $hasPolicyActivationConfig := hasKey $configMaps "policy-engine-activation" -}} +{{- $policyActivationConfigName := "" -}} +{{- if $hasPolicyActivationConfig -}} +{{- $policyActivationConfigName = include "stellaops.fullname" (dict "root" $root "name" "policy-engine-activation") -}} +{{- end -}} +{{- $policyActivationTargets := dict "policy-engine" true "policy-gateway" true -}} +{{- range $name, $svc := .Values.services }} +{{- $configMounts := (default (list) $svc.configMounts) }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "stellaops.fullname" (dict "root" $root "name" $name) }} + labels: + {{- include "stellaops.labels" (dict "root" $root "name" $name "svc" $svc) | nindent 4 }} +spec: + replicas: {{ default 1 $svc.replicas }} + selector: + matchLabels: + {{- include "stellaops.selectorLabels" (dict "root" $root "name" $name "svc" $svc) | nindent 6 }} + template: + metadata: + labels: + {{- include "stellaops.selectorLabels" (dict "root" $root "name" $name "svc" $svc) | nindent 8 }} + {{- if $svc.podAnnotations }} + annotations: +{{ toYaml $svc.podAnnotations | nindent 8 }} + {{- end }} + annotations: + stellaops.release/version: {{ $root.Values.global.release.version | quote }} + stellaops.release/channel: {{ $root.Values.global.release.channel | quote }} + spec: + {{- if $svc.podSecurityContext }} + securityContext: +{{ toYaml $svc.podSecurityContext | nindent 6 }} + {{- end }} + containers: + - name: {{ $name }} + image: {{ $svc.image | quote }} + imagePullPolicy: {{ default $root.Values.global.image.pullPolicy $svc.imagePullPolicy }} +{{- if $svc.securityContext }} + securityContext: +{{ toYaml $svc.securityContext | nindent 12 }} +{{- end }} +{{- if $svc.command }} + command: +{{- range $cmd := $svc.command }} + - {{ $cmd | quote }} +{{- end }} +{{- end }} +{{- if $svc.args }} + args: +{{- range $arg := $svc.args }} + - {{ $arg | quote }} +{{- end }} +{{- end }} +{{- if $svc.env }} + env: +{{- range $envName, $envValue := $svc.env }} + - name: {{ $envName }} + value: {{ $envValue | quote }} +{{- end }} +{{- end }} +{{- $needsPolicyActivation := and $hasPolicyActivationConfig (hasKey $policyActivationTargets $name) }} +{{- $envFrom := default (list) $svc.envFrom }} +{{- if and (hasKey $root.Values.configMaps "surface-env") (or (hasPrefix "scanner-" $name) (hasPrefix "zastava-" $name)) }} + {{- $envFrom = append $envFrom (dict "configMapRef" (dict "name" (include "stellaops.fullname" (dict "root" $root "name" "surface-env")))) }} +{{- end }} +{{- if and $needsPolicyActivation (ne $policyActivationConfigName "") }} +{{- $hasActivationReference := false }} +{{- range $envFromEntry := $envFrom }} + {{- if and (hasKey $envFromEntry "configMapRef") (eq (index (index $envFromEntry "configMapRef") "name") $policyActivationConfigName) }} + {{- $hasActivationReference = true }} + {{- end }} +{{- end }} +{{- if not $hasActivationReference }} +{{- $envFrom = append $envFrom (dict "configMapRef" (dict "name" $policyActivationConfigName)) }} +{{- end }} +{{- end }} +{{- if $envFrom }} + envFrom: +{{ toYaml $envFrom | nindent 12 }} +{{- end }} +{{- if $svc.ports }} + ports: +{{- range $port := $svc.ports }} + - name: {{ default (printf "%s-%v" $name $port.containerPort) $port.name | trunc 63 | trimSuffix "-" }} + containerPort: {{ $port.containerPort }} + protocol: {{ default "TCP" $port.protocol }} +{{- end }} +{{- else if and $svc.service (hasKey $svc.service "port") }} + {{- $svcService := $svc.service }} + ports: + - name: {{ printf "%s-http" $name | trunc 63 | trimSuffix "-" }} + containerPort: {{ default (index $svcService "port") (index $svcService "targetPort") }} + protocol: {{ default "TCP" (index $svcService "protocol") }} +{{- end }} +{{- if $svc.resources }} + resources: +{{ toYaml $svc.resources | nindent 12 }} +{{- end }} +{{- if $svc.securityContext }} + securityContext: +{{ toYaml $svc.securityContext | nindent 12 }} +{{- end }} +{{- if $svc.securityContext }} + securityContext: +{{ toYaml $svc.securityContext | nindent 12 }} +{{- end }} +{{- if $svc.livenessProbe }} + livenessProbe: +{{ toYaml $svc.livenessProbe | nindent 12 }} +{{- end }} +{{- if $svc.readinessProbe }} + readinessProbe: +{{ toYaml $svc.readinessProbe | nindent 12 }} +{{- end }} +{{- if $svc.prometheus }} + {{- $pr := $svc.prometheus }} + {{- if $pr.enabled }} + {{- if not $svc.podAnnotations }} + {{- $svc = merge $svc (dict "podAnnotations" (dict)) }} + {{- end }} + {{- $svc.podAnnotations = merge $svc.podAnnotations (dict "prometheus.io/scrape" "true" "prometheus.io/path" (default "/metrics" $pr.path) "prometheus.io/port" (toString (default 8080 $pr.port)) "prometheus.io/scheme" (default "http" $pr.scheme))) }} + {{- end }} +{{- end }} +{{- if or $svc.volumeMounts $configMounts }} + volumeMounts: +{{- if $svc.volumeMounts }} +{{ toYaml $svc.volumeMounts | nindent 12 }} +{{- end }} +{{- range $mount := $configMounts }} + - name: {{ $mount.name }} + mountPath: {{ $mount.mountPath }} +{{- if $mount.subPath }} + subPath: {{ $mount.subPath }} +{{- end }} +{{- if hasKey $mount "readOnly" }} + readOnly: {{ $mount.readOnly }} +{{- else }} + readOnly: true +{{- end }} +{{- end }} +{{- end }} + {{- if or $svc.volumes (or $svc.volumeClaims $configMounts) }} + volumes: +{{- if $svc.volumes }} +{{ toYaml $svc.volumes | nindent 8 }} +{{- end }} +{{- if $svc.volumeClaims }} +{{- range $claim := $svc.volumeClaims }} + - name: {{ $claim.name }} + persistentVolumeClaim: + claimName: {{ $claim.claimName }} +{{- end }} +{{- end }} +{{- range $mount := $configMounts }} + - name: {{ $mount.name }} + configMap: + name: {{ include "stellaops.fullname" (dict "root" $root "name" $mount.configMap) }} +{{- if $mount.items }} + items: +{{ toYaml $mount.items | nindent 12 }} +{{- else if $mount.subPath }} + items: + - key: {{ $mount.subPath }} + path: {{ $mount.subPath }} +{{- end }} +{{- end }} + {{- end }} + {{- if $svc.serviceAccount }} + serviceAccountName: {{ $svc.serviceAccount | quote }} + {{- end }} + {{- if $svc.nodeSelector }} + nodeSelector: +{{ toYaml $svc.nodeSelector | nindent 8 }} + {{- end }} + {{- if $svc.affinity }} + affinity: +{{ toYaml $svc.affinity | nindent 8 }} + {{- end }} +{{- if $svc.tolerations }} + tolerations: +{{ toYaml $svc.tolerations | nindent 8 }} + {{- end }} + {{- if $svc.pdb }} +--- +apiVersion: policy/v1 +kind: PodDisruptionBudget +metadata: + name: {{ include "stellaops.fullname" (dict "root" $root "name" $name) }} + labels: + {{- include "stellaops.labels" (dict "root" $root "name" $name "svc" $svc) | nindent 4 }} +spec: + {{- if $svc.pdb.minAvailable }} + minAvailable: {{ $svc.pdb.minAvailable }} + {{- end }} + {{- if $svc.pdb.maxUnavailable }} + maxUnavailable: {{ $svc.pdb.maxUnavailable }} + {{- end }} + selector: + matchLabels: + {{- include "stellaops.selectorLabels" (dict "root" $root "name" $name "svc" $svc) | nindent 6 }} + {{- end }} +--- +{{- if $svc.service }} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "stellaops.fullname" (dict "root" $root "name" $name) }} + labels: + {{- include "stellaops.labels" (dict "root" $root "name" $name "svc" $svc) | nindent 4 }} +spec: + type: {{ default "ClusterIP" $svc.service.type }} + selector: + {{- include "stellaops.selectorLabels" (dict "root" $root "name" $name "svc" $svc) | nindent 4 }} + ports: + - name: {{ default "http" $svc.service.portName }} + port: {{ $svc.service.port }} + targetPort: {{ $svc.service.targetPort | default $svc.service.port }} + protocol: {{ default "TCP" $svc.service.protocol }} +--- +{{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/externalsecrets.yaml b/deploy/helm/stellaops/templates/externalsecrets.yaml new file mode 100644 index 000000000..7702500d8 --- /dev/null +++ b/deploy/helm/stellaops/templates/externalsecrets.yaml @@ -0,0 +1,28 @@ +{{- if and .Values.externalSecrets.enabled .Values.externalSecrets.secrets }} +{{- range $secret := .Values.externalSecrets.secrets }} +apiVersion: external-secrets.io/v1beta1 +kind: ExternalSecret +metadata: + name: {{ include "stellaops.fullname" $ }}-{{ $secret.name }} + labels: + {{- include "stellaops.labels" $ | nindent 4 }} +spec: + refreshInterval: {{ default "1h" $secret.refreshInterval }} + secretStoreRef: + name: {{ $secret.storeRef.name }} + kind: {{ default "ClusterSecretStore" $secret.storeRef.kind }} + target: + name: {{ $secret.target.name | default (printf "%s-%s" (include "stellaops.fullname" $) $secret.name) }} + creationPolicy: {{ default "Owner" $secret.target.creationPolicy }} + data: + {{- range $secret.data }} + - secretKey: {{ .key }} + remoteRef: + key: {{ .remoteKey }} + {{- if .property }} + property: {{ .property }} + {{- end }} + {{- end }} +--- +{{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/hpa.yaml b/deploy/helm/stellaops/templates/hpa.yaml new file mode 100644 index 000000000..2c8660a5d --- /dev/null +++ b/deploy/helm/stellaops/templates/hpa.yaml @@ -0,0 +1,39 @@ +{{- if and .Values.hpa.enabled .Values.services }} +{{- range $name, $svc := .Values.services }} +{{- if and $svc.hpa $svc.hpa.enabled }} +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: {{ include "stellaops.fullname" (dict "root" $ "name" $name) }} + labels: + {{- include "stellaops.labels" (dict "root" $ "name" $name "svc" $svc) | nindent 4 }} +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: {{ include "stellaops.fullname" (dict "root" $ "name" $name) }} + minReplicas: {{ default $.Values.hpa.minReplicas $svc.hpa.minReplicas }} + maxReplicas: {{ default $.Values.hpa.maxReplicas $svc.hpa.maxReplicas }} + metrics: + {{- $cpu := coalesce $svc.hpa.cpu.targetPercentage $.Values.hpa.cpu.targetPercentage -}} + {{- if $cpu }} + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: {{ $cpu }} + {{- end }} + {{- $mem := coalesce $svc.hpa.memory.targetPercentage $.Values.hpa.memory.targetPercentage -}} + {{- if $mem }} + - type: Resource + resource: + name: memory + target: + type: Utilization + averageUtilization: {{ $mem }} + {{- end }} +--- +{{- end }} +{{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/ingress.yaml b/deploy/helm/stellaops/templates/ingress.yaml new file mode 100644 index 000000000..636f35ccf --- /dev/null +++ b/deploy/helm/stellaops/templates/ingress.yaml @@ -0,0 +1,32 @@ +{{- if and .Values.ingress.enabled .Values.ingress.hosts }} +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: {{ include "stellaops.fullname" . }} + labels: + {{- include "stellaops.labels" . | nindent 4 }} + annotations: + {{- range $k, $v := .Values.ingress.annotations }} + {{ $k }}: {{ $v | quote }} + {{- end }} +spec: + ingressClassName: {{ .Values.ingress.className | default "nginx" | quote }} + tls: + {{- range .Values.ingress.tls }} + - hosts: {{ toYaml .hosts | nindent 6 }} + secretName: {{ .secretName }} + {{- end }} + rules: + {{- range .Values.ingress.hosts }} + - host: {{ .host }} + http: + paths: + - path: {{ .path | default "/" }} + pathType: Prefix + backend: + service: + name: {{ include "stellaops.fullname" $ }}-gateway + port: + number: {{ .servicePort | default 80 }} + {{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/migrations.yaml b/deploy/helm/stellaops/templates/migrations.yaml new file mode 100644 index 000000000..cce478fb4 --- /dev/null +++ b/deploy/helm/stellaops/templates/migrations.yaml @@ -0,0 +1,50 @@ +{{- if and .Values.migrations.enabled .Values.migrations.jobs }} +{{- range $job := .Values.migrations.jobs }} +apiVersion: batch/v1 +kind: Job +metadata: + name: {{ include "stellaops.fullname" $ }}-migration-{{ $job.name | trunc 30 | trimSuffix "-" }} + labels: + {{- include "stellaops.labels" $ | nindent 4 }} + stellaops.io/component: migration + stellaops.io/migration-name: {{ $job.name | quote }} +spec: + backoffLimit: {{ default 3 $job.backoffLimit }} + ttlSecondsAfterFinished: {{ default 3600 $job.ttlSecondsAfterFinished }} + template: + metadata: + labels: + {{- include "stellaops.selectorLabels" $ | nindent 8 }} + stellaops.io/component: migration + stellaops.io/migration-name: {{ $job.name | quote }} + spec: + restartPolicy: {{ default "Never" $job.restartPolicy }} + serviceAccountName: {{ default "default" $job.serviceAccountName }} + containers: + - name: {{ $job.name | trunc 50 | trimSuffix "-" }} + image: {{ $job.image | quote }} + imagePullPolicy: {{ default "IfNotPresent" $job.imagePullPolicy }} + command: {{- if $job.command }} {{ toJson $job.command }} {{- else }} null {{- end }} + args: {{- if $job.args }} {{ toJson $job.args }} {{- else }} null {{- end }} + env: + {{- if $job.env }} + {{- range $k, $v := $job.env }} + - name: {{ $k }} + value: {{ $v | quote }} + {{- end }} + {{- end }} + envFrom: + {{- if $job.envFrom }} + {{- toYaml $job.envFrom | nindent 12 }} + {{- end }} + resources: + {{- if $job.resources }} + {{- toYaml $job.resources | nindent 12 }} + {{- else }}{} + {{- end }} + imagePullSecrets: + {{- if $.Values.global.image.pullSecrets }} + {{- toYaml $.Values.global.image.pullSecrets | nindent 8 }} + {{- end }} +{{- end }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/networkpolicy.yaml b/deploy/helm/stellaops/templates/networkpolicy.yaml new file mode 100644 index 000000000..3533464ae --- /dev/null +++ b/deploy/helm/stellaops/templates/networkpolicy.yaml @@ -0,0 +1,45 @@ +{{- if .Values.networkPolicy.enabled }} +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: {{ include "stellaops.fullname" . }}-default + labels: + {{- include "stellaops.labels" . | nindent 4 }} +spec: + podSelector: + matchLabels: + {{- include "stellaops.selectorLabelsRoot" . | nindent 6 }} + policyTypes: + - Ingress + - Egress + ingress: + - from: + {{- if .Values.networkPolicy.ingressNamespaces }} + - namespaceSelector: + matchLabels: + {{- toYaml .Values.networkPolicy.ingressNamespaces | nindent 14 }} + {{- end }} + {{- if .Values.networkPolicy.ingressPods }} + - podSelector: + matchLabels: + {{- toYaml .Values.networkPolicy.ingressPods | nindent 14 }} + {{- end }} + ports: + - protocol: TCP + port: {{ default 80 .Values.networkPolicy.ingressPort }} + egress: + - to: + {{- if .Values.networkPolicy.egressNamespaces }} + - namespaceSelector: + matchLabels: + {{- toYaml .Values.networkPolicy.egressNamespaces | nindent 14 }} + {{- end }} + {{- if .Values.networkPolicy.egressPods }} + - podSelector: + matchLabels: + {{- toYaml .Values.networkPolicy.egressPods | nindent 14 }} + {{- end }} + ports: + - protocol: TCP + port: {{ default 443 .Values.networkPolicy.egressPort }} +{{- end }} diff --git a/deploy/helm/stellaops/templates/orchestrator-mock.yaml b/deploy/helm/stellaops/templates/orchestrator-mock.yaml new file mode 100644 index 000000000..6b51c5944 --- /dev/null +++ b/deploy/helm/stellaops/templates/orchestrator-mock.yaml @@ -0,0 +1,22 @@ +{{- if .Values.mock.enabled }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: orchestrator-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: orchestrator-mock + template: + metadata: + labels: + app: orchestrator-mock + spec: + containers: + - name: orchestrator + image: "{{ .Values.mock.orchestrator.image }}" + args: ["dotnet", "StellaOps.Orchestrator.WebService.dll"] +{{- end }} diff --git a/deploy/helm/stellaops/templates/otel-collector.yaml b/deploy/helm/stellaops/templates/otel-collector.yaml new file mode 100644 index 000000000..f4f10f349 --- /dev/null +++ b/deploy/helm/stellaops/templates/otel-collector.yaml @@ -0,0 +1,121 @@ +{{- if .Values.telemetry.collector.enabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "stellaops.telemetryCollector.fullname" . }} + labels: + {{- include "stellaops.labels" (dict "root" . "name" "otel-collector" "svc" (dict "class" "telemetry")) | nindent 4 }} +data: + config.yaml: | +{{ include "stellaops.telemetryCollector.config" . | indent 4 }} +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "stellaops.telemetryCollector.fullname" . }} + labels: + {{- include "stellaops.labels" (dict "root" . "name" "otel-collector" "svc" (dict "class" "telemetry")) | nindent 4 }} +spec: + replicas: {{ .Values.telemetry.collector.replicas | default 1 }} + selector: + matchLabels: + app.kubernetes.io/name: {{ include "stellaops.name" . | quote }} + app.kubernetes.io/component: "otel-collector" + template: + metadata: + labels: + app.kubernetes.io/name: {{ include "stellaops.name" . | quote }} + app.kubernetes.io/component: "otel-collector" + stellaops.profile: {{ .Values.global.profile | quote }} + spec: + containers: + - name: otel-collector + image: {{ .Values.telemetry.collector.image | default "otel/opentelemetry-collector:0.105.0" | quote }} + args: + - "--config=/etc/otel/config.yaml" + ports: + - name: otlp-grpc + containerPort: 4317 + - name: otlp-http + containerPort: 4318 + - name: metrics + containerPort: 9464 + - name: health + containerPort: 13133 + - name: pprof + containerPort: 1777 + env: + - name: STELLAOPS_OTEL_TLS_CERT + value: {{ .Values.telemetry.collector.tls.certPath | default "/etc/otel/tls/tls.crt" | quote }} + - name: STELLAOPS_OTEL_TLS_KEY + value: {{ .Values.telemetry.collector.tls.keyPath | default "/etc/otel/tls/tls.key" | quote }} + - name: STELLAOPS_OTEL_TLS_CA + value: {{ .Values.telemetry.collector.tls.caPath | default "/etc/otel/tls/ca.crt" | quote }} + - name: STELLAOPS_OTEL_PROMETHEUS_ENDPOINT + value: {{ .Values.telemetry.collector.prometheusEndpoint | default "0.0.0.0:9464" | quote }} + - name: STELLAOPS_OTEL_REQUIRE_CLIENT_CERT + value: {{ .Values.telemetry.collector.requireClientCert | default true | quote }} + - name: STELLAOPS_TENANT_ID + value: {{ .Values.telemetry.collector.defaultTenant | default "unknown" | quote }} + - name: STELLAOPS_OTEL_LOG_LEVEL + value: {{ .Values.telemetry.collector.logLevel | default "info" | quote }} + volumeMounts: + - name: config + mountPath: /etc/otel/config.yaml + subPath: config.yaml + readOnly: true + - name: tls + mountPath: /etc/otel/tls + readOnly: true + livenessProbe: + httpGet: + scheme: HTTPS + port: health + path: /healthz + initialDelaySeconds: 10 + periodSeconds: 30 + readinessProbe: + httpGet: + scheme: HTTPS + port: health + path: /healthz + initialDelaySeconds: 5 + periodSeconds: 15 +{{- with .Values.telemetry.collector.resources }} + resources: +{{ toYaml . | indent 12 }} +{{- end }} + volumes: + - name: config + configMap: + name: {{ include "stellaops.telemetryCollector.fullname" . }} + - name: tls + secret: + secretName: {{ .Values.telemetry.collector.tls.secretName | required "telemetry.collector.tls.secretName is required" }} +{{- if .Values.telemetry.collector.tls.items }} + items: +{{ toYaml .Values.telemetry.collector.tls.items | indent 14 }} +{{- end }} +--- +apiVersion: v1 +kind: Service +metadata: + name: {{ include "stellaops.telemetryCollector.fullname" . }} + labels: + {{- include "stellaops.labels" (dict "root" . "name" "otel-collector" "svc" (dict "class" "telemetry")) | nindent 4 }} +spec: + type: ClusterIP + selector: + app.kubernetes.io/name: {{ include "stellaops.name" . | quote }} + app.kubernetes.io/component: "otel-collector" + ports: + - name: otlp-grpc + port: {{ .Values.telemetry.collector.service.grpcPort | default 4317 }} + targetPort: otlp-grpc + - name: otlp-http + port: {{ .Values.telemetry.collector.service.httpPort | default 4318 }} + targetPort: otlp-http + - name: metrics + port: {{ .Values.telemetry.collector.service.metricsPort | default 9464 }} + targetPort: metrics +{{- end }} diff --git a/deploy/helm/stellaops/templates/packs-mock.yaml b/deploy/helm/stellaops/templates/packs-mock.yaml new file mode 100644 index 000000000..b3c6cc7fc --- /dev/null +++ b/deploy/helm/stellaops/templates/packs-mock.yaml @@ -0,0 +1,44 @@ +{{- if .Values.mock.enabled }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: packs-registry-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: packs-registry-mock + template: + metadata: + labels: + app: packs-registry-mock + spec: + containers: + - name: packs-registry + image: "{{ .Values.mock.packsRegistry.image }}" + args: ["dotnet", "StellaOps.PacksRegistry.dll"] + +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: task-runner-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: task-runner-mock + template: + metadata: + labels: + app: task-runner-mock + spec: + containers: + - name: task-runner + image: "{{ .Values.mock.taskRunner.image }}" + args: ["dotnet", "StellaOps.TaskRunner.WebService.dll"] +{{- end }} diff --git a/deploy/helm/stellaops/templates/policy-mock.yaml b/deploy/helm/stellaops/templates/policy-mock.yaml new file mode 100644 index 000000000..7dec60676 --- /dev/null +++ b/deploy/helm/stellaops/templates/policy-mock.yaml @@ -0,0 +1,22 @@ +{{- if .Values.mock.enabled }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: policy-registry-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: policy-registry-mock + template: + metadata: + labels: + app: policy-registry-mock + spec: + containers: + - name: policy-registry + image: "{{ .Values.mock.policyRegistry.image }}" + args: ["dotnet", "StellaOps.Policy.Engine.dll"] +{{- end }} diff --git a/deploy/helm/stellaops/templates/vex-mock.yaml b/deploy/helm/stellaops/templates/vex-mock.yaml new file mode 100644 index 000000000..9a5acc595 --- /dev/null +++ b/deploy/helm/stellaops/templates/vex-mock.yaml @@ -0,0 +1,22 @@ +{{- if .Values.mock.enabled }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: vex-lens-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: vex-lens-mock + template: + metadata: + labels: + app: vex-lens-mock + spec: + containers: + - name: vex-lens + image: "{{ .Values.mock.vexLens.image }}" + args: ["dotnet", "StellaOps.VexLens.dll"] +{{- end }} diff --git a/deploy/helm/stellaops/templates/vuln-mock.yaml b/deploy/helm/stellaops/templates/vuln-mock.yaml new file mode 100644 index 000000000..b8c90af49 --- /dev/null +++ b/deploy/helm/stellaops/templates/vuln-mock.yaml @@ -0,0 +1,44 @@ +{{- if .Values.mock.enabled }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: findings-ledger-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: findings-ledger-mock + template: + metadata: + labels: + app: findings-ledger-mock + spec: + containers: + - name: findings-ledger + image: "{{ .Values.mock.findingsLedger.image }}" + args: ["dotnet", "StellaOps.Findings.Ledger.WebService.dll"] + +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: vuln-explorer-api-mock + annotations: + stellaops.dev/mock: "true" +spec: + replicas: 1 + selector: + matchLabels: + app: vuln-explorer-api-mock + template: + metadata: + labels: + app: vuln-explorer-api-mock + spec: + containers: + - name: vuln-explorer-api + image: "{{ .Values.mock.vulnExplorerApi.image }}" + args: ["dotnet", "StellaOps.VulnExplorer.Api.dll"] +{{- end }} diff --git a/deploy/helm/stellaops/values-airgap.yaml b/deploy/helm/stellaops/values-airgap.yaml new file mode 100644 index 000000000..428839f45 --- /dev/null +++ b/deploy/helm/stellaops/values-airgap.yaml @@ -0,0 +1,318 @@ +global: + profile: airgap + release: + version: "2025.09.2-airgap" + channel: airgap + manifestSha256: "b787b833dddd73960c31338279daa0b0a0dce2ef32bd32ef1aaf953d66135f94" + image: + pullPolicy: IfNotPresent + labels: + stellaops.io/channel: airgap + +migrations: + enabled: false + jobs: [] + +networkPolicy: + enabled: true + ingressPort: 8443 + egressPort: 443 + ingressNamespaces: + kubernetes.io/metadata.name: stellaops + egressNamespaces: + kubernetes.io/metadata.name: stellaops + +ingress: + enabled: false + className: nginx + annotations: {} + hosts: [] + tls: [] + +externalSecrets: + enabled: false + secrets: [] + +prometheus: + enabled: true + path: /metrics + port: 8080 + scheme: http + +hpa: + enabled: false + minReplicas: 1 + maxReplicas: 3 + cpu: + targetPercentage: 70 + memory: + targetPercentage: 80 + +configMaps: + notify-config: + data: + notify.yaml: | + storage: + driver: postgres + connectionString: "Host=stellaops-postgres;Port=5432;Database=notify;Username=stellaops;Password=stellaops" + commandTimeoutSeconds: 60 + + authority: + enabled: true + issuer: "https://authority.stella-ops.org" + metadataAddress: "https://authority.stella-ops.org/.well-known/openid-configuration" + requireHttpsMetadata: true + allowAnonymousFallback: false + backchannelTimeoutSeconds: 30 + tokenClockSkewSeconds: 60 + audiences: + - notify + readScope: notify.read + adminScope: notify.admin + + api: + basePath: "/api/v1/notify" + internalBasePath: "/internal/notify" + tenantHeader: "X-StellaOps-Tenant" + + plugins: + baseDirectory: "/var/opt/stellaops" + directory: "plugins/notify" + searchPatterns: + - "StellaOps.Notify.Connectors.*.dll" + orderedPlugins: + - StellaOps.Notify.Connectors.Slack + - StellaOps.Notify.Connectors.Teams + - StellaOps.Notify.Connectors.Email + - StellaOps.Notify.Connectors.Webhook + + telemetry: + enableRequestLogging: true + minimumLogLevel: Warning + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" + + +services: + authority: + image: registry.stella-ops.org/stellaops/authority@sha256:5551a3269b7008cd5aceecf45df018c67459ed519557ccbe48b093b926a39bcc + service: + port: 8440 + env: + STELLAOPS_AUTHORITY__ISSUER: "https://stellaops-authority:8440" + STELLAOPS_AUTHORITY__STORAGE__DRIVER: "postgres" + STELLAOPS_AUTHORITY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=authority;Username=stellaops;Password=stellaops" + STELLAOPS_AUTHORITY__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + STELLAOPS_AUTHORITY__ALLOWANONYMOUSFALLBACK: "false" + signer: + image: registry.stella-ops.org/stellaops/signer@sha256:ddbbd664a42846cea6b40fca6465bc679b30f72851158f300d01a8571c5478fc + service: + port: 8441 + env: + SIGNER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + SIGNER__POE__INTROSPECTURL: "file:///offline/poe/introspect.json" + SIGNER__STORAGE__DRIVER: "postgres" + SIGNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=signer;Username=stellaops;Password=stellaops" + SIGNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + attestor: + image: registry.stella-ops.org/stellaops/attestor@sha256:1ff0a3124d66d3a2702d8e421df40fbd98cc75cb605d95510598ebbae1433c50 + service: + port: 8442 + env: + ATTESTOR__SIGNER__BASEURL: "https://stellaops-signer:8441" + ATTESTOR__STORAGE__DRIVER: "postgres" + ATTESTOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=attestor;Username=stellaops;Password=stellaops" + ATTESTOR__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + concelier: + image: registry.stella-ops.org/stellaops/concelier@sha256:29e2e1a0972707e092cbd3d370701341f9fec2aa9316fb5d8100480f2a1c76b5 + service: + port: 8445 + env: + CONCELIER__STORAGE__DRIVER: "postgres" + CONCELIER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops" + CONCELIER__STORAGE__S3__ENDPOINT: "http://stellaops-rustfs:8080" + CONCELIER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + CONCELIER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + CONCELIER__AUTHORITY__RESILIENCE__ALLOWOFFLINECACHEFALLBACK: "true" + CONCELIER__AUTHORITY__RESILIENCE__OFFLINECACHETOLERANCE: "00:45:00" + volumeMounts: + - name: concelier-jobs + mountPath: /var/lib/concelier/jobs + volumeClaims: + - name: concelier-jobs + claimName: stellaops-concelier-jobs + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web@sha256:3df8ca21878126758203c1a0444e39fd97f77ddacf04a69685cda9f1e5e94718 + service: + port: 8444 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER__OFFLINEKIT__ENABLED: "false" + SCANNER__OFFLINEKIT__REQUIREDSSE: "true" + SCANNER__OFFLINEKIT__REKOROFFLINEMODE: "true" + SCANNER__OFFLINEKIT__TRUSTROOTDIRECTORY: "/etc/stellaops/trust-roots" + SCANNER__OFFLINEKIT__REKORSNAPSHOTDIRECTORY: "/var/lib/stellaops/rekor-snapshot" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "file" + SCANNER_SURFACE_SECRETS_ROOT: "/etc/stellaops/secrets" + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:eea5d6cfe7835950c5ec7a735a651f2f0d727d3e470cf9027a4a402ea89c4fb5 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "file" + SCANNER_SURFACE_SECRETS_ROOT: "/etc/stellaops/secrets" + # Secret Detection Rules Bundle + SCANNER__FEATURES__EXPERIMENTAL__SECRETLEAKDETECTION: "false" + SCANNER__SECRETS__BUNDLEPATH: "/opt/stellaops/plugins/scanner/analyzers/secrets" + SCANNER__SECRETS__REQUIRESIGNATURE: "true" + volumeMounts: + - name: secrets-rules + mountPath: /opt/stellaops/plugins/scanner/analyzers/secrets + readOnly: true + volumeClaims: + - name: secrets-rules + claimName: stellaops-secrets-rules + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:2025.09.2 + service: + port: 8446 + env: + DOTNET_ENVIRONMENT: Production + NOTIFY__QUEUE__DRIVER: "valkey" + NOTIFY__QUEUE__VALKEY__URL: "stellaops-valkey:6379" + configMounts: + - name: notify-config + mountPath: /app/etc/notify.yaml + subPath: notify.yaml + configMap: notify-config + excititor: + image: registry.stella-ops.org/stellaops/excititor@sha256:65c0ee13f773efe920d7181512349a09d363ab3f3e177d276136bd2742325a68 + env: + EXCITITOR__CONCELIER__BASEURL: "https://stellaops-concelier:8445" + EXCITITOR__STORAGE__DRIVER: "postgres" + EXCITITOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops" + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.09.2-airgap + service: + port: 8448 + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: https://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.09.2-airgap + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: https://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + web-ui: + image: registry.stella-ops.org/stellaops/web-ui@sha256:bee9668011ff414572131dc777faab4da24473fe12c230893f161cabee092a1d + service: + port: 9443 + targetPort: 8443 + env: + STELLAOPS_UI__BACKEND__BASEURL: "https://stellaops-scanner-web:8444" + + # Infrastructure services + postgres: + class: infrastructure + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + service: + port: 5432 + env: + POSTGRES_USER: stellaops + POSTGRES_PASSWORD: stellaops + POSTGRES_DB: stellaops + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumeClaims: + - name: postgres-data + claimName: stellaops-postgres-data + valkey: + class: infrastructure + image: docker.io/valkey/valkey:9.0.1-alpine + service: + port: 6379 + command: + - valkey-server + - --appendonly + - "yes" + volumeMounts: + - name: valkey-data + mountPath: /data + volumeClaims: + - name: valkey-data + claimName: stellaops-valkey-data + rustfs: + class: infrastructure + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + service: + port: 8080 + command: + - serve + - --listen + - 0.0.0.0:8080 + - --root + - /data + env: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumeMounts: + - name: rustfs-data + mountPath: /data + volumeClaims: + - name: rustfs-data + claimName: stellaops-rustfs-data diff --git a/deploy/helm/stellaops/values-bluegreen-blue.yaml b/deploy/helm/stellaops/values-bluegreen-blue.yaml new file mode 100644 index 000000000..191fc11c1 --- /dev/null +++ b/deploy/helm/stellaops/values-bluegreen-blue.yaml @@ -0,0 +1,104 @@ +# Blue/Green Deployment: Blue Environment +# Use this file alongside values-prod.yaml for the blue (current) environment +# +# Deploy with: +# helm upgrade stellaops-blue ./devops/helm/stellaops \ +# --namespace stellaops-blue \ +# --values devops/helm/stellaops/values-prod.yaml \ +# --values devops/helm/stellaops/values-bluegreen-blue.yaml \ +# --wait + +# Environment identification +global: + profile: prod-blue + labels: + stellaops.io/environment: blue + stellaops.io/deployment-strategy: blue-green + +# Deployment identification +deployment: + environment: blue + color: blue + namespace: stellaops-blue + +# Ingress for direct blue access (for validation/debugging) +ingress: + enabled: true + hosts: + - host: stellaops-blue.example.com + path: / + servicePort: 80 + annotations: + # Not a canary - this is the primary ingress for blue + nginx.ingress.kubernetes.io/canary: "false" + +# Service naming for traffic routing +services: + api: + name: stellaops-blue-api + web: + name: stellaops-blue-web + scanner: + name: stellaops-blue-scanner + +# Pod labels for service selector +podLabels: + stellaops.io/color: blue + +# Shared resources (same for both blue and green) +database: + # IMPORTANT: Blue and Green share the same database + # Ensure migrations are N-1 compatible + host: postgres.shared.svc.cluster.local + database: stellaops_production + # Connection pool tuning for blue/green (half of normal) + pool: + minSize: 5 + maxSize: 25 + +valkey: + # Separate Valkey (Redis-compatible) instance per environment to avoid cache conflicts + host: valkey-blue.stellaops-blue.svc.cluster.local + database: 0 + +evidence: + storage: + # IMPORTANT: Shared evidence storage for continuity + bucket: stellaops-evidence-production + prefix: "" # No prefix - shared namespace + +# Health check configuration +healthCheck: + readiness: + path: /health/ready + initialDelaySeconds: 10 + periodSeconds: 15 + liveness: + path: /health/live + initialDelaySeconds: 30 + periodSeconds: 10 + +# Resource allocation (half of normal for blue/green) +resources: + api: + requests: + cpu: 500m + memory: 512Mi + limits: + cpu: 2000m + memory: 2Gi + scanner: + requests: + cpu: 1000m + memory: 1Gi + limits: + cpu: 4000m + memory: 4Gi + +# Replica count (half of normal for blue/green) +replicaCount: + api: 2 + web: 2 + scanner: 2 + signer: 1 + attestor: 1 diff --git a/deploy/helm/stellaops/values-bluegreen-green.yaml b/deploy/helm/stellaops/values-bluegreen-green.yaml new file mode 100644 index 000000000..c28ba12bb --- /dev/null +++ b/deploy/helm/stellaops/values-bluegreen-green.yaml @@ -0,0 +1,126 @@ +# Blue/Green Deployment: Green Environment +# Use this file alongside values-prod.yaml for the green (new version) environment +# +# Deploy with: +# helm upgrade stellaops-green ./devops/helm/stellaops \ +# --namespace stellaops-green \ +# --create-namespace \ +# --values devops/helm/stellaops/values-prod.yaml \ +# --values devops/helm/stellaops/values-bluegreen-green.yaml \ +# --set global.release.version="NEW_VERSION" \ +# --wait + +# Environment identification +global: + profile: prod-green + labels: + stellaops.io/environment: green + stellaops.io/deployment-strategy: blue-green + +# Deployment identification +deployment: + environment: green + color: green + namespace: stellaops-green + +# Ingress for green - starts as canary with 0% weight +ingress: + enabled: true + hosts: + - host: stellaops-green.example.com + path: / + servicePort: 80 + annotations: + # Canary ingress for gradual traffic shifting + nginx.ingress.kubernetes.io/canary: "true" + nginx.ingress.kubernetes.io/canary-weight: "0" + # Optional: header-based routing for testing + nginx.ingress.kubernetes.io/canary-by-header: "X-Canary" + nginx.ingress.kubernetes.io/canary-by-header-value: "green" + +# Canary ingress for production hostname (traffic shifting) +canaryIngress: + enabled: true + host: stellaops.example.com + annotations: + nginx.ingress.kubernetes.io/canary: "true" + nginx.ingress.kubernetes.io/canary-weight: "0" # Start at 0%, increase during cutover + +# Service naming for traffic routing +services: + api: + name: stellaops-green-api + web: + name: stellaops-green-web + scanner: + name: stellaops-green-scanner + +# Pod labels for service selector +podLabels: + stellaops.io/color: green + +# Shared resources (same for both blue and green) +database: + # IMPORTANT: Blue and Green share the same database + # Ensure migrations are N-1 compatible + host: postgres.shared.svc.cluster.local + database: stellaops_production + # Connection pool tuning for blue/green (half of normal) + pool: + minSize: 5 + maxSize: 25 + +valkey: + # Separate Valkey (Redis-compatible) instance per environment to avoid cache conflicts + host: valkey-green.stellaops-green.svc.cluster.local + database: 0 + +evidence: + storage: + # IMPORTANT: Shared evidence storage for continuity + bucket: stellaops-evidence-production + prefix: "" # No prefix - shared namespace + +# Health check configuration +healthCheck: + readiness: + path: /health/ready + initialDelaySeconds: 10 + periodSeconds: 15 + liveness: + path: /health/live + initialDelaySeconds: 30 + periodSeconds: 10 + +# Resource allocation (half of normal for blue/green) +resources: + api: + requests: + cpu: 500m + memory: 512Mi + limits: + cpu: 2000m + memory: 2Gi + scanner: + requests: + cpu: 1000m + memory: 1Gi + limits: + cpu: 4000m + memory: 4Gi + +# Replica count (half of normal for blue/green) +replicaCount: + api: 2 + web: 2 + scanner: 2 + signer: 1 + attestor: 1 + +# Migration jobs - enable for green environment +migrations: + enabled: true + # Run migrations before main deployment + preUpgrade: + enabled: true + backoffLimit: 3 diff --git a/deploy/helm/stellaops/values-console.yaml b/deploy/helm/stellaops/values-console.yaml new file mode 100644 index 000000000..2eb70b35d --- /dev/null +++ b/deploy/helm/stellaops/values-console.yaml @@ -0,0 +1,84 @@ +# Console (Angular SPA) values overlay +# Use: helm install stellaops . -f values-console.yaml + +console: + enabled: true + image: registry.stella-ops.org/stellaops/console:2025.10.0-edge + replicas: 1 + port: 8080 + + # Backend API URL injected via config.json at startup + apiBaseUrl: "" + # Authority URL for OAuth/OIDC + authorityUrl: "" + # Tenant header name + tenantHeader: "X-StellaOps-Tenant" + + # Resource limits (nginx is lightweight) + resources: + limits: + cpu: "200m" + memory: "128Mi" + requests: + cpu: "50m" + memory: "64Mi" + + # Service configuration + service: + type: ClusterIP + port: 80 + targetPort: 8080 + + # Ingress configuration (enable for external access) + ingress: + enabled: false + className: nginx + annotations: + nginx.ingress.kubernetes.io/proxy-body-size: "10m" + hosts: + - host: console.local + paths: + - path: / + pathType: Prefix + tls: [] + + # Health probes + livenessProbe: + httpGet: + path: / + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 30 + readinessProbe: + httpGet: + path: / + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 10 + + # Pod security context (non-root per DOCKER-44-001) + securityContext: + runAsNonRoot: true + runAsUser: 101 + runAsGroup: 101 + fsGroup: 101 + + # Container security context + containerSecurityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL + + # Volume mounts for nginx temp directories (RO rootfs) + volumeMounts: + - name: nginx-cache + mountPath: /var/cache/nginx + - name: nginx-run + mountPath: /var/run + volumes: + - name: nginx-cache + emptyDir: {} + - name: nginx-run + emptyDir: {} diff --git a/deploy/helm/stellaops/values-dev.yaml b/deploy/helm/stellaops/values-dev.yaml new file mode 100644 index 000000000..06e5f9e45 --- /dev/null +++ b/deploy/helm/stellaops/values-dev.yaml @@ -0,0 +1,266 @@ +global: + profile: dev + release: + version: "2025.10.0-edge" + channel: edge + manifestSha256: "822f82987529ea38d2321dbdd2ef6874a4062a117116a20861c26a8df1807beb" + image: + pullPolicy: IfNotPresent + labels: + stellaops.io/channel: edge + +telemetry: + collector: + enabled: true + defaultTenant: dev + tls: + secretName: stellaops-otel-tls + +configMaps: + notify-config: + data: + notify.yaml: | + storage: + driver: postgres + connectionString: "Host=stellaops-postgres;Port=5432;Database=notify;Username=stellaops;Password=stellaops" + commandTimeoutSeconds: 30 + + authority: + enabled: true + issuer: "https://authority.dev.stella-ops.local" + metadataAddress: "https://authority.dev.stella-ops.local/.well-known/openid-configuration" + requireHttpsMetadata: false + allowAnonymousFallback: false + backchannelTimeoutSeconds: 30 + tokenClockSkewSeconds: 60 + audiences: + - notify.dev + readScope: notify.read + adminScope: notify.admin + + api: + basePath: "/api/v1/notify" + internalBasePath: "/internal/notify" + tenantHeader: "X-StellaOps-Tenant" + + plugins: + baseDirectory: "../" + directory: "plugins/notify" + searchPatterns: + - "StellaOps.Notify.Connectors.*.dll" + orderedPlugins: + - StellaOps.Notify.Connectors.Slack + - StellaOps.Notify.Connectors.Teams + - StellaOps.Notify.Connectors.Email + - StellaOps.Notify.Connectors.Webhook + + telemetry: + enableRequestLogging: true + minimumLogLevel: Debug + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "false" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "false" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" + +services: + authority: + image: registry.stella-ops.org/stellaops/authority@sha256:a8e8faec44a579aa5714e58be835f25575710430b1ad2ccd1282a018cd9ffcdd + service: + port: 8440 + env: + STELLAOPS_AUTHORITY__ISSUER: "https://stellaops-authority:8440" + STELLAOPS_AUTHORITY__STORAGE__DRIVER: "postgres" + STELLAOPS_AUTHORITY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=authority;Username=stellaops;Password=stellaops" + STELLAOPS_AUTHORITY__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + STELLAOPS_AUTHORITY__PLUGINDIRECTORIES__0: "/app/plugins" + STELLAOPS_AUTHORITY__PLUGINS__CONFIGURATIONDIRECTORY: "/app/etc/authority.plugins" + signer: + image: registry.stella-ops.org/stellaops/signer@sha256:8bfef9a75783883d49fc18e3566553934e970b00ee090abee9cb110d2d5c3298 + service: + port: 8441 + env: + SIGNER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + SIGNER__POE__INTROSPECTURL: "https://licensing.svc.local/introspect" + SIGNER__STORAGE__DRIVER: "postgres" + SIGNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=signer;Username=stellaops;Password=stellaops" + SIGNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + attestor: + image: registry.stella-ops.org/stellaops/attestor@sha256:5cc417948c029da01dccf36e4645d961a3f6d8de7e62fe98d845f07cd2282114 + service: + port: 8442 + env: + ATTESTOR__SIGNER__BASEURL: "https://stellaops-signer:8441" + ATTESTOR__STORAGE__DRIVER: "postgres" + ATTESTOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=attestor;Username=stellaops;Password=stellaops" + ATTESTOR__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + concelier: + image: registry.stella-ops.org/stellaops/concelier@sha256:dafef3954eb4b837e2c424dd2d23e1e4d60fa83794840fac9cd3dea1d43bd085 + service: + port: 8445 + env: + CONCELIER__STORAGE__DRIVER: "postgres" + CONCELIER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops" + CONCELIER__STORAGE__S3__ENDPOINT: "http://stellaops-rustfs:8080" + CONCELIER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + CONCELIER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + volumeMounts: + - name: concelier-jobs + mountPath: /var/lib/concelier/jobs + volumes: + - name: concelier-jobs + emptyDir: {} + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web@sha256:e0dfdb087e330585a5953029fb4757f5abdf7610820a085bd61b457dbead9a11 + service: + port: 8444 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER__OFFLINEKIT__ENABLED: "false" + SCANNER__OFFLINEKIT__REQUIREDSSE: "true" + SCANNER__OFFLINEKIT__REKOROFFLINEMODE: "true" + SCANNER__OFFLINEKIT__TRUSTROOTDIRECTORY: "/etc/stellaops/trust-roots" + SCANNER__OFFLINEKIT__REKORSNAPSHOTDIRECTORY: "/var/lib/stellaops/rekor-snapshot" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "inline" + SCANNER_SURFACE_SECRETS_ROOT: "" + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:92dda42f6f64b2d9522104a5c9ffb61d37b34dd193132b68457a259748008f37 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "inline" + SCANNER_SURFACE_SECRETS_ROOT: "" + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:2025.10.0-edge + service: + port: 8446 + env: + DOTNET_ENVIRONMENT: Development + NOTIFY__QUEUE__DRIVER: "valkey" + NOTIFY__QUEUE__VALKEY__URL: "stellaops-valkey:6379" + configMounts: + - name: notify-config + mountPath: /app/etc/notify.yaml + subPath: notify.yaml + configMap: notify-config + excititor: + image: registry.stella-ops.org/stellaops/excititor@sha256:d9bd5cadf1eab427447ce3df7302c30ded837239771cc6433b9befb895054285 + env: + EXCITITOR__CONCELIER__BASEURL: "https://stellaops-concelier:8445" + EXCITITOR__STORAGE__DRIVER: "postgres" + EXCITITOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops" + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.10.0-edge + service: + port: 8448 + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: http://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.10.0-edge + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: http://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + web-ui: + image: registry.stella-ops.org/stellaops/web-ui@sha256:38b225fa7767a5b94ebae4dae8696044126aac429415e93de514d5dd95748dcf + service: + port: 8443 + env: + STELLAOPS_UI__BACKEND__BASEURL: "https://stellaops-scanner-web:8444" + + # Infrastructure services + postgres: + class: infrastructure + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + service: + port: 5432 + env: + POSTGRES_USER: stellaops + POSTGRES_PASSWORD: stellaops + POSTGRES_DB: stellaops + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumes: + - name: postgres-data + emptyDir: {} + valkey: + class: infrastructure + image: docker.io/valkey/valkey:9.0.1-alpine + service: + port: 6379 + command: + - valkey-server + - --appendonly + - "yes" + volumeMounts: + - name: valkey-data + mountPath: /data + volumes: + - name: valkey-data + emptyDir: {} + rustfs: + class: infrastructure + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + service: + port: 8080 + env: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumeMounts: + - name: rustfs-data + mountPath: /data + volumes: + - name: rustfs-data + emptyDir: {} diff --git a/deploy/helm/stellaops/values-export.yaml b/deploy/helm/stellaops/values-export.yaml new file mode 100644 index 000000000..35c918652 --- /dev/null +++ b/deploy/helm/stellaops/values-export.yaml @@ -0,0 +1,14 @@ +exportcenter: + image: + repository: registry.stella-ops.org/export-center + tag: latest + objectStorage: + endpoint: http://rustfs:8080 + bucket: export-prod + accessKeySecret: exportcenter-rustfs + secretKeySecret: exportcenter-rustfs + signing: + kmsKey: exportcenter-kms + kmsRegion: us-east-1 + dsse: + enabled: true diff --git a/deploy/helm/stellaops/values-exporter.yaml b/deploy/helm/stellaops/values-exporter.yaml new file mode 100644 index 000000000..bb30c69dd --- /dev/null +++ b/deploy/helm/stellaops/values-exporter.yaml @@ -0,0 +1,58 @@ +# Exporter (Export Center) values overlay +# Use: helm install stellaops . -f values-exporter.yaml + +exporter: + enabled: true + image: registry.stella-ops.org/stellaops/exporter:2025.10.0-edge + replicas: 1 + port: 8080 + + # Export configuration + storage: + # Object store for export artifacts + endpoint: "" + bucket: "stellaops-exports" + region: "" + + # Retention policy + retention: + defaultDays: 30 + maxDays: 365 + + resources: + limits: + cpu: "500m" + memory: "512Mi" + requests: + cpu: "100m" + memory: "256Mi" + + service: + type: ClusterIP + port: 8080 + + livenessProbe: + httpGet: + path: /health/liveness + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 30 + + readinessProbe: + httpGet: + path: /health/readiness + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 10 + + securityContext: + runAsNonRoot: true + runAsUser: 10001 + runAsGroup: 10001 + + containerSecurityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL diff --git a/deploy/helm/stellaops/values-ledger.yaml b/deploy/helm/stellaops/values-ledger.yaml new file mode 100644 index 000000000..09a8c4def --- /dev/null +++ b/deploy/helm/stellaops/values-ledger.yaml @@ -0,0 +1,59 @@ +# Ledger (Findings Ledger) values overlay +# Use: helm install stellaops . -f values-ledger.yaml + +ledger: + enabled: true + image: registry.stella-ops.org/stellaops/findings-ledger:2025.10.0-edge + replicas: 1 + port: 8080 + + # Database configuration + postgres: + host: "" + port: 5432 + database: "stellaops_ledger" + schema: "findings" + # Connection string override (takes precedence) + connectionString: "" + + # Tenant isolation + multiTenant: true + defaultTenant: "default" + + resources: + limits: + cpu: "1000m" + memory: "1Gi" + requests: + cpu: "200m" + memory: "512Mi" + + service: + type: ClusterIP + port: 8080 + + livenessProbe: + httpGet: + path: /health/liveness + port: 8080 + initialDelaySeconds: 15 + periodSeconds: 30 + + readinessProbe: + httpGet: + path: /health/readiness + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 10 + + securityContext: + runAsNonRoot: true + runAsUser: 10001 + runAsGroup: 10001 + + containerSecurityContext: + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + capabilities: + drop: + - ALL diff --git a/deploy/helm/stellaops/values-mirror.yaml b/deploy/helm/stellaops/values-mirror.yaml new file mode 100644 index 000000000..13c6b9706 --- /dev/null +++ b/deploy/helm/stellaops/values-mirror.yaml @@ -0,0 +1,305 @@ +global: + profile: mirror-managed + release: + version: "2025.10.0-edge" + channel: edge + manifestSha256: "822f82987529ea38d2321dbdd2ef6874a4062a117116a20861c26a8df1807beb" + image: + pullPolicy: IfNotPresent + labels: + stellaops.io/channel: edge + +configMaps: + mirror-gateway: + data: + mirror.conf: | + proxy_cache_path /var/cache/nginx/mirror levels=1:2 keys_zone=mirror_cache:100m max_size=10g inactive=12h use_temp_path=off; + + map $request_uri $mirror_cache_key { + default $scheme$request_method$host$request_uri; + } + + upstream concelier_backend { + server stellaops-concelier:8445; + keepalive 32; + } + + upstream excititor_backend { + server stellaops-excititor:8448; + keepalive 32; + } + + server { + listen 80; + server_name _; + return 301 https://$host$request_uri; + } + + server { + listen 443 ssl http2; + server_name mirror-primary.stella-ops.org; + + ssl_certificate /etc/nginx/tls/mirror-primary.crt; + ssl_certificate_key /etc/nginx/tls/mirror-primary.key; + ssl_protocols TLSv1.2 TLSv1.3; + ssl_prefer_server_ciphers on; + + auth_basic "StellaOps Mirror – primary"; + auth_basic_user_file /etc/nginx/secrets/mirror-primary.htpasswd; + + include /etc/nginx/conf.d/mirror-locations.conf; + } + + server { + listen 443 ssl http2; + server_name mirror-community.stella-ops.org; + + ssl_certificate /etc/nginx/tls/mirror-community.crt; + ssl_certificate_key /etc/nginx/tls/mirror-community.key; + ssl_protocols TLSv1.2 TLSv1.3; + ssl_prefer_server_ciphers on; + + auth_basic "StellaOps Mirror – community"; + auth_basic_user_file /etc/nginx/secrets/mirror-community.htpasswd; + + include /etc/nginx/conf.d/mirror-locations.conf; + } + mirror-locations.conf: | + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + proxy_redirect off; + + add_header X-Cache-Status $upstream_cache_status always; + + location = /healthz { + default_type application/json; + return 200 '{"status":"ok"}'; + } + + location /concelier/exports/ { + proxy_pass http://concelier_backend/concelier/exports/; + proxy_cache mirror_cache; + proxy_cache_key $mirror_cache_key; + proxy_cache_valid 200 5m; + proxy_cache_valid 404 1m; + add_header Cache-Control "public, max-age=300, immutable" always; + } + + location /concelier/ { + proxy_pass http://concelier_backend/concelier/; + proxy_cache off; + } + + location /excititor/mirror/ { + proxy_pass http://excititor_backend/excititor/mirror/; + proxy_cache mirror_cache; + proxy_cache_key $mirror_cache_key; + proxy_cache_valid 200 5m; + proxy_cache_valid 404 1m; + add_header Cache-Control "public, max-age=300, immutable" always; + } + + location /excititor/ { + proxy_pass http://excititor_backend/excititor/; + proxy_cache off; + } + + location / { + return 404; + } + + + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" + +services: + concelier: + image: registry.stella-ops.org/stellaops/concelier@sha256:dafef3954eb4b837e2c424dd2d23e1e4d60fa83794840fac9cd3dea1d43bd085 + service: + port: 8445 + env: + ASPNETCORE_URLS: "http://+:8445" + CONCELIER__STORAGE__DRIVER: "postgres" + CONCELIER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops" + CONCELIER__STORAGE__S3__ENDPOINT: "http://stellaops-rustfs:8080" + CONCELIER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + CONCELIER__TELEMETRY__SERVICENAME: "stellaops-concelier-mirror" + CONCELIER__MIRROR__ENABLED: "true" + CONCELIER__MIRROR__EXPORTROOT: "/exports/json" + CONCELIER__MIRROR__LATESTDIRECTORYNAME: "latest" + CONCELIER__MIRROR__MIRRORDIRECTORYNAME: "mirror" + CONCELIER__MIRROR__REQUIREAUTHENTICATION: "true" + CONCELIER__MIRROR__MAXINDEXREQUESTSPERHOUR: "600" + CONCELIER__MIRROR__DOMAINS__0__ID: "primary" + CONCELIER__MIRROR__DOMAINS__0__DISPLAYNAME: "Primary Mirror" + CONCELIER__MIRROR__DOMAINS__0__REQUIREAUTHENTICATION: "true" + CONCELIER__MIRROR__DOMAINS__0__MAXDOWNLOADREQUESTSPERHOUR: "3600" + CONCELIER__MIRROR__DOMAINS__1__ID: "community" + CONCELIER__MIRROR__DOMAINS__1__DISPLAYNAME: "Community Mirror" + CONCELIER__MIRROR__DOMAINS__1__REQUIREAUTHENTICATION: "false" + CONCELIER__MIRROR__DOMAINS__1__MAXDOWNLOADREQUESTSPERHOUR: "1800" + CONCELIER__AUTHORITY__ENABLED: "true" + CONCELIER__AUTHORITY__ALLOWANONYMOUSFALLBACK: "false" + CONCELIER__AUTHORITY__ISSUER: "https://authority.stella-ops.org" + CONCELIER__AUTHORITY__METADATAADDRESS: "" + CONCELIER__AUTHORITY__CLIENTID: "stellaops-concelier-mirror" + CONCELIER__AUTHORITY__CLIENTSECRETFILE: "/run/secrets/concelier-authority-client" + CONCELIER__AUTHORITY__CLIENTSCOPES__0: "concelier.mirror.read" + CONCELIER__AUTHORITY__AUDIENCES__0: "api://concelier.mirror" + CONCELIER__AUTHORITY__BYPASSNETWORKS__0: "10.0.0.0/8" + CONCELIER__AUTHORITY__BYPASSNETWORKS__1: "127.0.0.1/32" + CONCELIER__AUTHORITY__BYPASSNETWORKS__2: "::1/128" + CONCELIER__AUTHORITY__RESILIENCE__ENABLERETRIES: "true" + CONCELIER__AUTHORITY__RESILIENCE__RETRYDELAYS__0: "00:00:01" + CONCELIER__AUTHORITY__RESILIENCE__RETRYDELAYS__1: "00:00:02" + CONCELIER__AUTHORITY__RESILIENCE__RETRYDELAYS__2: "00:00:05" + CONCELIER__AUTHORITY__RESILIENCE__ALLOWOFFLINECACHEFALLBACK: "true" + CONCELIER__AUTHORITY__RESILIENCE__OFFLINECACHETOLERANCE: "00:10:00" + volumeMounts: + - name: concelier-jobs + mountPath: /var/lib/concelier/jobs + - name: concelier-exports + mountPath: /exports/json + - name: concelier-secrets + mountPath: /run/secrets + readOnly: true + volumes: + - name: concelier-jobs + persistentVolumeClaim: + claimName: concelier-mirror-jobs + - name: concelier-exports + persistentVolumeClaim: + claimName: concelier-mirror-exports + - name: concelier-secrets + secret: + secretName: concelier-mirror-auth + + excititor: + image: registry.stella-ops.org/stellaops/excititor@sha256:d9bd5cadf1eab427447ce3df7302c30ded837239771cc6433b9befb895054285 + env: + ASPNETCORE_URLS: "http://+:8448" + EXCITITOR__STORAGE__DRIVER: "postgres" + EXCITITOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops" + EXCITITOR__ARTIFACTS__FILESYSTEM__ROOT: "/exports" + EXCITITOR__ARTIFACTS__FILESYSTEM__OVERWRITEEXISTING: "false" + EXCITITOR__MIRROR__DOMAINS__0__ID: "primary" + EXCITITOR__MIRROR__DOMAINS__0__DISPLAYNAME: "Primary Mirror" + EXCITITOR__MIRROR__DOMAINS__0__REQUIREAUTHENTICATION: "true" + EXCITITOR__MIRROR__DOMAINS__0__MAXINDEXREQUESTSPERHOUR: "300" + EXCITITOR__MIRROR__DOMAINS__0__MAXDOWNLOADREQUESTSPERHOUR: "2400" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__0__KEY: "consensus-json" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__0__FORMAT: "json" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__0__VIEW: "consensus" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__1__KEY: "consensus-openvex" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__1__FORMAT: "openvex" + EXCITITOR__MIRROR__DOMAINS__0__EXPORTS__1__VIEW: "consensus" + EXCITITOR__MIRROR__DOMAINS__1__ID: "community" + EXCITITOR__MIRROR__DOMAINS__1__DISPLAYNAME: "Community Mirror" + EXCITITOR__MIRROR__DOMAINS__1__REQUIREAUTHENTICATION: "false" + EXCITITOR__MIRROR__DOMAINS__1__MAXINDEXREQUESTSPERHOUR: "120" + EXCITITOR__MIRROR__DOMAINS__1__MAXDOWNLOADREQUESTSPERHOUR: "600" + EXCITITOR__MIRROR__DOMAINS__1__EXPORTS__0__KEY: "community-consensus" + EXCITITOR__MIRROR__DOMAINS__1__EXPORTS__0__FORMAT: "json" + EXCITITOR__MIRROR__DOMAINS__1__EXPORTS__0__VIEW: "consensus" + volumeMounts: + - name: excititor-exports + mountPath: /exports + - name: excititor-secrets + mountPath: /run/secrets + readOnly: true + volumes: + - name: excititor-exports + persistentVolumeClaim: + claimName: excititor-mirror-exports + - name: excititor-secrets + secret: + secretName: excititor-mirror-auth + + # Infrastructure services + postgres: + class: infrastructure + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + service: + port: 5432 + env: + POSTGRES_USER: stellaops + POSTGRES_PASSWORD: stellaops + POSTGRES_DB: stellaops + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumeClaims: + - name: postgres-data + claimName: mirror-postgres-data + + valkey: + class: infrastructure + image: docker.io/valkey/valkey:9.0.1-alpine + service: + port: 6379 + command: + - valkey-server + - --appendonly + - "yes" + volumeMounts: + - name: valkey-data + mountPath: /data + volumeClaims: + - name: valkey-data + claimName: mirror-valkey-data + + rustfs: + class: infrastructure + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + service: + port: 8080 + command: + - serve + - --listen + - 0.0.0.0:8080 + - --root + - /data + env: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumeMounts: + - name: rustfs-data + mountPath: /data + volumeClaims: + - name: rustfs-data + claimName: mirror-rustfs-data + + mirror-gateway: + image: docker.io/library/nginx@sha256:208b70eefac13ee9be00e486f79c695b15cef861c680527171a27d253d834be9 + service: + type: LoadBalancer + port: 443 + portName: https + targetPort: 443 + configMounts: + - name: mirror-gateway-conf + mountPath: /etc/nginx/conf.d + configMap: mirror-gateway + volumeMounts: + - name: mirror-gateway-tls + mountPath: /etc/nginx/tls + readOnly: true + - name: mirror-gateway-secrets + mountPath: /etc/nginx/secrets + readOnly: true + - name: mirror-cache + mountPath: /var/cache/nginx + volumes: + - name: mirror-gateway-tls + secret: + secretName: mirror-gateway-tls + - name: mirror-gateway-secrets + secret: + secretName: mirror-gateway-htpasswd + - name: mirror-cache + emptyDir: {} diff --git a/deploy/helm/stellaops/values-mock.yaml b/deploy/helm/stellaops/values-mock.yaml new file mode 100644 index 000000000..bbaa05118 --- /dev/null +++ b/deploy/helm/stellaops/values-mock.yaml @@ -0,0 +1,18 @@ +mock: + enabled: true + orchestrator: + image: registry.stella-ops.org/stellaops/orchestrator@sha256:97f12856ce870bafd3328bda86833bcccbf56d255941d804966b5557f6610119 + policyRegistry: + image: registry.stella-ops.org/stellaops/policy-registry@sha256:c6cad8055e9827ebcbebb6ad4d6866dce4b83a0a49b0a8a6500b736a5cb26fa7 + packsRegistry: + image: registry.stella-ops.org/stellaops/packs-registry@sha256:1f5e9416c4dc608594ad6fad87c24d72134427f899c192b494e22b268499c791 + taskRunner: + image: registry.stella-ops.org/stellaops/task-runner@sha256:eb5ad992b49a41554f41516be1a6afcfa6522faf2111c08ff2b3664ad2fc954b + vexLens: + image: registry.stella-ops.org/stellaops/vex-lens@sha256:b44e63ecfeebc345a70c073c1ce5ace709c58be0ffaad0e2862758aeee3092fb + issuerDirectory: + image: registry.stella-ops.org/stellaops/issuer-directory@sha256:67e8ef02c97d3156741e857756994888f30c373ace8e84886762edba9dc51914 + findingsLedger: + image: registry.stella-ops.org/stellaops/findings-ledger@sha256:71d4c361ba8b2f8b69d652597bc3f2efc8a64f93fab854ce25272a88506df49c + vulnExplorerApi: + image: registry.stella-ops.org/stellaops/vuln-explorer-api@sha256:7fc7e43a05cbeb0106ce7d4d634612e83de6fdc119aaab754a71c1d60b82841d diff --git a/deploy/helm/stellaops/values-notify.yaml b/deploy/helm/stellaops/values-notify.yaml new file mode 100644 index 000000000..e352e109b --- /dev/null +++ b/deploy/helm/stellaops/values-notify.yaml @@ -0,0 +1,15 @@ +notify: + image: + repository: registry.stella-ops.org/notify + tag: latest + smtp: + host: smtp.example.com + port: 587 + usernameSecret: notify-smtp + passwordSecret: notify-smtp + webhook: + allowedHosts: ["https://hooks.slack.com"] + chat: + webhookSecret: notify-chat + tls: + secretName: notify-tls diff --git a/deploy/helm/stellaops/values-orchestrator.yaml b/deploy/helm/stellaops/values-orchestrator.yaml new file mode 100644 index 000000000..a4e889e8b --- /dev/null +++ b/deploy/helm/stellaops/values-orchestrator.yaml @@ -0,0 +1,209 @@ +# Orchestrator Service Helm Values Overlay +# Enables job scheduling, DAG planning, and worker coordination. +# +# Usage: +# helm upgrade stellaops ./stellaops -f values.yaml -f values-orchestrator.yaml + +global: + labels: + stellaops.io/component: orchestrator + +# Orchestrator-specific ConfigMaps +configMaps: + orchestrator-config: + data: + orchestrator.yaml: | + Orchestrator: + # Telemetry configuration + telemetry: + minimumLogLevel: Information + enableRequestLogging: true + otelEndpoint: "" + + # Authority integration (disable for standalone testing) + authority: + enabled: true + issuer: https://authority.svc.cluster.local/realms/stellaops + requireHttpsMetadata: true + audiences: + - stellaops-platform + readScope: orchestrator:read + writeScope: orchestrator:write + adminScope: orchestrator:admin + + # Tenant resolution + tenantHeader: X-StellaOps-Tenant + + # PostgreSQL connection + storage: + connectionString: "Host=orchestrator-postgres;Database=stellaops_orchestrator;Username=orchestrator;Password=${POSTGRES_PASSWORD}" + commandTimeoutSeconds: 60 + enableSensitiveDataLogging: false + + # Scheduler configuration + scheduler: + # Maximum concurrent jobs per tenant + defaultConcurrencyLimit: 100 + # Default rate limit (requests per second) + defaultRateLimit: 50 + # Job claim timeout before re-queue + claimTimeoutMinutes: 30 + # Heartbeat interval for active jobs + heartbeatIntervalSeconds: 30 + # Maximum heartbeat misses before job marked stale + maxHeartbeatMisses: 3 + + # Autoscaling configuration + autoscaling: + # Enable autoscaling metrics endpoint + enabled: true + # Queue depth threshold for scale-up signal + queueDepthThreshold: 10000 + # Dispatch latency P95 threshold (ms) + latencyP95ThresholdMs: 150 + # Scale-up cooldown period + scaleUpCooldownSeconds: 60 + # Scale-down cooldown period + scaleDownCooldownSeconds: 300 + + # Load shedding configuration + loadShedding: + enabled: true + # Warning threshold (load factor) + warningThreshold: 0.8 + # Critical threshold (load factor) + criticalThreshold: 1.0 + # Emergency threshold (load factor) + emergencyThreshold: 1.5 + # Recovery cooldown + recoveryCooldownSeconds: 30 + + # Dead letter configuration + deadLetter: + # Maximum replay attempts + maxReplayAttempts: 3 + # Entry expiration (days) + expirationDays: 30 + # Purge interval + purgeIntervalHours: 24 + + # Backfill configuration + backfill: + # Maximum concurrent backfill requests + maxConcurrentRequests: 5 + # Default batch size + defaultBatchSize: 1000 + # Maximum retention lookback (days) + maxRetentionDays: 90 + +# Service definitions +services: + orchestrator-web: + image: registry.stella-ops.org/stellaops/orchestrator-web:2025.10.0-edge + replicas: 2 + service: + port: 8080 + configMounts: + - name: orchestrator-config + configMap: orchestrator-config + mountPath: /app/etc/orchestrator.yaml + subPath: orchestrator.yaml + envFrom: + - secretRef: + name: orchestrator-secrets + env: + ASPNETCORE_ENVIRONMENT: Production + ORCHESTRATOR__CONFIG: /app/etc/orchestrator.yaml + ports: + - containerPort: 8080 + resources: + requests: + memory: "256Mi" + cpu: "250m" + limits: + memory: "1Gi" + cpu: "1000m" + readinessProbe: + httpGet: + path: /readyz + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 3 + livenessProbe: + httpGet: + path: /livez + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 20 + timeoutSeconds: 5 + failureThreshold: 3 + startupProbe: + httpGet: + path: /startupz + port: 8080 + initialDelaySeconds: 3 + periodSeconds: 5 + timeoutSeconds: 3 + failureThreshold: 30 + + orchestrator-worker: + image: registry.stella-ops.org/stellaops/orchestrator-worker:2025.10.0-edge + replicas: 1 + configMounts: + - name: orchestrator-config + configMap: orchestrator-config + mountPath: /app/etc/orchestrator.yaml + subPath: orchestrator.yaml + envFrom: + - secretRef: + name: orchestrator-secrets + env: + DOTNET_ENVIRONMENT: Production + ORCHESTRATOR__CONFIG: /app/etc/orchestrator.yaml + resources: + requests: + memory: "128Mi" + cpu: "100m" + limits: + memory: "512Mi" + cpu: "500m" + + orchestrator-postgres: + class: infrastructure + image: docker.io/library/postgres:16-alpine + service: + port: 5432 + envFrom: + - secretRef: + name: orchestrator-postgres-secrets + env: + POSTGRES_DB: stellaops_orchestrator + POSTGRES_USER: orchestrator + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumeClaims: + - name: postgres-data + claimName: orchestrator-postgres-data + readinessProbe: + exec: + command: + - pg_isready + - -U + - orchestrator + - -d + - stellaops_orchestrator + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + exec: + command: + - pg_isready + - -U + - orchestrator + - -d + - stellaops_orchestrator + initialDelaySeconds: 15 + periodSeconds: 30 diff --git a/deploy/helm/stellaops/values-prod.yaml b/deploy/helm/stellaops/values-prod.yaml new file mode 100644 index 000000000..4427dc686 --- /dev/null +++ b/deploy/helm/stellaops/values-prod.yaml @@ -0,0 +1,356 @@ +global: + profile: prod + release: + version: "2025.09.2" + channel: stable + manifestSha256: "dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7" + image: + pullPolicy: IfNotPresent + labels: + stellaops.io/channel: stable + stellaops.io/profile: prod + +# Migration jobs for controlled rollouts (disabled by default) +migrations: + enabled: false + jobs: [] + +networkPolicy: + enabled: true + ingressPort: 8443 + egressPort: 443 + ingressNamespaces: + kubernetes.io/metadata.name: stellaops + egressNamespaces: + kubernetes.io/metadata.name: stellaops + +ingress: + enabled: true + className: nginx + annotations: + nginx.ingress.kubernetes.io/proxy-body-size: "50m" + nginx.ingress.kubernetes.io/ssl-redirect: "true" + cert-manager.io/cluster-issuer: "letsencrypt-prod" + hosts: + - host: gateway.prod.stella-ops.org + path: / + servicePort: 80 + tls: + - secretName: stellaops-prod-tls + hosts: + - gateway.prod.stella-ops.org + +externalSecrets: + enabled: true + secrets: + - name: core-secrets + storeRef: + name: stellaops-secret-store + kind: ClusterSecretStore + target: + name: stellaops-prod-core + data: + - key: STELLAOPS_AUTHORITY__JWT__SIGNINGKEY + remoteKey: prod/authority/jwt-signing-key + - key: STELLAOPS_SECRETS_ENCRYPTION_KEY + remoteKey: prod/core/secrets-encryption-key + +prometheus: + enabled: true + path: /metrics + port: 8080 + scheme: http + +hpa: + enabled: true + minReplicas: 2 + maxReplicas: 6 + cpu: + targetPercentage: 70 + memory: + targetPercentage: 75 + +configMaps: + notify-config: + data: + notify.yaml: | + storage: + driver: postgres + connectionString: "Host=stellaops-postgres;Port=5432;Database=notify;Username=stellaops;Password=stellaops" + commandTimeoutSeconds: 45 + + authority: + enabled: true + issuer: "https://authority.prod.stella-ops.org" + metadataAddress: "https://authority.prod.stella-ops.org/.well-known/openid-configuration" + requireHttpsMetadata: true + allowAnonymousFallback: false + backchannelTimeoutSeconds: 30 + tokenClockSkewSeconds: 60 + audiences: + - notify + readScope: notify.read + adminScope: notify.admin + + api: + basePath: "/api/v1/notify" + internalBasePath: "/internal/notify" + tenantHeader: "X-StellaOps-Tenant" + + plugins: + baseDirectory: "/opt/stellaops" + directory: "plugins/notify" + searchPatterns: + - "StellaOps.Notify.Connectors.*.dll" + orderedPlugins: + - StellaOps.Notify.Connectors.Slack + - StellaOps.Notify.Connectors.Teams + - StellaOps.Notify.Connectors.Email + - StellaOps.Notify.Connectors.Webhook + + telemetry: + enableRequestLogging: true + minimumLogLevel: Information + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" +services: + authority: + image: registry.stella-ops.org/stellaops/authority@sha256:b0348bad1d0b401cc3c71cb40ba034c8043b6c8874546f90d4783c9dbfcc0bf5 + service: + port: 8440 + env: + STELLAOPS_AUTHORITY__ISSUER: "https://authority.prod.stella-ops.org" + STELLAOPS_AUTHORITY__STORAGE__DRIVER: "postgres" + STELLAOPS_AUTHORITY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=authority;Username=stellaops;Password=stellaops" + STELLAOPS_AUTHORITY__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + STELLAOPS_AUTHORITY__PLUGINDIRECTORIES__0: "/app/plugins" + STELLAOPS_AUTHORITY__PLUGINS__CONFIGURATIONDIRECTORY: "/app/etc/authority.plugins" + envFrom: + - secretRef: + name: stellaops-prod-core + signer: + image: registry.stella-ops.org/stellaops/signer@sha256:8ad574e61f3a9e9bda8a58eb2700ae46813284e35a150b1137bc7c2b92ac0f2e + service: + port: 8441 + env: + SIGNER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + SIGNER__POE__INTROSPECTURL: "https://licensing.prod.stella-ops.org/introspect" + SIGNER__STORAGE__DRIVER: "postgres" + SIGNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=signer;Username=stellaops;Password=stellaops" + SIGNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + envFrom: + - secretRef: + name: stellaops-prod-core + attestor: + image: registry.stella-ops.org/stellaops/attestor@sha256:0534985f978b0b5d220d73c96fddd962cd9135f616811cbe3bff4666c5af568f + service: + port: 8442 + env: + ATTESTOR__SIGNER__BASEURL: "https://stellaops-signer:8441" + ATTESTOR__STORAGE__DRIVER: "postgres" + ATTESTOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=attestor;Username=stellaops;Password=stellaops" + ATTESTOR__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + envFrom: + - secretRef: + name: stellaops-prod-core + concelier: + image: registry.stella-ops.org/stellaops/concelier@sha256:c58cdcaee1d266d68d498e41110a589dd204b487d37381096bd61ab345a867c5 + service: + port: 8445 + env: + CONCELIER__STORAGE__DRIVER: "postgres" + CONCELIER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops" + CONCELIER__STORAGE__S3__ENDPOINT: "http://stellaops-rustfs:8080" + CONCELIER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + CONCELIER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + envFrom: + - secretRef: + name: stellaops-prod-core + volumeMounts: + - name: concelier-jobs + mountPath: /var/lib/concelier/jobs + volumeClaims: + - name: concelier-jobs + claimName: stellaops-concelier-jobs + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web@sha256:14b23448c3f9586a9156370b3e8c1991b61907efa666ca37dd3aaed1e79fe3b7 + service: + port: 8444 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "true" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER__OFFLINEKIT__ENABLED: "false" + SCANNER__OFFLINEKIT__REQUIREDSSE: "true" + SCANNER__OFFLINEKIT__REKOROFFLINEMODE: "true" + SCANNER__OFFLINEKIT__TRUSTROOTDIRECTORY: "/etc/stellaops/trust-roots" + SCANNER__OFFLINEKIT__REKORSNAPSHOTDIRECTORY: "/var/lib/stellaops/rekor-snapshot" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "kubernetes" + SCANNER_SURFACE_SECRETS_ROOT: "stellaops/scanner" + envFrom: + - secretRef: + name: stellaops-prod-core + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:32e25e76386eb9ea8bee0a1ad546775db9a2df989fab61ac877e351881960dab + replicas: 3 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "true" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "kubernetes" + SCANNER_SURFACE_SECRETS_ROOT: "stellaops/scanner" + envFrom: + - secretRef: + name: stellaops-prod-core + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:2025.09.2 + service: + port: 8446 + env: + DOTNET_ENVIRONMENT: Production + NOTIFY__QUEUE__DRIVER: "valkey" + NOTIFY__QUEUE__VALKEY__URL: "stellaops-valkey:6379" + envFrom: + - secretRef: + name: stellaops-prod-notify + configMounts: + - name: notify-config + mountPath: /app/etc/notify.yaml + subPath: notify.yaml + configMap: notify-config + excititor: + image: registry.stella-ops.org/stellaops/excititor@sha256:59022e2016aebcef5c856d163ae705755d3f81949d41195256e935ef40a627fa + env: + EXCITITOR__CONCELIER__BASEURL: "https://stellaops-concelier:8445" + EXCITITOR__STORAGE__DRIVER: "postgres" + EXCITITOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops" + envFrom: + - secretRef: + name: stellaops-prod-core + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.09.2 + service: + port: 8448 + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: https://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + envFrom: + - secretRef: + name: stellaops-prod-core + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.09.2 + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: https://stellaops-scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + envFrom: + - secretRef: + name: stellaops-prod-core + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + web-ui: + image: registry.stella-ops.org/stellaops/web-ui@sha256:10d924808c48e4353e3a241da62eb7aefe727a1d6dc830eb23a8e181013b3a23 + service: + port: 8443 + env: + STELLAOPS_UI__BACKEND__BASEURL: "https://stellaops-scanner-web:8444" + # Infrastructure services + postgres: + class: infrastructure + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + service: + port: 5432 + env: + POSTGRES_USER: stellaops + POSTGRES_PASSWORD: stellaops + POSTGRES_DB: stellaops + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumeClaims: + - name: postgres-data + claimName: stellaops-postgres-data + valkey: + class: infrastructure + image: docker.io/valkey/valkey:9.0.1-alpine + service: + port: 6379 + command: + - valkey-server + - --appendonly + - "yes" + volumeMounts: + - name: valkey-data + mountPath: /data + volumeClaims: + - name: valkey-data + claimName: stellaops-valkey-data + rustfs: + class: infrastructure + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + service: + port: 8080 + command: + - serve + - --listen + - 0.0.0.0:8080 + - --root + - /data + env: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumeMounts: + - name: rustfs-data + mountPath: /data + volumeClaims: + - name: rustfs-data + claimName: stellaops-rustfs-data + diff --git a/deploy/helm/stellaops/values-stage.yaml b/deploy/helm/stellaops/values-stage.yaml new file mode 100644 index 000000000..385084de9 --- /dev/null +++ b/deploy/helm/stellaops/values-stage.yaml @@ -0,0 +1,238 @@ +global: + profile: stage + release: + version: "2025.09.2" + channel: stable + manifestSha256: "dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7" + image: + pullPolicy: IfNotPresent + labels: + stellaops.io/channel: stable + +telemetry: + collector: + enabled: true + defaultTenant: stage + tls: + secretName: stellaops-otel-tls-stage + +configMaps: + notify-config: + data: + notify.yaml: | + storage: + driver: postgres + connectionString: "Host=stellaops-postgres;Port=5432;Database=notify;Username=stellaops;Password=stellaops" + commandTimeoutSeconds: 45 + + authority: + enabled: true + issuer: "https://authority.stage.stella-ops.org" + metadataAddress: "https://authority.stage.stella-ops.org/.well-known/openid-configuration" + requireHttpsMetadata: true + allowAnonymousFallback: false + backchannelTimeoutSeconds: 30 + tokenClockSkewSeconds: 60 + audiences: + - notify + readScope: notify.read + adminScope: notify.admin + + api: + basePath: "/api/v1/notify" + internalBasePath: "/internal/notify" + tenantHeader: "X-StellaOps-Tenant" + + plugins: + baseDirectory: "/opt/stellaops" + directory: "plugins/notify" + searchPatterns: + - "StellaOps.Notify.Connectors.*.dll" + orderedPlugins: + - StellaOps.Notify.Connectors.Slack + - StellaOps.Notify.Connectors.Teams + - StellaOps.Notify.Connectors.Email + - StellaOps.Notify.Connectors.Webhook + + telemetry: + enableRequestLogging: true + minimumLogLevel: Information + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "true" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" +services: + authority: + image: registry.stella-ops.org/stellaops/authority@sha256:b0348bad1d0b401cc3c71cb40ba034c8043b6c8874546f90d4783c9dbfcc0bf5 + service: + port: 8440 + env: + STELLAOPS_AUTHORITY__ISSUER: "https://stellaops-authority:8440" + STELLAOPS_AUTHORITY__STORAGE__DRIVER: "postgres" + STELLAOPS_AUTHORITY__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=authority;Username=stellaops;Password=stellaops" + STELLAOPS_AUTHORITY__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + STELLAOPS_AUTHORITY__PLUGINDIRECTORIES__0: "/app/plugins" + STELLAOPS_AUTHORITY__PLUGINS__CONFIGURATIONDIRECTORY: "/app/etc/authority.plugins" + signer: + image: registry.stella-ops.org/stellaops/signer@sha256:8ad574e61f3a9e9bda8a58eb2700ae46813284e35a150b1137bc7c2b92ac0f2e + service: + port: 8441 + env: + SIGNER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + SIGNER__POE__INTROSPECTURL: "https://licensing.stage.stella-ops.internal/introspect" + SIGNER__STORAGE__DRIVER: "postgres" + SIGNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=signer;Username=stellaops;Password=stellaops" + SIGNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + attestor: + image: registry.stella-ops.org/stellaops/attestor@sha256:0534985f978b0b5d220d73c96fddd962cd9135f616811cbe3bff4666c5af568f + service: + port: 8442 + env: + ATTESTOR__SIGNER__BASEURL: "https://stellaops-signer:8441" + ATTESTOR__STORAGE__DRIVER: "postgres" + ATTESTOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=attestor;Username=stellaops;Password=stellaops" + ATTESTOR__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + concelier: + image: registry.stella-ops.org/stellaops/concelier@sha256:c58cdcaee1d266d68d498e41110a589dd204b487d37381096bd61ab345a867c5 + service: + port: 8445 + env: + CONCELIER__STORAGE__DRIVER: "postgres" + CONCELIER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=concelier;Username=stellaops;Password=stellaops" + CONCELIER__STORAGE__S3__ENDPOINT: "http://stellaops-rustfs:8080" + CONCELIER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + CONCELIER__AUTHORITY__BASEURL: "https://stellaops-authority:8440" + volumeMounts: + - name: concelier-jobs + mountPath: /var/lib/concelier/jobs + volumeClaims: + - name: concelier-jobs + claimName: stellaops-concelier-jobs + scanner-web: + image: registry.stella-ops.org/stellaops/scanner-web@sha256:14b23448c3f9586a9156370b3e8c1991b61907efa666ca37dd3aaed1e79fe3b7 + service: + port: 8444 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER__OFFLINEKIT__ENABLED: "false" + SCANNER__OFFLINEKIT__REQUIREDSSE: "true" + SCANNER__OFFLINEKIT__REKOROFFLINEMODE: "true" + SCANNER__OFFLINEKIT__TRUSTROOTDIRECTORY: "/etc/stellaops/trust-roots" + SCANNER__OFFLINEKIT__REKORSNAPSHOTDIRECTORY: "/var/lib/stellaops/rekor-snapshot" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "kubernetes" + SCANNER_SURFACE_SECRETS_ROOT: "stellaops/scanner" + scanner-worker: + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:32e25e76386eb9ea8bee0a1ad546775db9a2df989fab61ac877e351881960dab + replicas: 2 + env: + SCANNER__STORAGE__DRIVER: "postgres" + SCANNER__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=scanner;Username=stellaops;Password=stellaops" + SCANNER__CACHE__REDIS__CONNECTIONSTRING: "stellaops-valkey:6379" + SCANNER__ARTIFACTSTORE__DRIVER: "rustfs" + SCANNER__ARTIFACTSTORE__ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER__ARTIFACTSTORE__BUCKET: "scanner-artifacts" + SCANNER__ARTIFACTSTORE__TIMEOUTSECONDS: "30" + SCANNER__QUEUE__BROKER: "valkey://stellaops-valkey:6379" + SCANNER__EVENTS__ENABLED: "false" + SCANNER__EVENTS__DRIVER: "valkey" + SCANNER__EVENTS__DSN: "stellaops-valkey:6379" + SCANNER__EVENTS__STREAM: "stella.events" + SCANNER__EVENTS__PUBLISHTIMEOUTSECONDS: "5" + SCANNER__EVENTS__MAXSTREAMLENGTH: "10000" + SCANNER_SURFACE_FS_ENDPOINT: "http://stellaops-rustfs:8080/api/v1" + SCANNER_SURFACE_CACHE_ROOT: "/var/lib/stellaops/surface" + SCANNER_SURFACE_SECRETS_PROVIDER: "kubernetes" + SCANNER_SURFACE_SECRETS_ROOT: "stellaops/scanner" + notify-web: + image: registry.stella-ops.org/stellaops/notify-web:2025.09.2 + service: + port: 8446 + env: + DOTNET_ENVIRONMENT: Production + NOTIFY__QUEUE__DRIVER: "valkey" + NOTIFY__QUEUE__VALKEY__URL: "stellaops-valkey:6379" + configMounts: + - name: notify-config + mountPath: /app/etc/notify.yaml + subPath: notify.yaml + configMap: notify-config + excititor: + image: registry.stella-ops.org/stellaops/excititor@sha256:59022e2016aebcef5c856d163ae705755d3f81949d41195256e935ef40a627fa + env: + EXCITITOR__CONCELIER__BASEURL: "https://stellaops-concelier:8445" + EXCITITOR__STORAGE__DRIVER: "postgres" + EXCITITOR__STORAGE__POSTGRES__CONNECTIONSTRING: "Host=stellaops-postgres;Port=5432;Database=excititor;Username=stellaops;Password=stellaops" + web-ui: + image: registry.stella-ops.org/stellaops/web-ui@sha256:10d924808c48e4353e3a241da62eb7aefe727a1d6dc830eb23a8e181013b3a23 + service: + port: 8443 + env: + STELLAOPS_UI__BACKEND__BASEURL: "https://stellaops-scanner-web:8444" + + # Infrastructure services + postgres: + class: infrastructure + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + service: + port: 5432 + env: + POSTGRES_USER: stellaops + POSTGRES_PASSWORD: stellaops + POSTGRES_DB: stellaops + volumeMounts: + - name: postgres-data + mountPath: /var/lib/postgresql/data + volumeClaims: + - name: postgres-data + claimName: stellaops-postgres-data + valkey: + class: infrastructure + image: docker.io/valkey/valkey:9.0.1-alpine + service: + port: 6379 + command: + - valkey-server + - --appendonly + - "yes" + volumeMounts: + - name: valkey-data + mountPath: /data + volumeClaims: + - name: valkey-data + claimName: stellaops-valkey-data + rustfs: + class: infrastructure + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + service: + port: 8080 + command: + - serve + - --listen + - 0.0.0.0:8080 + - --root + - /data + env: + RUSTFS__LOG__LEVEL: info + RUSTFS__STORAGE__PATH: /data + volumeMounts: + - name: rustfs-data + mountPath: /data + volumeClaims: + - name: rustfs-data + claimName: stellaops-rustfs-data diff --git a/deploy/helm/stellaops/values.yaml b/deploy/helm/stellaops/values.yaml new file mode 100644 index 000000000..e76b39311 --- /dev/null +++ b/deploy/helm/stellaops/values.yaml @@ -0,0 +1,281 @@ +global: + release: + version: "" + channel: "" + manifestSha256: "" + profile: "" + image: + pullPolicy: IfNotPresent + labels: {} + +migrations: + enabled: false + jobs: [] + +networkPolicy: + enabled: false + ingressPort: 80 + egressPort: 443 + ingressNamespaces: {} + ingressPods: {} + egressNamespaces: {} + egressPods: {} + +ingress: + enabled: false + className: nginx + annotations: {} + hosts: [] + tls: [] + +externalSecrets: + enabled: false + secrets: [] + +prometheus: + enabled: false + path: /metrics + port: 8080 + scheme: http + +hpa: + enabled: false + minReplicas: 1 + maxReplicas: 3 + cpu: + targetPercentage: 75 + memory: + targetPercentage: null + +# Surface.Env configuration for Scanner/Zastava components +# See docs/modules/scanner/design/surface-env.md for details +surface: + # Surface.FS storage configuration + fs: + # Base URI for Surface.FS / RustFS / S3-compatible store (required) + endpoint: "" + # Bucket/container for manifests and artefacts + bucket: "surface-cache" + # Optional region for S3-compatible stores (AWS/GCS) + region: "" + # Local cache configuration + cache: + # Local directory for warm caches + root: "/var/lib/stellaops/surface" + # Soft limit for on-disk cache usage in MB (64-262144) + quotaMb: 4096 + # Enable manifest prefetch threads + prefetchEnabled: false + # Tenant configuration + tenant: "default" + # Comma-separated feature switches + features: "" + # TLS configuration for client authentication + tls: + # Path to PEM/PKCS#12 certificate file + certPath: "" + # Optional private key path when cert/key stored separately + keyPath: "" + # Secret name containing TLS cert/key + secretName: "" + # Secrets provider configuration + secrets: + # Provider ID: kubernetes, file, inline + provider: "kubernetes" + # Kubernetes namespace for secrets provider + namespace: "" + # Path or base for file provider + root: "" + # Optional fallback provider ID + fallbackProvider: "" + # Allow inline secrets (disable in production) + allowInline: false + +telemetry: + collector: + enabled: false + replicas: 1 + image: otel/opentelemetry-collector:0.105.0 + requireClientCert: true + defaultTenant: unknown + logLevel: info + tls: + secretName: "" + certPath: /etc/otel/tls/tls.crt + keyPath: /etc/otel/tls/tls.key + caPath: /etc/otel/tls/ca.crt + items: + - key: tls.crt + path: tls.crt + - key: tls.key + path: tls.key + - key: ca.crt + path: ca.crt + service: + grpcPort: 4317 + httpPort: 4318 + metricsPort: 9464 + resources: {} + +configMaps: + # Surface.Env environment variables for Scanner/Zastava components + surface-env: + data: + SCANNER_SURFACE_FS_ENDPOINT: "{{ .Values.surface.fs.endpoint }}" + SCANNER_SURFACE_FS_BUCKET: "{{ .Values.surface.fs.bucket }}" + SCANNER_SURFACE_FS_REGION: "{{ .Values.surface.fs.region }}" + SCANNER_SURFACE_CACHE_ROOT: "{{ .Values.surface.cache.root }}" + SCANNER_SURFACE_CACHE_QUOTA_MB: "{{ .Values.surface.cache.quotaMb }}" + SCANNER_SURFACE_PREFETCH_ENABLED: "{{ .Values.surface.cache.prefetchEnabled }}" + SCANNER_SURFACE_TENANT: "{{ .Values.surface.tenant }}" + SCANNER_SURFACE_FEATURES: "{{ .Values.surface.features }}" + SCANNER_SURFACE_TLS_CERT_PATH: "{{ .Values.surface.tls.certPath }}" + SCANNER_SURFACE_TLS_KEY_PATH: "{{ .Values.surface.tls.keyPath }}" + SCANNER_SURFACE_SECRETS_PROVIDER: "{{ .Values.surface.secrets.provider }}" + SCANNER_SURFACE_SECRETS_NAMESPACE: "{{ .Values.surface.secrets.namespace }}" + SCANNER_SURFACE_SECRETS_ROOT: "{{ .Values.surface.secrets.root }}" + SCANNER_SURFACE_SECRETS_FALLBACK_PROVIDER: "{{ .Values.surface.secrets.fallbackProvider }}" + SCANNER_SURFACE_SECRETS_ALLOW_INLINE: "{{ .Values.surface.secrets.allowInline }}" + # Zastava consumers inherit Scanner defaults but can be overridden via ZASTAVA_* envs + ZASTAVA_SURFACE_FS_ENDPOINT: "{{ .Values.surface.fs.endpoint }}" + ZASTAVA_SURFACE_FS_BUCKET: "{{ .Values.surface.fs.bucket }}" + ZASTAVA_SURFACE_FS_REGION: "{{ .Values.surface.fs.region }}" + ZASTAVA_SURFACE_CACHE_ROOT: "{{ .Values.surface.cache.root }}" + ZASTAVA_SURFACE_CACHE_QUOTA_MB: "{{ .Values.surface.cache.quotaMb }}" + ZASTAVA_SURFACE_PREFETCH_ENABLED: "{{ .Values.surface.cache.prefetchEnabled }}" + ZASTAVA_SURFACE_TENANT: "{{ .Values.surface.tenant }}" + ZASTAVA_SURFACE_FEATURES: "{{ .Values.surface.features }}" + ZASTAVA_SURFACE_TLS_CERT_PATH: "{{ .Values.surface.tls.certPath }}" + ZASTAVA_SURFACE_TLS_KEY_PATH: "{{ .Values.surface.tls.keyPath }}" + ZASTAVA_SURFACE_SECRETS_PROVIDER: "{{ .Values.surface.secrets.provider }}" + ZASTAVA_SURFACE_SECRETS_NAMESPACE: "{{ .Values.surface.secrets.namespace }}" + ZASTAVA_SURFACE_SECRETS_ROOT: "{{ .Values.surface.secrets.root }}" + ZASTAVA_SURFACE_SECRETS_FALLBACK_PROVIDER: "{{ .Values.surface.secrets.fallbackProvider }}" + ZASTAVA_SURFACE_SECRETS_ALLOW_INLINE: "{{ .Values.surface.secrets.allowInline }}" + + issuer-directory-config: + data: + issuer-directory.yaml: | + IssuerDirectory: + telemetry: + minimumLogLevel: Information + authority: + enabled: true + issuer: https://authority.svc.cluster.local/realms/stellaops + requireHttpsMetadata: true + audiences: + - stellaops-platform + readScope: issuer-directory:read + writeScope: issuer-directory:write + adminScope: issuer-directory:admin + tenantHeader: X-StellaOps-Tenant + seedCsafPublishers: true + csafSeedPath: data/csaf-publishers.json + Storage: + Driver: postgres + Postgres: + ConnectionString: Host=postgres;Port=5432;Database=issuer_directory;Username=stellaops;Password=stellaops + + policy-engine-activation: + data: + STELLAOPS_POLICY_ENGINE__ACTIVATION__FORCETWOPERSONAPPROVAL: "false" + STELLAOPS_POLICY_ENGINE__ACTIVATION__DEFAULTREQUIRESTWOPERSONAPPROVAL: "false" + STELLAOPS_POLICY_ENGINE__ACTIVATION__EMITAUDITLOGS: "true" + +services: + issuer-directory: + image: registry.stella-ops.org/stellaops/issuer-directory-web:2025.10.0-edge + replicas: 1 + configMounts: + - name: issuer-directory-config + configMap: issuer-directory-config + mountPath: /etc/issuer-directory.yaml + subPath: issuer-directory.yaml + envFrom: + - secretRef: + name: issuer-directory-secrets + env: + ISSUERDIRECTORY__CONFIG: /etc/issuer-directory.yaml + ISSUERDIRECTORY__AUTHORITY__BASEURL: https://authority:8440 + ISSUERDIRECTORY__SEEDCSAFPUBLISHERS: "true" + ports: + - containerPort: 8080 + service: + port: 8080 + readinessProbe: + httpGet: + path: /health/live + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + httpGet: + path: /health/live + port: 8080 + initialDelaySeconds: 10 + periodSeconds: 20 + scheduler-worker: + image: registry.stella-ops.org/stellaops/scheduler-worker:2025.10.0-edge + replicas: 1 + command: + - dotnet + - StellaOps.Scheduler.Worker.Host.dll + env: + SCHEDULER__QUEUE__KIND: Valkey + SCHEDULER__QUEUE__VALKEY__URL: valkey:6379 + SCHEDULER__STORAGE__DRIVER: postgres + SCHEDULER__STORAGE__POSTGRES__CONNECTIONSTRING: Host=postgres;Port=5432;Database=scheduler;Username=stellaops;Password=stellaops + SCHEDULER__WORKER__RUNNER__SCANNER__BASEADDRESS: http://scanner-web:8444 + advisory-ai-web: + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.10.0-edge + service: + port: 8448 + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: http://scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + advisory-ai-worker: + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.10.0-edge + env: + ADVISORYAI__AdvisoryAI__SbomBaseAddress: http://scanner-web:8444 + ADVISORYAI__AdvisoryAI__Queue__DirectoryPath: /var/lib/advisory-ai/queue + ADVISORYAI__AdvisoryAI__Storage__PlanCacheDirectory: /var/lib/advisory-ai/plans + ADVISORYAI__AdvisoryAI__Storage__OutputDirectory: /var/lib/advisory-ai/outputs + ADVISORYAI__AdvisoryAI__Inference__Mode: Local + ADVISORYAI__AdvisoryAI__Inference__Remote__BaseAddress: "" + ADVISORYAI__AdvisoryAI__Inference__Remote__ApiKey: "" + volumeMounts: + - name: advisory-ai-data + mountPath: /var/lib/advisory-ai + volumeClaims: + - name: advisory-ai-data + claimName: stellaops-advisory-ai-data + +mock: + enabled: false + orchestrator: + image: registry.stella-ops.org/stellaops/orchestrator@sha256:97f12856ce870bafd3328bda86833bcccbf56d255941d804966b5557f6610119 + policyRegistry: + image: registry.stella-ops.org/stellaops/policy-registry@sha256:c6cad8055e9827ebcbebb6ad4d6866dce4b83a0a49b0a8a6500b736a5cb26fa7 + packsRegistry: + image: registry.stella-ops.org/stellaops/packs-registry@sha256:1f5e9416c4dc608594ad6fad87c24d72134427f899c192b494e22b268499c791 + taskRunner: + image: registry.stella-ops.org/stellaops/task-runner@sha256:eb5ad992b49a41554f41516be1a6afcfa6522faf2111c08ff2b3664ad2fc954b + vexLens: + image: registry.stella-ops.org/stellaops/vex-lens@sha256:b44e63ecfeebc345a70c073c1ce5ace709c58be0ffaad0e2862758aeee3092fb + issuerDirectory: + image: registry.stella-ops.org/stellaops/issuer-directory@sha256:67e8ef02c97d3156741e857756994888f30c373ace8e84886762edba9dc51914 + findingsLedger: + image: registry.stella-ops.org/stellaops/findings-ledger@sha256:71d4c361ba8b2f8b69d652597bc3f2efc8a64f93fab854ce25272a88506df49c + vulnExplorerApi: + image: registry.stella-ops.org/stellaops/vuln-explorer-api@sha256:7fc7e43a05cbeb0106ce7d4d634612e83de6fdc119aaab754a71c1d60b82841d diff --git a/deploy/offline/airgap/README.md b/deploy/offline/airgap/README.md new file mode 100644 index 000000000..e675fad29 --- /dev/null +++ b/deploy/offline/airgap/README.md @@ -0,0 +1,22 @@ +# Air-gap Egress Guard Rails + +Artifacts supporting `DEVOPS-AIRGAP-56-001`: + +- `k8s-deny-egress.yaml` — NetworkPolicy template that denies all egress for pods labeled `sealed=true`, except optional in-cluster DNS when enabled. +- `compose-egress-guard.sh` — Idempotent iptables guard for Docker/compose using the `DOCKER-USER` chain to drop all outbound traffic from a compose project network while allowing loopback and RFC1918 intra-cluster ranges. +- `verify-egress-block.sh` — Verification harness that runs curl probes from Docker or Kubernetes and reports JSON results; exits non-zero if any target is reachable. +- `bundle_stage_import.py` — Deterministic bundle staging helper: validates sha256 manifest, copies bundles to staging dir as `-`, emits `staging-report.json` for evidence. +- `stage-bundle.sh` — Thin wrapper around `bundle_stage_import.py` with positional args. +- `build_bootstrap_pack.py` — Builds a Bootstrap Pack from images/charts/extras listed in a JSON config, writing `bootstrap-manifest.json` + `checksums.sha256` deterministically. +- `build_bootstrap_pack.sh` — Wrapper for the bootstrap pack builder. +- `build_mirror_bundle.py` — Generates mirror bundle manifest + checksums with dual-control approvals; optional cosign signing. Outputs `mirror-bundle-manifest.json`, `checksums.sha256`, and optional signature/cert. +- `compose-syslog-smtp.yaml` — Local SMTP (MailHog) + syslog-ng stack for sealed environments. +- `health_syslog_smtp.sh` — Brings up the syslog/SMTP stack via docker compose and performs health checks (MailHog API + syslog logger). +- `compose-observability.yaml` — Sealed-mode observability stack (Prometheus, Grafana, Tempo, Loki) with offline configs and healthchecks. +- `health_observability.sh` — Starts the observability stack and probes Prometheus/Grafana/Tempo/Loki readiness. +- `compose-syslog-smtp.yaml` + `syslog-ng.conf` — Local SMTP + syslog stack for sealed-mode notifications; run via `scripts/devops/run-smtp-syslog.sh` (health check `health_syslog_smtp.sh`). +- `observability-offline-compose.yml` + `otel-offline.yaml` + `promtail-config.yaml` — Sealed-mode observability stack (Loki, Promtail, OTEL collector with file exporters) to satisfy DEVOPS-AIRGAP-58-002. +- `compose-syslog-smtp.yaml` — Local SMTP (MailHog) + syslog-ng stack for sealed environments. +- `health_syslog_smtp.sh` — Brings up the syslog/SMTP stack via docker compose and performs health checks (MailHog API + syslog logger). + +See also `ops/devops/sealed-mode-ci/` for the full sealed-mode compose harness and `egress_probe.py`, which this verification script wraps. diff --git a/deploy/offline/airgap/build_bootstrap_pack.py b/deploy/offline/airgap/build_bootstrap_pack.py new file mode 100644 index 000000000..5ec1a5c72 --- /dev/null +++ b/deploy/offline/airgap/build_bootstrap_pack.py @@ -0,0 +1,174 @@ +#!/usr/bin/env python3 +"""Build a deterministic Bootstrap Pack bundle for sealed/offline transfer. + +- Reads a JSON config listing artefacts to include (images, Helm charts, extras). +- Copies artefacts into an output directory with preserved basenames. +- Generates `bootstrap-manifest.json` and `checksums.sha256` with sha256 hashes + and sizes for evidence/verification. +- Intended to satisfy DEVOPS-AIRGAP-56-003. + +Config schema (JSON): +{ + "name": "bootstrap-pack", + "images": ["release/containers/taskrunner.tar", "release/containers/orchestrator.tar"], + "charts": ["deploy/helm/stella.tgz"], + "extras": ["docs/24_OFFLINE_KIT.md"] +} + +Usage: + build_bootstrap_pack.py --config bootstrap.json --output out/bootstrap-pack + build_bootstrap_pack.py --self-test +""" +from __future__ import annotations + +import argparse +import hashlib +import json +import os +import shutil +import sys +from datetime import datetime, timezone +from pathlib import Path +from typing import Dict, List, Tuple + +DEFAULT_NAME = "bootstrap-pack" + + +def sha256_file(path: Path) -> Tuple[str, int]: + h = hashlib.sha256() + size = 0 + with path.open("rb") as f: + for chunk in iter(lambda: f.read(1024 * 1024), b""): + h.update(chunk) + size += len(chunk) + return h.hexdigest(), size + + +def load_config(path: Path) -> Dict: + with path.open("r", encoding="utf-8") as handle: + cfg = json.load(handle) + if not isinstance(cfg, dict): + raise ValueError("config must be a JSON object") + return cfg + + +def ensure_list(cfg: Dict, key: str) -> List[str]: + value = cfg.get(key, []) + if value is None: + return [] + if not isinstance(value, list): + raise ValueError(f"config.{key} must be a list") + return [str(x) for x in value] + + +def copy_item(src: Path, dest_root: Path, rel_dir: str) -> Tuple[str, str, int]: + dest_dir = dest_root / rel_dir + dest_dir.mkdir(parents=True, exist_ok=True) + dest_path = dest_dir / src.name + shutil.copy2(src, dest_path) + digest, size = sha256_file(dest_path) + rel_path = dest_path.relative_to(dest_root).as_posix() + return rel_path, digest, size + + +def build_pack(config_path: Path, output_dir: Path) -> Dict: + cfg = load_config(config_path) + name = cfg.get("name", DEFAULT_NAME) + images = ensure_list(cfg, "images") + charts = ensure_list(cfg, "charts") + extras = ensure_list(cfg, "extras") + + output_dir.mkdir(parents=True, exist_ok=True) + items = [] + + def process_list(paths: List[str], kind: str, rel_dir: str): + for raw in sorted(paths): + src = Path(raw).expanduser().resolve() + if not src.exists(): + items.append({ + "type": kind, + "source": raw, + "status": "missing" + }) + continue + rel_path, digest, size = copy_item(src, output_dir, rel_dir) + items.append({ + "type": kind, + "source": raw, + "path": rel_path, + "sha256": digest, + "size": size, + "status": "ok", + }) + + process_list(images, "image", "images") + process_list(charts, "chart", "charts") + process_list(extras, "extra", "extras") + + manifest = { + "name": name, + "created": datetime.now(timezone.utc).isoformat(), + "items": items, + } + + # checksums file (only for ok items) + checksum_lines = [f"{item['sha256']} {item['path']}" for item in items if item.get("status") == "ok"] + (output_dir / "checksums.sha256").write_text("\n".join(checksum_lines) + ("\n" if checksum_lines else ""), encoding="utf-8") + (output_dir / "bootstrap-manifest.json").write_text(json.dumps(manifest, ensure_ascii=False, indent=2) + "\n", encoding="utf-8") + return manifest + + +def parse_args(argv: List[str]) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("--config", type=Path, help="Path to bootstrap pack config JSON") + parser.add_argument("--output", type=Path, help="Output directory for the pack") + parser.add_argument("--self-test", action="store_true", help="Run internal self-test and exit") + return parser.parse_args(argv) + + +def self_test() -> int: + import tempfile + + with tempfile.TemporaryDirectory() as tmp: + tmpdir = Path(tmp) + files = [] + for name, content in [("img1.tar", b"image-one"), ("chart1.tgz", b"chart-one"), ("readme.txt", b"hello")]: + p = tmpdir / name + p.write_bytes(content) + files.append(p) + cfg = { + "images": [str(files[0])], + "charts": [str(files[1])], + "extras": [str(files[2])], + } + cfg_path = tmpdir / "bootstrap.json" + cfg_path.write_text(json.dumps(cfg), encoding="utf-8") + outdir = tmpdir / "out" + manifest = build_pack(cfg_path, outdir) + assert all(item.get("status") == "ok" for item in manifest["items"]), manifest + for rel in ["images/img1.tar", "charts/chart1.tgz", "extras/readme.txt", "checksums.sha256", "bootstrap-manifest.json"]: + assert (outdir / rel).exists(), f"missing {rel}" + print("self-test passed") + return 0 + + +def main(argv: List[str]) -> int: + args = parse_args(argv) + if args.self_test: + return self_test() + if not (args.config and args.output): + print("--config and --output are required unless --self-test", file=sys.stderr) + return 2 + manifest = build_pack(args.config, args.output) + missing = [i for i in manifest["items"] if i.get("status") == "missing"] + if missing: + print("Pack built with missing items:") + for item in missing: + print(f" - {item['source']}") + return 1 + print(f"Bootstrap pack written to {args.output}") + return 0 + + +if __name__ == "__main__": # pragma: no cover + sys.exit(main(sys.argv[1:])) diff --git a/deploy/offline/airgap/build_bootstrap_pack.sh b/deploy/offline/airgap/build_bootstrap_pack.sh new file mode 100644 index 000000000..9e8ace6f8 --- /dev/null +++ b/deploy/offline/airgap/build_bootstrap_pack.sh @@ -0,0 +1,10 @@ +#!/usr/bin/env bash +# Thin wrapper for build_bootstrap_pack.py +# Usage: ./build_bootstrap_pack.sh config.json out/bootstrap-pack +set -euo pipefail +if [[ $# -lt 2 ]]; then + echo "Usage: $0 " >&2 + exit 2 +fi +SCRIPT_DIR=$(cd "$(dirname "$0")" && pwd) +python3 "$SCRIPT_DIR/build_bootstrap_pack.py" --config "$1" --output "$2" diff --git a/deploy/offline/airgap/build_mirror_bundle.py b/deploy/offline/airgap/build_mirror_bundle.py new file mode 100644 index 000000000..f40213056 --- /dev/null +++ b/deploy/offline/airgap/build_mirror_bundle.py @@ -0,0 +1,154 @@ +#!/usr/bin/env python3 +"""Automate mirror bundle manifest + checksums with dual-control approvals. + +Implements DEVOPS-AIRGAP-57-001. + +Features: +- Deterministic manifest (`mirror-bundle-manifest.json`) with sha256/size per file. +- `checksums.sha256` for quick verification. +- Dual-control approvals recorded via `--approver` (min 2 required to mark approved). +- Optional cosign signing of the manifest via `--cosign-key` (sign-blob); writes + `mirror-bundle-manifest.sig` and `mirror-bundle-manifest.pem` when available. +- Offline-friendly: purely local file reads; no network access. + +Usage: + build_mirror_bundle.py --root /path/to/bundles --output out/mirror \ + --approver alice@example.com --approver bob@example.com + + build_mirror_bundle.py --self-test +""" +from __future__ import annotations + +import argparse +import hashlib +import json +import os +import shutil +import subprocess +import sys +from datetime import datetime, timezone +from pathlib import Path +from typing import Dict, List, Optional + + +def sha256_file(path: Path) -> Dict[str, int | str]: + h = hashlib.sha256() + size = 0 + with path.open("rb") as f: + for chunk in iter(lambda: f.read(1024 * 1024), b""): + h.update(chunk) + size += len(chunk) + return {"sha256": h.hexdigest(), "size": size} + + +def find_files(root: Path) -> List[Path]: + files: List[Path] = [] + for p in sorted(root.rglob("*")): + if p.is_file(): + files.append(p) + return files + + +def write_checksums(items: List[Dict], output_dir: Path) -> None: + lines = [f"{item['sha256']} {item['path']}" for item in items] + (output_dir / "checksums.sha256").write_text("\n".join(lines) + ("\n" if lines else ""), encoding="utf-8") + + +def maybe_sign(manifest_path: Path, key: Optional[str]) -> Dict[str, str]: + if not key: + return {"status": "skipped", "reason": "no key provided"} + if shutil.which("cosign") is None: + return {"status": "skipped", "reason": "cosign not found"} + sig = manifest_path.with_suffix(manifest_path.suffix + ".sig") + pem = manifest_path.with_suffix(manifest_path.suffix + ".pem") + try: + subprocess.run( + ["cosign", "sign-blob", "--key", key, "--output-signature", str(sig), "--output-certificate", str(pem), str(manifest_path)], + check=True, + capture_output=True, + text=True, + ) + return { + "status": "signed", + "signature": sig.name, + "certificate": pem.name, + } + except subprocess.CalledProcessError as exc: # pragma: no cover + return {"status": "failed", "reason": exc.stderr or str(exc)} + + +def build_manifest(root: Path, output_dir: Path, approvers: List[str], cosign_key: Optional[str]) -> Dict: + files = find_files(root) + items: List[Dict] = [] + for p in files: + rel = p.relative_to(root).as_posix() + info = sha256_file(p) + items.append({"path": rel, **info}) + manifest = { + "created": datetime.now(timezone.utc).isoformat(), + "root": str(root), + "total": len(items), + "items": items, + "approvals": sorted(set(approvers)), + "approvalStatus": "approved" if len(set(approvers)) >= 2 else "pending", + } + output_dir.mkdir(parents=True, exist_ok=True) + manifest_path = output_dir / "mirror-bundle-manifest.json" + manifest_path.write_text(json.dumps(manifest, ensure_ascii=False, indent=2) + "\n", encoding="utf-8") + write_checksums(items, output_dir) + signing = maybe_sign(manifest_path, cosign_key) + manifest["signing"] = signing + # Persist signing status in manifest for traceability + manifest_path.write_text(json.dumps(manifest, ensure_ascii=False, indent=2) + "\n", encoding="utf-8") + return manifest + + +def parse_args(argv: List[str]) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("--root", type=Path, help="Root directory containing bundle files") + parser.add_argument("--output", type=Path, help="Output directory for manifest + checksums") + parser.add_argument("--approver", action="append", default=[], help="Approver identity (email or handle); provide twice for dual-control") + parser.add_argument("--cosign-key", help="Path or KMS URI for cosign signing key (optional)") + parser.add_argument("--self-test", action="store_true", help="Run internal self-test and exit") + return parser.parse_args(argv) + + +def self_test() -> int: + import tempfile + + with tempfile.TemporaryDirectory() as tmp: + tmpdir = Path(tmp) + root = tmpdir / "bundles" + root.mkdir() + (root / "a.txt").write_text("hello", encoding="utf-8") + (root / "b.bin").write_bytes(b"world") + out = tmpdir / "out" + manifest = build_manifest(root, out, ["alice", "bob"], cosign_key=None) + assert manifest["approvalStatus"] == "approved" + assert (out / "mirror-bundle-manifest.json").exists() + assert (out / "checksums.sha256").exists() + print("self-test passed") + return 0 + + +def main(argv: List[str]) -> int: + args = parse_args(argv) + if args.self_test: + return self_test() + if not (args.root and args.output): + print("--root and --output are required unless --self-test", file=sys.stderr) + return 2 + manifest = build_manifest(args.root.resolve(), args.output.resolve(), args.approver, args.cosign_key) + if manifest["approvalStatus"] != "approved": + print("Manifest generated but approvalStatus=pending (need >=2 distinct approvers).", file=sys.stderr) + return 1 + missing = [i for i in manifest["items"] if not (args.root / i["path"]).exists()] + if missing: + print(f"Missing files in manifest: {missing}", file=sys.stderr) + return 1 + print(f"Mirror bundle manifest written to {args.output}") + return 0 + + +if __name__ == "__main__": # pragma: no cover + sys.exit(main(sys.argv[1:])) diff --git a/deploy/offline/airgap/bundle_stage_import.py b/deploy/offline/airgap/bundle_stage_import.py new file mode 100644 index 000000000..087b4e444 --- /dev/null +++ b/deploy/offline/airgap/bundle_stage_import.py @@ -0,0 +1,169 @@ +#!/usr/bin/env python3 +"""Bundle staging helper for sealed-mode imports. + +Validates bundle files against a manifest and stages them into a target directory +with deterministic names (`-`). Emits a JSON report detailing +success/failure per file for evidence capture. + +Manifest format (JSON): +[ + {"file": "bundle1.tar.gz", "sha256": "..."}, + {"file": "bundle2.ndjson", "sha256": "..."} +] + +Usage: + bundle_stage_import.py --manifest bundles.json --root /path/to/files --out staging + bundle_stage_import.py --manifest bundles.json --root . --out staging --prefix mirror/ + bundle_stage_import.py --self-test +""" +from __future__ import annotations + +import argparse +import hashlib +import json +import os +import shutil +import sys +from datetime import datetime, timezone +from pathlib import Path +from typing import Dict, List + + +def sha256_file(path: Path) -> str: + h = hashlib.sha256() + with path.open('rb') as f: + for chunk in iter(lambda: f.read(1024 * 1024), b""): + h.update(chunk) + return h.hexdigest() + + +def load_manifest(path: Path) -> List[Dict[str, str]]: + with path.open('r', encoding='utf-8') as handle: + data = json.load(handle) + if not isinstance(data, list): + raise ValueError("Manifest must be a list of objects") + normalized = [] + for idx, entry in enumerate(data): + if not isinstance(entry, dict): + raise ValueError(f"Manifest entry {idx} is not an object") + file = entry.get("file") + digest = entry.get("sha256") + if not file or not digest: + raise ValueError(f"Manifest entry {idx} missing file or sha256") + normalized.append({"file": str(file), "sha256": str(digest).lower()}) + return normalized + + +def stage_file(src: Path, digest: str, out_dir: Path, prefix: str) -> Path: + dest_name = f"{digest}-{src.name}" + dest_rel = Path(prefix) / dest_name if prefix else Path(dest_name) + dest_path = out_dir / dest_rel + dest_path.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(src, dest_path) + return dest_rel + + +def process(manifest: Path, root: Path, out_dir: Path, prefix: str) -> Dict: + items = load_manifest(manifest) + results = [] + success = True + for entry in items: + rel = Path(entry["file"]) + src = (root / rel).resolve() + expected = entry["sha256"].lower() + status = "ok" + actual = None + staged = None + message = "" + if not src.exists(): + status = "missing" + message = "file not found" + success = False + else: + actual = sha256_file(src) + if actual != expected: + status = "checksum-mismatch" + message = "sha256 mismatch" + success = False + else: + staged = str(stage_file(src, expected, out_dir, prefix)) + results.append( + { + "file": str(rel), + "expectedSha256": expected, + "actualSha256": actual, + "status": status, + "stagedPath": staged, + "message": message, + } + ) + report = { + "timestamp": datetime.now(timezone.utc).isoformat(), + "root": str(root), + "output": str(out_dir), + "prefix": prefix, + "summary": { + "total": len(results), + "success": success, + "ok": sum(1 for r in results if r["status"] == "ok"), + "missing": sum(1 for r in results if r["status"] == "missing"), + "checksumMismatch": sum(1 for r in results if r["status"] == "checksum-mismatch"), + }, + "items": results, + } + return report + + +def parse_args(argv: List[str]) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("--manifest", type=Path, help="Path to bundle manifest JSON") + parser.add_argument("--root", type=Path, help="Root directory containing bundle files") + parser.add_argument("--out", type=Path, help="Output directory for staged bundles and report") + parser.add_argument("--prefix", default="", help="Optional prefix within output dir (e.g., mirror/)") + parser.add_argument("--report", type=Path, help="Override report path (defaults to /staging-report.json)") + parser.add_argument("--self-test", action="store_true", help="Run internal self-test and exit") + return parser.parse_args(argv) + + +def write_report(report: Dict, report_path: Path) -> None: + report_path.parent.mkdir(parents=True, exist_ok=True) + with report_path.open('w', encoding='utf-8') as handle: + json.dump(report, handle, ensure_ascii=False, indent=2) + handle.write("\n") + + +def self_test() -> int: + import tempfile + + with tempfile.TemporaryDirectory() as tmp: + tmpdir = Path(tmp) + sample = tmpdir / "sample.bin" + sample.write_bytes(b"offline-bundle") + digest = sha256_file(sample) + manifest = tmpdir / "manifest.json" + manifest.write_text(json.dumps([{ "file": "sample.bin", "sha256": digest }]), encoding='utf-8') + out = tmpdir / "out" + report = process(manifest, tmpdir, out, prefix="mirror/") + assert report["summary"]["success"] is True, report + staged = out / report["items"][0]["stagedPath"] + assert staged.exists(), f"staged file missing: {staged}" + print("self-test passed") + return 0 + + +def main(argv: List[str]) -> int: + args = parse_args(argv) + if args.self_test: + return self_test() + if not (args.manifest and args.root and args.out): + print("--manifest, --root, and --out are required unless --self-test", file=sys.stderr) + return 2 + report = process(args.manifest, args.root, args.out, args.prefix) + report_path = args.report or args.out / "staging-report.json" + write_report(report, report_path) + print(f"Staged bundles → {args.out} (report {report_path})") + return 0 if report["summary"]["success"] else 1 + + +if __name__ == "__main__": # pragma: no cover + sys.exit(main(sys.argv[1:])) diff --git a/deploy/offline/airgap/compose-egress-guard.sh b/deploy/offline/airgap/compose-egress-guard.sh new file mode 100644 index 000000000..28266c160 --- /dev/null +++ b/deploy/offline/airgap/compose-egress-guard.sh @@ -0,0 +1,54 @@ +#!/usr/bin/env bash +# Enforce deny-all egress for a Docker/Compose project using DOCKER-USER chain. +# Usage: COMPOSE_PROJECT=stella ./compose-egress-guard.sh +# Optional env: ALLOW_RFC1918=true to allow east-west traffic inside 10/172/192 ranges. +set -euo pipefail + +PROJECT=${COMPOSE_PROJECT:-stella} +ALLOW_RFC1918=${ALLOW_RFC1918:-true} +NETWORK=${COMPOSE_NETWORK:-${PROJECT}_default} + +chain=STELLAOPS_SEALED_${PROJECT^^} +ipset_name=${PROJECT}_cidrs + +insert_accept() { + local dest=$1 + iptables -C DOCKER-USER -d "$dest" -j ACCEPT 2>/dev/null || iptables -I DOCKER-USER -d "$dest" -j ACCEPT +} + +# 1) Ensure DOCKER-USER exists +iptables -nL DOCKER-USER >/dev/null 2>&1 || iptables -N DOCKER-USER + +# 2) Create dedicated chain per project for clarity +iptables -nL "$chain" >/dev/null 2>&1 || iptables -N "$chain" + +# 2b) Populate ipset with compose network CIDRs (if available) +if command -v ipset >/dev/null; then + ipset list "$ipset_name" >/dev/null 2>&1 || ipset create "$ipset_name" hash:net -exist + cidrs=$(docker network inspect "$NETWORK" -f '{{range .IPAM.Config}}{{.Subnet}} {{end}}') + for cidr in $cidrs; do + ipset add "$ipset_name" "$cidr" 2>/dev/null || true + done +fi + +# 3) Allow loopback and optional RFC1918 intra-cluster ranges, then drop everything else +insert_accept 127.0.0.0/8 +if [[ "$ALLOW_RFC1918" == "true" ]]; then + insert_accept 10.0.0.0/8 + insert_accept 172.16.0.0/12 + insert_accept 192.168.0.0/16 +fi +iptables -C "$chain" -j DROP 2>/dev/null || iptables -A "$chain" -j DROP + +# 4) Hook chain into DOCKER-USER for containers in this project network +iptables -C DOCKER-USER -m addrtype --src-type LOCAL -j RETURN 2>/dev/null || true +if command -v ipset >/dev/null && ipset list "$ipset_name" >/dev/null 2>&1; then + iptables -C DOCKER-USER -m set --match-set "$ipset_name" dst -j "$chain" 2>/dev/null || iptables -I DOCKER-USER -m set --match-set "$ipset_name" dst -j "$chain" +else + # Fallback: match by destination subnet from docker inspect (first subnet only) + first_cidr=$(docker network inspect "$NETWORK" -f '{{(index .IPAM.Config 0).Subnet}}') + iptables -C DOCKER-USER -d "$first_cidr" -j "$chain" 2>/dev/null || iptables -I DOCKER-USER -d "$first_cidr" -j "$chain" +fi + +echo "Applied compose egress guard via DOCKER-USER -> $chain" >&2 +iptables -vnL "$chain" diff --git a/deploy/offline/airgap/compose-observability.yaml b/deploy/offline/airgap/compose-observability.yaml new file mode 100644 index 000000000..8b1a6865f --- /dev/null +++ b/deploy/offline/airgap/compose-observability.yaml @@ -0,0 +1,77 @@ +version: "3.9" + +services: + prometheus: + image: prom/prometheus:v2.53.0 + container_name: prometheus + command: + - --config.file=/etc/prometheus/prometheus.yml + volumes: + - ./observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro + ports: + - "9090:9090" + healthcheck: + test: ["CMD", "wget", "-qO-", "http://localhost:9090/-/ready"] + interval: 15s + timeout: 5s + retries: 5 + start_period: 10s + restart: unless-stopped + + loki: + image: grafana/loki:3.0.0 + container_name: loki + command: ["-config.file=/etc/loki/config.yaml"] + volumes: + - ./observability/loki-config.yaml:/etc/loki/config.yaml:ro + - ./observability/data/loki:/loki + ports: + - "3100:3100" + healthcheck: + test: ["CMD", "wget", "-qO-", "http://localhost:3100/ready"] + interval: 15s + timeout: 5s + retries: 5 + start_period: 15s + restart: unless-stopped + + tempo: + image: grafana/tempo:2.4.1 + container_name: tempo + command: ["-config.file=/etc/tempo/tempo.yaml"] + volumes: + - ./observability/tempo-config.yaml:/etc/tempo/tempo.yaml:ro + - ./observability/data/tempo:/var/tempo + ports: + - "3200:3200" + healthcheck: + test: ["CMD", "wget", "-qO-", "http://localhost:3200/ready"] + interval: 15s + timeout: 5s + retries: 5 + start_period: 15s + restart: unless-stopped + + grafana: + image: grafana/grafana:10.4.2 + container_name: grafana + environment: + - GF_AUTH_ANONYMOUS_ENABLED=true + - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin + - GF_SECURITY_ADMIN_PASSWORD=admin + - GF_SECURITY_ADMIN_USER=admin + volumes: + - ./observability/grafana/provisioning/datasources:/etc/grafana/provisioning/datasources:ro + ports: + - "3000:3000" + depends_on: + - prometheus + - loki + - tempo + healthcheck: + test: ["CMD", "wget", "-qO-", "http://localhost:3000/api/health"] + interval: 15s + timeout: 5s + retries: 5 + start_period: 20s + restart: unless-stopped diff --git a/deploy/offline/airgap/compose-syslog-smtp.yaml b/deploy/offline/airgap/compose-syslog-smtp.yaml new file mode 100644 index 000000000..5cb9e0359 --- /dev/null +++ b/deploy/offline/airgap/compose-syslog-smtp.yaml @@ -0,0 +1,23 @@ +version: '3.8' +services: + smtp: + image: bytemark/smtp + restart: unless-stopped + environment: + - MAILNAME=sealed.local + networks: [sealed] + ports: + - "2525:25" + syslog: + image: balabit/syslog-ng:4.7.1 + restart: unless-stopped + command: ["syslog-ng", "-F", "--no-caps"] + networks: [sealed] + ports: + - "5514:514/udp" + - "5515:601/tcp" + volumes: + - ./syslog-ng.conf:/etc/syslog-ng/syslog-ng.conf:ro +networks: + sealed: + driver: bridge diff --git a/deploy/offline/airgap/health_observability.sh b/deploy/offline/airgap/health_observability.sh new file mode 100644 index 000000000..2a3f89ed2 --- /dev/null +++ b/deploy/offline/airgap/health_observability.sh @@ -0,0 +1,28 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Health check for compose-observability.yaml (DEVOPS-AIRGAP-58-002) + +COMPOSE_FILE="$(cd "$(dirname "$0")" && pwd)/compose-observability.yaml" + +echo "Starting observability stack (Prometheus/Grafana/Tempo/Loki)..." +docker compose -f "$COMPOSE_FILE" up -d + +echo "Waiting for containers to report healthy..." +docker compose -f "$COMPOSE_FILE" wait >/dev/null 2>&1 || true + +docker compose -f "$COMPOSE_FILE" ps + +echo "Probing Prometheus /-/ready" +curl -sf http://127.0.0.1:9090/-/ready + +echo "Probing Grafana /api/health" +curl -sf http://127.0.0.1:3000/api/health + +echo "Probing Loki /ready" +curl -sf http://127.0.0.1:3100/ready + +echo "Probing Tempo /ready" +curl -sf http://127.0.0.1:3200/ready + +echo "All probes succeeded." diff --git a/deploy/offline/airgap/health_syslog_smtp.sh b/deploy/offline/airgap/health_syslog_smtp.sh new file mode 100644 index 000000000..29b4f6ccf --- /dev/null +++ b/deploy/offline/airgap/health_syslog_smtp.sh @@ -0,0 +1,33 @@ +#!/usr/bin/env bash +set -euo pipefail +# Health check for compose-syslog-smtp.yaml (DEVOPS-AIRGAP-58-001) +ROOT=${ROOT:-$(git rev-parse --show-toplevel)} +COMPOSE_FILE="${COMPOSE_FILE:-$ROOT/ops/devops/airgap/compose-syslog-smtp.yaml}" +SMTP_PORT=${SMTP_PORT:-2525} +SYSLOG_TCP=${SYSLOG_TCP:-5515} +SYSLOG_UDP=${SYSLOG_UDP:-5514} + +export COMPOSE_FILE +# ensure stack up +if ! docker compose ps >/dev/null 2>&1; then + docker compose up -d +fi +sleep 2 + +# probe smtp banner +if ! timeout 5 bash -lc "echo QUIT | nc -w2 127.0.0.1 ${SMTP_PORT}" >/dev/null 2>&1; then + echo "smtp service not responding on ${SMTP_PORT}" >&2 + exit 1 +fi +# probe syslog tcp +if ! echo "test" | nc -w2 127.0.0.1 ${SYSLOG_TCP} >/dev/null 2>&1; then + echo "syslog tcp not responding on ${SYSLOG_TCP}" >&2 + exit 1 +fi +# probe syslog udp +if ! echo "test" | nc -w2 -u 127.0.0.1 ${SYSLOG_UDP} >/dev/null 2>&1; then + echo "syslog udp not responding on ${SYSLOG_UDP}" >&2 + exit 1 +fi + +echo "smtp/syslog stack healthy" diff --git a/deploy/offline/airgap/import-bundle.sh b/deploy/offline/airgap/import-bundle.sh new file mode 100644 index 000000000..a088b1a05 --- /dev/null +++ b/deploy/offline/airgap/import-bundle.sh @@ -0,0 +1,130 @@ +#!/usr/bin/env bash +# Import air-gap bundle into isolated environment +# Usage: ./import-bundle.sh [registry] +# Example: ./import-bundle.sh /media/usb/stellaops-bundle localhost:5000 + +set -euo pipefail + +BUNDLE_DIR="${1:?Bundle directory required}" +REGISTRY="${2:-localhost:5000}" + +echo "==> Importing air-gap bundle from ${BUNDLE_DIR}" + +# Verify bundle structure +if [[ ! -f "${BUNDLE_DIR}/manifest.json" ]]; then + echo "ERROR: manifest.json not found in bundle" >&2 + exit 1 +fi + +# Verify checksums first +echo "==> Verifying checksums..." +cd "${BUNDLE_DIR}" +for sha_file in *.sha256; do + if [[ -f "${sha_file}" ]]; then + echo " Checking ${sha_file}..." + sha256sum -c "${sha_file}" || { echo "CHECKSUM FAILED: ${sha_file}" >&2; exit 1; } + fi +done + +# Load container images +echo "==> Loading container images..." +for tarball in images/*.tar images/*.tar.gz 2>/dev/null; do + if [[ -f "${tarball}" ]]; then + echo " Loading ${tarball}..." + docker load -i "${tarball}" + fi +done + +# Re-tag and push to local registry +echo "==> Pushing images to ${REGISTRY}..." +IMAGES=$(jq -r '.images[]?.name // empty' manifest.json 2>/dev/null || true) +for IMAGE in ${IMAGES}; do + LOCAL_TAG="${REGISTRY}/${IMAGE##*/}" + echo " ${IMAGE} -> ${LOCAL_TAG}" + docker tag "${IMAGE}" "${LOCAL_TAG}" 2>/dev/null || true + docker push "${LOCAL_TAG}" 2>/dev/null || echo " (push skipped - registry may be unavailable)" +done + +# Import Helm charts +echo "==> Importing Helm charts..." +if [[ -d "${BUNDLE_DIR}/charts" ]]; then + for chart in "${BUNDLE_DIR}"/charts/*.tgz; do + if [[ -f "${chart}" ]]; then + echo " Installing ${chart}..." + helm push "${chart}" "oci://${REGISTRY}/charts" 2>/dev/null || \ + echo " (OCI push skipped - copying to local)" + fi + done +fi + +# Import NuGet packages +echo "==> Importing NuGet packages..." +if [[ -d "${BUNDLE_DIR}/nugets" ]]; then + NUGET_CACHE="${HOME}/.nuget/packages" + mkdir -p "${NUGET_CACHE}" + for nupkg in "${BUNDLE_DIR}"/nugets/*.nupkg; do + if [[ -f "${nupkg}" ]]; then + PKG_NAME=$(basename "${nupkg}" .nupkg) + echo " Caching ${PKG_NAME}..." + # Extract to NuGet cache structure + unzip -q -o "${nupkg}" -d "${NUGET_CACHE}/${PKG_NAME,,}" 2>/dev/null || true + fi + done +fi + +# Import npm packages +echo "==> Importing npm packages..." +if [[ -d "${BUNDLE_DIR}/npm" ]]; then + NPM_CACHE="${HOME}/.npm/_cacache" + mkdir -p "${NPM_CACHE}" + if [[ -f "${BUNDLE_DIR}/npm/cache.tar.gz" ]]; then + tar -xzf "${BUNDLE_DIR}/npm/cache.tar.gz" -C "${HOME}/.npm" 2>/dev/null || true + fi +fi + +# Import advisory feeds +echo "==> Importing advisory feeds..." +if [[ -d "${BUNDLE_DIR}/feeds" ]]; then + FEEDS_DIR="/var/lib/stellaops/feeds" + sudo mkdir -p "${FEEDS_DIR}" 2>/dev/null || mkdir -p "${FEEDS_DIR}" + for feed in "${BUNDLE_DIR}"/feeds/*.ndjson.gz; do + if [[ -f "${feed}" ]]; then + FEED_NAME=$(basename "${feed}") + echo " Installing ${FEED_NAME}..." + cp "${feed}" "${FEEDS_DIR}/" 2>/dev/null || sudo cp "${feed}" "${FEEDS_DIR}/" + fi + done +fi + +# Import symbol bundles +echo "==> Importing symbol bundles..." +if [[ -d "${BUNDLE_DIR}/symbols" ]]; then + SYMBOLS_DIR="/var/lib/stellaops/symbols" + sudo mkdir -p "${SYMBOLS_DIR}" 2>/dev/null || mkdir -p "${SYMBOLS_DIR}" + for bundle in "${BUNDLE_DIR}"/symbols/*.zip; do + if [[ -f "${bundle}" ]]; then + echo " Extracting ${bundle}..." + unzip -q -o "${bundle}" -d "${SYMBOLS_DIR}" 2>/dev/null || true + fi + done +fi + +# Generate import report +echo "==> Generating import report..." +cat > "${BUNDLE_DIR}/import-report.json" < Import complete" +echo " Registry: ${REGISTRY}" +echo " Report: ${BUNDLE_DIR}/import-report.json" +echo "" +echo "Next steps:" +echo " 1. Update Helm values with registry: ${REGISTRY}" +echo " 2. Deploy: helm install stellaops deploy/helm/stellaops -f values-airgap.yaml" +echo " 3. Verify: kubectl get pods -n stellaops" diff --git a/deploy/offline/airgap/k8s-deny-egress.yaml b/deploy/offline/airgap/k8s-deny-egress.yaml new file mode 100644 index 000000000..44f55cc83 --- /dev/null +++ b/deploy/offline/airgap/k8s-deny-egress.yaml @@ -0,0 +1,42 @@ +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: sealed-deny-all-egress + namespace: default + labels: + stellaops.dev/owner: devops + stellaops.dev/purpose: sealed-mode +spec: + podSelector: + matchLabels: + sealed: "true" + policyTypes: + - Egress + egress: [] +--- +# Optional patch to allow in-cluster DNS while still blocking external egress. +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: sealed-allow-dns + namespace: default + labels: + stellaops.dev/owner: devops + stellaops.dev/purpose: sealed-mode +spec: + podSelector: + matchLabels: + sealed: "true" + policyTypes: + - Egress + egress: + - to: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: kube-system + podSelector: + matchLabels: + k8s-app: kube-dns + ports: + - protocol: UDP + port: 53 diff --git a/deploy/offline/airgap/observability-offline-compose.yml b/deploy/offline/airgap/observability-offline-compose.yml new file mode 100644 index 000000000..1d7662c25 --- /dev/null +++ b/deploy/offline/airgap/observability-offline-compose.yml @@ -0,0 +1,32 @@ +version: '3.8' +services: + loki: + image: grafana/loki:3.0.1 + command: ["-config.file=/etc/loki/local-config.yaml"] + volumes: + - loki-data:/loki + networks: [sealed] + promtail: + image: grafana/promtail:3.0.1 + command: ["-config.file=/etc/promtail/config.yml"] + volumes: + - promtail-data:/var/log + - ./promtail-config.yaml:/etc/promtail/config.yml:ro + networks: [sealed] + otel: + image: otel/opentelemetry-collector-contrib:0.97.0 + command: ["--config=/etc/otel/otel-offline.yaml"] + volumes: + - ./otel-offline.yaml:/etc/otel/otel-offline.yaml:ro + - otel-data:/var/otel + ports: + - "4317:4317" + - "4318:4318" + networks: [sealed] +networks: + sealed: + driver: bridge +volumes: + loki-data: + promtail-data: + otel-data: diff --git a/deploy/offline/airgap/observability/grafana/provisioning/datasources/datasources.yaml b/deploy/offline/airgap/observability/grafana/provisioning/datasources/datasources.yaml new file mode 100644 index 000000000..5d0e6fc33 --- /dev/null +++ b/deploy/offline/airgap/observability/grafana/provisioning/datasources/datasources.yaml @@ -0,0 +1,16 @@ +apiVersion: 1 + +datasources: + - name: Prometheus + type: prometheus + access: proxy + url: http://prometheus:9090 + isDefault: true + - name: Loki + type: loki + access: proxy + url: http://loki:3100 + - name: Tempo + type: tempo + access: proxy + url: http://tempo:3200 diff --git a/deploy/offline/airgap/observability/loki-config.yaml b/deploy/offline/airgap/observability/loki-config.yaml new file mode 100644 index 000000000..1342fda3b --- /dev/null +++ b/deploy/offline/airgap/observability/loki-config.yaml @@ -0,0 +1,35 @@ +server: + http_listen_port: 3100 + log_level: warn + +common: + ring: + instance_addr: loki + kvstore: + store: inmemory + replication_factor: 1 + +table_manager: + retention_deletes_enabled: true + retention_period: 168h + +schema_config: + configs: + - from: 2024-01-01 + store: boltdb-shipper + object_store: filesystem + schema: v13 + index: + prefix: index_ + period: 24h + +storage_config: + filesystem: + directory: /loki/chunks + boltdb_shipper: + active_index_directory: /loki/index + cache_location: /loki/cache + shared_store: filesystem + +limits_config: + retention_period: 168h diff --git a/deploy/offline/airgap/observability/prometheus.yml b/deploy/offline/airgap/observability/prometheus.yml new file mode 100644 index 000000000..1b49895e8 --- /dev/null +++ b/deploy/offline/airgap/observability/prometheus.yml @@ -0,0 +1,14 @@ +global: + scrape_interval: 15s + evaluation_interval: 15s + +scrape_configs: + - job_name: prometheus + static_configs: + - targets: ['prometheus:9090'] + - job_name: loki + static_configs: + - targets: ['loki:3100'] + - job_name: tempo + static_configs: + - targets: ['tempo:3200'] diff --git a/deploy/offline/airgap/observability/tempo-config.yaml b/deploy/offline/airgap/observability/tempo-config.yaml new file mode 100644 index 000000000..4b43e2195 --- /dev/null +++ b/deploy/offline/airgap/observability/tempo-config.yaml @@ -0,0 +1,26 @@ +server: + http_listen_port: 3200 + log_level: warn + +distributor: + receivers: + jaeger: + protocols: + thrift_http: + otlp: + protocols: + http: + grpc: + zipkin: + +storage: + trace: + backend: local + wal: + path: /var/tempo/wal + local: + path: /var/tempo/traces + +compactor: + compaction: + block_retention: 168h diff --git a/deploy/offline/airgap/otel-offline.yaml b/deploy/offline/airgap/otel-offline.yaml new file mode 100644 index 000000000..7879cb072 --- /dev/null +++ b/deploy/offline/airgap/otel-offline.yaml @@ -0,0 +1,40 @@ +receivers: + prometheus: + config: + scrape_configs: + - job_name: 'self' + static_configs: + - targets: ['localhost:8888'] + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + http: + endpoint: 0.0.0.0:4318 +processors: + batch: + timeout: 1s + send_batch_size: 512 +exporters: + file/metrics: + path: /var/otel/metrics.prom + file/traces: + path: /var/otel/traces.ndjson + loki/offline: + endpoint: http://loki:3100/loki/api/v1/push + labels: + job: sealed-observability + tenant_id: "sealed" +service: + telemetry: + logs: + level: info + pipelines: + metrics: + receivers: [prometheus] + processors: [batch] + exporters: [file/metrics] + traces: + receivers: [otlp] + processors: [batch] + exporters: [file/traces] diff --git a/deploy/offline/airgap/promtail-config.yaml b/deploy/offline/airgap/promtail-config.yaml new file mode 100644 index 000000000..8cf66b98f --- /dev/null +++ b/deploy/offline/airgap/promtail-config.yaml @@ -0,0 +1,14 @@ +server: + http_listen_port: 9080 + grpc_listen_port: 0 +positions: + filename: /tmp/positions.yaml +clients: + - url: http://loki:3100/loki/api/v1/push +scrape_configs: + - job_name: promtail + static_configs: + - targets: [localhost] + labels: + job: promtail + __path__: /var/log/*.log diff --git a/deploy/offline/airgap/sealed-ci-smoke.sh b/deploy/offline/airgap/sealed-ci-smoke.sh new file mode 100644 index 000000000..0326667c6 --- /dev/null +++ b/deploy/offline/airgap/sealed-ci-smoke.sh @@ -0,0 +1,42 @@ +#!/usr/bin/env bash +set -euo pipefail +# Simple sealed-mode CI smoke: block egress, resolve mock DNS, assert services start. +ROOT=${ROOT:-$(cd "$(dirname "$0")/../.." && pwd)} +LOGDIR=${LOGDIR:-$ROOT/out/airgap-smoke} +mkdir -p "$LOGDIR" + +# 1) Start mock DNS (returns 0.0.0.0 for everything) +DNS_PORT=${DNS_PORT:-53535} +python - </dev/null +DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_HTTP2SUPPORT=false \ +DOTNET_CLI_TELEMETRY_OPTOUT=1 \ +DNS_SERVER=127.0.0.1:${DNS_PORT} \ +dotnet --info > "$LOGDIR/dotnet-info.txt" +popd >/dev/null + +echo "sealed CI smoke complete; logs at $LOGDIR" diff --git a/deploy/offline/airgap/stage-bundle.sh b/deploy/offline/airgap/stage-bundle.sh new file mode 100644 index 000000000..a1299aa03 --- /dev/null +++ b/deploy/offline/airgap/stage-bundle.sh @@ -0,0 +1,14 @@ +#!/usr/bin/env bash +# Wrapper for bundle_stage_import.py with sane defaults. +# Usage: ./stage-bundle.sh manifest.json /path/to/files out/staging [prefix] +set -euo pipefail +if [[ $# -lt 3 ]]; then + echo "Usage: $0 [prefix]" >&2 + exit 2 +fi +manifest=$1 +root=$2 +out=$3 +prefix=${4:-} +SCRIPT_DIR=$(cd "$(dirname "$0")" && pwd) +python3 "$SCRIPT_DIR/bundle_stage_import.py" --manifest "$manifest" --root "$root" --out "$out" --prefix "$prefix" diff --git a/deploy/offline/airgap/syslog-ng.conf b/deploy/offline/airgap/syslog-ng.conf new file mode 100644 index 000000000..89292a704 --- /dev/null +++ b/deploy/offline/airgap/syslog-ng.conf @@ -0,0 +1,19 @@ +@version: 4.7 +@include "scl.conf" + +options { + time-reopen(10); + log-msg-size(8192); + ts-format(iso); +}; + +source s_net { + tcp(port(601)); + udp(port(514)); +}; + +destination d_file { + file("/var/log/syslog-ng/sealed.log" create-dirs(yes) perm(0644)); +}; + +log { source(s_net); destination(d_file); }; diff --git a/deploy/offline/airgap/verify-egress-block.sh b/deploy/offline/airgap/verify-egress-block.sh new file mode 100644 index 000000000..6732c4ecc --- /dev/null +++ b/deploy/offline/airgap/verify-egress-block.sh @@ -0,0 +1,88 @@ +#!/usr/bin/env bash +# Verification harness for sealed-mode egress: Docker/Compose or Kubernetes. +# Examples: +# ./verify-egress-block.sh docker stella_default out/airgap-probe.json +# ./verify-egress-block.sh k8s default out/k8s-probe.json +set -euo pipefail + +mode=${1:-} +context=${2:-} +out=${3:-} + +if [[ -z "$mode" || -z "$context" || -z "$out" ]]; then + echo "Usage: $0 [target ...]" >&2 + exit 2 +fi +shift 3 +TARGETS=($@) + +ROOT=$(cd "$(dirname "$0")/../.." && pwd) +PROBE_PY="$ROOT/ops/devops/sealed-mode-ci/egress_probe.py" + +case "$mode" in + docker) + network="$context" + python3 "$PROBE_PY" --network "$network" --output "$out" "${TARGETS[@]}" + ;; + k8s|kubernetes) + ns="$context" + targets=("${TARGETS[@]}") + if [[ ${#targets[@]} -eq 0 ]]; then + targets=("https://example.com" "https://www.cloudflare.com" "https://releases.stella-ops.org/healthz") + fi + image="curlimages/curl:8.6.0" + tmpfile=$(mktemp) + cat > "$tmpfile" < + set -euo pipefail; + rc=0; + for url in ${targets[@]}; do + echo "PROBE $url"; + if curl -fsS --max-time 8 "$url"; then + echo "UNEXPECTED_SUCCESS $url"; + rc=1; + else + echo "BLOCKED $url"; + fi; + done; + exit $rc; + securityContext: + runAsNonRoot: true + readOnlyRootFilesystem: true +MANIFEST + kubectl apply -f "$tmpfile" >/dev/null + kubectl wait --for=condition=Ready pod/sealed-egress-probe -n "$ns" --timeout=30s >/dev/null 2>&1 || true + set +e + kubectl logs -n "$ns" sealed-egress-probe > "$out.log" 2>&1 + kubectl wait --for=condition=Succeeded pod/sealed-egress-probe -n "$ns" --timeout=60s + pod_rc=$? + kubectl get pod/sealed-egress-probe -n "$ns" -o json > "$out" + kubectl delete pod/sealed-egress-probe -n "$ns" >/dev/null 2>&1 || true + set -e + if [[ $pod_rc -ne 0 ]]; then + echo "Egress check failed; see $out and $out.log" >&2 + exit 1 + fi + ;; + *) + echo "Unknown mode: $mode" >&2 + exit 2 + ;; +esac + +echo "Egress verification complete → $out" diff --git a/deploy/offline/kit/AGENTS.md b/deploy/offline/kit/AGENTS.md new file mode 100644 index 000000000..c5578c06f --- /dev/null +++ b/deploy/offline/kit/AGENTS.md @@ -0,0 +1,15 @@ +# Offline Kit — Agent Charter + +## Mission +Package Offline Update Kit per `docs/modules/devops/ARCHITECTURE.md` and `docs/24_OFFLINE_KIT.md` with deterministic digests and import tooling. + +## Required Reading +- `docs/modules/platform/architecture-overview.md` +- `docs/modules/airgap/airgap-mode.md` + +## Working Agreement +- 1. Update task status to `DOING`/`DONE` inside the corresponding `docs/implplan/SPRINT_*.md` entry when you start or finish work. +- 2. Review this charter and the Required Reading documents before coding; confirm prerequisites are met. +- 3. Keep changes deterministic (stable ordering, timestamps, hashes) and align with offline/air-gap expectations. +- 4. Coordinate doc updates, tests, and cross-guild communication whenever contracts or workflows change. +- 5. Revert to `TODO` if you pause the task without shipping changes; leave notes in commit/PR descriptions for context. diff --git a/deploy/offline/kit/TASKS.completed.md b/deploy/offline/kit/TASKS.completed.md new file mode 100644 index 000000000..9f8ee8087 --- /dev/null +++ b/deploy/offline/kit/TASKS.completed.md @@ -0,0 +1,8 @@ +# Completed Tasks + +| ID | Status | Owner(s) | Depends on | Description | Exit Criteria | +|----|--------|----------|------------|-------------|---------------| +| DEVOPS-OFFLINE-14-002 | DONE (2025-10-26) | Offline Kit Guild | DEVOPS-REL-14-001 | Build offline kit packaging workflow (artifact bundling, manifest generation, signature verification). | Offline tarball generated with manifest + checksums + signatures; `ops/offline-kit/run-python-analyzer-smoke.sh` invoked as part of packaging; `debug/.build-id` tree mirrored from release output; import script verifies integrity; docs updated. | +| DEVOPS-OFFLINE-18-004 | DONE (2025-10-22) | Offline Kit Guild, Scanner Guild | DEVOPS-OFFLINE-18-003, SCANNER-ANALYZERS-LANG-10-309G | Rebuild Offline Kit bundle with Go analyzer plug-in and updated manifest/signature set. | Kit tarball includes Go analyzer artifacts; manifest/signature refreshed; verification steps executed and logged; docs updated with new bundle version. | +| DEVOPS-OFFLINE-18-005 | DONE (2025-10-26) | Offline Kit Guild, Scanner Guild | DEVOPS-REL-14-004, SCANNER-ANALYZERS-LANG-10-309P | Repackage Offline Kit with Python analyzer plug-in artefacts and refreshed manifest/signature set. | Kit tarball includes Python analyzer DLL/PDB/manifest; signature + manifest updated; Offline Kit guide references Python coverage; smoke import validated. | +| DEVOPS-OFFLINE-17-003 | DONE (2025-10-26) | Offline Kit Guild, DevOps Guild | DEVOPS-REL-17-002 | Mirror release debug-store artefacts ( `.build-id/` tree and `debug-manifest.json`) into Offline Kit packaging and document import validation. | Offline kit archives `debug/.build-id/` with manifest/sha256, docs cover symbol lookup workflow, smoke job confirms build-id lookup succeeds on air-gapped install. | diff --git a/deploy/offline/kit/build_offline_kit.py b/deploy/offline/kit/build_offline_kit.py new file mode 100644 index 000000000..b73876d66 --- /dev/null +++ b/deploy/offline/kit/build_offline_kit.py @@ -0,0 +1,580 @@ +#!/usr/bin/env python3 +"""Package the StellaOps Offline Kit with deterministic artefacts and manifest.""" + +from __future__ import annotations + +import argparse +import datetime as dt +import hashlib +import json +import os +import re +import shutil +import subprocess +import sys +import tarfile +from collections import OrderedDict +from pathlib import Path +from typing import Any, Iterable, Mapping, MutableMapping, Optional + +REPO_ROOT = Path(__file__).resolve().parents[2] +RELEASE_TOOLS_DIR = REPO_ROOT / "ops" / "devops" / "release" +TELEMETRY_TOOLS_DIR = REPO_ROOT / "ops" / "devops" / "telemetry" +TELEMETRY_BUNDLE_PATH = REPO_ROOT / "out" / "telemetry" / "telemetry-offline-bundle.tar.gz" + +if str(RELEASE_TOOLS_DIR) not in sys.path: + sys.path.insert(0, str(RELEASE_TOOLS_DIR)) + +from verify_release import ( # type: ignore import-not-found + load_manifest, + resolve_path, + verify_release, +) + +import mirror_debug_store # type: ignore import-not-found + +DEFAULT_RELEASE_DIR = REPO_ROOT / "out" / "release" +DEFAULT_STAGING_DIR = REPO_ROOT / "out" / "offline-kit" / "staging" +DEFAULT_OUTPUT_DIR = REPO_ROOT / "out" / "offline-kit" / "dist" + +ARTIFACT_TARGETS = { + "sbom": Path("sboms"), + "provenance": Path("attest"), + "signature": Path("signatures"), + "metadata": Path("metadata/docker"), +} + + +class CommandError(RuntimeError): + """Raised when an external command fails.""" + + +def run(cmd: Iterable[str], *, cwd: Optional[Path] = None, env: Optional[Mapping[str, str]] = None) -> str: + process_env = dict(os.environ) + if env: + process_env.update(env) + result = subprocess.run( + list(cmd), + cwd=str(cwd) if cwd else None, + env=process_env, + check=False, + capture_output=True, + text=True, + ) + if result.returncode != 0: + raise CommandError( + f"Command failed ({result.returncode}): {' '.join(cmd)}\nSTDOUT:\n{result.stdout}\nSTDERR:\n{result.stderr}" + ) + return result.stdout + + +def compute_sha256(path: Path) -> str: + sha = hashlib.sha256() + with path.open("rb") as handle: + for chunk in iter(lambda: handle.read(1024 * 1024), b""): + sha.update(chunk) + return sha.hexdigest() + + +def utc_now_iso() -> str: + return dt.datetime.now(tz=dt.timezone.utc).replace(microsecond=0).isoformat().replace("+00:00", "Z") + + +def safe_component_name(name: str) -> str: + return re.sub(r"[^A-Za-z0-9_.-]", "-", name.strip().lower()) + + +def clean_directory(path: Path) -> None: + if path.exists(): + shutil.rmtree(path) + path.mkdir(parents=True, exist_ok=True) + + +def run_python_analyzer_smoke() -> None: + script = REPO_ROOT / "ops" / "offline-kit" / "run-python-analyzer-smoke.sh" + run(["bash", str(script)], cwd=REPO_ROOT) + + +def run_rust_analyzer_smoke() -> None: + script = REPO_ROOT / "ops" / "offline-kit" / "run-rust-analyzer-smoke.sh" + run(["bash", str(script)], cwd=REPO_ROOT) + + +def copy_if_exists(source: Path, target: Path) -> None: + if source.is_dir(): + shutil.copytree(source, target, dirs_exist_ok=True) + elif source.is_file(): + target.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(source, target) + + +def copy_release_manifests(release_dir: Path, staging_dir: Path) -> None: + manifest_dir = staging_dir / "manifest" + manifest_dir.mkdir(parents=True, exist_ok=True) + for name in ("release.yaml", "release.yaml.sha256", "release.json", "release.json.sha256"): + source = release_dir / name + if source.exists(): + shutil.copy2(source, manifest_dir / source.name) + + +def copy_component_artifacts( + manifest: Mapping[str, Any], + release_dir: Path, + staging_dir: Path, +) -> None: + components = manifest.get("components") or [] + for component in sorted(components, key=lambda entry: str(entry.get("name", ""))): + if not isinstance(component, Mapping): + continue + component_name = safe_component_name(str(component.get("name", "component"))) + for key, target_root in ARTIFACT_TARGETS.items(): + entry = component.get(key) + if not entry or not isinstance(entry, Mapping): + continue + path_str = entry.get("path") + if not path_str: + continue + resolved = resolve_path(str(path_str), release_dir) + if not resolved.exists(): + raise FileNotFoundError(f"Component '{component_name}' {key} artefact not found: {resolved}") + target_dir = staging_dir / target_root + target_dir.mkdir(parents=True, exist_ok=True) + target_name = f"{component_name}-{resolved.name}" if resolved.name else component_name + shutil.copy2(resolved, target_dir / target_name) + + +def copy_collections( + manifest: Mapping[str, Any], + release_dir: Path, + staging_dir: Path, +) -> None: + for collection, subdir in (("charts", Path("charts")), ("compose", Path("compose"))): + entries = manifest.get(collection) or [] + for entry in entries: + if not isinstance(entry, Mapping): + continue + path_str = entry.get("path") + if not path_str: + continue + resolved = resolve_path(str(path_str), release_dir) + if not resolved.exists(): + raise FileNotFoundError(f"{collection} artefact not found: {resolved}") + target_dir = staging_dir / subdir + target_dir.mkdir(parents=True, exist_ok=True) + shutil.copy2(resolved, target_dir / resolved.name) + + +def copy_debug_store(release_dir: Path, staging_dir: Path) -> None: + mirror_debug_store.main( + [ + "--release-dir", + str(release_dir), + "--offline-kit-dir", + str(staging_dir), + ] + ) + + +def copy_plugins_and_assets(staging_dir: Path) -> None: + copy_if_exists(REPO_ROOT / "plugins" / "scanner", staging_dir / "plugins" / "scanner") + copy_if_exists(REPO_ROOT / "certificates", staging_dir / "certificates") + copy_if_exists(REPO_ROOT / "src" / "__Tests" / "__Datasets" / "seed-data", staging_dir / "seed-data") + docs_dir = staging_dir / "docs" + docs_dir.mkdir(parents=True, exist_ok=True) + copy_if_exists(REPO_ROOT / "docs" / "24_OFFLINE_KIT.md", docs_dir / "24_OFFLINE_KIT.md") + copy_if_exists(REPO_ROOT / "docs" / "ops" / "telemetry-collector.md", docs_dir / "telemetry-collector.md") + copy_if_exists(REPO_ROOT / "docs" / "ops" / "telemetry-storage.md", docs_dir / "telemetry-storage.md") + copy_if_exists(REPO_ROOT / "docs" / "airgap" / "mirror-bundles.md", docs_dir / "mirror-bundles.md") + + +def copy_cli_and_taskrunner_assets(release_dir: Path, staging_dir: Path) -> None: + """Bundle CLI binaries, task pack docs, and Task Runner samples when available.""" + cli_src = release_dir / "cli" + if cli_src.exists(): + copy_if_exists(cli_src, staging_dir / "cli") + + taskrunner_bootstrap = staging_dir / "bootstrap" / "task-runner" + taskrunner_bootstrap.mkdir(parents=True, exist_ok=True) + copy_if_exists(REPO_ROOT / "etc" / "task-runner.yaml.sample", taskrunner_bootstrap / "task-runner.yaml.sample") + + docs_dir = staging_dir / "docs" + copy_if_exists(REPO_ROOT / "docs" / "task-packs", docs_dir / "task-packs") + copy_if_exists(REPO_ROOT / "docs" / "modules" / "taskrunner", docs_dir / "modules" / "taskrunner") + + +def copy_orchestrator_assets(release_dir: Path, staging_dir: Path) -> None: + """Copy orchestrator service, worker SDK, postgres snapshot, and dashboards when present.""" + mapping = { + release_dir / "orchestrator" / "service": staging_dir / "orchestrator" / "service", + release_dir / "orchestrator" / "worker-sdk": staging_dir / "orchestrator" / "worker-sdk", + release_dir / "orchestrator" / "postgres": staging_dir / "orchestrator" / "postgres", + release_dir / "orchestrator" / "dashboards": staging_dir / "orchestrator" / "dashboards", + } + for src, dest in mapping.items(): + copy_if_exists(src, dest) + + +def copy_export_and_notifier_assets(release_dir: Path, staging_dir: Path) -> None: + """Copy Export Center and Notifier offline bundles and tooling when present.""" + copy_if_exists(release_dir / "export-center", staging_dir / "export-center") + copy_if_exists(release_dir / "notifier", staging_dir / "notifier") + + +def copy_surface_secrets(release_dir: Path, staging_dir: Path) -> None: + """Include Surface.Secrets bundles and manifests if present.""" + copy_if_exists(release_dir / "surface-secrets", staging_dir / "surface-secrets") + + +def copy_bootstrap_configs(staging_dir: Path) -> None: + notify_config = REPO_ROOT / "etc" / "notify.airgap.yaml" + notify_secret = REPO_ROOT / "etc" / "secrets" / "notify-web-airgap.secret.example" + notify_doc = REPO_ROOT / "docs" / "modules" / "notify" / "bootstrap-pack.md" + + if not notify_config.exists(): + raise FileNotFoundError(f"Missing notifier air-gap config: {notify_config}") + if not notify_secret.exists(): + raise FileNotFoundError(f"Missing notifier air-gap secret template: {notify_secret}") + + notify_bootstrap_dir = staging_dir / "bootstrap" / "notify" + notify_bootstrap_dir.mkdir(parents=True, exist_ok=True) + copy_if_exists(REPO_ROOT / "etc" / "bootstrap" / "notify", notify_bootstrap_dir) + + copy_if_exists(notify_config, notify_bootstrap_dir / "notify.yaml") + copy_if_exists(notify_secret, notify_bootstrap_dir / "notify-web.secret.example") + copy_if_exists(notify_doc, notify_bootstrap_dir / "README.md") + + +def verify_required_seed_data(repo_root: Path) -> None: + ruby_git_sources = repo_root / "src" / "__Tests" / "__Datasets" / "seed-data" / "analyzers" / "ruby" / "git-sources" + if not ruby_git_sources.is_dir(): + raise FileNotFoundError(f"Missing Ruby git-sources seed directory: {ruby_git_sources}") + + required_files = [ + ruby_git_sources / "Gemfile.lock", + ruby_git_sources / "expected.json", + ] + for path in required_files: + if not path.exists(): + raise FileNotFoundError(f"Offline kit seed artefact missing: {path}") + + +def copy_third_party_licenses(staging_dir: Path) -> None: + licenses_src = REPO_ROOT / "third-party-licenses" + if not licenses_src.is_dir(): + return + + target_dir = staging_dir / "third-party-licenses" + target_dir.mkdir(parents=True, exist_ok=True) + + entries = sorted(licenses_src.iterdir(), key=lambda entry: entry.name.lower()) + for entry in entries: + if entry.is_dir(): + shutil.copytree(entry, target_dir / entry.name, dirs_exist_ok=True) + elif entry.is_file(): + shutil.copy2(entry, target_dir / entry.name) + + +def package_telemetry_bundle(staging_dir: Path) -> None: + script = TELEMETRY_TOOLS_DIR / "package_offline_bundle.py" + if not script.exists(): + return + TELEMETRY_BUNDLE_PATH.parent.mkdir(parents=True, exist_ok=True) + run(["python", str(script), "--output", str(TELEMETRY_BUNDLE_PATH)], cwd=REPO_ROOT) + telemetry_dir = staging_dir / "telemetry" + telemetry_dir.mkdir(parents=True, exist_ok=True) + shutil.copy2(TELEMETRY_BUNDLE_PATH, telemetry_dir / TELEMETRY_BUNDLE_PATH.name) + sha_path = TELEMETRY_BUNDLE_PATH.with_suffix(TELEMETRY_BUNDLE_PATH.suffix + ".sha256") + if sha_path.exists(): + shutil.copy2(sha_path, telemetry_dir / sha_path.name) + + +def scan_files(staging_dir: Path, exclude: Optional[set[str]] = None) -> list[OrderedDict[str, Any]]: + entries: list[OrderedDict[str, Any]] = [] + exclude = exclude or set() + for path in sorted(staging_dir.rglob("*")): + if not path.is_file(): + continue + rel = path.relative_to(staging_dir).as_posix() + if rel in exclude: + continue + entries.append( + OrderedDict( + ( + ("name", rel), + ("sha256", compute_sha256(path)), + ("size", path.stat().st_size), + ) + ) + ) + return entries + + +def summarize_counts(staging_dir: Path) -> Mapping[str, int]: + def count_files(rel: str) -> int: + root = staging_dir / rel + if not root.exists(): + return 0 + return sum(1 for path in root.rglob("*") if path.is_file()) + + return { + "cli": count_files("cli"), + "taskPacksDocs": count_files("docs/task-packs"), + "containers": count_files("containers"), + "orchestrator": count_files("orchestrator"), + "exportCenter": count_files("export-center"), + "notifier": count_files("notifier"), + "surfaceSecrets": count_files("surface-secrets"), + } + + +def copy_container_bundles(release_dir: Path, staging_dir: Path) -> None: + """Copy container air-gap bundles if present in the release directory.""" + candidates = [release_dir / "containers", release_dir / "images"] + target_dir = staging_dir / "containers" + for root in candidates: + if not root.exists(): + continue + for bundle in sorted(root.glob("**/*")): + if bundle.is_file() and bundle.suffix in {".gz", ".tar", ".tgz"}: + target_path = target_dir / bundle.relative_to(root) + target_path.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(bundle, target_path) + + +def write_offline_manifest( + staging_dir: Path, + version: str, + channel: str, + release_manifest_sha: Optional[str], +) -> tuple[Path, str]: + manifest_dir = staging_dir / "manifest" + manifest_dir.mkdir(parents=True, exist_ok=True) + offline_manifest_path = manifest_dir / "offline-manifest.json" + files = scan_files(staging_dir, exclude={"manifest/offline-manifest.json", "manifest/offline-manifest.json.sha256"}) + manifest_data = OrderedDict( + ( + ( + "bundle", + OrderedDict( + ( + ("version", version), + ("channel", channel), + ("capturedAt", utc_now_iso()), + ("releaseManifestSha256", release_manifest_sha), + ) + ), + ), + ("artifacts", files), + ) + ) + with offline_manifest_path.open("w", encoding="utf-8") as handle: + json.dump(manifest_data, handle, indent=2) + handle.write("\n") + manifest_sha = compute_sha256(offline_manifest_path) + (offline_manifest_path.with_suffix(".json.sha256")).write_text( + f"{manifest_sha} {offline_manifest_path.name}\n", + encoding="utf-8", + ) + return offline_manifest_path, manifest_sha + + +def tarinfo_filter(tarinfo: tarfile.TarInfo) -> tarfile.TarInfo: + tarinfo.uid = 0 + tarinfo.gid = 0 + tarinfo.uname = "" + tarinfo.gname = "" + tarinfo.mtime = 0 + return tarinfo + + +def create_tarball(staging_dir: Path, output_dir: Path, bundle_name: str) -> Path: + output_dir.mkdir(parents=True, exist_ok=True) + bundle_path = output_dir / f"{bundle_name}.tar.gz" + if bundle_path.exists(): + bundle_path.unlink() + with tarfile.open(bundle_path, "w:gz", compresslevel=9) as tar: + for path in sorted(staging_dir.rglob("*")): + if path.is_file(): + arcname = path.relative_to(staging_dir).as_posix() + tar.add(path, arcname=arcname, filter=tarinfo_filter) + return bundle_path + + +def sign_blob( + path: Path, + *, + key_ref: Optional[str], + identity_token: Optional[str], + password: Optional[str], + tlog_upload: bool, +) -> Optional[Path]: + if not key_ref and not identity_token: + return None + cmd = ["cosign", "sign-blob", "--yes", str(path)] + if key_ref: + cmd.extend(["--key", key_ref]) + if identity_token: + cmd.extend(["--identity-token", identity_token]) + if not tlog_upload: + cmd.append("--tlog-upload=false") + env = {"COSIGN_PASSWORD": password or ""} + signature = run(cmd, env=env) + sig_path = path.with_suffix(path.suffix + ".sig") + sig_path.write_text(signature, encoding="utf-8") + return sig_path + + +def build_offline_kit(args: argparse.Namespace) -> MutableMapping[str, Any]: + release_dir = args.release_dir.resolve() + staging_dir = args.staging_dir.resolve() + output_dir = args.output_dir.resolve() + + verify_release(release_dir) + verify_required_seed_data(REPO_ROOT) + if not args.skip_smoke: + run_rust_analyzer_smoke() + run_python_analyzer_smoke() + clean_directory(staging_dir) + copy_debug_store(release_dir, staging_dir) + + manifest_data = load_manifest(release_dir) + release_manifest_sha = None + checksums = manifest_data.get("checksums") + if isinstance(checksums, Mapping): + release_manifest_sha = checksums.get("sha256") + + copy_release_manifests(release_dir, staging_dir) + copy_component_artifacts(manifest_data, release_dir, staging_dir) + copy_collections(manifest_data, release_dir, staging_dir) + copy_plugins_and_assets(staging_dir) + copy_bootstrap_configs(staging_dir) + copy_cli_and_taskrunner_assets(release_dir, staging_dir) + copy_container_bundles(release_dir, staging_dir) + copy_orchestrator_assets(release_dir, staging_dir) + copy_export_and_notifier_assets(release_dir, staging_dir) + copy_surface_secrets(release_dir, staging_dir) + copy_third_party_licenses(staging_dir) + package_telemetry_bundle(staging_dir) + + offline_manifest_path, offline_manifest_sha = write_offline_manifest( + staging_dir, + args.version, + args.channel, + release_manifest_sha, + ) + bundle_name = f"stella-ops-offline-kit-{args.version}-{args.channel}" + bundle_path = create_tarball(staging_dir, output_dir, bundle_name) + bundle_sha = compute_sha256(bundle_path) + bundle_sha_prefixed = f"sha256:{bundle_sha}" + (bundle_path.with_suffix(".tar.gz.sha256")).write_text( + f"{bundle_sha} {bundle_path.name}\n", + encoding="utf-8", + ) + + signature_paths: dict[str, str] = {} + sig = sign_blob( + bundle_path, + key_ref=args.cosign_key, + identity_token=args.cosign_identity_token, + password=args.cosign_password, + tlog_upload=not args.no_transparency, + ) + if sig: + signature_paths["bundleSignature"] = str(sig) + manifest_sig = sign_blob( + offline_manifest_path, + key_ref=args.cosign_key, + identity_token=args.cosign_identity_token, + password=args.cosign_password, + tlog_upload=not args.no_transparency, + ) + if manifest_sig: + signature_paths["manifestSignature"] = str(manifest_sig) + + metadata = OrderedDict( + ( + ("bundleId", args.bundle_id or f"{args.version}-{args.channel}-{utc_now_iso()}"), + ("bundleName", bundle_path.name), + ("bundleSha256", bundle_sha_prefixed), + ("bundleSize", bundle_path.stat().st_size), + ("manifestName", offline_manifest_path.name), + ("manifestSha256", f"sha256:{offline_manifest_sha}"), + ("manifestSize", offline_manifest_path.stat().st_size), + ("channel", args.channel), + ("version", args.version), + ("capturedAt", utc_now_iso()), + ("counts", summarize_counts(staging_dir)), + ) + ) + + if sig: + metadata["bundleSignatureName"] = Path(sig).name + if manifest_sig: + metadata["manifestSignatureName"] = Path(manifest_sig).name + + metadata_path = output_dir / f"{bundle_name}.metadata.json" + with metadata_path.open("w", encoding="utf-8") as handle: + json.dump(metadata, handle, indent=2) + handle.write("\n") + + return OrderedDict( + ( + ("bundlePath", str(bundle_path)), + ("bundleSha256", bundle_sha), + ("manifestPath", str(offline_manifest_path)), + ("metadataPath", str(metadata_path)), + ("signatures", signature_paths), + ) + ) + + +def parse_args(argv: Optional[list[str]] = None) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("--version", required=True, help="Bundle version (e.g. 2025.10.0)") + parser.add_argument("--channel", default="edge", help="Release channel (default: %(default)s)") + parser.add_argument("--bundle-id", help="Optional explicit bundle identifier") + parser.add_argument( + "--release-dir", + type=Path, + default=DEFAULT_RELEASE_DIR, + help="Release artefact directory (default: %(default)s)", + ) + parser.add_argument( + "--staging-dir", + type=Path, + default=DEFAULT_STAGING_DIR, + help="Temporary staging directory (default: %(default)s)", + ) + parser.add_argument( + "--output-dir", + type=Path, + default=DEFAULT_OUTPUT_DIR, + help="Destination directory for packaged bundles (default: %(default)s)", + ) + parser.add_argument("--cosign-key", dest="cosign_key", help="Cosign key reference for signing") + parser.add_argument("--cosign-password", dest="cosign_password", help="Cosign key password (if applicable)") + parser.add_argument("--cosign-identity-token", dest="cosign_identity_token", help="Cosign identity token") + parser.add_argument("--no-transparency", action="store_true", help="Disable Rekor transparency log uploads") + parser.add_argument("--skip-smoke", action="store_true", help="Skip analyzer smoke execution (testing only)") + return parser.parse_args(argv) + + +def main(argv: Optional[list[str]] = None) -> int: + args = parse_args(argv) + try: + result = build_offline_kit(args) + except Exception as exc: # pylint: disable=broad-except + print(f"offline-kit packaging failed: {exc}", file=sys.stderr) + return 1 + print("✅ Offline kit packaged") + for key, value in result.items(): + if isinstance(value, dict): + for sub_key, sub_val in value.items(): + print(f" - {key}.{sub_key}: {sub_val}") + else: + print(f" - {key}: {value}") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/deploy/offline/kit/mirror_debug_store.py b/deploy/offline/kit/mirror_debug_store.py new file mode 100644 index 000000000..334e40d9d --- /dev/null +++ b/deploy/offline/kit/mirror_debug_store.py @@ -0,0 +1,221 @@ +#!/usr/bin/env python3 +"""Mirror release debug-store artefacts into the Offline Kit staging tree. + +This helper copies the release `debug/` directory (including `.build-id/`, +`debug-manifest.json`, and the `.sha256` companion) into the Offline Kit +output directory and verifies the manifest hashes after the copy. A summary +document is written under `metadata/debug-store.json` so packaging jobs can +surface the available build-ids and validation status. +""" + +from __future__ import annotations + +import argparse +import datetime as dt +import json +import pathlib +import shutil +import sys +from typing import Iterable, Tuple + +REPO_ROOT = pathlib.Path(__file__).resolve().parents[2] + + +def compute_sha256(path: pathlib.Path) -> str: + import hashlib + + sha = hashlib.sha256() + with path.open("rb") as handle: + for chunk in iter(lambda: handle.read(1024 * 1024), b""): + sha.update(chunk) + return sha.hexdigest() + + +def load_manifest(manifest_path: pathlib.Path) -> dict: + with manifest_path.open("r", encoding="utf-8") as handle: + return json.load(handle) + + +def parse_manifest_sha(sha_path: pathlib.Path) -> str | None: + if not sha_path.exists(): + return None + text = sha_path.read_text(encoding="utf-8").strip() + if not text: + return None + # Allow either "" or " filename" formats. + return text.split()[0] + + +def iter_debug_files(base_dir: pathlib.Path) -> Iterable[pathlib.Path]: + for path in base_dir.rglob("*"): + if path.is_file(): + yield path + + +def copy_debug_store(source_root: pathlib.Path, target_root: pathlib.Path, *, dry_run: bool) -> None: + if dry_run: + print(f"[dry-run] Would copy '{source_root}' -> '{target_root}'") + return + + if target_root.exists(): + shutil.rmtree(target_root) + shutil.copytree(source_root, target_root) + + +def verify_debug_store(manifest: dict, offline_root: pathlib.Path) -> Tuple[int, int]: + """Return (verified_count, total_entries).""" + + artifacts = manifest.get("artifacts", []) + verified = 0 + for entry in artifacts: + debug_path = entry.get("debugPath") + expected_sha = entry.get("sha256") + expected_size = entry.get("size") + + if not debug_path or not expected_sha: + continue + + relative = pathlib.PurePosixPath(debug_path) + resolved = (offline_root.parent / relative).resolve() + + if not resolved.exists(): + raise FileNotFoundError(f"Debug artefact missing after mirror: {relative}") + + actual_sha = compute_sha256(resolved) + if actual_sha != expected_sha: + raise ValueError( + f"Digest mismatch for {relative}: expected {expected_sha}, found {actual_sha}" + ) + + if expected_size is not None: + actual_size = resolved.stat().st_size + if actual_size != expected_size: + raise ValueError( + f"Size mismatch for {relative}: expected {expected_size}, found {actual_size}" + ) + + verified += 1 + + return verified, len(artifacts) + + +def summarize_store(manifest: dict, manifest_sha: str | None, offline_root: pathlib.Path, summary_path: pathlib.Path) -> None: + debug_files = [ + path + for path in iter_debug_files(offline_root) + if path.suffix == ".debug" + ] + + total_size = sum(path.stat().st_size for path in debug_files) + build_ids = sorted( + {entry.get("buildId") for entry in manifest.get("artifacts", []) if entry.get("buildId")} + ) + + summary = { + "generatedAt": dt.datetime.now(tz=dt.timezone.utc) + .replace(microsecond=0) + .isoformat() + .replace("+00:00", "Z"), + "manifestGeneratedAt": manifest.get("generatedAt"), + "manifestSha256": manifest_sha, + "platforms": manifest.get("platforms") + or sorted({entry.get("platform") for entry in manifest.get("artifacts", []) if entry.get("platform")}), + "artifactCount": len(manifest.get("artifacts", [])), + "buildIds": { + "total": len(build_ids), + "samples": build_ids[:10], + }, + "debugFiles": { + "count": len(debug_files), + "totalSizeBytes": total_size, + }, + } + + summary_path.parent.mkdir(parents=True, exist_ok=True) + with summary_path.open("w", encoding="utf-8") as handle: + json.dump(summary, handle, indent=2) + handle.write("\n") + + +def resolve_release_debug_dir(base: pathlib.Path) -> pathlib.Path: + debug_dir = base / "debug" + if debug_dir.exists(): + return debug_dir + + # Allow specifying the channel directory directly (e.g. out/release/stable) + if base.name == "debug": + return base + + raise FileNotFoundError(f"Debug directory not found under '{base}'") + + +def parse_args(argv: list[str] | None = None) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument( + "--release-dir", + type=pathlib.Path, + default=REPO_ROOT / "out" / "release", + help="Release output directory containing the debug store (default: %(default)s)", + ) + parser.add_argument( + "--offline-kit-dir", + type=pathlib.Path, + default=REPO_ROOT / "out" / "offline-kit", + help="Offline Kit staging directory (default: %(default)s)", + ) + parser.add_argument( + "--verify-only", + action="store_true", + help="Skip copying and only verify the existing offline kit debug store", + ) + parser.add_argument( + "--dry-run", + action="store_true", + help="Print actions without copying files", + ) + return parser.parse_args(argv) + + +def main(argv: list[str] | None = None) -> int: + args = parse_args(argv) + + try: + source_debug = resolve_release_debug_dir(args.release_dir.resolve()) + except FileNotFoundError as exc: + print(f"error: {exc}", file=sys.stderr) + return 2 + + target_root = (args.offline_kit_dir / "debug").resolve() + + if not args.verify_only: + copy_debug_store(source_debug, target_root, dry_run=args.dry_run) + if args.dry_run: + return 0 + + manifest_path = target_root / "debug-manifest.json" + if not manifest_path.exists(): + print(f"error: offline kit manifest missing at {manifest_path}", file=sys.stderr) + return 3 + + manifest = load_manifest(manifest_path) + manifest_sha_path = manifest_path.with_suffix(manifest_path.suffix + ".sha256") + recorded_sha = parse_manifest_sha(manifest_sha_path) + recomputed_sha = compute_sha256(manifest_path) + if recorded_sha and recorded_sha != recomputed_sha: + print( + f"warning: manifest SHA mismatch (recorded {recorded_sha}, recomputed {recomputed_sha}); updating checksum", + file=sys.stderr, + ) + manifest_sha_path.write_text(f"{recomputed_sha} {manifest_path.name}\n", encoding="utf-8") + + verified, total = verify_debug_store(manifest, target_root) + print(f"✔ verified {verified}/{total} debug artefacts (manifest SHA {recomputed_sha})") + + summary_path = args.offline_kit_dir / "metadata" / "debug-store.json" + summarize_store(manifest, recomputed_sha, target_root, summary_path) + print(f"ℹ summary written to {summary_path}") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/deploy/offline/kit/run-python-analyzer-smoke.sh b/deploy/offline/kit/run-python-analyzer-smoke.sh new file mode 100644 index 000000000..cb4712f95 --- /dev/null +++ b/deploy/offline/kit/run-python-analyzer-smoke.sh @@ -0,0 +1,36 @@ +#!/usr/bin/env bash +set -euo pipefail + +repo_root="$(git -C "${BASH_SOURCE%/*}/.." rev-parse --show-toplevel 2>/dev/null || pwd)" +project_path="${repo_root}/src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Python/StellaOps.Scanner.Analyzers.Lang.Python.csproj" +output_dir="${repo_root}/out/analyzers/python" +plugin_dir="${repo_root}/plugins/scanner/analyzers/lang/StellaOps.Scanner.Analyzers.Lang.Python" + +to_win_path() { + if command -v wslpath >/dev/null 2>&1; then + wslpath -w "$1" + else + printf '%s\n' "$1" + fi +} + +rm -rf "${output_dir}" +project_path_win="$(to_win_path "$project_path")" +output_dir_win="$(to_win_path "$output_dir")" + +dotnet publish "$project_path_win" \ + --configuration Release \ + --output "$output_dir_win" \ + --self-contained false + +mkdir -p "${plugin_dir}" +cp "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Python.dll" "${plugin_dir}/" +if [[ -f "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Python.pdb" ]]; then + cp "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Python.pdb" "${plugin_dir}/" +fi + +repo_root_win="$(to_win_path "$repo_root")" +exec dotnet run \ + --project "${repo_root_win}/src/Tools/LanguageAnalyzerSmoke/LanguageAnalyzerSmoke.csproj" \ + --configuration Release \ + -- --repo-root "${repo_root_win}" diff --git a/deploy/offline/kit/run-rust-analyzer-smoke.sh b/deploy/offline/kit/run-rust-analyzer-smoke.sh new file mode 100644 index 000000000..04df06fdc --- /dev/null +++ b/deploy/offline/kit/run-rust-analyzer-smoke.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash +set -euo pipefail + +repo_root="$(git -C "${BASH_SOURCE%/*}/.." rev-parse --show-toplevel 2>/dev/null || pwd)" +project_path="${repo_root}/src/Scanner/__Libraries/StellaOps.Scanner.Analyzers.Lang.Rust/StellaOps.Scanner.Analyzers.Lang.Rust.csproj" +output_dir="${repo_root}/out/analyzers/rust" +plugin_dir="${repo_root}/plugins/scanner/analyzers/lang/StellaOps.Scanner.Analyzers.Lang.Rust" + +to_win_path() { + if command -v wslpath >/dev/null 2>&1; then + wslpath -w "$1" + else + printf '%s\n' "$1" + fi +} + +rm -rf "${output_dir}" +project_path_win="$(to_win_path "$project_path")" +output_dir_win="$(to_win_path "$output_dir")" + +dotnet publish "$project_path_win" \ + --configuration Release \ + --output "$output_dir_win" \ + --self-contained false + +mkdir -p "${plugin_dir}" +cp "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Rust.dll" "${plugin_dir}/" +if [[ -f "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Rust.pdb" ]]; then + cp "${output_dir}/StellaOps.Scanner.Analyzers.Lang.Rust.pdb" "${plugin_dir}/" +fi + +repo_root_win="$(to_win_path "$repo_root")" +exec dotnet run \ + --project "${repo_root_win}/src/Tools/LanguageAnalyzerSmoke/LanguageAnalyzerSmoke.csproj" \ + --configuration Release \ + -- --repo-root "${repo_root_win}" \ + --analyzer rust diff --git a/deploy/offline/kit/test_build_offline_kit.py b/deploy/offline/kit/test_build_offline_kit.py new file mode 100644 index 000000000..b6111cf03 --- /dev/null +++ b/deploy/offline/kit/test_build_offline_kit.py @@ -0,0 +1,334 @@ +from __future__ import annotations + +import json +import tarfile +import tempfile +import unittest +import argparse +import sys +from collections import OrderedDict +from pathlib import Path + +current_dir = Path(__file__).resolve().parent +sys.path.append(str(current_dir)) +sys.path.append(str(current_dir.parent / "devops" / "release")) + +from build_release import write_manifest # type: ignore import-not-found + +from build_offline_kit import build_offline_kit, compute_sha256 # type: ignore import-not-found + + +class OfflineKitBuilderTests(unittest.TestCase): + def setUp(self) -> None: + self._temp = tempfile.TemporaryDirectory() + self.base_path = Path(self._temp.name) + self.out_dir = self.base_path / "out" + self.release_dir = self.out_dir / "release" + self.staging_dir = self.base_path / "staging" + self.output_dir = self.base_path / "dist" + self._create_sample_release() + + def tearDown(self) -> None: + self._temp.cleanup() + + def _relative_to_out(self, path: Path) -> str: + return path.relative_to(self.out_dir).as_posix() + + def _write_json(self, path: Path, payload: dict[str, object]) -> None: + path.parent.mkdir(parents=True, exist_ok=True) + with path.open("w", encoding="utf-8") as handle: + json.dump(payload, handle, indent=2) + handle.write("\n") + + def _create_sample_release(self) -> None: + self.release_dir.mkdir(parents=True, exist_ok=True) + + cli_archive = self.release_dir / "cli" / "stellaops-cli-linux-x64.tar.gz" + cli_archive.parent.mkdir(parents=True, exist_ok=True) + cli_archive.write_bytes(b"cli-bytes") + compute_sha256(cli_archive) + + container_bundle = self.release_dir / "containers" / "stellaops-containers.tar.gz" + container_bundle.parent.mkdir(parents=True, exist_ok=True) + container_bundle.write_bytes(b"container-bundle") + compute_sha256(container_bundle) + + orchestrator_service = self.release_dir / "orchestrator" / "service" / "orchestrator-service.tar.gz" + orchestrator_service.parent.mkdir(parents=True, exist_ok=True) + orchestrator_service.write_bytes(b"orch-service") + compute_sha256(orchestrator_service) + + orchestrator_dash = self.release_dir / "orchestrator" / "dashboards" / "dash.json" + orchestrator_dash.parent.mkdir(parents=True, exist_ok=True) + orchestrator_dash.write_text("{}\n", encoding="utf-8") + + export_bundle = self.release_dir / "export-center" / "export-offline-bundle.tar.gz" + export_bundle.parent.mkdir(parents=True, exist_ok=True) + export_bundle.write_bytes(b"export") + compute_sha256(export_bundle) + + notifier_pack = self.release_dir / "notifier" / "notifier-offline-pack.tar.gz" + notifier_pack.parent.mkdir(parents=True, exist_ok=True) + notifier_pack.write_bytes(b"notifier") + compute_sha256(notifier_pack) + + secrets_bundle = self.release_dir / "surface-secrets" / "secrets-bundle.tar.gz" + secrets_bundle.parent.mkdir(parents=True, exist_ok=True) + secrets_bundle.write_bytes(b"secrets") + compute_sha256(secrets_bundle) + + sbom_path = self.release_dir / "artifacts/sboms/sample.cyclonedx.json" + sbom_path.parent.mkdir(parents=True, exist_ok=True) + sbom_path.write_text('{"bomFormat":"CycloneDX","specVersion":"1.5"}\n', encoding="utf-8") + sbom_sha = compute_sha256(sbom_path) + + provenance_path = self.release_dir / "artifacts/provenance/sample.provenance.json" + self._write_json( + provenance_path, + { + "buildDefinition": {"buildType": "https://example/build"}, + "runDetails": {"builder": {"id": "https://example/ci"}}, + }, + ) + provenance_sha = compute_sha256(provenance_path) + + signature_path = self.release_dir / "artifacts/signatures/sample.signature" + signature_path.parent.mkdir(parents=True, exist_ok=True) + signature_path.write_text("signature-data\n", encoding="utf-8") + signature_sha = compute_sha256(signature_path) + + metadata_path = self.release_dir / "artifacts/metadata/sample.metadata.json" + self._write_json(metadata_path, {"digest": "sha256:1234"}) + metadata_sha = compute_sha256(metadata_path) + + chart_path = self.release_dir / "helm/stellaops-1.0.0.tgz" + chart_path.parent.mkdir(parents=True, exist_ok=True) + chart_path.write_bytes(b"helm-chart-data") + chart_sha = compute_sha256(chart_path) + + compose_path = self.release_dir.parent / "deploy/compose/docker-compose.dev.yaml" + compose_path.parent.mkdir(parents=True, exist_ok=True) + compose_path.write_text("services: {}\n", encoding="utf-8") + compose_sha = compute_sha256(compose_path) + + debug_file = self.release_dir / "debug/.build-id/ab/cdef.debug" + debug_file.parent.mkdir(parents=True, exist_ok=True) + debug_file.write_bytes(b"\x7fELFDEBUGDATA") + debug_sha = compute_sha256(debug_file) + + debug_manifest_path = self.release_dir / "debug/debug-manifest.json" + debug_manifest = OrderedDict( + ( + ("generatedAt", "2025-10-26T00:00:00Z"), + ("version", "1.0.0"), + ("channel", "edge"), + ( + "artifacts", + [ + OrderedDict( + ( + ("buildId", "abcdef1234"), + ("platform", "linux/amd64"), + ("debugPath", "debug/.build-id/ab/cdef.debug"), + ("sha256", debug_sha), + ("size", debug_file.stat().st_size), + ("components", ["sample"]), + ("images", ["registry.example/sample@sha256:feedface"]), + ("sources", ["app/sample.dll"]), + ) + ) + ], + ), + ) + ) + self._write_json(debug_manifest_path, debug_manifest) + debug_manifest_sha = compute_sha256(debug_manifest_path) + (debug_manifest_path.with_suffix(debug_manifest_path.suffix + ".sha256")).write_text( + f"{debug_manifest_sha} {debug_manifest_path.name}\n", + encoding="utf-8", + ) + + manifest = OrderedDict( + ( + ( + "release", + OrderedDict( + ( + ("version", "1.0.0"), + ("channel", "edge"), + ("date", "2025-10-26T00:00:00Z"), + ("calendar", "2025.10"), + ) + ), + ), + ( + "components", + [ + OrderedDict( + ( + ("name", "sample"), + ("image", "registry.example/sample@sha256:feedface"), + ("tags", ["registry.example/sample:1.0.0"]), + ( + "sbom", + OrderedDict( + ( + ("path", self._relative_to_out(sbom_path)), + ("sha256", sbom_sha), + ) + ), + ), + ( + "provenance", + OrderedDict( + ( + ("path", self._relative_to_out(provenance_path)), + ("sha256", provenance_sha), + ) + ), + ), + ( + "signature", + OrderedDict( + ( + ("path", self._relative_to_out(signature_path)), + ("sha256", signature_sha), + ("ref", "sigstore://example"), + ("tlogUploaded", True), + ) + ), + ), + ( + "metadata", + OrderedDict( + ( + ("path", self._relative_to_out(metadata_path)), + ("sha256", metadata_sha), + ) + ), + ), + ) + ) + ], + ), + ( + "charts", + [ + OrderedDict( + ( + ("name", "stellaops"), + ("version", "1.0.0"), + ("path", self._relative_to_out(chart_path)), + ("sha256", chart_sha), + ) + ) + ], + ), + ( + "compose", + [ + OrderedDict( + ( + ("name", "docker-compose.dev.yaml"), + ("path", compose_path.relative_to(self.out_dir).as_posix()), + ("sha256", compose_sha), + ) + ) + ], + ), + ( + "debugStore", + OrderedDict( + ( + ("manifest", "debug/debug-manifest.json"), + ("sha256", debug_manifest_sha), + ("entries", 1), + ("platforms", ["linux/amd64"]), + ("directory", "debug/.build-id"), + ) + ), + ), + ) + ) + write_manifest(manifest, self.release_dir) + + def test_build_offline_kit(self) -> None: + args = argparse.Namespace( + version="2025.10.0", + channel="edge", + bundle_id="bundle-001", + release_dir=self.release_dir, + staging_dir=self.staging_dir, + output_dir=self.output_dir, + cosign_key=None, + cosign_password=None, + cosign_identity_token=None, + no_transparency=False, + skip_smoke=True, + ) + result = build_offline_kit(args) + bundle_path = Path(result["bundlePath"]) + self.assertTrue(bundle_path.exists()) + offline_manifest = self.output_dir.parent / "staging" / "manifest" / "offline-manifest.json" + self.assertTrue(offline_manifest.exists()) + + bootstrap_notify = self.staging_dir / "bootstrap" / "notify" + self.assertTrue((bootstrap_notify / "notify.yaml").exists()) + self.assertTrue((bootstrap_notify / "notify-web.secret.example").exists()) + + taskrunner_bootstrap = self.staging_dir / "bootstrap" / "task-runner" + self.assertTrue((taskrunner_bootstrap / "task-runner.yaml.sample").exists()) + + docs_taskpacks = self.staging_dir / "docs" / "task-packs" + self.assertTrue(docs_taskpacks.exists()) + self.assertTrue((self.staging_dir / "docs" / "mirror-bundles.md").exists()) + + containers_dir = self.staging_dir / "containers" + self.assertTrue((containers_dir / "stellaops-containers.tar.gz").exists()) + + orchestrator_dir = self.staging_dir / "orchestrator" + self.assertTrue((orchestrator_dir / "service" / "orchestrator-service.tar.gz").exists()) + self.assertTrue((orchestrator_dir / "dashboards" / "dash.json").exists()) + + export_dir = self.staging_dir / "export-center" + self.assertTrue((export_dir / "export-offline-bundle.tar.gz").exists()) + + notifier_dir = self.staging_dir / "notifier" + self.assertTrue((notifier_dir / "notifier-offline-pack.tar.gz").exists()) + + secrets_dir = self.staging_dir / "surface-secrets" + self.assertTrue((secrets_dir / "secrets-bundle.tar.gz").exists()) + + with offline_manifest.open("r", encoding="utf-8") as handle: + manifest_data = json.load(handle) + artifacts = manifest_data["artifacts"] + self.assertTrue(any(item["name"].startswith("sboms/") for item in artifacts)) + self.assertTrue(any(item["name"].startswith("cli/") for item in artifacts)) + + metadata_path = Path(result["metadataPath"]) + data = json.loads(metadata_path.read_text(encoding="utf-8")) + self.assertTrue(data["bundleSha256"].startswith("sha256:")) + self.assertTrue(data["manifestSha256"].startswith("sha256:")) + counts = data["counts"] + self.assertGreaterEqual(counts["cli"], 1) + self.assertGreaterEqual(counts["containers"], 1) + self.assertGreaterEqual(counts["orchestrator"], 2) + self.assertGreaterEqual(counts["exportCenter"], 1) + self.assertGreaterEqual(counts["notifier"], 1) + self.assertGreaterEqual(counts["surfaceSecrets"], 1) + + with tarfile.open(bundle_path, "r:gz") as tar: + members = tar.getnames() + self.assertIn("manifest/release.yaml", members) + self.assertTrue(any(name.startswith("sboms/sample-") for name in members)) + self.assertIn("bootstrap/notify/notify.yaml", members) + self.assertIn("bootstrap/notify/notify-web.secret.example", members) + self.assertIn("containers/stellaops-containers.tar.gz", members) + self.assertIn("orchestrator/service/orchestrator-service.tar.gz", members) + self.assertIn("export-center/export-offline-bundle.tar.gz", members) + self.assertIn("notifier/notifier-offline-pack.tar.gz", members) + self.assertIn("surface-secrets/secrets-bundle.tar.gz", members) + + +if __name__ == "__main__": + unittest.main() diff --git a/deploy/offline/scripts/install-secrets-bundle.sh b/deploy/offline/scripts/install-secrets-bundle.sh new file mode 100644 index 000000000..e29db46ef --- /dev/null +++ b/deploy/offline/scripts/install-secrets-bundle.sh @@ -0,0 +1,231 @@ +#!/usr/bin/env bash +# ----------------------------------------------------------------------------- +# install-secrets-bundle.sh +# Sprint: SPRINT_20260104_005_AIRGAP (Secret Offline Kit Integration) +# Task: OKS-005 - Create bundle installation script +# Description: Install signed secrets rule bundle for offline environments +# ----------------------------------------------------------------------------- +# Usage: ./install-secrets-bundle.sh [install-path] [attestor-mirror] +# Example: ./install-secrets-bundle.sh /mnt/offline-kit/rules/secrets/2026.01 + +set -euo pipefail + +# Configuration +BUNDLE_PATH="${1:?Bundle path required (e.g., /mnt/offline-kit/rules/secrets/2026.01)}" +INSTALL_PATH="${2:-/opt/stellaops/plugins/scanner/analyzers/secrets}" +ATTESTOR_MIRROR="${3:-}" +BUNDLE_ID="${BUNDLE_ID:-secrets.ruleset}" +REQUIRE_SIGNATURE="${REQUIRE_SIGNATURE:-true}" +STELLAOPS_USER="${STELLAOPS_USER:-stellaops}" +STELLAOPS_GROUP="${STELLAOPS_GROUP:-stellaops}" + +# Color output helpers (disabled if not a terminal) +if [[ -t 1 ]]; then + RED='\033[0;31m' + GREEN='\033[0;32m' + YELLOW='\033[0;33m' + NC='\033[0m' # No Color +else + RED='' + GREEN='' + YELLOW='' + NC='' +fi + +log_info() { echo -e "${GREEN}==>${NC} $*"; } +log_warn() { echo -e "${YELLOW}WARN:${NC} $*" >&2; } +log_error() { echo -e "${RED}ERROR:${NC} $*" >&2; } + +# Validate bundle path +log_info "Validating secrets bundle at ${BUNDLE_PATH}" + +if [[ ! -d "${BUNDLE_PATH}" ]]; then + log_error "Bundle directory not found: ${BUNDLE_PATH}" + exit 1 +fi + +MANIFEST_FILE="${BUNDLE_PATH}/${BUNDLE_ID}.manifest.json" +RULES_FILE="${BUNDLE_PATH}/${BUNDLE_ID}.rules.jsonl" +SIGNATURE_FILE="${BUNDLE_PATH}/${BUNDLE_ID}.dsse.json" + +if [[ ! -f "${MANIFEST_FILE}" ]]; then + log_error "Manifest not found: ${MANIFEST_FILE}" + exit 1 +fi + +if [[ ! -f "${RULES_FILE}" ]]; then + log_error "Rules file not found: ${RULES_FILE}" + exit 1 +fi + +# Extract bundle version +BUNDLE_VERSION=$(jq -r '.version // "unknown"' "${MANIFEST_FILE}" 2>/dev/null || echo "unknown") +RULE_COUNT=$(jq -r '.ruleCount // 0' "${MANIFEST_FILE}" 2>/dev/null || echo "0") +SIGNER_KEY_ID=$(jq -r '.signerKeyId // "unknown"' "${MANIFEST_FILE}" 2>/dev/null || echo "unknown") + +log_info "Bundle version: ${BUNDLE_VERSION}" +log_info "Rule count: ${RULE_COUNT}" +log_info "Signer key ID: ${SIGNER_KEY_ID}" + +# Verify signature if required +if [[ "${REQUIRE_SIGNATURE}" == "true" ]]; then + log_info "Verifying bundle signature..." + + if [[ ! -f "${SIGNATURE_FILE}" ]]; then + log_error "Signature file not found: ${SIGNATURE_FILE}" + log_error "Set REQUIRE_SIGNATURE=false to skip signature verification (not recommended)" + exit 1 + fi + + # Set attestor mirror URL if provided + if [[ -n "${ATTESTOR_MIRROR}" ]]; then + export STELLA_ATTESTOR_URL="file://${ATTESTOR_MIRROR}" + log_info "Using attestor mirror: ${STELLA_ATTESTOR_URL}" + fi + + # Verify using stella CLI if available + if command -v stella &>/dev/null; then + if ! stella secrets bundle verify --bundle "${BUNDLE_PATH}" --bundle-id "${BUNDLE_ID}"; then + log_error "Bundle signature verification failed" + exit 1 + fi + log_info "Signature verification passed" + else + log_warn "stella CLI not found, performing basic signature file check only" + + # Basic check: verify signature file is valid JSON with expected structure + if ! jq -e '.payloadType and .payload and .signatures' "${SIGNATURE_FILE}" >/dev/null 2>&1; then + log_error "Invalid DSSE envelope structure in ${SIGNATURE_FILE}" + exit 1 + fi + + # Verify payload digest matches + EXPECTED_DIGEST=$(jq -r '.payload' "${SIGNATURE_FILE}" | base64 -d | sha256sum | cut -d' ' -f1) + ACTUAL_DIGEST=$(sha256sum "${MANIFEST_FILE}" | cut -d' ' -f1) + + if [[ "${EXPECTED_DIGEST}" != "${ACTUAL_DIGEST}" ]]; then + log_error "Payload digest mismatch" + log_error "Expected: ${EXPECTED_DIGEST}" + log_error "Actual: ${ACTUAL_DIGEST}" + exit 1 + fi + + log_warn "Basic signature structure verified (full cryptographic verification requires stella CLI)" + fi +else + log_warn "Signature verification skipped (REQUIRE_SIGNATURE=false)" +fi + +# Verify file digests listed in manifest +log_info "Verifying file digests..." +DIGEST_ERRORS=() + +while IFS= read -r file_entry; do + FILE_NAME=$(echo "${file_entry}" | jq -r '.name') + EXPECTED_DIGEST=$(echo "${file_entry}" | jq -r '.digest' | sed 's/sha256://') + FILE_PATH="${BUNDLE_PATH}/${FILE_NAME}" + + if [[ ! -f "${FILE_PATH}" ]]; then + DIGEST_ERRORS+=("File missing: ${FILE_NAME}") + continue + fi + + ACTUAL_DIGEST=$(sha256sum "${FILE_PATH}" | cut -d' ' -f1) + if [[ "${EXPECTED_DIGEST}" != "${ACTUAL_DIGEST}" ]]; then + DIGEST_ERRORS+=("Digest mismatch: ${FILE_NAME}") + fi +done < <(jq -c '.files[]' "${MANIFEST_FILE}" 2>/dev/null) + +if [[ ${#DIGEST_ERRORS[@]} -gt 0 ]]; then + log_error "File digest verification failed:" + for err in "${DIGEST_ERRORS[@]}"; do + log_error " - ${err}" + done + exit 1 +fi +log_info "File digests verified" + +# Check existing installation +if [[ -d "${INSTALL_PATH}" ]]; then + EXISTING_MANIFEST="${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" + if [[ -f "${EXISTING_MANIFEST}" ]]; then + EXISTING_VERSION=$(jq -r '.version // "unknown"' "${EXISTING_MANIFEST}" 2>/dev/null || echo "unknown") + log_info "Existing installation found: version ${EXISTING_VERSION}" + + # Version comparison (CalVer: YYYY.MM) + if [[ "${EXISTING_VERSION}" > "${BUNDLE_VERSION}" ]]; then + log_warn "Existing version (${EXISTING_VERSION}) is newer than bundle (${BUNDLE_VERSION})" + log_warn "Use FORCE_INSTALL=true to override" + if [[ "${FORCE_INSTALL:-false}" != "true" ]]; then + exit 1 + fi + fi + fi +fi + +# Create installation directory +log_info "Creating installation directory: ${INSTALL_PATH}" +mkdir -p "${INSTALL_PATH}" + +# Install bundle files +log_info "Installing bundle files..." +for file in "${BUNDLE_PATH}"/${BUNDLE_ID}.*; do + if [[ -f "${file}" ]]; then + FILE_NAME=$(basename "${file}") + echo " ${FILE_NAME}" + cp -f "${file}" "${INSTALL_PATH}/" + fi +done + +# Set permissions +log_info "Setting file permissions..." +chmod 640 "${INSTALL_PATH}"/${BUNDLE_ID}.* 2>/dev/null || true + +# Set ownership if running as root +if [[ "${EUID:-$(id -u)}" -eq 0 ]]; then + if id "${STELLAOPS_USER}" &>/dev/null; then + chown "${STELLAOPS_USER}:${STELLAOPS_GROUP}" "${INSTALL_PATH}"/${BUNDLE_ID}.* 2>/dev/null || true + log_info "Set ownership to ${STELLAOPS_USER}:${STELLAOPS_GROUP}" + else + log_warn "User ${STELLAOPS_USER} does not exist, skipping ownership change" + fi +else + log_info "Not running as root, skipping ownership change" +fi + +# Create installation receipt +RECEIPT_FILE="${INSTALL_PATH}/.install-receipt.json" +cat > "${RECEIPT_FILE}" </dev/null || hostname)" +} +EOF + +# Verify installation +INSTALLED_VERSION=$(jq -r '.version' "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" 2>/dev/null || echo "unknown") +log_info "Successfully installed secrets bundle version ${INSTALLED_VERSION}" + +echo "" +echo "Installation summary:" +echo " Bundle ID: ${BUNDLE_ID}" +echo " Version: ${INSTALLED_VERSION}" +echo " Rule count: ${RULE_COUNT}" +echo " Install path: ${INSTALL_PATH}" +echo " Receipt: ${RECEIPT_FILE}" +echo "" +echo "Next steps:" +echo " 1. Restart Scanner Worker to load the new bundle:" +echo " systemctl restart stellaops-scanner-worker" +echo "" +echo " Or with Kubernetes:" +echo " kubectl rollout restart deployment/scanner-worker -n stellaops" +echo "" +echo " 2. Verify bundle is loaded:" +echo " kubectl logs -l app=scanner-worker --tail=100 | grep SecretsAnalyzerHost" diff --git a/deploy/offline/scripts/rotate-secrets-bundle.sh b/deploy/offline/scripts/rotate-secrets-bundle.sh new file mode 100644 index 000000000..693cb0c99 --- /dev/null +++ b/deploy/offline/scripts/rotate-secrets-bundle.sh @@ -0,0 +1,299 @@ +#!/usr/bin/env bash +# ----------------------------------------------------------------------------- +# rotate-secrets-bundle.sh +# Sprint: SPRINT_20260104_005_AIRGAP (Secret Offline Kit Integration) +# Task: OKS-006 - Add bundle rotation/upgrade workflow +# Description: Safely rotate/upgrade secrets rule bundle with backup and rollback +# ----------------------------------------------------------------------------- +# Usage: ./rotate-secrets-bundle.sh [install-path] +# Example: ./rotate-secrets-bundle.sh /mnt/offline-kit/rules/secrets/2026.02 + +set -euo pipefail + +# Configuration +NEW_BUNDLE_PATH="${1:?New bundle path required (e.g., /mnt/offline-kit/rules/secrets/2026.02)}" +INSTALL_PATH="${2:-/opt/stellaops/plugins/scanner/analyzers/secrets}" +BACKUP_BASE="${BACKUP_BASE:-/opt/stellaops/backups/secrets-bundles}" +BUNDLE_ID="${BUNDLE_ID:-secrets.ruleset}" +ATTESTOR_MIRROR="${ATTESTOR_MIRROR:-}" +RESTART_WORKERS="${RESTART_WORKERS:-true}" +KUBERNETES_NAMESPACE="${KUBERNETES_NAMESPACE:-stellaops}" +KUBERNETES_DEPLOYMENT="${KUBERNETES_DEPLOYMENT:-scanner-worker}" +MAX_BACKUPS="${MAX_BACKUPS:-5}" + +# Script directory for calling install script +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# Color output helpers +if [[ -t 1 ]]; then + RED='\033[0;31m' + GREEN='\033[0;32m' + YELLOW='\033[0;33m' + BLUE='\033[0;34m' + NC='\033[0m' +else + RED='' + GREEN='' + YELLOW='' + BLUE='' + NC='' +fi + +log_info() { echo -e "${GREEN}==>${NC} $*"; } +log_warn() { echo -e "${YELLOW}WARN:${NC} $*" >&2; } +log_error() { echo -e "${RED}ERROR:${NC} $*" >&2; } +log_step() { echo -e "${BLUE}--->${NC} $*"; } + +# Error handler +cleanup_on_error() { + log_error "Rotation failed! Attempting rollback..." + if [[ -n "${BACKUP_DIR:-}" && -d "${BACKUP_DIR}" ]]; then + perform_rollback "${BACKUP_DIR}" + fi +} + +perform_rollback() { + local backup_dir="$1" + log_info "Rolling back to backup: ${backup_dir}" + + if [[ ! -d "${backup_dir}" ]]; then + log_error "Backup directory not found: ${backup_dir}" + return 1 + fi + + # Restore files + cp -a "${backup_dir}"/* "${INSTALL_PATH}/" 2>/dev/null || { + log_error "Failed to restore files from backup" + return 1 + } + + log_info "Rollback completed" + + # Restart workers after rollback + if [[ "${RESTART_WORKERS}" == "true" ]]; then + restart_workers "rollback" + fi + + return 0 +} + +restart_workers() { + local reason="${1:-upgrade}" + log_info "Restarting scanner workers (${reason})..." + + # Try Kubernetes first + if command -v kubectl &>/dev/null; then + if kubectl get deployment "${KUBERNETES_DEPLOYMENT}" -n "${KUBERNETES_NAMESPACE}" &>/dev/null; then + log_step "Performing Kubernetes rolling restart..." + kubectl rollout restart deployment/"${KUBERNETES_DEPLOYMENT}" -n "${KUBERNETES_NAMESPACE}" + log_step "Waiting for rollout to complete..." + kubectl rollout status deployment/"${KUBERNETES_DEPLOYMENT}" -n "${KUBERNETES_NAMESPACE}" --timeout=300s || { + log_warn "Rollout status check timed out (workers may still be restarting)" + } + return 0 + fi + fi + + # Try systemd + if command -v systemctl &>/dev/null; then + if systemctl is-active stellaops-scanner-worker &>/dev/null 2>&1; then + log_step "Restarting systemd service..." + systemctl restart stellaops-scanner-worker + return 0 + fi + fi + + log_warn "Could not auto-restart workers (no Kubernetes or systemd found)" + log_warn "Please restart scanner workers manually" +} + +cleanup_old_backups() { + log_info "Cleaning up old backups (keeping last ${MAX_BACKUPS})..." + + if [[ ! -d "${BACKUP_BASE}" ]]; then + return 0 + fi + + # List backups sorted by name (which includes timestamp) + local backups + backups=$(find "${BACKUP_BASE}" -maxdepth 1 -type d -name "20*" | sort -r) + local count=0 + + for backup in ${backups}; do + count=$((count + 1)) + if [[ ${count} -gt ${MAX_BACKUPS} ]]; then + log_step "Removing old backup: ${backup}" + rm -rf "${backup}" + fi + done +} + +# Main rotation logic +main() { + echo "" + log_info "Secrets Bundle Rotation" + echo "========================================" + echo "" + + # Validate new bundle + log_info "Step 1/6: Validating new bundle..." + if [[ ! -d "${NEW_BUNDLE_PATH}" ]]; then + log_error "New bundle directory not found: ${NEW_BUNDLE_PATH}" + exit 1 + fi + + NEW_MANIFEST="${NEW_BUNDLE_PATH}/${BUNDLE_ID}.manifest.json" + if [[ ! -f "${NEW_MANIFEST}" ]]; then + log_error "New bundle manifest not found: ${NEW_MANIFEST}" + exit 1 + fi + + NEW_VERSION=$(jq -r '.version // "unknown"' "${NEW_MANIFEST}" 2>/dev/null || echo "unknown") + NEW_RULE_COUNT=$(jq -r '.ruleCount // 0' "${NEW_MANIFEST}" 2>/dev/null || echo "0") + log_step "New version: ${NEW_VERSION} (${NEW_RULE_COUNT} rules)" + + # Check current installation + log_info "Step 2/6: Checking current installation..." + CURRENT_VERSION="(none)" + CURRENT_RULE_COUNT="0" + + if [[ -f "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" ]]; then + CURRENT_VERSION=$(jq -r '.version // "unknown"' "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" 2>/dev/null || echo "unknown") + CURRENT_RULE_COUNT=$(jq -r '.ruleCount // 0' "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" 2>/dev/null || echo "0") + log_step "Current version: ${CURRENT_VERSION} (${CURRENT_RULE_COUNT} rules)" + else + log_step "No current installation found" + fi + + # Version comparison + if [[ "${CURRENT_VERSION}" != "(none)" ]]; then + if [[ "${CURRENT_VERSION}" == "${NEW_VERSION}" ]]; then + log_warn "New version (${NEW_VERSION}) is the same as current" + if [[ "${FORCE_ROTATION:-false}" != "true" ]]; then + log_warn "Use FORCE_ROTATION=true to reinstall" + exit 0 + fi + elif [[ "${CURRENT_VERSION}" > "${NEW_VERSION}" ]]; then + log_warn "New version (${NEW_VERSION}) is older than current (${CURRENT_VERSION})" + if [[ "${FORCE_ROTATION:-false}" != "true" ]]; then + log_warn "Use FORCE_ROTATION=true to downgrade" + exit 1 + fi + fi + fi + + echo "" + log_info "Upgrade: ${CURRENT_VERSION} -> ${NEW_VERSION}" + echo "" + + # Backup current installation + log_info "Step 3/6: Creating backup..." + BACKUP_DIR="${BACKUP_BASE}/$(date +%Y%m%d_%H%M%S)_${CURRENT_VERSION}" + + if [[ -d "${INSTALL_PATH}" && -f "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" ]]; then + mkdir -p "${BACKUP_DIR}" + cp -a "${INSTALL_PATH}"/* "${BACKUP_DIR}/" 2>/dev/null || { + log_error "Failed to create backup" + exit 1 + } + log_step "Backup created: ${BACKUP_DIR}" + + # Create backup metadata + cat > "${BACKUP_DIR}/.backup-metadata.json" </dev/null || hostname)" +} +EOF + else + log_step "No existing installation to backup" + BACKUP_DIR="" + fi + + # Set up error handler for rollback + trap cleanup_on_error ERR + + # Install new bundle + log_info "Step 4/6: Installing new bundle..." + export FORCE_INSTALL=true + export REQUIRE_SIGNATURE="${REQUIRE_SIGNATURE:-true}" + + if [[ -n "${ATTESTOR_MIRROR}" ]]; then + "${SCRIPT_DIR}/install-secrets-bundle.sh" "${NEW_BUNDLE_PATH}" "${INSTALL_PATH}" "${ATTESTOR_MIRROR}" + else + "${SCRIPT_DIR}/install-secrets-bundle.sh" "${NEW_BUNDLE_PATH}" "${INSTALL_PATH}" + fi + + # Verify installation + log_info "Step 5/6: Verifying installation..." + INSTALLED_VERSION=$(jq -r '.version' "${INSTALL_PATH}/${BUNDLE_ID}.manifest.json" 2>/dev/null || echo "unknown") + + if [[ "${INSTALLED_VERSION}" != "${NEW_VERSION}" ]]; then + log_error "Installation verification failed" + log_error "Expected version: ${NEW_VERSION}" + log_error "Installed version: ${INSTALLED_VERSION}" + exit 1 + fi + log_step "Installation verified: ${INSTALLED_VERSION}" + + # Remove error trap since installation succeeded + trap - ERR + + # Restart workers + log_info "Step 6/6: Restarting workers..." + if [[ "${RESTART_WORKERS}" == "true" ]]; then + restart_workers "upgrade" + else + log_step "Worker restart skipped (RESTART_WORKERS=false)" + fi + + # Cleanup old backups + cleanup_old_backups + + # Generate rotation report + REPORT_FILE="${INSTALL_PATH}/.rotation-report.json" + cat > "${REPORT_FILE}" </dev/null || hostname)" +} +EOF + + echo "" + echo "========================================" + log_info "Rotation completed successfully!" + echo "" + echo "Summary:" + echo " Previous version: ${CURRENT_VERSION} (${CURRENT_RULE_COUNT} rules)" + echo " New version: ${NEW_VERSION} (${NEW_RULE_COUNT} rules)" + if [[ -n "${BACKUP_DIR}" ]]; then + echo " Backup path: ${BACKUP_DIR}" + fi + echo " Report: ${REPORT_FILE}" + echo "" + echo "To verify the upgrade:" + echo " kubectl logs -l app=scanner-worker --tail=100 | grep SecretsAnalyzerHost" + echo "" + echo "To rollback if needed:" + echo " $0 --rollback ${BACKUP_DIR:-/path/to/backup}" +} + +# Handle rollback command +if [[ "${1:-}" == "--rollback" ]]; then + ROLLBACK_BACKUP="${2:?Backup directory required for rollback}" + perform_rollback "${ROLLBACK_BACKUP}" + if [[ "${RESTART_WORKERS}" == "true" ]]; then + restart_workers "rollback" + fi + exit 0 +fi + +# Run main +main "$@" diff --git a/deploy/offline/templates/mirror-thin-v1.manifest.json b/deploy/offline/templates/mirror-thin-v1.manifest.json new file mode 100644 index 000000000..cfde5f290 --- /dev/null +++ b/deploy/offline/templates/mirror-thin-v1.manifest.json @@ -0,0 +1,6 @@ +{ + "created": "$CREATED", + "indexes": [], + "layers": [], + "version": "1.0.0" +} diff --git a/deploy/releases/2025.09-airgap.yaml b/deploy/releases/2025.09-airgap.yaml new file mode 100644 index 000000000..86a91bb4b --- /dev/null +++ b/deploy/releases/2025.09-airgap.yaml @@ -0,0 +1,35 @@ +release: + version: "2025.09.2-airgap" + channel: "airgap" + date: "2025-09-20T00:00:00Z" + calendar: "2025.09" + components: + - name: authority + image: registry.stella-ops.org/stellaops/authority@sha256:5551a3269b7008cd5aceecf45df018c67459ed519557ccbe48b093b926a39bcc + - name: signer + image: registry.stella-ops.org/stellaops/signer@sha256:ddbbd664a42846cea6b40fca6465bc679b30f72851158f300d01a8571c5478fc + - name: attestor + image: registry.stella-ops.org/stellaops/attestor@sha256:1ff0a3124d66d3a2702d8e421df40fbd98cc75cb605d95510598ebbae1433c50 + - name: scanner-web + image: registry.stella-ops.org/stellaops/scanner-web@sha256:3df8ca21878126758203c1a0444e39fd97f77ddacf04a69685cda9f1e5e94718 + - name: scanner-worker + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:eea5d6cfe7835950c5ec7a735a651f2f0d727d3e470cf9027a4a402ea89c4fb5 + - name: concelier + image: registry.stella-ops.org/stellaops/concelier@sha256:29e2e1a0972707e092cbd3d370701341f9fec2aa9316fb5d8100480f2a1c76b5 + - name: excititor + image: registry.stella-ops.org/stellaops/excititor@sha256:65c0ee13f773efe920d7181512349a09d363ab3f3e177d276136bd2742325a68 + - name: advisory-ai-web + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.09.2-airgap + - name: advisory-ai-worker + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.09.2-airgap + - name: web-ui + image: registry.stella-ops.org/stellaops/web-ui@sha256:bee9668011ff414572131dc777faab4da24473fe12c230893f161cabee092a1d + infrastructure: + postgres: + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + valkey: + image: docker.io/valkey/valkey@sha256:9a2cf7c980f2f28678a5e34b1c8d74e4b7b7b6c8c4d5e6f7a8b9c0d1e2f3a4b5 + rustfs: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + checksums: + releaseManifestSha256: b787b833dddd73960c31338279daa0b0a0dce2ef32bd32ef1aaf953d66135f94 diff --git a/deploy/releases/2025.09-mock-dev.yaml b/deploy/releases/2025.09-mock-dev.yaml new file mode 100644 index 000000000..60555e16d --- /dev/null +++ b/deploy/releases/2025.09-mock-dev.yaml @@ -0,0 +1,51 @@ +release: + version: 2025.09.2 + channel: stable + date: '2025-09-20T00:00:00Z' + calendar: '2025.09' + components: + - name: authority + image: registry.stella-ops.org/stellaops/authority@sha256:b0348bad1d0b401cc3c71cb40ba034c8043b6c8874546f90d4783c9dbfcc0bf5 + - name: signer + image: registry.stella-ops.org/stellaops/signer@sha256:8ad574e61f3a9e9bda8a58eb2700ae46813284e35a150b1137bc7c2b92ac0f2e + - name: attestor + image: registry.stella-ops.org/stellaops/attestor@sha256:0534985f978b0b5d220d73c96fddd962cd9135f616811cbe3bff4666c5af568f + - name: scanner-web + image: registry.stella-ops.org/stellaops/scanner-web@sha256:14b23448c3f9586a9156370b3e8c1991b61907efa666ca37dd3aaed1e79fe3b7 + - name: scanner-worker + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:32e25e76386eb9ea8bee0a1ad546775db9a2df989fab61ac877e351881960dab + - name: concelier + image: registry.stella-ops.org/stellaops/concelier@sha256:c58cdcaee1d266d68d498e41110a589dd204b487d37381096bd61ab345a867c5 + - name: excititor + image: registry.stella-ops.org/stellaops/excititor@sha256:59022e2016aebcef5c856d163ae705755d3f81949d41195256e935ef40a627fa + - name: advisory-ai-web + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.09.2 + - name: advisory-ai-worker + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.09.2 + - name: web-ui + image: registry.stella-ops.org/stellaops/web-ui@sha256:10d924808c48e4353e3a241da62eb7aefe727a1d6dc830eb23a8e181013b3a23 + - name: orchestrator + image: registry.stella-ops.org/stellaops/orchestrator@sha256:97f12856ce870bafd3328bda86833bcccbf56d255941d804966b5557f6610119 + - name: policy-registry + image: registry.stella-ops.org/stellaops/policy-registry@sha256:c6cad8055e9827ebcbebb6ad4d6866dce4b83a0a49b0a8a6500b736a5cb26fa7 + - name: vex-lens + image: registry.stella-ops.org/stellaops/vex-lens@sha256:b44e63ecfeebc345a70c073c1ce5ace709c58be0ffaad0e2862758aeee3092fb + - name: issuer-directory + image: registry.stella-ops.org/stellaops/issuer-directory@sha256:67e8ef02c97d3156741e857756994888f30c373ace8e84886762edba9dc51914 + - name: findings-ledger + image: registry.stella-ops.org/stellaops/findings-ledger@sha256:71d4c361ba8b2f8b69d652597bc3f2efc8a64f93fab854ce25272a88506df49c + - name: vuln-explorer-api + image: registry.stella-ops.org/stellaops/vuln-explorer-api@sha256:7fc7e43a05cbeb0106ce7d4d634612e83de6fdc119aaab754a71c1d60b82841d + - name: packs-registry + image: registry.stella-ops.org/stellaops/packs-registry@sha256:1f5e9416c4dc608594ad6fad87c24d72134427f899c192b494e22b268499c791 + - name: task-runner + image: registry.stella-ops.org/stellaops/task-runner@sha256:eb5ad992b49a41554f41516be1a6afcfa6522faf2111c08ff2b3664ad2fc954b + infrastructure: + postgres: + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + valkey: + image: docker.io/valkey/valkey@sha256:9a2cf7c980f2f28678a5e34b1c8d74e4b7b7b6c8c4d5e6f7a8b9c0d1e2f3a4b5 + rustfs: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + checksums: + releaseManifestSha256: dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7 diff --git a/deploy/releases/2025.09-stable.yaml b/deploy/releases/2025.09-stable.yaml new file mode 100644 index 000000000..c37fa20e0 --- /dev/null +++ b/deploy/releases/2025.09-stable.yaml @@ -0,0 +1,35 @@ +release: + version: "2025.09.2" + channel: "stable" + date: "2025-09-20T00:00:00Z" + calendar: "2025.09" + components: + - name: authority + image: registry.stella-ops.org/stellaops/authority@sha256:b0348bad1d0b401cc3c71cb40ba034c8043b6c8874546f90d4783c9dbfcc0bf5 + - name: signer + image: registry.stella-ops.org/stellaops/signer@sha256:8ad574e61f3a9e9bda8a58eb2700ae46813284e35a150b1137bc7c2b92ac0f2e + - name: attestor + image: registry.stella-ops.org/stellaops/attestor@sha256:0534985f978b0b5d220d73c96fddd962cd9135f616811cbe3bff4666c5af568f + - name: scanner-web + image: registry.stella-ops.org/stellaops/scanner-web@sha256:14b23448c3f9586a9156370b3e8c1991b61907efa666ca37dd3aaed1e79fe3b7 + - name: scanner-worker + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:32e25e76386eb9ea8bee0a1ad546775db9a2df989fab61ac877e351881960dab + - name: concelier + image: registry.stella-ops.org/stellaops/concelier@sha256:c58cdcaee1d266d68d498e41110a589dd204b487d37381096bd61ab345a867c5 + - name: excititor + image: registry.stella-ops.org/stellaops/excititor@sha256:59022e2016aebcef5c856d163ae705755d3f81949d41195256e935ef40a627fa + - name: advisory-ai-web + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.09.2 + - name: advisory-ai-worker + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.09.2 + - name: web-ui + image: registry.stella-ops.org/stellaops/web-ui@sha256:10d924808c48e4353e3a241da62eb7aefe727a1d6dc830eb23a8e181013b3a23 + infrastructure: + postgres: + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + valkey: + image: docker.io/valkey/valkey@sha256:9a2cf7c980f2f28678a5e34b1c8d74e4b7b7b6c8c4d5e6f7a8b9c0d1e2f3a4b5 + rustfs: + image: registry.stella-ops.org/stellaops/rustfs:2025.09.2 + checksums: + releaseManifestSha256: dc3c8fe1ab83941c838ccc5a8a5862f7ddfa38c2078e580b5649db26554565b7 diff --git a/deploy/releases/2025.10-edge.yaml b/deploy/releases/2025.10-edge.yaml new file mode 100644 index 000000000..7e8cb0608 --- /dev/null +++ b/deploy/releases/2025.10-edge.yaml @@ -0,0 +1,37 @@ + release: + version: "2025.10.0-edge" + channel: "edge" + date: "2025-10-01T00:00:00Z" + calendar: "2025.10" + components: + - name: authority + image: registry.stella-ops.org/stellaops/authority@sha256:a8e8faec44a579aa5714e58be835f25575710430b1ad2ccd1282a018cd9ffcdd + - name: signer + image: registry.stella-ops.org/stellaops/signer@sha256:8bfef9a75783883d49fc18e3566553934e970b00ee090abee9cb110d2d5c3298 + - name: attestor + image: registry.stella-ops.org/stellaops/attestor@sha256:5cc417948c029da01dccf36e4645d961a3f6d8de7e62fe98d845f07cd2282114 + - name: issuer-directory-web + image: registry.stella-ops.org/stellaops/issuer-directory-web:2025.10.0-edge + - name: scanner-web + image: registry.stella-ops.org/stellaops/scanner-web@sha256:e0dfdb087e330585a5953029fb4757f5abdf7610820a085bd61b457dbead9a11 + - name: scanner-worker + image: registry.stella-ops.org/stellaops/scanner-worker@sha256:92dda42f6f64b2d9522104a5c9ffb61d37b34dd193132b68457a259748008f37 + - name: concelier + image: registry.stella-ops.org/stellaops/concelier@sha256:dafef3954eb4b837e2c424dd2d23e1e4d60fa83794840fac9cd3dea1d43bd085 + - name: excititor + image: registry.stella-ops.org/stellaops/excititor@sha256:d9bd5cadf1eab427447ce3df7302c30ded837239771cc6433b9befb895054285 + - name: advisory-ai-web + image: registry.stella-ops.org/stellaops/advisory-ai-web:2025.10.0-edge + - name: advisory-ai-worker + image: registry.stella-ops.org/stellaops/advisory-ai-worker:2025.10.0-edge + - name: web-ui + image: registry.stella-ops.org/stellaops/web-ui@sha256:38b225fa7767a5b94ebae4dae8696044126aac429415e93de514d5dd95748dcf + infrastructure: + postgres: + image: docker.io/library/postgres@sha256:8e97b8526ed19304b144f7478bc9201646acf0723cdc6e4b19bc9eb34879a27e + valkey: + image: docker.io/valkey/valkey@sha256:9a2cf7c980f2f28678a5e34b1c8d74e4b7b7b6c8c4d5e6f7a8b9c0d1e2f3a4b5 + rustfs: + image: registry.stella-ops.org/stellaops/rustfs:2025.10.0-edge + checksums: + releaseManifestSha256: 64d5b05c864bbfaeb29dad3958f4e7ff43d13393059da558ab355cebb9aba2b7 diff --git a/deploy/releases/service-versions.json b/deploy/releases/service-versions.json new file mode 100644 index 000000000..3738b3722 --- /dev/null +++ b/deploy/releases/service-versions.json @@ -0,0 +1,143 @@ +{ + "$schema": "./service-versions.schema.json", + "schemaVersion": "1.0.0", + "lastUpdated": "2025-01-01T00:00:00Z", + "registry": "git.stella-ops.org/stella-ops.org", + "services": { + "authority": { + "name": "Authority", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "attestor": { + "name": "Attestor", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "concelier": { + "name": "Concelier", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "scanner": { + "name": "Scanner", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "policy": { + "name": "Policy", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "signer": { + "name": "Signer", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "excititor": { + "name": "Excititor", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "gateway": { + "name": "Gateway", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "scheduler": { + "name": "Scheduler", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "cli": { + "name": "CLI", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "orchestrator": { + "name": "Orchestrator", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "notify": { + "name": "Notify", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "sbomservice": { + "name": "SbomService", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "vexhub": { + "name": "VexHub", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + }, + "evidencelocker": { + "name": "EvidenceLocker", + "version": "1.0.0", + "dockerTag": null, + "releasedAt": null, + "gitSha": null, + "sbomDigest": null, + "signatureDigest": null + } + } +} diff --git a/deploy/scripts/bootstrap-trust-offline.sh b/deploy/scripts/bootstrap-trust-offline.sh new file mode 100644 index 000000000..55900c1ab --- /dev/null +++ b/deploy/scripts/bootstrap-trust-offline.sh @@ -0,0 +1,170 @@ +#!/bin/bash +# ----------------------------------------------------------------------------- +# bootstrap-trust-offline.sh +# Sprint: SPRINT_20260125_003_Attestor_trust_workflows_conformance +# Task: WORKFLOW-001 - Create bootstrap workflow script +# Description: Initialize trust for air-gapped StellaOps deployment +# ----------------------------------------------------------------------------- + +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } +log_step() { echo -e "${BLUE}[STEP]${NC} $1"; } + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Initialize trust for an air-gapped StellaOps deployment." + echo "" + echo "Arguments:" + echo " trust-bundle Path to trust bundle (tar.zst or directory)" + echo "" + echo "Options:" + echo " --key-dir DIR Directory for signing keys (default: /etc/stellaops/keys)" + echo " --reject-if-stale D Reject bundle if older than D (e.g., 7d, 24h)" + echo " --skip-keygen Skip signing key generation" + echo " --force Force import even if validation fails" + echo " -h, --help Show this help message" + echo "" + echo "Example:" + echo " $0 /media/usb/trust-bundle-2026-01-25.tar.zst" + exit 1 +} + +BUNDLE_PATH="" +KEY_DIR="/etc/stellaops/keys" +REJECT_STALE="" +SKIP_KEYGEN=false +FORCE=false + +while [[ $# -gt 0 ]]; do + case $1 in + --key-dir) KEY_DIR="$2"; shift 2 ;; + --reject-if-stale) REJECT_STALE="$2"; shift 2 ;; + --skip-keygen) SKIP_KEYGEN=true; shift ;; + --force) FORCE=true; shift ;; + -h|--help) usage ;; + -*) log_error "Unknown option: $1"; usage ;; + *) + if [[ -z "$BUNDLE_PATH" ]]; then + BUNDLE_PATH="$1" + else + log_error "Unexpected argument: $1" + usage + fi + shift + ;; + esac +done + +if [[ -z "$BUNDLE_PATH" ]]; then + log_error "Trust bundle path is required" + usage +fi + +if [[ ! -e "$BUNDLE_PATH" ]]; then + log_error "Trust bundle not found: $BUNDLE_PATH" + exit 1 +fi + +echo "" +echo "================================================" +echo " StellaOps Offline Trust Bootstrap" +echo "================================================" +echo "" +log_info "Trust Bundle: $BUNDLE_PATH" +log_info "Key Directory: $KEY_DIR" +if [[ -n "$REJECT_STALE" ]]; then + log_info "Staleness Threshold: $REJECT_STALE" +fi +echo "" + +# Step 1: Generate signing keys (if using local keys) +if [[ "$SKIP_KEYGEN" != "true" ]]; then + log_step "Step 1: Generating signing keys..." + + mkdir -p "$KEY_DIR" + chmod 700 "$KEY_DIR" + + if [[ ! -f "$KEY_DIR/signing-key.pem" ]]; then + openssl ecparam -name prime256v1 -genkey -noout -out "$KEY_DIR/signing-key.pem" + chmod 600 "$KEY_DIR/signing-key.pem" + log_info "Generated signing key: $KEY_DIR/signing-key.pem" + else + log_info "Signing key already exists: $KEY_DIR/signing-key.pem" + fi +else + log_step "Step 1: Skipping key generation (--skip-keygen)" +fi + +# Step 2: Import trust bundle +log_step "Step 2: Importing trust bundle..." + +IMPORT_ARGS="--verify-manifest" +if [[ -n "$REJECT_STALE" ]]; then + IMPORT_ARGS="$IMPORT_ARGS --reject-if-stale $REJECT_STALE" +fi +if [[ "$FORCE" == "true" ]]; then + IMPORT_ARGS="$IMPORT_ARGS --force" +fi + +stella trust import "$BUNDLE_PATH" $IMPORT_ARGS + +if [[ $? -ne 0 ]]; then + log_error "Failed to import trust bundle" + exit 1 +fi + +log_info "Trust bundle imported successfully" + +# Step 3: Verify trust state +log_step "Step 3: Verifying trust state..." + +stella trust status --show-keys + +if [[ $? -ne 0 ]]; then + log_error "Failed to verify trust status" + exit 1 +fi + +# Step 4: Test offline verification +log_step "Step 4: Testing offline verification capability..." + +# Check that we have TUF metadata +CACHE_DIR="${HOME}/.local/share/StellaOps/TufCache" +if [[ -f "$CACHE_DIR/root.json" ]] && [[ -f "$CACHE_DIR/timestamp.json" ]]; then + log_info "TUF metadata present" +else + log_warn "TUF metadata may be incomplete" +fi + +# Check for tiles (if snapshot included them) +if [[ -d "$CACHE_DIR/tiles" ]]; then + TILE_COUNT=$(find "$CACHE_DIR/tiles" -name "*.tile" 2>/dev/null | wc -l) + log_info "Tiles cached: $TILE_COUNT" +fi + +echo "" +echo "================================================" +echo -e "${GREEN} Offline Bootstrap Complete!${NC}" +echo "================================================" +echo "" +log_info "Trust state imported to: $CACHE_DIR" +log_info "Signing key (if generated): $KEY_DIR/signing-key.pem" +echo "" +log_info "This system can now verify attestations offline using the imported trust state." +log_warn "Remember to periodically update the trust bundle to maintain freshness." +echo "" +log_info "To update trust state:" +echo " 1. On connected system: stella trust snapshot export --out bundle.tar.zst" +echo " 2. Transfer bundle to this system" +echo " 3. Run: $0 bundle.tar.zst" +echo "" diff --git a/deploy/scripts/bootstrap-trust.sh b/deploy/scripts/bootstrap-trust.sh new file mode 100644 index 000000000..3cdb4ceb1 --- /dev/null +++ b/deploy/scripts/bootstrap-trust.sh @@ -0,0 +1,196 @@ +#!/bin/bash +# ----------------------------------------------------------------------------- +# bootstrap-trust.sh +# Sprint: SPRINT_20260125_003_Attestor_trust_workflows_conformance +# Task: WORKFLOW-001 - Create bootstrap workflow script +# Description: Initialize trust for new StellaOps deployment +# ----------------------------------------------------------------------------- + +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } +log_step() { echo -e "${BLUE}[STEP]${NC} $1"; } + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Initialize trust for a new StellaOps deployment." + echo "" + echo "Options:" + echo " --tuf-url URL TUF repository URL (required)" + echo " --service-map NAME Service map target name (default: sigstore-services-v1)" + echo " --pin KEY Rekor key to pin (can specify multiple)" + echo " --key-dir DIR Directory for signing keys (default: /etc/stellaops/keys)" + echo " --skip-keygen Skip signing key generation" + echo " --skip-test Skip sign/verify test" + echo " --offline Initialize in offline mode" + echo " -h, --help Show this help message" + echo "" + echo "Example:" + echo " $0 --tuf-url https://trust.example.com/tuf/ --pin rekor-key-v1" + exit 1 +} + +TUF_URL="" +SERVICE_MAP="sigstore-services-v1" +PIN_KEYS=() +KEY_DIR="/etc/stellaops/keys" +SKIP_KEYGEN=false +SKIP_TEST=false +OFFLINE=false + +while [[ $# -gt 0 ]]; do + case $1 in + --tuf-url) TUF_URL="$2"; shift 2 ;; + --service-map) SERVICE_MAP="$2"; shift 2 ;; + --pin) PIN_KEYS+=("$2"); shift 2 ;; + --key-dir) KEY_DIR="$2"; shift 2 ;; + --skip-keygen) SKIP_KEYGEN=true; shift ;; + --skip-test) SKIP_TEST=true; shift ;; + --offline) OFFLINE=true; shift ;; + -h|--help) usage ;; + *) log_error "Unknown option: $1"; usage ;; + esac +done + +if [[ -z "$TUF_URL" ]]; then + log_error "TUF URL is required" + usage +fi + +if [[ ${#PIN_KEYS[@]} -eq 0 ]]; then + PIN_KEYS=("rekor-key-v1") +fi + +echo "" +echo "================================================" +echo " StellaOps Trust Bootstrap" +echo "================================================" +echo "" +log_info "TUF URL: $TUF_URL" +log_info "Service Map: $SERVICE_MAP" +log_info "Pinned Keys: ${PIN_KEYS[*]}" +log_info "Key Directory: $KEY_DIR" +echo "" + +# Step 1: Generate signing keys (if using local keys) +if [[ "$SKIP_KEYGEN" != "true" ]]; then + log_step "Step 1: Generating signing keys..." + + mkdir -p "$KEY_DIR" + chmod 700 "$KEY_DIR" + + if [[ ! -f "$KEY_DIR/signing-key.pem" ]]; then + stella keys generate --type ecdsa-p256 --out "$KEY_DIR/signing-key.pem" 2>/dev/null || \ + openssl ecparam -name prime256v1 -genkey -noout -out "$KEY_DIR/signing-key.pem" + + chmod 600 "$KEY_DIR/signing-key.pem" + log_info "Generated signing key: $KEY_DIR/signing-key.pem" + else + log_info "Signing key already exists: $KEY_DIR/signing-key.pem" + fi +else + log_step "Step 1: Skipping key generation (--skip-keygen)" +fi + +# Step 2: Initialize TUF client +log_step "Step 2: Initializing TUF client..." + +PIN_ARGS="" +for key in "${PIN_KEYS[@]}"; do + PIN_ARGS="$PIN_ARGS --pin $key" +done + +OFFLINE_ARG="" +if [[ "$OFFLINE" == "true" ]]; then + OFFLINE_ARG="--offline" +fi + +stella trust init \ + --tuf-url "$TUF_URL" \ + --service-map "$SERVICE_MAP" \ + $PIN_ARGS \ + $OFFLINE_ARG \ + --force + +if [[ $? -ne 0 ]]; then + log_error "Failed to initialize TUF client" + exit 1 +fi + +log_info "TUF client initialized successfully" + +# Step 3: Verify TUF metadata loaded +log_step "Step 3: Verifying TUF metadata..." + +stella trust status --show-keys --show-endpoints + +if [[ $? -ne 0 ]]; then + log_error "Failed to verify TUF status" + exit 1 +fi + +# Step 4: Test sign/verify cycle +if [[ "$SKIP_TEST" != "true" ]] && [[ "$SKIP_KEYGEN" != "true" ]]; then + log_step "Step 4: Testing sign/verify cycle..." + + TEST_FILE=$(mktemp) + TEST_SIG=$(mktemp) + echo "StellaOps bootstrap test $(date -u +%Y-%m-%dT%H:%M:%SZ)" > "$TEST_FILE" + + stella sign "$TEST_FILE" --key "$KEY_DIR/signing-key.pem" --out "$TEST_SIG" 2>/dev/null || { + # Fallback to openssl if stella sign not available + openssl dgst -sha256 -sign "$KEY_DIR/signing-key.pem" -out "$TEST_SIG" "$TEST_FILE" + } + + if [[ -f "$TEST_SIG" ]] && [[ -s "$TEST_SIG" ]]; then + log_info "Sign/verify test passed" + else + log_warn "Sign test could not be verified (this may be expected)" + fi + + rm -f "$TEST_FILE" "$TEST_SIG" +else + log_step "Step 4: Skipping sign/verify test" +fi + +# Step 5: Test Rekor connectivity (if online) +if [[ "$OFFLINE" != "true" ]]; then + log_step "Step 5: Testing Rekor connectivity..." + + REKOR_URL=$(stella trust status --output json 2>/dev/null | grep -o '"rekor_url"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | cut -d'"' -f4 || echo "") + + if [[ -n "$REKOR_URL" ]]; then + if curl -sf "${REKOR_URL}/api/v1/log" >/dev/null 2>&1; then + log_info "Rekor connectivity: OK" + else + log_warn "Rekor connectivity check failed (service may be unavailable)" + fi + else + log_warn "Could not determine Rekor URL from trust status" + fi +else + log_step "Step 5: Skipping Rekor test (offline mode)" +fi + +echo "" +echo "================================================" +echo -e "${GREEN} Bootstrap Complete!${NC}" +echo "================================================" +echo "" +log_info "Trust repository initialized at: ~/.local/share/StellaOps/TufCache" +log_info "Signing key (if generated): $KEY_DIR/signing-key.pem" +echo "" +log_info "Next steps:" +echo " 1. Configure your CI/CD to use the signing key" +echo " 2. Set up periodic 'stella trust sync' for metadata freshness" +echo " 3. For air-gap deployments, run 'stella trust export' to create bundles" +echo "" diff --git a/deploy/scripts/disaster-swap-endpoint.sh b/deploy/scripts/disaster-swap-endpoint.sh new file mode 100644 index 000000000..2b7a0b0e4 --- /dev/null +++ b/deploy/scripts/disaster-swap-endpoint.sh @@ -0,0 +1,195 @@ +#!/bin/bash +# ----------------------------------------------------------------------------- +# disaster-swap-endpoint.sh +# Sprint: SPRINT_20260125_003_Attestor_trust_workflows_conformance +# Task: WORKFLOW-003 - Create disaster endpoint swap script +# Description: Emergency endpoint swap via TUF (no client reconfiguration) +# ----------------------------------------------------------------------------- + +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } +log_step() { echo -e "${BLUE}[STEP]${NC} $1"; } + +usage() { + echo "Usage: $0 --repo --new-rekor-url [options]" + echo "" + echo "Emergency endpoint swap via TUF update." + echo "Clients will auto-discover new endpoints without reconfiguration." + echo "" + echo "Options:" + echo " --repo DIR TUF repository directory (required)" + echo " --new-rekor-url URL New Rekor URL (required)" + echo " --new-fulcio-url URL New Fulcio URL (optional)" + echo " --note TEXT Note explaining the change" + echo " --version N New service map version (auto-increment if not specified)" + echo " -h, --help Show this help message" + echo "" + echo "Example:" + echo " $0 --repo /path/to/tuf \\" + echo " --new-rekor-url https://rekor-mirror.internal:8080 \\" + echo " --note 'Emergency: Production Rekor outage'" + echo "" + echo "IMPORTANT: This changes where ALL clients send requests!" + exit 1 +} + +REPO_DIR="" +NEW_REKOR_URL="" +NEW_FULCIO_URL="" +NOTE="" +VERSION="" + +while [[ $# -gt 0 ]]; do + case $1 in + --repo) REPO_DIR="$2"; shift 2 ;; + --new-rekor-url) NEW_REKOR_URL="$2"; shift 2 ;; + --new-fulcio-url) NEW_FULCIO_URL="$2"; shift 2 ;; + --note) NOTE="$2"; shift 2 ;; + --version) VERSION="$2"; shift 2 ;; + -h|--help) usage ;; + *) log_error "Unknown argument: $1"; usage ;; + esac +done + +if [[ -z "$REPO_DIR" ]] || [[ -z "$NEW_REKOR_URL" ]]; then + log_error "--repo and --new-rekor-url are required" + usage +fi + +if [[ ! -d "$REPO_DIR" ]]; then + log_error "TUF repository not found: $REPO_DIR" + exit 1 +fi + +echo "" +echo "================================================" +echo -e "${RED} EMERGENCY ENDPOINT SWAP${NC}" +echo "================================================" +echo "" +log_warn "This will redirect ALL clients to new endpoints!" +echo "" +log_info "TUF Repository: $REPO_DIR" +log_info "New Rekor URL: $NEW_REKOR_URL" +if [[ -n "$NEW_FULCIO_URL" ]]; then + log_info "New Fulcio URL: $NEW_FULCIO_URL" +fi +if [[ -n "$NOTE" ]]; then + log_info "Note: $NOTE" +fi +echo "" + +read -p "Type 'SWAP' to confirm endpoint change: " CONFIRM +if [[ "$CONFIRM" != "SWAP" ]]; then + log_error "Aborted" + exit 1 +fi + +# Find current service map +CURRENT_MAP=$(ls "$REPO_DIR/targets/" 2>/dev/null | grep -E '^sigstore-services-v[0-9]+\.json$' | sort -V | tail -1 || echo "") + +if [[ -z "$CURRENT_MAP" ]]; then + log_error "No service map found in $REPO_DIR/targets/" + exit 1 +fi + +CURRENT_PATH="$REPO_DIR/targets/$CURRENT_MAP" +log_info "Current service map: $CURRENT_MAP" + +# Determine new version +if [[ -z "$VERSION" ]]; then + CURRENT_VERSION=$(echo "$CURRENT_MAP" | grep -oE '[0-9]+' | tail -1) + VERSION=$((CURRENT_VERSION + 1)) +fi + +NEW_MAP="sigstore-services-v${VERSION}.json" +NEW_PATH="$REPO_DIR/targets/$NEW_MAP" + +log_step "Creating new service map: $NEW_MAP" + +# Read current map and update +if command -v python3 &>/dev/null; then + python3 - "$CURRENT_PATH" "$NEW_PATH" "$NEW_REKOR_URL" "$NEW_FULCIO_URL" "$NOTE" "$VERSION" << 'PYTHON_SCRIPT' +import json +import sys +from datetime import datetime + +current_path = sys.argv[1] +new_path = sys.argv[2] +new_rekor_url = sys.argv[3] +new_fulcio_url = sys.argv[4] if len(sys.argv) > 4 and sys.argv[4] else None +note = sys.argv[5] if len(sys.argv) > 5 and sys.argv[5] else None +version = int(sys.argv[6]) if len(sys.argv) > 6 else 1 + +with open(current_path) as f: + data = json.load(f) + +# Update endpoints +data['version'] = version +data['rekor']['url'] = new_rekor_url + +if new_fulcio_url and 'fulcio' in data: + data['fulcio']['url'] = new_fulcio_url + +# Update metadata +if 'metadata' not in data: + data['metadata'] = {} +data['metadata']['updated_at'] = datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%SZ') +if note: + data['metadata']['note'] = note + +with open(new_path, 'w') as f: + json.dump(data, f, indent=2) + +print(f"Created: {new_path}") +PYTHON_SCRIPT +else + # Fallback: simple JSON creation + cat > "$NEW_PATH" << EOF +{ + "version": $VERSION, + "rekor": { + "url": "$NEW_REKOR_URL" + }, + "metadata": { + "updated_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)", + "note": "$NOTE" + } +} +EOF +fi + +log_info "New service map created: $NEW_PATH" + +# Add to targets +log_step "Adding new service map to TUF targets..." + +if [[ -x "$REPO_DIR/scripts/add-target.sh" ]]; then + "$REPO_DIR/scripts/add-target.sh" "$NEW_PATH" "$NEW_MAP" --repo "$REPO_DIR" +fi + +echo "" +echo "================================================" +echo -e "${GREEN} Endpoint Swap Prepared${NC}" +echo "================================================" +echo "" +log_warn "NEXT STEPS (REQUIRED):" +echo " 1. Review the new service map: cat $NEW_PATH" +echo " 2. Sign the updated targets.json with targets key" +echo " 3. Update snapshot.json and sign with snapshot key" +echo " 4. Update timestamp.json and sign with timestamp key" +echo " 5. Deploy updated metadata to TUF server" +echo "" +log_info "Clients will auto-discover the new endpoint within their refresh interval." +log_info "For immediate effect, clients can run: stella trust sync --force" +echo "" +log_warn "Monitor client traffic to ensure failover is working!" +echo "" diff --git a/deploy/scripts/init-config.sh b/deploy/scripts/init-config.sh new file mode 100644 index 000000000..d66d8d95d --- /dev/null +++ b/deploy/scripts/init-config.sh @@ -0,0 +1,221 @@ +#!/usr/bin/env bash +# +# Initialize StellaOps configuration from sample files +# +# Usage: +# ./devops/scripts/init-config.sh [profile] +# +# Profiles: +# dev - Development environment (default) +# stage - Staging environment +# prod - Production environment +# airgap - Air-gapped deployment +# + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +ROOT_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)" +ETC_DIR="${ROOT_DIR}/etc" + +PROFILE="${1:-dev}" + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +log_info() { echo -e "${BLUE}[INFO]${NC} $*"; } +log_ok() { echo -e "${GREEN}[OK]${NC} $*"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; } +log_error() { echo -e "${RED}[ERROR]${NC} $*"; } + +# Validate profile +case "${PROFILE}" in + dev|stage|prod|airgap) + log_info "Initializing configuration for profile: ${PROFILE}" + ;; + *) + log_error "Unknown profile: ${PROFILE}" + echo "Valid profiles: dev, stage, prod, airgap" + exit 1 + ;; +esac + +# Create directory structure +create_directories() { + log_info "Creating directory structure..." + + local dirs=( + "etc/authority/plugins" + "etc/certificates/trust-roots" + "etc/certificates/signing" + "etc/concelier/sources" + "etc/crypto/profiles/cn" + "etc/crypto/profiles/eu" + "etc/crypto/profiles/kr" + "etc/crypto/profiles/ru" + "etc/crypto/profiles/us-fips" + "etc/env" + "etc/llm-providers" + "etc/notify/templates" + "etc/plugins/notify" + "etc/plugins/scanner/lang" + "etc/plugins/scanner/os" + "etc/policy/packs" + "etc/policy/schemas" + "etc/router" + "etc/scanner" + "etc/scheduler" + "etc/scm-connectors" + "etc/secrets" + "etc/signals" + "etc/vex" + ) + + for dir in "${dirs[@]}"; do + mkdir -p "${ROOT_DIR}/${dir}" + done + + log_ok "Directory structure created" +} + +# Copy sample files to active configs +copy_sample_files() { + log_info "Copying sample files..." + + local count=0 + + # Find all .sample files + while IFS= read -r -d '' sample_file; do + # Determine target file (remove .sample extension) + local target_file="${sample_file%.sample}" + + # Skip if target already exists + if [[ -f "${target_file}" ]]; then + log_warn "Skipping (exists): ${target_file#${ROOT_DIR}/}" + continue + fi + + cp "${sample_file}" "${target_file}" + log_ok "Created: ${target_file#${ROOT_DIR}/}" + ((count++)) + done < <(find "${ETC_DIR}" -name "*.sample" -type f -print0 2>/dev/null) + + log_info "Copied ${count} sample files" +} + +# Copy environment-specific profile +copy_env_profile() { + log_info "Setting up environment profile: ${PROFILE}" + + local env_sample="${ETC_DIR}/env/${PROFILE}.env.sample" + local env_target="${ROOT_DIR}/.env" + + if [[ -f "${env_sample}" ]]; then + if [[ -f "${env_target}" ]]; then + log_warn ".env already exists, not overwriting" + else + cp "${env_sample}" "${env_target}" + log_ok "Created .env from ${PROFILE} profile" + fi + else + log_warn "No environment sample found for profile: ${PROFILE}" + fi +} + +# Create .gitignore entries for active configs +update_gitignore() { + log_info "Updating .gitignore..." + + local gitignore="${ROOT_DIR}/.gitignore" + local entries=( + "# Active configuration files (not samples)" + "etc/**/*.yaml" + "!etc/**/*.yaml.sample" + "etc/**/*.json" + "!etc/**/*.json.sample" + "etc/**/env" + "!etc/**/env.sample" + "etc/secrets/*" + "!etc/secrets/*.sample" + "!etc/secrets/README.md" + ) + + # Check if entries already exist + if grep -q "# Active configuration files" "${gitignore}" 2>/dev/null; then + log_warn ".gitignore already contains config entries" + return + fi + + echo "" >> "${gitignore}" + for entry in "${entries[@]}"; do + echo "${entry}" >> "${gitignore}" + done + + log_ok "Updated .gitignore" +} + +# Validate the configuration +validate_config() { + log_info "Validating configuration..." + + local errors=0 + + # Check for required directories + local required_dirs=( + "etc/scanner" + "etc/authority" + "etc/policy" + ) + + for dir in "${required_dirs[@]}"; do + if [[ ! -d "${ROOT_DIR}/${dir}" ]]; then + log_error "Missing required directory: ${dir}" + ((errors++)) + fi + done + + if [[ ${errors} -gt 0 ]]; then + log_error "Validation failed with ${errors} errors" + exit 1 + fi + + log_ok "Configuration validated" +} + +# Print summary +print_summary() { + echo "" + echo "========================================" + echo " Configuration Initialized" + echo "========================================" + echo "" + echo "Profile: ${PROFILE}" + echo "" + echo "Next steps:" + echo " 1. Review and customize configurations in etc/" + echo " 2. Set sensitive values via environment variables" + echo " 3. For crypto compliance, set STELLAOPS_CRYPTO_PROFILE" + echo "" + echo "Quick start:" + echo " docker compose up -d" + echo "" + echo "Documentation:" + echo " docs/operations/configuration-guide.md" + echo "" +} + +# Main +main() { + create_directories + copy_sample_files + copy_env_profile + update_gitignore + validate_config + print_summary +} + +main "$@" diff --git a/deploy/scripts/lib/ci-common.sh b/deploy/scripts/lib/ci-common.sh new file mode 100644 index 000000000..4863502ff --- /dev/null +++ b/deploy/scripts/lib/ci-common.sh @@ -0,0 +1,406 @@ +#!/usr/bin/env bash +# ============================================================================= +# CI COMMON FUNCTIONS +# ============================================================================= +# Shared utility functions for local CI testing scripts. +# +# Usage: +# source "$SCRIPT_DIR/lib/ci-common.sh" +# +# ============================================================================= + +# Prevent multiple sourcing +[[ -n "${_CI_COMMON_LOADED:-}" ]] && return +_CI_COMMON_LOADED=1 + +# ============================================================================= +# COLOR DEFINITIONS +# ============================================================================= + +if [[ -t 1 ]] && [[ -n "${TERM:-}" ]] && [[ "${TERM}" != "dumb" ]]; then + RED='\033[0;31m' + GREEN='\033[0;32m' + YELLOW='\033[0;33m' + BLUE='\033[0;34m' + MAGENTA='\033[0;35m' + CYAN='\033[0;36m' + WHITE='\033[0;37m' + BOLD='\033[1m' + DIM='\033[2m' + RESET='\033[0m' +else + RED='' + GREEN='' + YELLOW='' + BLUE='' + MAGENTA='' + CYAN='' + WHITE='' + BOLD='' + DIM='' + RESET='' +fi + +# ============================================================================= +# LOGGING FUNCTIONS +# ============================================================================= + +# Log an info message +log_info() { + echo -e "${BLUE}[INFO]${RESET} $*" +} + +# Log a success message +log_success() { + echo -e "${GREEN}[OK]${RESET} $*" +} + +# Log a warning message +log_warn() { + echo -e "${YELLOW}[WARN]${RESET} $*" >&2 +} + +# Log an error message +log_error() { + echo -e "${RED}[ERROR]${RESET} $*" >&2 +} + +# Log a debug message (only if VERBOSE is true) +log_debug() { + if [[ "${VERBOSE:-false}" == "true" ]]; then + echo -e "${DIM}[DEBUG]${RESET} $*" + fi +} + +# Log a step in a process +log_step() { + local step_num="$1" + local total_steps="$2" + local message="$3" + echo -e "${CYAN}[${step_num}/${total_steps}]${RESET} ${BOLD}${message}${RESET}" +} + +# Log a section header +log_section() { + echo "" + echo -e "${BOLD}${MAGENTA}=== $* ===${RESET}" + echo "" +} + +# Log a subsection header +log_subsection() { + echo -e "${CYAN}--- $* ---${RESET}" +} + +# ============================================================================= +# ERROR HANDLING +# ============================================================================= + +# Exit with error message +die() { + log_error "$@" + exit 1 +} + +# Check if a command exists +require_command() { + local cmd="$1" + local install_hint="${2:-}" + + if ! command -v "$cmd" &>/dev/null; then + log_error "Required command not found: $cmd" + if [[ -n "$install_hint" ]]; then + log_info "Install with: $install_hint" + fi + return 1 + fi + return 0 +} + +# Check if a file exists +require_file() { + local file="$1" + if [[ ! -f "$file" ]]; then + log_error "Required file not found: $file" + return 1 + fi + return 0 +} + +# Check if a directory exists +require_dir() { + local dir="$1" + if [[ ! -d "$dir" ]]; then + log_error "Required directory not found: $dir" + return 1 + fi + return 0 +} + +# ============================================================================= +# TIMING FUNCTIONS +# ============================================================================= + +# Get current timestamp in seconds +get_timestamp() { + date +%s +} + +# Format duration in human-readable format +format_duration() { + local seconds="$1" + local minutes=$((seconds / 60)) + local remaining_seconds=$((seconds % 60)) + + if [[ $minutes -gt 0 ]]; then + echo "${minutes}m ${remaining_seconds}s" + else + echo "${remaining_seconds}s" + fi +} + +# Start a timer and return the start time +start_timer() { + get_timestamp +} + +# Stop a timer and print the duration +stop_timer() { + local start_time="$1" + local label="${2:-Operation}" + local end_time + end_time=$(get_timestamp) + local duration=$((end_time - start_time)) + + log_info "$label completed in $(format_duration $duration)" +} + +# ============================================================================= +# STRING FUNCTIONS +# ============================================================================= + +# Convert string to lowercase +to_lower() { + echo "$1" | tr '[:upper:]' '[:lower:]' +} + +# Convert string to uppercase +to_upper() { + echo "$1" | tr '[:lower:]' '[:upper:]' +} + +# Trim whitespace from string +trim() { + local var="$*" + var="${var#"${var%%[![:space:]]*}"}" + var="${var%"${var##*[![:space:]]}"}" + echo -n "$var" +} + +# Join array elements with delimiter +join_by() { + local delimiter="$1" + shift + local first="$1" + shift + printf '%s' "$first" "${@/#/$delimiter}" +} + +# ============================================================================= +# ARRAY FUNCTIONS +# ============================================================================= + +# Check if array contains element +array_contains() { + local needle="$1" + shift + local element + for element in "$@"; do + [[ "$element" == "$needle" ]] && return 0 + done + return 1 +} + +# ============================================================================= +# FILE FUNCTIONS +# ============================================================================= + +# Create directory if it doesn't exist +ensure_dir() { + local dir="$1" + if [[ ! -d "$dir" ]]; then + mkdir -p "$dir" + log_debug "Created directory: $dir" + fi +} + +# Get absolute path +get_absolute_path() { + local path="$1" + if [[ -d "$path" ]]; then + (cd "$path" && pwd) + elif [[ -f "$path" ]]; then + local dir + dir=$(dirname "$path") + echo "$(cd "$dir" && pwd)/$(basename "$path")" + else + echo "$path" + fi +} + +# ============================================================================= +# GIT FUNCTIONS +# ============================================================================= + +# Get the repository root directory +get_repo_root() { + git rev-parse --show-toplevel 2>/dev/null +} + +# Get current branch name +get_current_branch() { + git rev-parse --abbrev-ref HEAD 2>/dev/null +} + +# Get current commit SHA +get_current_sha() { + git rev-parse HEAD 2>/dev/null +} + +# Get short commit SHA +get_short_sha() { + git rev-parse --short HEAD 2>/dev/null +} + +# Check if working directory is clean +is_git_clean() { + [[ -z "$(git status --porcelain 2>/dev/null)" ]] +} + +# Get list of changed files compared to main branch +get_changed_files() { + local base_branch="${1:-main}" + git diff --name-only "$base_branch"...HEAD 2>/dev/null +} + +# ============================================================================= +# MODULE DETECTION +# ============================================================================= + +# Map of module names to source paths +declare -A MODULE_PATHS=( + ["Scanner"]="src/Scanner src/BinaryIndex" + ["Concelier"]="src/Concelier src/Excititor" + ["Authority"]="src/Authority" + ["Policy"]="src/Policy src/RiskEngine" + ["Attestor"]="src/Attestor src/Provenance" + ["EvidenceLocker"]="src/EvidenceLocker" + ["ExportCenter"]="src/ExportCenter" + ["Findings"]="src/Findings" + ["SbomService"]="src/SbomService" + ["Notify"]="src/Notify src/Notifier" + ["Router"]="src/Router src/Gateway" + ["Cryptography"]="src/Cryptography" + ["AirGap"]="src/AirGap" + ["Cli"]="src/Cli" + ["AdvisoryAI"]="src/AdvisoryAI" + ["ReachGraph"]="src/ReachGraph" + ["Orchestrator"]="src/Orchestrator" + ["PacksRegistry"]="src/PacksRegistry" + ["Replay"]="src/Replay" + ["Aoc"]="src/Aoc" + ["IssuerDirectory"]="src/IssuerDirectory" + ["Telemetry"]="src/Telemetry" + ["Signals"]="src/Signals" + ["Web"]="src/Web" + ["DevPortal"]="src/DevPortal" +) + +# Modules that use Node.js/npm instead of .NET +declare -a NODE_MODULES=("Web" "DevPortal") + +# Detect which modules have changed based on git diff +detect_changed_modules() { + local base_branch="${1:-main}" + local changed_files + changed_files=$(get_changed_files "$base_branch") + + local changed_modules=() + local module + local paths + + for module in "${!MODULE_PATHS[@]}"; do + paths="${MODULE_PATHS[$module]}" + for path in $paths; do + if echo "$changed_files" | grep -q "^${path}/"; then + if ! array_contains "$module" "${changed_modules[@]}"; then + changed_modules+=("$module") + fi + break + fi + done + done + + # Check for infrastructure changes that affect all modules + if echo "$changed_files" | grep -qE "^(Directory\.Build\.props|Directory\.Packages\.props|nuget\.config)"; then + echo "ALL" + return + fi + + # Check for shared library changes + if echo "$changed_files" | grep -q "^src/__Libraries/"; then + echo "ALL" + return + fi + + if [[ ${#changed_modules[@]} -eq 0 ]]; then + echo "NONE" + else + echo "${changed_modules[*]}" + fi +} + +# ============================================================================= +# RESULT REPORTING +# ============================================================================= + +# Print a summary table row +print_table_row() { + local col1="$1" + local col2="$2" + local col3="${3:-}" + + printf " %-30s %-15s %s\n" "$col1" "$col2" "$col3" +} + +# Print pass/fail status +print_status() { + local name="$1" + local passed="$2" + local duration="${3:-}" + + if [[ "$passed" == "true" ]]; then + print_table_row "$name" "${GREEN}PASSED${RESET}" "$duration" + else + print_table_row "$name" "${RED}FAILED${RESET}" "$duration" + fi +} + +# ============================================================================= +# ENVIRONMENT LOADING +# ============================================================================= + +# Load environment file if it exists +load_env_file() { + local env_file="$1" + + if [[ -f "$env_file" ]]; then + log_debug "Loading environment from: $env_file" + set -a + # shellcheck source=/dev/null + source "$env_file" + set +a + return 0 + fi + return 1 +} diff --git a/deploy/scripts/lib/ci-docker.sh b/deploy/scripts/lib/ci-docker.sh new file mode 100644 index 000000000..4f74ee407 --- /dev/null +++ b/deploy/scripts/lib/ci-docker.sh @@ -0,0 +1,342 @@ +#!/usr/bin/env bash +# ============================================================================= +# CI DOCKER UTILITIES +# ============================================================================= +# Docker-related utility functions for local CI testing. +# +# Usage: +# source "$SCRIPT_DIR/lib/ci-docker.sh" +# +# ============================================================================= + +# Prevent multiple sourcing +[[ -n "${_CI_DOCKER_LOADED:-}" ]] && return +_CI_DOCKER_LOADED=1 + +# ============================================================================= +# CONFIGURATION +# ============================================================================= + +CI_COMPOSE_FILE="${CI_COMPOSE_FILE:-devops/compose/docker-compose.testing.yml}" +CI_IMAGE="${CI_IMAGE:-stellaops-ci:local}" +CI_DOCKERFILE="${CI_DOCKERFILE:-devops/docker/Dockerfile.ci}" +CI_PROJECT_NAME="${CI_PROJECT_NAME:-stellaops-ci}" + +# Service names from docker-compose.testing.yml +CI_SERVICES=(postgres-test valkey-test rustfs-test mock-registry) + +# ============================================================================= +# DOCKER CHECK +# ============================================================================= + +# Check if Docker is available and running +check_docker() { + if ! command -v docker &>/dev/null; then + log_error "Docker is not installed or not in PATH" + log_info "Install Docker: https://docs.docker.com/get-docker/" + return 1 + fi + + if ! docker info &>/dev/null; then + log_error "Docker daemon is not running" + log_info "Start Docker Desktop or run: sudo systemctl start docker" + return 1 + fi + + log_debug "Docker is available and running" + return 0 +} + +# Check if Docker Compose is available +check_docker_compose() { + if docker compose version &>/dev/null; then + DOCKER_COMPOSE="docker compose" + log_debug "Using Docker Compose plugin" + return 0 + elif command -v docker-compose &>/dev/null; then + DOCKER_COMPOSE="docker-compose" + log_debug "Using standalone docker-compose" + return 0 + else + log_error "Docker Compose is not installed" + log_info "Install with: docker compose plugin or standalone docker-compose" + return 1 + fi +} + +# ============================================================================= +# CI SERVICES MANAGEMENT +# ============================================================================= + +# Start CI services +start_ci_services() { + local services=("$@") + local compose_file="$REPO_ROOT/$CI_COMPOSE_FILE" + + if [[ ! -f "$compose_file" ]]; then + log_error "Compose file not found: $compose_file" + return 1 + fi + + check_docker || return 1 + check_docker_compose || return 1 + + log_section "Starting CI Services" + + if [[ ${#services[@]} -eq 0 ]]; then + # Start all services + log_info "Starting all CI services..." + $DOCKER_COMPOSE -f "$compose_file" -p "$CI_PROJECT_NAME" up -d + else + # Start specific services + log_info "Starting services: ${services[*]}" + $DOCKER_COMPOSE -f "$compose_file" -p "$CI_PROJECT_NAME" up -d "${services[@]}" + fi + + local result=$? + if [[ $result -ne 0 ]]; then + log_error "Failed to start CI services" + return $result + fi + + # Wait for services to be healthy + wait_for_services "${services[@]}" +} + +# Stop CI services +stop_ci_services() { + local compose_file="$REPO_ROOT/$CI_COMPOSE_FILE" + + if [[ ! -f "$compose_file" ]]; then + log_debug "Compose file not found, nothing to stop" + return 0 + fi + + check_docker_compose || return 1 + + log_section "Stopping CI Services" + + $DOCKER_COMPOSE -f "$compose_file" -p "$CI_PROJECT_NAME" down +} + +# Stop CI services and remove volumes +cleanup_ci_services() { + local compose_file="$REPO_ROOT/$CI_COMPOSE_FILE" + + if [[ ! -f "$compose_file" ]]; then + return 0 + fi + + check_docker_compose || return 1 + + log_section "Cleaning Up CI Services" + + $DOCKER_COMPOSE -f "$compose_file" -p "$CI_PROJECT_NAME" down -v --remove-orphans +} + +# Check status of CI services +check_ci_services_status() { + local compose_file="$REPO_ROOT/$CI_COMPOSE_FILE" + + check_docker_compose || return 1 + + log_subsection "CI Services Status" + $DOCKER_COMPOSE -f "$compose_file" -p "$CI_PROJECT_NAME" ps +} + +# ============================================================================= +# HEALTH CHECKS +# ============================================================================= + +# Wait for a specific service to be healthy +wait_for_service() { + local service="$1" + local timeout="${2:-60}" + local interval="${3:-2}" + + log_info "Waiting for $service to be healthy..." + + local elapsed=0 + while [[ $elapsed -lt $timeout ]]; do + local status + status=$(docker inspect --format='{{.State.Health.Status}}' "${CI_PROJECT_NAME}-${service}-1" 2>/dev/null || echo "not found") + + if [[ "$status" == "healthy" ]]; then + log_success "$service is healthy" + return 0 + elif [[ "$status" == "not found" ]]; then + # Container might not have health check, check if running + local running + running=$(docker inspect --format='{{.State.Running}}' "${CI_PROJECT_NAME}-${service}-1" 2>/dev/null || echo "false") + if [[ "$running" == "true" ]]; then + log_success "$service is running (no health check)" + return 0 + fi + fi + + sleep "$interval" + elapsed=$((elapsed + interval)) + done + + log_error "$service did not become healthy within ${timeout}s" + return 1 +} + +# Wait for multiple services to be healthy +wait_for_services() { + local services=("$@") + local failed=0 + + if [[ ${#services[@]} -eq 0 ]]; then + services=("${CI_SERVICES[@]}") + fi + + log_info "Waiting for services to be ready..." + + for service in "${services[@]}"; do + if ! wait_for_service "$service" 60 2; then + failed=1 + fi + done + + return $failed +} + +# Check if PostgreSQL is accepting connections +check_postgres_ready() { + local host="${1:-localhost}" + local port="${2:-5433}" + local user="${3:-stellaops_ci}" + local db="${4:-stellaops_test}" + + if command -v pg_isready &>/dev/null; then + pg_isready -h "$host" -p "$port" -U "$user" -d "$db" &>/dev/null + else + # Fallback to nc if pg_isready not available + nc -z "$host" "$port" &>/dev/null + fi +} + +# Check if Valkey/Redis is accepting connections +check_valkey_ready() { + local host="${1:-localhost}" + local port="${2:-6380}" + + if command -v valkey-cli &>/dev/null; then + valkey-cli -h "$host" -p "$port" ping &>/dev/null + elif command -v redis-cli &>/dev/null; then + redis-cli -h "$host" -p "$port" ping &>/dev/null + else + nc -z "$host" "$port" &>/dev/null + fi +} + +# ============================================================================= +# CI DOCKER IMAGE MANAGEMENT +# ============================================================================= + +# Check if CI image exists +ci_image_exists() { + docker image inspect "$CI_IMAGE" &>/dev/null +} + +# Build CI Docker image +build_ci_image() { + local force_rebuild="${1:-false}" + local dockerfile="$REPO_ROOT/$CI_DOCKERFILE" + + if [[ ! -f "$dockerfile" ]]; then + log_error "Dockerfile not found: $dockerfile" + return 1 + fi + + check_docker || return 1 + + if ci_image_exists && [[ "$force_rebuild" != "true" ]]; then + log_info "CI image already exists: $CI_IMAGE" + log_info "Use --rebuild to force rebuild" + return 0 + fi + + log_section "Building CI Docker Image" + log_info "Dockerfile: $dockerfile" + log_info "Image: $CI_IMAGE" + + docker build -t "$CI_IMAGE" -f "$dockerfile" "$REPO_ROOT" + + if [[ $? -ne 0 ]]; then + log_error "Failed to build CI image" + return 1 + fi + + log_success "CI image built successfully: $CI_IMAGE" +} + +# ============================================================================= +# CONTAINER EXECUTION +# ============================================================================= + +# Run a command inside the CI container +run_in_ci_container() { + local command="$*" + + check_docker || return 1 + + if ! ci_image_exists; then + log_info "CI image not found, building..." + build_ci_image || return 1 + fi + + local docker_args=( + --rm + -v "$REPO_ROOT:/src" + -v "$REPO_ROOT/TestResults:/src/TestResults" + -e DOTNET_NOLOGO=1 + -e DOTNET_CLI_TELEMETRY_OPTOUT=1 + -e DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 + -e TZ=UTC + -w /src + ) + + # Mount Docker socket for Testcontainers + if [[ -S /var/run/docker.sock ]]; then + docker_args+=(-v /var/run/docker.sock:/var/run/docker.sock) + fi + + # Load environment file if exists + local env_file="$REPO_ROOT/devops/ci-local/.env.local" + if [[ -f "$env_file" ]]; then + docker_args+=(--env-file "$env_file") + fi + + # Connect to CI network if services are running + if docker network inspect stellaops-ci-net &>/dev/null; then + docker_args+=(--network stellaops-ci-net) + fi + + log_debug "Running in CI container: $command" + docker run "${docker_args[@]}" "$CI_IMAGE" bash -c "$command" +} + +# ============================================================================= +# DOCKER NETWORK UTILITIES +# ============================================================================= + +# Get the IP address of a running container +get_container_ip() { + local container="$1" + docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' "$container" 2>/dev/null +} + +# Check if container is running +is_container_running() { + local container="$1" + [[ "$(docker inspect -f '{{.State.Running}}' "$container" 2>/dev/null)" == "true" ]] +} + +# Get container logs +get_container_logs() { + local container="$1" + local lines="${2:-100}" + docker logs --tail "$lines" "$container" 2>&1 +} diff --git a/deploy/scripts/lib/ci-web.sh b/deploy/scripts/lib/ci-web.sh new file mode 100644 index 000000000..a96c1409c --- /dev/null +++ b/deploy/scripts/lib/ci-web.sh @@ -0,0 +1,475 @@ +#!/usr/bin/env bash +# ============================================================================= +# CI-WEB.SH - Angular Web Testing Utilities +# ============================================================================= +# Functions for running Angular/Web frontend tests locally. +# +# Test Types: +# - Unit Tests (Karma/Jasmine) +# - E2E Tests (Playwright) +# - Accessibility Tests (Axe-core) +# - Lighthouse Audits +# - Storybook Build +# +# ============================================================================= + +# Prevent direct execution +if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then + echo "This script should be sourced, not executed directly." + exit 1 +fi + +# ============================================================================= +# CONSTANTS +# ============================================================================= + +WEB_DIR="${REPO_ROOT:-$(git rev-parse --show-toplevel)}/src/Web/StellaOps.Web" +WEB_NODE_VERSION="20" + +# Test categories for Web +WEB_TEST_CATEGORIES=( + "web:unit" # Karma unit tests + "web:e2e" # Playwright E2E + "web:a11y" # Accessibility + "web:lighthouse" # Performance/a11y audit + "web:build" # Production build + "web:storybook" # Storybook build +) + +# ============================================================================= +# DEPENDENCY CHECKS +# ============================================================================= + +check_node_version() { + if ! command -v node &>/dev/null; then + log_error "Node.js not found" + log_info "Install Node.js $WEB_NODE_VERSION+: https://nodejs.org" + return 1 + fi + + local version + version=$(node --version | sed 's/v//' | cut -d. -f1) + if [[ "$version" -lt "$WEB_NODE_VERSION" ]]; then + log_warn "Node.js version $version is below recommended $WEB_NODE_VERSION" + else + log_debug "Node.js version: $(node --version)" + fi + return 0 +} + +check_npm() { + if ! command -v npm &>/dev/null; then + log_error "npm not found" + return 1 + fi + log_debug "npm version: $(npm --version)" + return 0 +} + +check_web_dependencies() { + log_subsection "Checking Web Dependencies" + + check_node_version || return 1 + check_npm || return 1 + + # Check if node_modules exists + if [[ ! -d "$WEB_DIR/node_modules" ]]; then + log_warn "node_modules not found - will install dependencies" + fi + + return 0 +} + +# ============================================================================= +# SETUP +# ============================================================================= + +install_web_dependencies() { + log_subsection "Installing Web Dependencies" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + pushd "$WEB_DIR" > /dev/null || return 1 + + # Check if package-lock.json exists + if [[ -f "package-lock.json" ]]; then + log_info "Running npm ci (clean install)..." + npm ci --prefer-offline --no-audit --no-fund || { + log_error "npm ci failed" + popd > /dev/null + return 1 + } + else + log_info "Running npm install..." + npm install --no-audit --no-fund || { + log_error "npm install failed" + popd > /dev/null + return 1 + } + fi + + popd > /dev/null + log_success "Web dependencies installed" + return 0 +} + +ensure_web_dependencies() { + if [[ ! -d "$WEB_DIR/node_modules" ]]; then + install_web_dependencies || return 1 + fi + return 0 +} + +# ============================================================================= +# TEST RUNNERS +# ============================================================================= + +run_web_unit_tests() { + log_subsection "Running Web Unit Tests (Karma/Jasmine)" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + ensure_web_dependencies || return 1 + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: npm run test:ci" + popd > /dev/null + return 0 + fi + + # Run tests + npm run test:ci + local result=$? + + stop_timer "$start_time" "Web unit tests" + popd > /dev/null + + if [[ $result -eq 0 ]]; then + log_success "Web unit tests passed" + else + log_error "Web unit tests failed" + fi + + return $result +} + +run_web_e2e_tests() { + log_subsection "Running Web E2E Tests (Playwright)" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + ensure_web_dependencies || return 1 + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + # Install Playwright browsers if needed + if [[ ! -d "$HOME/.cache/ms-playwright" ]] && [[ ! -d "node_modules/.cache/ms-playwright" ]]; then + log_info "Installing Playwright browsers..." + npx playwright install --with-deps chromium || { + log_warn "Playwright browser installation failed - E2E tests may fail" + } + fi + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: npm run test:e2e" + popd > /dev/null + return 0 + fi + + # Run E2E tests + npm run test:e2e + local result=$? + + stop_timer "$start_time" "Web E2E tests" + popd > /dev/null + + if [[ $result -eq 0 ]]; then + log_success "Web E2E tests passed" + else + log_error "Web E2E tests failed" + fi + + return $result +} + +run_web_a11y_tests() { + log_subsection "Running Web Accessibility Tests (Axe)" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + ensure_web_dependencies || return 1 + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: npm run test:a11y" + popd > /dev/null + return 0 + fi + + # Run accessibility tests + npm run test:a11y + local result=$? + + stop_timer "$start_time" "Web accessibility tests" + popd > /dev/null + + if [[ $result -eq 0 ]]; then + log_success "Web accessibility tests passed" + else + log_warn "Web accessibility tests had issues (non-blocking)" + fi + + # A11y tests are non-blocking by default + return 0 +} + +run_web_build() { + log_subsection "Building Web Application" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + ensure_web_dependencies || return 1 + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: npm run build -- --configuration production" + popd > /dev/null + return 0 + fi + + # Build production bundle + npm run build -- --configuration production --progress=false + local result=$? + + stop_timer "$start_time" "Web build" + popd > /dev/null + + if [[ $result -eq 0 ]]; then + log_success "Web build completed" + + # Check bundle size + if [[ -d "$WEB_DIR/dist" ]]; then + local size + size=$(du -sh "$WEB_DIR/dist" 2>/dev/null | cut -f1) + log_info "Bundle size: $size" + fi + else + log_error "Web build failed" + fi + + return $result +} + +run_web_storybook_build() { + log_subsection "Building Storybook" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + ensure_web_dependencies || return 1 + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: npm run storybook:build" + popd > /dev/null + return 0 + fi + + # Build Storybook + npm run storybook:build + local result=$? + + stop_timer "$start_time" "Storybook build" + popd > /dev/null + + if [[ $result -eq 0 ]]; then + log_success "Storybook build completed" + else + log_error "Storybook build failed" + fi + + return $result +} + +run_web_lighthouse() { + log_subsection "Running Lighthouse Audit" + + if [[ ! -d "$WEB_DIR" ]]; then + log_error "Web directory not found: $WEB_DIR" + return 1 + fi + + # Check if lighthouse is available + if ! command -v lhci &>/dev/null && ! npx lhci --version &>/dev/null 2>&1; then + log_warn "Lighthouse CI not installed - skipping audit" + log_info "Install with: npm install -g @lhci/cli" + return 0 + fi + + ensure_web_dependencies || return 1 + + # Build first if not already built + if [[ ! -d "$WEB_DIR/dist" ]]; then + run_web_build || return 1 + fi + + pushd "$WEB_DIR" > /dev/null || return 1 + + local start_time + start_time=$(start_timer) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would run: lhci autorun" + popd > /dev/null + return 0 + fi + + # Run Lighthouse + npx lhci autorun \ + --collect.staticDistDir=./dist/stellaops-web/browser \ + --collect.numberOfRuns=1 \ + --upload.target=filesystem \ + --upload.outputDir=./lighthouse-results 2>/dev/null || { + log_warn "Lighthouse audit had issues" + } + + stop_timer "$start_time" "Lighthouse audit" + popd > /dev/null + + log_success "Lighthouse audit completed" + return 0 +} + +# ============================================================================= +# COMPOSITE RUNNERS +# ============================================================================= + +run_web_smoke() { + log_section "Web Smoke Tests" + log_info "Running quick web validation" + + local failed=0 + + run_web_build || failed=1 + + if [[ $failed -eq 0 ]]; then + run_web_unit_tests || failed=1 + fi + + return $failed +} + +run_web_pr_gating() { + log_section "Web PR-Gating Tests" + log_info "Running full web PR-gating suite" + + local failed=0 + local results=() + + # Build + run_web_build + results+=("Build:$?") + [[ ${results[-1]##*:} -ne 0 ]] && failed=1 + + # Unit tests + if [[ $failed -eq 0 ]]; then + run_web_unit_tests + results+=("Unit:$?") + [[ ${results[-1]##*:} -ne 0 ]] && failed=1 + fi + + # E2E tests + if [[ $failed -eq 0 ]]; then + run_web_e2e_tests + results+=("E2E:$?") + [[ ${results[-1]##*:} -ne 0 ]] && failed=1 + fi + + # A11y tests (non-blocking) + run_web_a11y_tests + results+=("A11y:$?") + + # Print summary + log_section "Web Test Results" + for result in "${results[@]}"; do + local name="${result%%:*}" + local status="${result##*:}" + if [[ "$status" == "0" ]]; then + print_status "Web $name" "true" + else + print_status "Web $name" "false" + fi + done + + return $failed +} + +run_web_full() { + log_section "Full Web Test Suite" + log_info "Running all web tests including extended categories" + + local failed=0 + + # PR-gating tests + run_web_pr_gating || failed=1 + + # Extended tests + run_web_storybook_build || log_warn "Storybook build failed (non-blocking)" + run_web_lighthouse || log_warn "Lighthouse audit failed (non-blocking)" + + return $failed +} + +# ============================================================================= +# EXPORTS +# ============================================================================= + +export -f check_web_dependencies +export -f install_web_dependencies +export -f ensure_web_dependencies +export -f run_web_unit_tests +export -f run_web_e2e_tests +export -f run_web_a11y_tests +export -f run_web_build +export -f run_web_storybook_build +export -f run_web_lighthouse +export -f run_web_smoke +export -f run_web_pr_gating +export -f run_web_full diff --git a/deploy/scripts/lib/exit-codes.sh b/deploy/scripts/lib/exit-codes.sh new file mode 100644 index 000000000..20cbd5d58 --- /dev/null +++ b/deploy/scripts/lib/exit-codes.sh @@ -0,0 +1,178 @@ +#!/usr/bin/env bash +# Shared Exit Codes Registry +# Sprint: CI/CD Enhancement - Script Consolidation +# +# Purpose: Standard exit codes for all CI/CD scripts +# Usage: source "$(dirname "${BASH_SOURCE[0]}")/lib/exit-codes.sh" +# +# Exit codes follow POSIX conventions (0-125) +# 126-127 reserved for shell errors +# 128+ reserved for signal handling + +# Prevent multiple sourcing +if [[ -n "${__STELLAOPS_EXIT_CODES_LOADED:-}" ]]; then + return 0 +fi +export __STELLAOPS_EXIT_CODES_LOADED=1 + +# ============================================================================ +# Standard Exit Codes +# ============================================================================ + +# Success +export EXIT_SUCCESS=0 + +# General errors (1-9) +export EXIT_ERROR=1 # Generic error +export EXIT_USAGE=2 # Invalid usage/arguments +export EXIT_CONFIG_ERROR=3 # Configuration error +export EXIT_NOT_FOUND=4 # File/resource not found +export EXIT_PERMISSION=5 # Permission denied +export EXIT_IO_ERROR=6 # I/O error +export EXIT_NETWORK_ERROR=7 # Network error +export EXIT_TIMEOUT=8 # Operation timed out +export EXIT_INTERRUPTED=9 # User interrupted (Ctrl+C) + +# Tool/dependency errors (10-19) +export EXIT_MISSING_TOOL=10 # Required tool not installed +export EXIT_TOOL_ERROR=11 # Tool execution failed +export EXIT_VERSION_MISMATCH=12 # Wrong tool version +export EXIT_DEPENDENCY_ERROR=13 # Dependency resolution failed + +# Build errors (20-29) +export EXIT_BUILD_FAILED=20 # Build compilation failed +export EXIT_RESTORE_FAILED=21 # Package restore failed +export EXIT_PUBLISH_FAILED=22 # Publish failed +export EXIT_PACKAGING_FAILED=23 # Packaging failed + +# Test errors (30-39) +export EXIT_TEST_FAILED=30 # Tests failed +export EXIT_TEST_TIMEOUT=31 # Test timed out +export EXIT_FIXTURE_ERROR=32 # Test fixture error +export EXIT_DETERMINISM_FAIL=33 # Determinism check failed + +# Deployment errors (40-49) +export EXIT_DEPLOY_FAILED=40 # Deployment failed +export EXIT_ROLLBACK_FAILED=41 # Rollback failed +export EXIT_HEALTH_CHECK_FAIL=42 # Health check failed +export EXIT_REGISTRY_ERROR=43 # Container registry error + +# Validation errors (50-59) +export EXIT_VALIDATION_FAILED=50 # General validation failed +export EXIT_SCHEMA_ERROR=51 # Schema validation failed +export EXIT_LINT_ERROR=52 # Lint check failed +export EXIT_FORMAT_ERROR=53 # Format check failed +export EXIT_LICENSE_ERROR=54 # License compliance failed + +# Security errors (60-69) +export EXIT_SECURITY_ERROR=60 # Security check failed +export EXIT_SECRETS_FOUND=61 # Secrets detected in code +export EXIT_VULN_FOUND=62 # Vulnerabilities found +export EXIT_SIGN_FAILED=63 # Signing failed +export EXIT_VERIFY_FAILED=64 # Verification failed + +# Git/VCS errors (70-79) +export EXIT_GIT_ERROR=70 # Git operation failed +export EXIT_DIRTY_WORKTREE=71 # Uncommitted changes +export EXIT_MERGE_CONFLICT=72 # Merge conflict +export EXIT_BRANCH_ERROR=73 # Branch operation failed + +# Reserved for specific tools (80-99) +export EXIT_DOTNET_ERROR=80 # .NET specific error +export EXIT_DOCKER_ERROR=81 # Docker specific error +export EXIT_HELM_ERROR=82 # Helm specific error +export EXIT_KUBECTL_ERROR=83 # kubectl specific error +export EXIT_NPM_ERROR=84 # npm specific error +export EXIT_PYTHON_ERROR=85 # Python specific error + +# Legacy compatibility +export EXIT_TOOLCHAIN=69 # Tool not found (legacy, use EXIT_MISSING_TOOL) + +# ============================================================================ +# Helper Functions +# ============================================================================ + +# Get exit code name from number +exit_code_name() { + local code="${1:-}" + + case "$code" in + 0) echo "SUCCESS" ;; + 1) echo "ERROR" ;; + 2) echo "USAGE" ;; + 3) echo "CONFIG_ERROR" ;; + 4) echo "NOT_FOUND" ;; + 5) echo "PERMISSION" ;; + 6) echo "IO_ERROR" ;; + 7) echo "NETWORK_ERROR" ;; + 8) echo "TIMEOUT" ;; + 9) echo "INTERRUPTED" ;; + 10) echo "MISSING_TOOL" ;; + 11) echo "TOOL_ERROR" ;; + 12) echo "VERSION_MISMATCH" ;; + 13) echo "DEPENDENCY_ERROR" ;; + 20) echo "BUILD_FAILED" ;; + 21) echo "RESTORE_FAILED" ;; + 22) echo "PUBLISH_FAILED" ;; + 23) echo "PACKAGING_FAILED" ;; + 30) echo "TEST_FAILED" ;; + 31) echo "TEST_TIMEOUT" ;; + 32) echo "FIXTURE_ERROR" ;; + 33) echo "DETERMINISM_FAIL" ;; + 40) echo "DEPLOY_FAILED" ;; + 41) echo "ROLLBACK_FAILED" ;; + 42) echo "HEALTH_CHECK_FAIL" ;; + 43) echo "REGISTRY_ERROR" ;; + 50) echo "VALIDATION_FAILED" ;; + 51) echo "SCHEMA_ERROR" ;; + 52) echo "LINT_ERROR" ;; + 53) echo "FORMAT_ERROR" ;; + 54) echo "LICENSE_ERROR" ;; + 60) echo "SECURITY_ERROR" ;; + 61) echo "SECRETS_FOUND" ;; + 62) echo "VULN_FOUND" ;; + 63) echo "SIGN_FAILED" ;; + 64) echo "VERIFY_FAILED" ;; + 69) echo "TOOLCHAIN (legacy)" ;; + 70) echo "GIT_ERROR" ;; + 71) echo "DIRTY_WORKTREE" ;; + 72) echo "MERGE_CONFLICT" ;; + 73) echo "BRANCH_ERROR" ;; + 80) echo "DOTNET_ERROR" ;; + 81) echo "DOCKER_ERROR" ;; + 82) echo "HELM_ERROR" ;; + 83) echo "KUBECTL_ERROR" ;; + 84) echo "NPM_ERROR" ;; + 85) echo "PYTHON_ERROR" ;; + 126) echo "COMMAND_NOT_EXECUTABLE" ;; + 127) echo "COMMAND_NOT_FOUND" ;; + *) + if [[ $code -ge 128 ]] && [[ $code -le 255 ]]; then + local signal=$((code - 128)) + echo "SIGNAL_${signal}" + else + echo "UNKNOWN_${code}" + fi + ;; + esac +} + +# Check if exit code indicates success +is_success() { + [[ "${1:-1}" -eq 0 ]] +} + +# Check if exit code indicates error +is_error() { + [[ "${1:-0}" -ne 0 ]] +} + +# Exit with message and code +exit_with() { + local code="${1:-1}" + shift + if [[ $# -gt 0 ]]; then + echo "$@" >&2 + fi + exit "$code" +} diff --git a/deploy/scripts/lib/git-utils.sh b/deploy/scripts/lib/git-utils.sh new file mode 100644 index 000000000..4a2249d03 --- /dev/null +++ b/deploy/scripts/lib/git-utils.sh @@ -0,0 +1,262 @@ +#!/usr/bin/env bash +# Shared Git Utilities +# Sprint: CI/CD Enhancement - Script Consolidation +# +# Purpose: Common git operations for CI/CD scripts +# Usage: source "$(dirname "${BASH_SOURCE[0]}")/lib/git-utils.sh" + +# Prevent multiple sourcing +if [[ -n "${__STELLAOPS_GIT_UTILS_LOADED:-}" ]]; then + return 0 +fi +export __STELLAOPS_GIT_UTILS_LOADED=1 + +# Source dependencies +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +source "${SCRIPT_DIR}/logging.sh" 2>/dev/null || true +source "${SCRIPT_DIR}/exit-codes.sh" 2>/dev/null || true + +# ============================================================================ +# Repository Information +# ============================================================================ + +# Get repository root directory +git_root() { + git rev-parse --show-toplevel 2>/dev/null || echo "." +} + +# Check if current directory is a git repository +is_git_repo() { + git rev-parse --git-dir >/dev/null 2>&1 +} + +# Get current commit SHA (full) +git_sha() { + git rev-parse HEAD 2>/dev/null +} + +# Get current commit SHA (short) +git_sha_short() { + git rev-parse --short HEAD 2>/dev/null +} + +# Get current branch name +git_branch() { + git rev-parse --abbrev-ref HEAD 2>/dev/null +} + +# Get current tag (if HEAD is tagged) +git_tag() { + git describe --tags --exact-match HEAD 2>/dev/null || echo "" +} + +# Get latest tag +git_latest_tag() { + git describe --tags --abbrev=0 2>/dev/null || echo "" +} + +# Get remote URL +git_remote_url() { + local remote="${1:-origin}" + git remote get-url "$remote" 2>/dev/null +} + +# Get repository name from remote URL +git_repo_name() { + local url + url=$(git_remote_url "${1:-origin}") + basename "$url" .git +} + +# ============================================================================ +# Commit Information +# ============================================================================ + +# Get commit message +git_commit_message() { + local sha="${1:-HEAD}" + git log -1 --format="%s" "$sha" 2>/dev/null +} + +# Get commit author +git_commit_author() { + local sha="${1:-HEAD}" + git log -1 --format="%an" "$sha" 2>/dev/null +} + +# Get commit author email +git_commit_author_email() { + local sha="${1:-HEAD}" + git log -1 --format="%ae" "$sha" 2>/dev/null +} + +# Get commit timestamp (ISO 8601) +git_commit_timestamp() { + local sha="${1:-HEAD}" + git log -1 --format="%aI" "$sha" 2>/dev/null +} + +# Get commit timestamp (Unix epoch) +git_commit_epoch() { + local sha="${1:-HEAD}" + git log -1 --format="%at" "$sha" 2>/dev/null +} + +# ============================================================================ +# Working Tree State +# ============================================================================ + +# Check if working tree is clean +git_is_clean() { + [[ -z "$(git status --porcelain 2>/dev/null)" ]] +} + +# Check if working tree is dirty +git_is_dirty() { + ! git_is_clean +} + +# Get list of changed files +git_changed_files() { + git status --porcelain 2>/dev/null | awk '{print $2}' +} + +# Get list of staged files +git_staged_files() { + git diff --cached --name-only 2>/dev/null +} + +# Get list of untracked files +git_untracked_files() { + git ls-files --others --exclude-standard 2>/dev/null +} + +# ============================================================================ +# Diff and History +# ============================================================================ + +# Get files changed between two refs +git_diff_files() { + local from="${1:-HEAD~1}" + local to="${2:-HEAD}" + git diff --name-only "$from" "$to" 2>/dev/null +} + +# Get files changed in last N commits +git_recent_files() { + local count="${1:-1}" + git diff --name-only "HEAD~${count}" HEAD 2>/dev/null +} + +# Check if file was changed between two refs +git_file_changed() { + local file="$1" + local from="${2:-HEAD~1}" + local to="${3:-HEAD}" + git diff --name-only "$from" "$to" -- "$file" 2>/dev/null | grep -q "$file" +} + +# Get commits between two refs +git_commits_between() { + local from="${1:-HEAD~10}" + local to="${2:-HEAD}" + git log --oneline "$from".."$to" 2>/dev/null +} + +# ============================================================================ +# Tag Operations +# ============================================================================ + +# Create a tag +git_create_tag() { + local tag="$1" + local message="${2:-}" + + if [[ -n "$message" ]]; then + git tag -a "$tag" -m "$message" + else + git tag "$tag" + fi +} + +# Delete a tag +git_delete_tag() { + local tag="$1" + git tag -d "$tag" 2>/dev/null +} + +# Push tag to remote +git_push_tag() { + local tag="$1" + local remote="${2:-origin}" + git push "$remote" "$tag" +} + +# List tags matching pattern +git_list_tags() { + local pattern="${1:-*}" + git tag -l "$pattern" 2>/dev/null +} + +# ============================================================================ +# Branch Operations +# ============================================================================ + +# Check if branch exists +git_branch_exists() { + local branch="$1" + git show-ref --verify --quiet "refs/heads/$branch" 2>/dev/null +} + +# Check if remote branch exists +git_remote_branch_exists() { + local branch="$1" + local remote="${2:-origin}" + git show-ref --verify --quiet "refs/remotes/$remote/$branch" 2>/dev/null +} + +# Get default branch +git_default_branch() { + local remote="${1:-origin}" + git remote show "$remote" 2>/dev/null | grep "HEAD branch" | awk '{print $NF}' +} + +# ============================================================================ +# CI/CD Helpers +# ============================================================================ + +# Get version string for CI builds +git_ci_version() { + local tag + tag=$(git_tag) + + if [[ -n "$tag" ]]; then + echo "$tag" + else + local branch sha + branch=$(git_branch | tr '/' '-') + sha=$(git_sha_short) + echo "${branch}-${sha}" + fi +} + +# Check if current commit is on default branch +git_is_default_branch() { + local current default + current=$(git_branch) + default=$(git_default_branch) + [[ "$current" == "$default" ]] +} + +# Check if running in CI environment +git_is_ci() { + [[ -n "${CI:-}" ]] || [[ -n "${GITHUB_ACTIONS:-}" ]] || [[ -n "${GITLAB_CI:-}" ]] +} + +# Ensure clean worktree or fail +git_require_clean() { + if git_is_dirty; then + log_error "Working tree is dirty. Commit or stash changes first." + return "${EXIT_DIRTY_WORKTREE:-71}" + fi +} diff --git a/deploy/scripts/lib/hash-utils.sh b/deploy/scripts/lib/hash-utils.sh new file mode 100644 index 000000000..ade90039b --- /dev/null +++ b/deploy/scripts/lib/hash-utils.sh @@ -0,0 +1,266 @@ +#!/usr/bin/env bash +# Shared Hash/Checksum Utilities +# Sprint: CI/CD Enhancement - Script Consolidation +# +# Purpose: Cryptographic hash and checksum operations for CI/CD scripts +# Usage: source "$(dirname "${BASH_SOURCE[0]}")/lib/hash-utils.sh" + +# Prevent multiple sourcing +if [[ -n "${__STELLAOPS_HASH_UTILS_LOADED:-}" ]]; then + return 0 +fi +export __STELLAOPS_HASH_UTILS_LOADED=1 + +# Source dependencies +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +source "${SCRIPT_DIR}/logging.sh" 2>/dev/null || true +source "${SCRIPT_DIR}/exit-codes.sh" 2>/dev/null || true + +# ============================================================================ +# Hash Computation +# ============================================================================ + +# Compute SHA-256 hash of a file +compute_sha256() { + local file="$1" + + if [[ ! -f "$file" ]]; then + log_error "File not found: $file" + return "${EXIT_NOT_FOUND:-4}" + fi + + if command -v sha256sum >/dev/null 2>&1; then + sha256sum "$file" | awk '{print $1}' + elif command -v shasum >/dev/null 2>&1; then + shasum -a 256 "$file" | awk '{print $1}' + elif command -v openssl >/dev/null 2>&1; then + openssl dgst -sha256 "$file" | awk '{print $NF}' + else + log_error "No SHA-256 tool available" + return "${EXIT_MISSING_TOOL:-10}" + fi +} + +# Compute SHA-512 hash of a file +compute_sha512() { + local file="$1" + + if [[ ! -f "$file" ]]; then + log_error "File not found: $file" + return "${EXIT_NOT_FOUND:-4}" + fi + + if command -v sha512sum >/dev/null 2>&1; then + sha512sum "$file" | awk '{print $1}' + elif command -v shasum >/dev/null 2>&1; then + shasum -a 512 "$file" | awk '{print $1}' + elif command -v openssl >/dev/null 2>&1; then + openssl dgst -sha512 "$file" | awk '{print $NF}' + else + log_error "No SHA-512 tool available" + return "${EXIT_MISSING_TOOL:-10}" + fi +} + +# Compute MD5 hash of a file (for compatibility, not security) +compute_md5() { + local file="$1" + + if [[ ! -f "$file" ]]; then + log_error "File not found: $file" + return "${EXIT_NOT_FOUND:-4}" + fi + + if command -v md5sum >/dev/null 2>&1; then + md5sum "$file" | awk '{print $1}' + elif command -v md5 >/dev/null 2>&1; then + md5 -q "$file" + elif command -v openssl >/dev/null 2>&1; then + openssl dgst -md5 "$file" | awk '{print $NF}' + else + log_error "No MD5 tool available" + return "${EXIT_MISSING_TOOL:-10}" + fi +} + +# Compute hash of string +compute_string_hash() { + local string="$1" + local algorithm="${2:-sha256}" + + case "$algorithm" in + sha256) + echo -n "$string" | sha256sum 2>/dev/null | awk '{print $1}' || \ + echo -n "$string" | shasum -a 256 2>/dev/null | awk '{print $1}' + ;; + sha512) + echo -n "$string" | sha512sum 2>/dev/null | awk '{print $1}' || \ + echo -n "$string" | shasum -a 512 2>/dev/null | awk '{print $1}' + ;; + md5) + echo -n "$string" | md5sum 2>/dev/null | awk '{print $1}' || \ + echo -n "$string" | md5 2>/dev/null + ;; + *) + log_error "Unknown algorithm: $algorithm" + return "${EXIT_USAGE:-2}" + ;; + esac +} + +# ============================================================================ +# Checksum Files +# ============================================================================ + +# Write checksum file for a single file +write_checksum() { + local file="$1" + local checksum_file="${2:-${file}.sha256}" + local algorithm="${3:-sha256}" + + local hash + case "$algorithm" in + sha256) hash=$(compute_sha256 "$file") ;; + sha512) hash=$(compute_sha512 "$file") ;; + md5) hash=$(compute_md5 "$file") ;; + *) + log_error "Unknown algorithm: $algorithm" + return "${EXIT_USAGE:-2}" + ;; + esac + + if [[ -z "$hash" ]]; then + return "${EXIT_ERROR:-1}" + fi + + local basename + basename=$(basename "$file") + echo "$hash $basename" > "$checksum_file" + log_debug "Wrote checksum to $checksum_file" +} + +# Write checksums for multiple files +write_checksums() { + local output_file="$1" + shift + local files=("$@") + + : > "$output_file" + + for file in "${files[@]}"; do + if [[ -f "$file" ]]; then + local hash basename + hash=$(compute_sha256 "$file") + basename=$(basename "$file") + echo "$hash $basename" >> "$output_file" + fi + done + + log_debug "Wrote checksums to $output_file" +} + +# ============================================================================ +# Checksum Verification +# ============================================================================ + +# Verify checksum of a file +verify_checksum() { + local file="$1" + local expected_hash="$2" + local algorithm="${3:-sha256}" + + local actual_hash + case "$algorithm" in + sha256) actual_hash=$(compute_sha256 "$file") ;; + sha512) actual_hash=$(compute_sha512 "$file") ;; + md5) actual_hash=$(compute_md5 "$file") ;; + *) + log_error "Unknown algorithm: $algorithm" + return "${EXIT_USAGE:-2}" + ;; + esac + + if [[ "$actual_hash" == "$expected_hash" ]]; then + log_debug "Checksum verified: $file" + return 0 + else + log_error "Checksum mismatch for $file" + log_error " Expected: $expected_hash" + log_error " Actual: $actual_hash" + return "${EXIT_VERIFY_FAILED:-64}" + fi +} + +# Verify checksums from file (sha256sum -c style) +verify_checksums_file() { + local checksum_file="$1" + local base_dir="${2:-.}" + + if [[ ! -f "$checksum_file" ]]; then + log_error "Checksum file not found: $checksum_file" + return "${EXIT_NOT_FOUND:-4}" + fi + + local failures=0 + + while IFS= read -r line; do + # Skip empty lines and comments + [[ -z "$line" ]] && continue + [[ "$line" == \#* ]] && continue + + local hash filename + hash=$(echo "$line" | awk '{print $1}') + filename=$(echo "$line" | awk '{print $2}') + + local filepath="${base_dir}/${filename}" + + if [[ ! -f "$filepath" ]]; then + log_error "File not found: $filepath" + ((failures++)) + continue + fi + + if ! verify_checksum "$filepath" "$hash"; then + ((failures++)) + fi + done < "$checksum_file" + + if [[ $failures -gt 0 ]]; then + log_error "$failures checksum verification(s) failed" + return "${EXIT_VERIFY_FAILED:-64}" + fi + + log_info "All checksums verified" + return 0 +} + +# ============================================================================ +# Helpers +# ============================================================================ + +# Check if two files have the same content +files_identical() { + local file1="$1" + local file2="$2" + + [[ -f "$file1" ]] && [[ -f "$file2" ]] || return 1 + + local hash1 hash2 + hash1=$(compute_sha256 "$file1") + hash2=$(compute_sha256 "$file2") + + [[ "$hash1" == "$hash2" ]] +} + +# Get short hash for display +short_hash() { + local hash="$1" + local length="${2:-8}" + echo "${hash:0:$length}" +} + +# Generate deterministic ID from inputs +generate_id() { + local inputs="$*" + compute_string_hash "$inputs" sha256 | head -c 16 +} diff --git a/deploy/scripts/lib/logging.sh b/deploy/scripts/lib/logging.sh new file mode 100644 index 000000000..4e363d6f8 --- /dev/null +++ b/deploy/scripts/lib/logging.sh @@ -0,0 +1,181 @@ +#!/usr/bin/env bash +# Shared Logging Library +# Sprint: CI/CD Enhancement - Script Consolidation +# +# Purpose: Standard logging functions for all CI/CD scripts +# Usage: source "$(dirname "${BASH_SOURCE[0]}")/lib/logging.sh" +# +# Log Levels: DEBUG, INFO, WARN, ERROR +# Set LOG_LEVEL environment variable to control verbosity (default: INFO) + +# Prevent multiple sourcing +if [[ -n "${__STELLAOPS_LOGGING_LOADED:-}" ]]; then + return 0 +fi +export __STELLAOPS_LOGGING_LOADED=1 + +# Colors (disable with NO_COLOR=1) +if [[ -z "${NO_COLOR:-}" ]] && [[ -t 1 ]]; then + export LOG_COLOR_RED='\033[0;31m' + export LOG_COLOR_GREEN='\033[0;32m' + export LOG_COLOR_YELLOW='\033[1;33m' + export LOG_COLOR_BLUE='\033[0;34m' + export LOG_COLOR_MAGENTA='\033[0;35m' + export LOG_COLOR_CYAN='\033[0;36m' + export LOG_COLOR_GRAY='\033[0;90m' + export LOG_COLOR_RESET='\033[0m' +else + export LOG_COLOR_RED='' + export LOG_COLOR_GREEN='' + export LOG_COLOR_YELLOW='' + export LOG_COLOR_BLUE='' + export LOG_COLOR_MAGENTA='' + export LOG_COLOR_CYAN='' + export LOG_COLOR_GRAY='' + export LOG_COLOR_RESET='' +fi + +# Log level configuration +export LOG_LEVEL="${LOG_LEVEL:-INFO}" + +# Convert log level to numeric for comparison +_log_level_to_num() { + case "$1" in + DEBUG) echo 0 ;; + INFO) echo 1 ;; + WARN) echo 2 ;; + ERROR) echo 3 ;; + *) echo 1 ;; + esac +} + +# Check if message should be logged based on level +_should_log() { + local msg_level="$1" + local current_level="${LOG_LEVEL:-INFO}" + + local msg_num current_num + msg_num=$(_log_level_to_num "$msg_level") + current_num=$(_log_level_to_num "$current_level") + + [[ $msg_num -ge $current_num ]] +} + +# Format timestamp +_log_timestamp() { + if [[ "${LOG_TIMESTAMPS:-true}" == "true" ]]; then + date -u +"%Y-%m-%dT%H:%M:%SZ" + fi +} + +# Core logging function +_log() { + local level="$1" + local color="$2" + shift 2 + + if ! _should_log "$level"; then + return 0 + fi + + local timestamp + timestamp=$(_log_timestamp) + + local prefix="" + if [[ -n "$timestamp" ]]; then + prefix="${LOG_COLOR_GRAY}${timestamp}${LOG_COLOR_RESET} " + fi + + echo -e "${prefix}${color}[${level}]${LOG_COLOR_RESET} $*" +} + +# Public logging functions +log_debug() { + _log "DEBUG" "${LOG_COLOR_GRAY}" "$@" +} + +log_info() { + _log "INFO" "${LOG_COLOR_GREEN}" "$@" +} + +log_warn() { + _log "WARN" "${LOG_COLOR_YELLOW}" "$@" +} + +log_error() { + _log "ERROR" "${LOG_COLOR_RED}" "$@" >&2 +} + +# Step logging (for workflow stages) +log_step() { + _log "STEP" "${LOG_COLOR_BLUE}" "$@" +} + +# Success message +log_success() { + _log "OK" "${LOG_COLOR_GREEN}" "$@" +} + +# GitHub Actions annotations +log_gh_notice() { + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::notice::$*" + else + log_info "$@" + fi +} + +log_gh_warning() { + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::warning::$*" + else + log_warn "$@" + fi +} + +log_gh_error() { + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::error::$*" + else + log_error "$@" + fi +} + +# Group logging (for GitHub Actions) +log_group_start() { + local title="$1" + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::group::$title" + else + log_step "=== $title ===" + fi +} + +log_group_end() { + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::endgroup::" + fi +} + +# Masked logging (for secrets) +log_masked() { + local value="$1" + if [[ -n "${GITHUB_ACTIONS:-}" ]]; then + echo "::add-mask::$value" + fi +} + +# Die with error message +die() { + log_error "$@" + exit 1 +} + +# Conditional die +die_if() { + local condition="$1" + shift + if eval "$condition"; then + die "$@" + fi +} diff --git a/deploy/scripts/lib/path-utils.sh b/deploy/scripts/lib/path-utils.sh new file mode 100644 index 000000000..0298073da --- /dev/null +++ b/deploy/scripts/lib/path-utils.sh @@ -0,0 +1,274 @@ +#!/usr/bin/env bash +# Shared Path Utilities +# Sprint: CI/CD Enhancement - Script Consolidation +# +# Purpose: Path manipulation and file operations for CI/CD scripts +# Usage: source "$(dirname "${BASH_SOURCE[0]}")/lib/path-utils.sh" + +# Prevent multiple sourcing +if [[ -n "${__STELLAOPS_PATH_UTILS_LOADED:-}" ]]; then + return 0 +fi +export __STELLAOPS_PATH_UTILS_LOADED=1 + +# Source dependencies +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +source "${SCRIPT_DIR}/logging.sh" 2>/dev/null || true +source "${SCRIPT_DIR}/exit-codes.sh" 2>/dev/null || true + +# ============================================================================ +# Path Normalization +# ============================================================================ + +# Normalize path (resolve .., ., symlinks) +normalize_path() { + local path="$1" + + # Handle empty path + if [[ -z "$path" ]]; then + echo "." + return 0 + fi + + # Try realpath first (most reliable) + if command -v realpath >/dev/null 2>&1; then + realpath -m "$path" 2>/dev/null && return 0 + fi + + # Fallback to Python + if command -v python3 >/dev/null 2>&1; then + python3 -c "import os; print(os.path.normpath('$path'))" 2>/dev/null && return 0 + fi + + # Manual normalization (basic) + echo "$path" | sed 's|/\./|/|g' | sed 's|/[^/]*/\.\./|/|g' | sed 's|//|/|g' +} + +# Get absolute path +absolute_path() { + local path="$1" + + if [[ "$path" == /* ]]; then + normalize_path "$path" + else + normalize_path "$(pwd)/$path" + fi +} + +# Get relative path from one path to another +relative_path() { + local from="$1" + local to="$2" + + if command -v realpath >/dev/null 2>&1; then + realpath --relative-to="$from" "$to" 2>/dev/null && return 0 + fi + + if command -v python3 >/dev/null 2>&1; then + python3 -c "import os.path; print(os.path.relpath('$to', '$from'))" 2>/dev/null && return 0 + fi + + # Fallback: just return absolute path + absolute_path "$to" +} + +# ============================================================================ +# Path Components +# ============================================================================ + +# Get directory name +dir_name() { + dirname "$1" +} + +# Get base name +base_name() { + basename "$1" +} + +# Get file extension +file_extension() { + local path="$1" + local base + base=$(basename "$path") + + if [[ "$base" == *.* ]]; then + echo "${base##*.}" + else + echo "" + fi +} + +# Get file name without extension +file_stem() { + local path="$1" + local base + base=$(basename "$path") + + if [[ "$base" == *.* ]]; then + echo "${base%.*}" + else + echo "$base" + fi +} + +# ============================================================================ +# Directory Operations +# ============================================================================ + +# Ensure directory exists +ensure_directory() { + local dir="$1" + if [[ ! -d "$dir" ]]; then + mkdir -p "$dir" + fi +} + +# Create temporary directory +create_temp_dir() { + local prefix="${1:-stellaops}" + mktemp -d "${TMPDIR:-/tmp}/${prefix}.XXXXXX" +} + +# Create temporary file +create_temp_file() { + local prefix="${1:-stellaops}" + local suffix="${2:-}" + mktemp "${TMPDIR:-/tmp}/${prefix}.XXXXXX${suffix}" +} + +# Clean temporary directory +clean_temp() { + local path="$1" + if [[ -d "$path" ]] && [[ "$path" == *stellaops* ]]; then + rm -rf "$path" + fi +} + +# ============================================================================ +# File Existence Checks +# ============================================================================ + +# Check if file exists +file_exists() { + [[ -f "$1" ]] +} + +# Check if directory exists +dir_exists() { + [[ -d "$1" ]] +} + +# Check if path exists (file or directory) +path_exists() { + [[ -e "$1" ]] +} + +# Check if file is readable +file_readable() { + [[ -r "$1" ]] +} + +# Check if file is writable +file_writable() { + [[ -w "$1" ]] +} + +# Check if file is executable +file_executable() { + [[ -x "$1" ]] +} + +# ============================================================================ +# File Discovery +# ============================================================================ + +# Find files by pattern +find_files() { + local dir="${1:-.}" + local pattern="${2:-*}" + find "$dir" -type f -name "$pattern" 2>/dev/null +} + +# Find files by extension +find_by_extension() { + local dir="${1:-.}" + local ext="${2:-}" + find "$dir" -type f -name "*.${ext}" 2>/dev/null +} + +# Find project files (csproj, package.json, etc.) +find_project_files() { + local dir="${1:-.}" + find "$dir" -type f \( \ + -name "*.csproj" -o \ + -name "*.fsproj" -o \ + -name "package.json" -o \ + -name "Cargo.toml" -o \ + -name "go.mod" -o \ + -name "pom.xml" -o \ + -name "build.gradle" \ + \) 2>/dev/null | grep -v node_modules | grep -v bin | grep -v obj +} + +# Find test projects +find_test_projects() { + local dir="${1:-.}" + find "$dir" -type f -name "*.Tests.csproj" 2>/dev/null | grep -v bin | grep -v obj +} + +# ============================================================================ +# Path Validation +# ============================================================================ + +# Check if path is under directory +path_under() { + local path="$1" + local dir="$2" + + local abs_path abs_dir + abs_path=$(absolute_path "$path") + abs_dir=$(absolute_path "$dir") + + [[ "$abs_path" == "$abs_dir"* ]] +} + +# Validate path is safe (no directory traversal) +path_is_safe() { + local path="$1" + local base="${2:-.}" + + # Check for obvious traversal attempts + if [[ "$path" == *".."* ]] || [[ "$path" == "/*" ]]; then + return 1 + fi + + # Verify resolved path is under base + path_under "$path" "$base" +} + +# ============================================================================ +# CI/CD Helpers +# ============================================================================ + +# Get artifact output directory +get_artifact_dir() { + local name="${1:-artifacts}" + local base="${GITHUB_WORKSPACE:-$(pwd)}" + echo "${base}/out/${name}" +} + +# Get test results directory +get_test_results_dir() { + local base="${GITHUB_WORKSPACE:-$(pwd)}" + echo "${base}/TestResults" +} + +# Ensure artifact directory exists and return path +ensure_artifact_dir() { + local name="${1:-artifacts}" + local dir + dir=$(get_artifact_dir "$name") + ensure_directory "$dir" + echo "$dir" +} diff --git a/deploy/scripts/local-ci.sh b/deploy/scripts/local-ci.sh new file mode 100644 index 000000000..75cfa981d --- /dev/null +++ b/deploy/scripts/local-ci.sh @@ -0,0 +1,1050 @@ +#!/usr/bin/env bash +# ============================================================================= +# LOCAL CI RUNNER +# ============================================================================= +# Unified local CI/CD testing runner for StellaOps. +# +# Usage: +# ./devops/scripts/local-ci.sh [mode] [options] +# +# Modes: +# smoke - Quick smoke test (unit tests only, ~2 min) +# pr - Full PR-gating suite (all required checks, ~15 min) +# module - Module-specific tests (auto-detect or specified) +# workflow - Simulate specific workflow via act +# release - Release simulation (dry-run) +# full - All tests including extended categories (~45 min) +# +# Options: +# --category Run specific test category +# --workflow Specific workflow to simulate +# --module Specific module to test +# --smoke-step Smoke step: build, unit, unit-split (smoke mode only) +# --test-timeout Per-test timeout (e.g., 5m). Adds --blame-hang timeout. +# --progress-interval Progress interval in seconds for long tests +# --project-start Start index (1-based) for unit-split slicing +# --project-count Limit number of projects for unit-split slicing +# --docker Force Docker execution +# --native Force native execution +# --act Force act execution +# --parallel Parallel test runners (default: CPU count) +# --verbose Verbose output +# --dry-run Show what would run without executing +# --rebuild Force rebuild of CI Docker image +# --no-services Skip starting CI services +# --keep-services Don't stop services after tests +# --help Show this help message +# +# Examples: +# ./local-ci.sh smoke # Quick validation +# ./local-ci.sh pr # Full PR check +# ./local-ci.sh module --module Scanner # Test Scanner module +# ./local-ci.sh workflow --workflow test-matrix +# ./local-ci.sh release --dry-run +# +# ============================================================================= + +set -euo pipefail + +# ============================================================================= +# SCRIPT INITIALIZATION +# ============================================================================= + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" +export REPO_ROOT + +# Source libraries +source "$SCRIPT_DIR/lib/ci-common.sh" +source "$SCRIPT_DIR/lib/ci-docker.sh" +source "$SCRIPT_DIR/lib/ci-web.sh" 2>/dev/null || true # Web testing utilities + +# ============================================================================= +# CONSTANTS +# ============================================================================= + +# Modes +MODE_SMOKE="smoke" +MODE_PR="pr" +MODE_MODULE="module" +MODE_WORKFLOW="workflow" +MODE_RELEASE="release" +MODE_FULL="full" + +# Test categories +PR_GATING_CATEGORIES=(Unit Architecture Contract Integration Security Golden) +EXTENDED_CATEGORIES=(Performance Benchmark AirGap Chaos Determinism Resilience Observability) +ALL_CATEGORIES=("${PR_GATING_CATEGORIES[@]}" "${EXTENDED_CATEGORIES[@]}") + +# Default configuration +RESULTS_DIR="$REPO_ROOT/out/local-ci" +TRX_DIR="$RESULTS_DIR/trx" +LOGS_DIR="$RESULTS_DIR/logs" +ACTIVE_TEST_FILE="$RESULTS_DIR/active-test.txt" + +# ============================================================================= +# CONFIGURATION +# ============================================================================= + +MODE="" +EXECUTION_ENGINE="" # docker, native, act +SPECIFIC_CATEGORY="" +SPECIFIC_MODULE="" +SPECIFIC_WORKFLOW="" +SMOKE_STEP="" +TEST_TIMEOUT="" +PROGRESS_INTERVAL="" +PROJECT_START="" +PROJECT_COUNT="" +PARALLEL_JOBS="" +VERBOSE=false +DRY_RUN=false +REBUILD_IMAGE=false +SKIP_SERVICES=false +KEEP_SERVICES=false + +# ============================================================================= +# USAGE +# ============================================================================= + +usage() { + cat < Run specific test category (${ALL_CATEGORIES[*]}) + --workflow Specific workflow to simulate (for workflow mode) + --module Specific module to test (for module mode) + --smoke-step Smoke step (smoke mode only): build, unit, unit-split + --test-timeout Per-test timeout (e.g., 5m) using --blame-hang + --progress-interval Progress heartbeat in seconds + --project-start Start index (1-based) for unit-split slicing + --project-count Limit number of projects for unit-split slicing + --docker Force Docker execution + --native Force native execution + --act Force act execution + --parallel Parallel test runners (default: auto-detect) + --verbose Verbose output + --dry-run Show what would run without executing + --rebuild Force rebuild of CI Docker image + --no-services Skip starting CI services + --keep-services Don't stop services after tests + --help Show this help message + +Examples: + $(basename "$0") smoke # Quick validation before push + $(basename "$0") smoke --smoke-step build # Build only (smoke) + $(basename "$0") smoke --smoke-step unit # Unit tests only (smoke) + $(basename "$0") smoke --smoke-step unit-split # Unit tests per project + $(basename "$0") smoke --smoke-step unit-split --test-timeout 5m --progress-interval 60 + $(basename "$0") smoke --smoke-step unit-split --project-start 1 --project-count 50 + $(basename "$0") pr # Full PR check + $(basename "$0") pr --category Unit # Only run Unit tests + $(basename "$0") module # Auto-detect changed modules + $(basename "$0") module --module Scanner # Test specific module + $(basename "$0") workflow --workflow test-matrix + $(basename "$0") release --dry-run + $(basename "$0") pr --verbose --docker + +Test Categories: + PR-Gating: ${PR_GATING_CATEGORIES[*]} + Extended: ${EXTENDED_CATEGORIES[*]} +EOF +} + +# ============================================================================= +# ARGUMENT PARSING +# ============================================================================= + +parse_args() { + while [[ $# -gt 0 ]]; do + case $1 in + smoke|pr|module|workflow|release|full) + MODE="$1" + shift + ;; + --category) + SPECIFIC_CATEGORY="$2" + shift 2 + ;; + --workflow) + SPECIFIC_WORKFLOW="$2" + shift 2 + ;; + --module) + SPECIFIC_MODULE="$2" + shift 2 + ;; + --smoke-step) + SMOKE_STEP="$2" + shift 2 + ;; + --test-timeout) + TEST_TIMEOUT="$2" + shift 2 + ;; + --progress-interval) + PROGRESS_INTERVAL="$2" + shift 2 + ;; + --project-start) + PROJECT_START="$2" + shift 2 + ;; + --project-count) + PROJECT_COUNT="$2" + shift 2 + ;; + --docker) + EXECUTION_ENGINE="docker" + shift + ;; + --native) + EXECUTION_ENGINE="native" + shift + ;; + --act) + EXECUTION_ENGINE="act" + shift + ;; + --parallel) + PARALLEL_JOBS="$2" + shift 2 + ;; + --verbose|-v) + VERBOSE=true + shift + ;; + --dry-run) + DRY_RUN=true + shift + ;; + --rebuild) + REBUILD_IMAGE=true + shift + ;; + --no-services) + SKIP_SERVICES=true + shift + ;; + --keep-services) + KEEP_SERVICES=true + shift + ;; + --help|-h) + usage + exit 0 + ;; + *) + log_error "Unknown option: $1" + usage + exit 1 + ;; + esac + done + + # Default mode is smoke + if [[ -z "$MODE" ]]; then + MODE="$MODE_SMOKE" + fi + + # Default execution engine based on mode + if [[ -z "$EXECUTION_ENGINE" ]]; then + case "$MODE" in + workflow) + EXECUTION_ENGINE="act" + ;; + *) + EXECUTION_ENGINE="native" + ;; + esac + fi + + # Auto-detect parallel jobs + if [[ -z "$PARALLEL_JOBS" ]]; then + PARALLEL_JOBS=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4) + fi + + export VERBOSE +} + +# ============================================================================= +# DEPENDENCY CHECKS +# ============================================================================= + +check_dependencies() { + log_subsection "Checking Dependencies" + + local missing=0 + + # Always required + if ! require_command "dotnet" "https://dot.net/download"; then + missing=1 + else + local dotnet_version + dotnet_version=$(dotnet --version 2>/dev/null || echo "unknown") + log_debug "dotnet version: $dotnet_version" + fi + + if ! require_command "git"; then + missing=1 + fi + + # Docker required for docker mode + if [[ "$EXECUTION_ENGINE" == "docker" ]]; then + if ! check_docker; then + missing=1 + fi + fi + + # Act required for workflow mode + if [[ "$EXECUTION_ENGINE" == "act" ]] || [[ "$MODE" == "$MODE_WORKFLOW" ]]; then + if ! require_command "act" "brew install act (macOS) or https://github.com/nektos/act"; then + log_warn "act not found - workflow simulation will be limited" + fi + fi + + # Check for solution file + if ! require_file "$REPO_ROOT/src/StellaOps.sln"; then + missing=1 + fi + + return $missing +} + +# ============================================================================= +# RESULT INITIALIZATION +# ============================================================================= + +init_results() { + ensure_dir "$RESULTS_DIR" + ensure_dir "$TRX_DIR" + ensure_dir "$LOGS_DIR" + : > "$ACTIVE_TEST_FILE" + + # Create run metadata + local run_id + run_id=$(date +%Y%m%d_%H%M%S) + export RUN_ID="$run_id" + + log_debug "Results directory: $RESULTS_DIR" + log_debug "Run ID: $RUN_ID" +} + +# ============================================================================= +# TEST EXECUTION +# ============================================================================= + +run_dotnet_tests() { + local category="$1" + local filter="Category=$category" + + log_subsection "Running $category Tests" + + local trx_file="$TRX_DIR/${category}-${RUN_ID}.trx" + local log_file="$LOGS_DIR/${category}-${RUN_ID}.log" + + local blame_args=() + if [[ -n "$TEST_TIMEOUT" ]]; then + blame_args+=(--blame-hang "--blame-hang-timeout" "$TEST_TIMEOUT") + fi + + local test_cmd=( + dotnet test "$REPO_ROOT/src/StellaOps.sln" + --filter "$filter" + --configuration Release + --no-build + "${blame_args[@]}" + --logger "trx;LogFileName=$trx_file" + --results-directory "$TRX_DIR" + --verbosity minimal + ) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would execute: ${test_cmd[*]}" + return 0 + fi + + local start_time + start_time=$(start_timer) + + if [[ "$VERBOSE" == "true" ]]; then + "${test_cmd[@]}" 2>&1 | tee "$log_file" + else + "${test_cmd[@]}" > "$log_file" 2>&1 + fi + + local result=$? + stop_timer "$start_time" "$category tests" + + if [[ $result -eq 0 ]]; then + log_success "$category tests passed" + else + log_error "$category tests failed (see $log_file)" + fi + + return $result +} + +collect_test_projects() { + if command -v rg &>/dev/null; then + rg --files -g "*Tests.csproj" "$REPO_ROOT/src" | LC_ALL=C sort + else + find "$REPO_ROOT/src" -name "*Tests.csproj" -print | LC_ALL=C sort + fi +} + +run_dotnet_tests_split() { + local category="$1" + local filter="Category=$category" + local progress_interval="$PROGRESS_INTERVAL" + if [[ -z "$progress_interval" ]]; then + progress_interval=60 + fi + + log_subsection "Running $category Tests (per project)" + + local projects=() + mapfile -t projects < <(collect_test_projects) + if [[ ${#projects[@]} -eq 0 ]]; then + log_warn "No test projects found under $REPO_ROOT/src" + return 0 + fi + + local failed=0 + local total_all="${#projects[@]}" + local start_index="${PROJECT_START:-1}" + local count_limit="${PROJECT_COUNT:-0}" + if [[ "$start_index" -lt 1 ]]; then + start_index=1 + fi + if [[ "$count_limit" -lt 0 ]]; then + count_limit=0 + fi + + local total_to_run="$total_all" + if [[ "$count_limit" -gt 0 ]]; then + total_to_run="$count_limit" + else + total_to_run=$((total_all - start_index + 1)) + if [[ "$total_to_run" -lt 0 ]]; then + total_to_run=0 + fi + fi + + local index=0 + local run_index=0 + + for project in "${projects[@]}"; do + index=$((index + 1)) + if [[ "$index" -lt "$start_index" ]]; then + continue + fi + if [[ "$count_limit" -gt 0 && "$run_index" -ge "$count_limit" ]]; then + break + fi + run_index=$((run_index + 1)) + local project_name + project_name="$(basename "${project%.csproj}")" + + log_step "$run_index" "$total_to_run" "Testing $project_name ($category)" + printf '%s %s (%s)\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$project_name" "$category" > "$ACTIVE_TEST_FILE" + + local trx_file="$TRX_DIR/${category}-${RUN_ID}-${project_name}.trx" + local log_file="$LOGS_DIR/${category}-${RUN_ID}-${project_name}.log" + + local blame_args=() + if [[ -n "$TEST_TIMEOUT" ]]; then + blame_args+=(--blame-hang "--blame-hang-timeout" "$TEST_TIMEOUT") + fi + + local test_cmd=( + dotnet test "$project" + --filter "$filter" + --configuration Release + --no-build + "${blame_args[@]}" + --logger "trx;LogFileName=$trx_file" + --results-directory "$TRX_DIR" + --verbosity minimal + ) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would execute: ${test_cmd[*]}" + continue + fi + + local start_time + start_time=$(start_timer) + + local ticker_pid="" + if [[ "$progress_interval" -gt 0 ]]; then + ( + while true; do + sleep "$progress_interval" + local_now=$(get_timestamp) + local_elapsed=$((local_now - start_time)) + log_info "$project_name still running after $(format_duration "$local_elapsed")" + done + ) & + ticker_pid=$! + fi + + set +e + if [[ "$VERBOSE" == "true" ]]; then + "${test_cmd[@]}" 2>&1 | tee "$log_file" + else + "${test_cmd[@]}" > "$log_file" 2>&1 + fi + local result=$? + set -e + + if [[ -n "$ticker_pid" ]]; then + kill "$ticker_pid" 2>/dev/null || true + wait "$ticker_pid" 2>/dev/null || true + fi + + stop_timer "$start_time" "$project_name ($category)" + + if [[ $result -ne 0 ]] && grep -q -E "The test source file .* was not found" "$log_file"; then + log_warn "$project_name output missing; retrying with build" + local retry_cmd=( + dotnet test "$project" + --filter "$filter" + --configuration Release + "${blame_args[@]}" + --logger "trx;LogFileName=$trx_file" + --results-directory "$TRX_DIR" + --verbosity minimal + ) + local retry_start + retry_start=$(start_timer) + set +e + if [[ "$VERBOSE" == "true" ]]; then + "${retry_cmd[@]}" 2>&1 | tee -a "$log_file" + else + "${retry_cmd[@]}" >> "$log_file" 2>&1 + fi + result=$? + set -e + stop_timer "$retry_start" "$project_name ($category) rebuild" + fi + + if [[ $result -eq 0 ]]; then + log_success "$project_name $category tests passed" + else + if grep -q -E "No test matches the given testcase filter|No test is available" "$log_file"; then + log_warn "$project_name has no $category tests; skipping" + else + log_error "$project_name $category tests failed (see $log_file)" + failed=1 + fi + fi + done + + return $failed +} + +run_dotnet_build() { + log_subsection "Building Solution" + + local build_cmd=( + dotnet build "$REPO_ROOT/src/StellaOps.sln" + --configuration Release + ) + + if [[ "$DRY_RUN" == "true" ]]; then + log_info "[DRY-RUN] Would execute: ${build_cmd[*]}" + return 0 + fi + + local start_time + start_time=$(start_timer) + + "${build_cmd[@]}" + + local result=$? + stop_timer "$start_time" "Build" + + if [[ $result -eq 0 ]]; then + log_success "Build completed successfully" + else + log_error "Build failed" + fi + + return $result +} + +# ============================================================================= +# MODE IMPLEMENTATIONS +# ============================================================================= + +run_smoke_mode() { + log_section "Smoke Test Mode" + if [[ -n "$SMOKE_STEP" ]]; then + log_info "Running smoke step: $SMOKE_STEP" + else + log_info "Running quick validation (Unit tests only)" + fi + + local start_time + start_time=$(start_timer) + + local result=0 + case "$SMOKE_STEP" in + "" ) + # Build + run_dotnet_build || return 1 + + # Run Unit tests only + run_dotnet_tests "Unit" + result=$? + ;; + build ) + run_dotnet_build + result=$? + ;; + unit ) + run_dotnet_tests "Unit" + result=$? + ;; + unit-split ) + run_dotnet_tests_split "Unit" + result=$? + ;; + * ) + log_error "Unknown smoke step: $SMOKE_STEP" + return 1 + ;; + esac + + stop_timer "$start_time" "Smoke test" + return $result +} + +run_pr_mode() { + log_section "PR-Gating Mode" + log_info "Running full PR-gating suite" + log_info "Categories: ${PR_GATING_CATEGORIES[*]}" + + local start_time + start_time=$(start_timer) + local failed=0 + local results=() + + # Check if Web module has changes + local web_changed=false + local changed_files + changed_files=$(get_changed_files main 2>/dev/null || echo "") + if echo "$changed_files" | grep -q "^src/Web/"; then + web_changed=true + log_info "Web module changes detected - will run Web tests" + fi + + # Start services if needed + if [[ "$SKIP_SERVICES" != "true" ]]; then + start_ci_services postgres-ci valkey-ci || { + log_warn "Failed to start services, continuing anyway..." + } + fi + + # Build .NET solution + run_dotnet_build || return 1 + + # Run each .NET category + if [[ -n "$SPECIFIC_CATEGORY" ]]; then + if [[ "$SPECIFIC_CATEGORY" == "Web" ]] || [[ "$SPECIFIC_CATEGORY" == "web" ]]; then + # Run Web tests only + if type run_web_pr_gating &>/dev/null; then + run_web_pr_gating + results+=("Web:$?") + fi + else + run_dotnet_tests "$SPECIFIC_CATEGORY" + results+=("$SPECIFIC_CATEGORY:$?") + fi + else + for category in "${PR_GATING_CATEGORIES[@]}"; do + run_dotnet_tests "$category" + local cat_result=$? + results+=("$category:$cat_result") + if [[ $cat_result -ne 0 ]]; then + failed=1 + fi + done + + # Run Web tests if Web module changed + if [[ "$web_changed" == "true" ]]; then + log_subsection "Web Module Tests" + if type run_web_pr_gating &>/dev/null; then + run_web_pr_gating + local web_result=$? + results+=("Web:$web_result") + if [[ $web_result -ne 0 ]]; then + failed=1 + fi + else + log_warn "Web testing library not loaded" + fi + fi + fi + + # Stop services + if [[ "$SKIP_SERVICES" != "true" ]] && [[ "$KEEP_SERVICES" != "true" ]]; then + stop_ci_services + fi + + # Print summary + log_section "PR-Gating Results" + for result in "${results[@]}"; do + local name="${result%%:*}" + local status="${result##*:}" + if [[ "$status" == "0" ]]; then + print_status "$name" "true" + else + print_status "$name" "false" + fi + done + + stop_timer "$start_time" "PR-gating suite" + return $failed +} + +run_module_mode() { + log_section "Module-Specific Mode" + + local modules_to_test=() + local has_dotnet_modules=false + local has_node_modules=false + + if [[ -n "$SPECIFIC_MODULE" ]]; then + modules_to_test=("$SPECIFIC_MODULE") + log_info "Testing specified module: $SPECIFIC_MODULE" + else + log_info "Auto-detecting changed modules..." + local detected + detected=$(detect_changed_modules main) + + if [[ "$detected" == "ALL" ]]; then + log_info "Infrastructure changes detected - running all tests" + run_pr_mode + return $? + elif [[ "$detected" == "NONE" ]]; then + log_info "No module changes detected" + return 0 + else + read -ra modules_to_test <<< "$detected" + log_info "Detected changed modules: ${modules_to_test[*]}" + fi + fi + + # Categorize modules + for module in "${modules_to_test[@]}"; do + if [[ " ${NODE_MODULES[*]} " =~ " ${module} " ]]; then + has_node_modules=true + else + has_dotnet_modules=true + fi + done + + local start_time + start_time=$(start_timer) + local failed=0 + + # Build .NET solution if we have .NET modules + if [[ "$has_dotnet_modules" == "true" ]]; then + run_dotnet_build || return 1 + fi + + for module in "${modules_to_test[@]}"; do + log_subsection "Testing Module: $module" + + # Check if this is a Node.js module (Web, DevPortal) + if [[ " ${NODE_MODULES[*]} " =~ " ${module} " ]]; then + log_info "Running Node.js tests for $module" + + case "$module" in + Web) + if type run_web_pr_gating &>/dev/null; then + run_web_pr_gating || failed=1 + else + log_warn "Web testing library not loaded - running basic npm test" + pushd "$REPO_ROOT/src/Web/StellaOps.Web" > /dev/null 2>&1 || continue + npm ci --prefer-offline --no-audit 2>/dev/null || npm install + npm run test:ci || failed=1 + popd > /dev/null + fi + ;; + DevPortal) + local portal_dir="$REPO_ROOT/src/DevPortal/StellaOps.DevPortal.Site" + if [[ -d "$portal_dir" ]]; then + pushd "$portal_dir" > /dev/null || continue + npm ci --prefer-offline --no-audit 2>/dev/null || npm install + npm test 2>/dev/null || log_warn "DevPortal tests not configured" + popd > /dev/null + fi + ;; + esac + continue + fi + + # .NET module handling + local test_paths="${MODULE_PATHS[$module]:-}" + if [[ -z "$test_paths" ]]; then + log_warn "Unknown module: $module" + continue + fi + + # Run tests for each path + for path in $test_paths; do + local test_dir="$REPO_ROOT/$path/__Tests" + if [[ -d "$test_dir" ]]; then + log_info "Running tests in: $test_dir" + + local test_projects + test_projects=$(find "$test_dir" -name "*.Tests.csproj" -type f 2>/dev/null) + + for project in $test_projects; do + log_debug "Testing: $project" + dotnet test "$project" --configuration Release --no-build --verbosity minimal || { + failed=1 + } + done + fi + done + done + + stop_timer "$start_time" "Module tests" + return $failed +} + +run_workflow_mode() { + log_section "Workflow Simulation Mode" + + if [[ -z "$SPECIFIC_WORKFLOW" ]]; then + log_error "No workflow specified. Use --workflow " + log_info "Example: --workflow test-matrix" + return 1 + fi + + local workflow_file="$REPO_ROOT/.gitea/workflows/${SPECIFIC_WORKFLOW}.yml" + if [[ ! -f "$workflow_file" ]]; then + # Try without .yml extension + workflow_file="$REPO_ROOT/.gitea/workflows/${SPECIFIC_WORKFLOW}" + if [[ ! -f "$workflow_file" ]]; then + log_error "Workflow not found: $SPECIFIC_WORKFLOW" + log_info "Available workflows:" + ls -1 "$REPO_ROOT/.gitea/workflows/"*.yml 2>/dev/null | xargs -n1 basename | head -20 + return 1 + fi + fi + + log_info "Simulating workflow: $SPECIFIC_WORKFLOW" + log_info "Workflow file: $workflow_file" + + if ! command -v act &>/dev/null; then + log_error "act is required for workflow simulation" + log_info "Install with: brew install act (macOS)" + return 1 + fi + + # Build CI image if needed + if [[ "$REBUILD_IMAGE" == "true" ]] || ! ci_image_exists; then + build_ci_image "$REBUILD_IMAGE" || return 1 + fi + + local event_file="$REPO_ROOT/devops/ci-local/events/pull-request.json" + local actrc_file="$REPO_ROOT/.actrc" + + local act_args=( + -W "$workflow_file" + --platform "ubuntu-22.04=$CI_IMAGE" + --platform "ubuntu-latest=$CI_IMAGE" + --env "DOTNET_NOLOGO=1" + --env "DOTNET_CLI_TELEMETRY_OPTOUT=1" + --env "TZ=UTC" + --bind + ) + + if [[ -f "$event_file" ]]; then + act_args+=(--eventpath "$event_file") + fi + + if [[ -f "$REPO_ROOT/devops/ci-local/.env.local" ]]; then + act_args+=(--env-file "$REPO_ROOT/devops/ci-local/.env.local") + fi + + if [[ "$DRY_RUN" == "true" ]]; then + act_args+=(-n) + fi + + if [[ "$VERBOSE" == "true" ]]; then + act_args+=(--verbose) + fi + + log_info "Running: act ${act_args[*]}" + act "${act_args[@]}" +} + +run_release_mode() { + log_section "Release Simulation Mode" + log_info "Running release dry-run" + + if [[ "$DRY_RUN" != "true" ]]; then + log_warn "Release mode always runs as dry-run for safety" + DRY_RUN=true + fi + + local start_time + start_time=$(start_timer) + + # Build all modules + log_subsection "Building All Modules" + run_dotnet_build || return 1 + + # Package CLI + log_subsection "Packaging CLI" + local cli_project="$REPO_ROOT/src/Cli/StellaOps.Cli/StellaOps.Cli.csproj" + if [[ -f "$cli_project" ]]; then + log_info "[DRY-RUN] Would build CLI for: linux-x64, linux-arm64, osx-arm64, win-x64" + fi + + # Validate Helm chart + log_subsection "Validating Helm Chart" + if command -v helm &>/dev/null; then + local helm_chart="$REPO_ROOT/devops/helm/stellaops" + if [[ -d "$helm_chart" ]]; then + helm lint "$helm_chart" || log_warn "Helm lint warnings" + fi + else + log_info "helm not found - skipping chart validation" + fi + + # Generate release manifest + log_subsection "Release Manifest" + log_info "[DRY-RUN] Would generate:" + log_info " - Release notes" + log_info " - Changelog" + log_info " - Docker Compose files" + log_info " - SBOM" + log_info " - Checksums" + + stop_timer "$start_time" "Release simulation" + return 0 +} + +run_full_mode() { + log_section "Full Test Mode" + log_info "Running all tests including extended categories" + log_info "Categories: ${ALL_CATEGORIES[*]}" + + local start_time + start_time=$(start_timer) + local failed=0 + + # Start all services + if [[ "$SKIP_SERVICES" != "true" ]]; then + start_ci_services || { + log_warn "Failed to start services, continuing anyway..." + } + fi + + # Build + run_dotnet_build || return 1 + + # Run all categories + for category in "${ALL_CATEGORIES[@]}"; do + run_dotnet_tests "$category" || { + failed=1 + log_warn "Continuing after $category failure..." + } + done + + # Stop services + if [[ "$SKIP_SERVICES" != "true" ]] && [[ "$KEEP_SERVICES" != "true" ]]; then + stop_ci_services + fi + + stop_timer "$start_time" "Full test suite" + return $failed +} + +# ============================================================================= +# MAIN +# ============================================================================= + +main() { + parse_args "$@" + + log_section "StellaOps Local CI Runner" + log_info "Mode: $MODE" + log_info "Engine: $EXECUTION_ENGINE" + log_info "Parallel: $PARALLEL_JOBS jobs" + log_info "Repository: $REPO_ROOT" + + if [[ "$DRY_RUN" == "true" ]]; then + log_warn "DRY-RUN MODE - No changes will be made" + fi + + # Check dependencies + check_dependencies || exit 1 + + # Initialize results directory + init_results + + # Load environment + load_env_file "$REPO_ROOT/devops/ci-local/.env.local" || true + + # Run selected mode + case "$MODE" in + "$MODE_SMOKE") + run_smoke_mode + ;; + "$MODE_PR") + run_pr_mode + ;; + "$MODE_MODULE") + run_module_mode + ;; + "$MODE_WORKFLOW") + run_workflow_mode + ;; + "$MODE_RELEASE") + run_release_mode + ;; + "$MODE_FULL") + run_full_mode + ;; + *) + log_error "Unknown mode: $MODE" + usage + exit 1 + ;; + esac + + local result=$? + + log_section "Summary" + log_info "Results saved to: $RESULTS_DIR" + + if [[ $result -eq 0 ]]; then + log_success "All tests passed!" + else + log_error "Some tests failed" + fi + + return $result +} + +# Run main if executed directly +if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then + main "$@" +fi diff --git a/deploy/scripts/migrate-config.sh b/deploy/scripts/migrate-config.sh new file mode 100644 index 000000000..35d6668e6 --- /dev/null +++ b/deploy/scripts/migrate-config.sh @@ -0,0 +1,330 @@ +#!/usr/bin/env bash +# +# Migrate legacy configuration structure to consolidated etc/ +# +# This script migrates: +# - certificates/ -> etc/certificates/ +# - config/ -> etc/crypto/ and etc/env/ +# - policies/ -> etc/policy/ +# - etc/rootpack/ -> etc/crypto/profiles/ +# +# Usage: +# ./devops/scripts/migrate-config.sh [--dry-run] +# + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +ROOT_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)" + +DRY_RUN=false +[[ "${1:-}" == "--dry-run" ]] && DRY_RUN=true + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${BLUE}[INFO]${NC} $*"; } +log_ok() { echo -e "${GREEN}[OK]${NC} $*"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $*"; } +log_error() { echo -e "${RED}[ERROR]${NC} $*"; } +log_dry() { echo -e "${YELLOW}[DRY-RUN]${NC} $*"; } + +# Execute or log command +run_cmd() { + if [[ "${DRY_RUN}" == true ]]; then + log_dry "$*" + else + "$@" + fi +} + +# Create backup +create_backup() { + local backup_file="${ROOT_DIR}/config-backup-$(date +%Y%m%d-%H%M%S).tar.gz" + + log_info "Creating backup: ${backup_file}" + + if [[ "${DRY_RUN}" == true ]]; then + log_dry "Would create backup of: certificates/ config/ policies/ etc/" + return + fi + + local dirs_to_backup=() + [[ -d "${ROOT_DIR}/certificates" ]] && dirs_to_backup+=("certificates") + [[ -d "${ROOT_DIR}/config" ]] && dirs_to_backup+=("config") + [[ -d "${ROOT_DIR}/policies" ]] && dirs_to_backup+=("policies") + [[ -d "${ROOT_DIR}/etc" ]] && dirs_to_backup+=("etc") + + if [[ ${#dirs_to_backup[@]} -gt 0 ]]; then + cd "${ROOT_DIR}" + tar -czvf "${backup_file}" "${dirs_to_backup[@]}" + log_ok "Backup created: ${backup_file}" + else + log_warn "No directories to backup" + fi +} + +# Create new directory structure +create_directories() { + log_info "Creating new directory structure..." + + local dirs=( + "etc/certificates/trust-roots" + "etc/certificates/signing" + "etc/crypto/profiles/cn" + "etc/crypto/profiles/eu" + "etc/crypto/profiles/kr" + "etc/crypto/profiles/ru" + "etc/crypto/profiles/us-fips" + "etc/env" + "etc/policy/packs" + "etc/policy/schemas" + ) + + for dir in "${dirs[@]}"; do + run_cmd mkdir -p "${ROOT_DIR}/${dir}" + done + + log_ok "Directory structure created" +} + +# Migrate certificates/ +migrate_certificates() { + local src_dir="${ROOT_DIR}/certificates" + + if [[ ! -d "${src_dir}" ]]; then + log_info "No certificates/ directory found, skipping" + return + fi + + log_info "Migrating certificates/..." + + # Trust roots (CA bundles) + for f in "${src_dir}"/*-bundle*.pem "${src_dir}"/*-root*.pem "${src_dir}"/*_bundle*.pem "${src_dir}"/*_root*.pem 2>/dev/null; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${ROOT_DIR}/etc/certificates/trust-roots/" + log_ok "Moved: $(basename "$f") -> etc/certificates/trust-roots/" + done + + # Signing keys + for f in "${src_dir}"/*-signing-*.pem "${src_dir}"/*_signing_*.pem 2>/dev/null; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${ROOT_DIR}/etc/certificates/signing/" + log_ok "Moved: $(basename "$f") -> etc/certificates/signing/" + done + + # Move remaining .pem and .cer files to trust-roots + for f in "${src_dir}"/*.pem "${src_dir}"/*.cer 2>/dev/null; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${ROOT_DIR}/etc/certificates/trust-roots/" + log_ok "Moved: $(basename "$f") -> etc/certificates/trust-roots/" + done + + # Remove empty directory + if [[ -d "${src_dir}" ]] && [[ -z "$(ls -A "${src_dir}")" ]]; then + run_cmd rmdir "${src_dir}" + log_ok "Removed empty: certificates/" + fi +} + +# Migrate config/ +migrate_config_dir() { + local src_dir="${ROOT_DIR}/config" + + if [[ ! -d "${src_dir}" ]]; then + log_info "No config/ directory found, skipping" + return + fi + + log_info "Migrating config/..." + + # Map env files to crypto profiles + declare -A env_mapping=( + [".env.fips.example"]="us-fips/env.sample" + [".env.eidas.example"]="eu/env.sample" + [".env.ru-free.example"]="ru/env.sample" + [".env.ru-paid.example"]="ru/env-paid.sample" + [".env.sm.example"]="cn/env.sample" + [".env.kcmvp.example"]="kr/env.sample" + ) + + for src_name in "${!env_mapping[@]}"; do + local src_file="${src_dir}/env/${src_name}" + local dst_file="${ROOT_DIR}/etc/crypto/profiles/${env_mapping[$src_name]}" + + if [[ -f "${src_file}" ]]; then + run_cmd mkdir -p "$(dirname "${dst_file}")" + run_cmd mv "${src_file}" "${dst_file}" + log_ok "Moved: ${src_name} -> etc/crypto/profiles/${env_mapping[$src_name]}" + fi + done + + # Remove crypto-profiles.sample.json (superseded) + if [[ -f "${src_dir}/crypto-profiles.sample.json" ]]; then + run_cmd rm "${src_dir}/crypto-profiles.sample.json" + log_ok "Removed: config/crypto-profiles.sample.json (superseded by etc/crypto/)" + fi + + # Remove empty directories + [[ -d "${src_dir}/env" ]] && [[ -z "$(ls -A "${src_dir}/env" 2>/dev/null)" ]] && run_cmd rmdir "${src_dir}/env" + [[ -d "${src_dir}" ]] && [[ -z "$(ls -A "${src_dir}" 2>/dev/null)" ]] && run_cmd rmdir "${src_dir}" +} + +# Migrate policies/ +migrate_policies() { + local src_dir="${ROOT_DIR}/policies" + + if [[ ! -d "${src_dir}" ]]; then + log_info "No policies/ directory found, skipping" + return + fi + + log_info "Migrating policies/..." + + # Move policy packs + for f in "${src_dir}"/*.yaml 2>/dev/null; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${ROOT_DIR}/etc/policy/packs/" + log_ok "Moved: $(basename "$f") -> etc/policy/packs/" + done + + # Move schemas + if [[ -d "${src_dir}/schemas" ]]; then + for f in "${src_dir}/schemas"/*.json 2>/dev/null; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${ROOT_DIR}/etc/policy/schemas/" + log_ok "Moved: schemas/$(basename "$f") -> etc/policy/schemas/" + done + [[ -z "$(ls -A "${src_dir}/schemas" 2>/dev/null)" ]] && run_cmd rmdir "${src_dir}/schemas" + fi + + # Move AGENTS.md if present + [[ -f "${src_dir}/AGENTS.md" ]] && run_cmd mv "${src_dir}/AGENTS.md" "${ROOT_DIR}/etc/policy/" + + # Remove empty directory + [[ -d "${src_dir}" ]] && [[ -z "$(ls -A "${src_dir}" 2>/dev/null)" ]] && run_cmd rmdir "${src_dir}" +} + +# Migrate etc/rootpack/ to etc/crypto/profiles/ +migrate_rootpack() { + local src_dir="${ROOT_DIR}/etc/rootpack" + + if [[ ! -d "${src_dir}" ]]; then + log_info "No etc/rootpack/ directory found, skipping" + return + fi + + log_info "Migrating etc/rootpack/ to etc/crypto/profiles/..." + + for region_dir in "${src_dir}"/*; do + [[ -d "${region_dir}" ]] || continue + local region_name=$(basename "${region_dir}") + local target_dir="${ROOT_DIR}/etc/crypto/profiles/${region_name}" + + run_cmd mkdir -p "${target_dir}" + + for f in "${region_dir}"/*; do + [[ -f "$f" ]] || continue + run_cmd mv "$f" "${target_dir}/" + log_ok "Moved: rootpack/${region_name}/$(basename "$f") -> etc/crypto/profiles/${region_name}/" + done + + [[ -z "$(ls -A "${region_dir}" 2>/dev/null)" ]] && run_cmd rmdir "${region_dir}" + done + + [[ -d "${src_dir}" ]] && [[ -z "$(ls -A "${src_dir}" 2>/dev/null)" ]] && run_cmd rmdir "${src_dir}" +} + +# Validate migration +validate_migration() { + log_info "Validating migration..." + + local errors=0 + + # Check new structure exists + local required=( + "etc/certificates" + "etc/crypto/profiles" + "etc/policy" + ) + + for dir in "${required[@]}"; do + if [[ ! -d "${ROOT_DIR}/${dir}" ]]; then + log_error "Missing: ${dir}" + ((errors++)) + fi + done + + # Check legacy directories are gone + local legacy=( + "certificates" + "config" + "policies" + "etc/rootpack" + ) + + for dir in "${legacy[@]}"; do + if [[ -d "${ROOT_DIR}/${dir}" ]] && [[ -n "$(ls -A "${ROOT_DIR}/${dir}" 2>/dev/null)" ]]; then + log_warn "Legacy directory still has content: ${dir}" + fi + done + + if [[ ${errors} -gt 0 ]]; then + log_error "Validation failed" + return 1 + fi + + log_ok "Migration validated" +} + +# Print summary +print_summary() { + echo "" + echo "========================================" + if [[ "${DRY_RUN}" == true ]]; then + echo " Migration Dry Run Complete" + else + echo " Migration Complete" + fi + echo "========================================" + echo "" + echo "New structure:" + echo " etc/certificates/ - Trust anchors and signing keys" + echo " etc/crypto/profiles/ - Regional crypto profiles" + echo " etc/policy/ - Policy engine configuration" + echo "" + if [[ "${DRY_RUN}" == true ]]; then + echo "Run without --dry-run to apply changes" + else + echo "Next steps:" + echo " 1. Update Docker Compose volume mounts" + echo " 2. Update any hardcoded paths in scripts" + echo " 3. Restart services and validate" + echo "" + echo "Rollback:" + echo " tar -xzvf config-backup-*.tar.gz" + fi + echo "" +} + +# Main +main() { + if [[ "${DRY_RUN}" == true ]]; then + log_info "DRY RUN - no changes will be made" + fi + + create_backup + create_directories + migrate_certificates + migrate_config_dir + migrate_policies + migrate_rootpack + validate_migration + print_summary +} + +main "$@" diff --git a/deploy/scripts/rotate-rekor-key.sh b/deploy/scripts/rotate-rekor-key.sh new file mode 100644 index 000000000..c9b8e8271 --- /dev/null +++ b/deploy/scripts/rotate-rekor-key.sh @@ -0,0 +1,197 @@ +#!/bin/bash +# ----------------------------------------------------------------------------- +# rotate-rekor-key.sh +# Sprint: SPRINT_20260125_003_Attestor_trust_workflows_conformance +# Task: WORKFLOW-002 - Create key rotation workflow script +# Description: Rotate Rekor public key with grace period +# ----------------------------------------------------------------------------- + +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } +log_step() { echo -e "${BLUE}[STEP]${NC} $1"; } + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Rotate Rekor public key through a dual-key grace period." + echo "" + echo "Phases:" + echo " add-key Add new key to TUF (starts grace period)" + echo " verify Verify both keys are active" + echo " remove-old Remove old key (after grace period)" + echo "" + echo "Options:" + echo " --repo DIR TUF repository directory" + echo " --new-key FILE Path to new Rekor public key" + echo " --new-key-name NAME Target name for new key (default: rekor-key-v{N+1})" + echo " --old-key-name NAME Target name for old key to remove" + echo " --grace-days N Grace period in days (default: 7)" + echo " -h, --help Show this help message" + echo "" + echo "Example (3-phase rotation):" + echo " # Phase 1: Add new key" + echo " $0 add-key --repo /path/to/tuf --new-key rekor-key-v2.pub" + echo "" + echo " # Wait for grace period (clients sync)" + echo " sleep 7d" + echo "" + echo " # Phase 2: Verify" + echo " $0 verify" + echo "" + echo " # Phase 3: Remove old key" + echo " $0 remove-old --repo /path/to/tuf --old-key-name rekor-key-v1" + exit 1 +} + +PHASE="" +REPO_DIR="" +NEW_KEY="" +NEW_KEY_NAME="" +OLD_KEY_NAME="" +GRACE_DAYS=7 + +while [[ $# -gt 0 ]]; do + case $1 in + add-key|verify|remove-old) + PHASE="$1" + shift + ;; + --repo) REPO_DIR="$2"; shift 2 ;; + --new-key) NEW_KEY="$2"; shift 2 ;; + --new-key-name) NEW_KEY_NAME="$2"; shift 2 ;; + --old-key-name) OLD_KEY_NAME="$2"; shift 2 ;; + --grace-days) GRACE_DAYS="$2"; shift 2 ;; + -h|--help) usage ;; + *) log_error "Unknown argument: $1"; usage ;; + esac +done + +if [[ -z "$PHASE" ]]; then + log_error "Phase is required" + usage +fi + +echo "" +echo "================================================" +echo " Rekor Key Rotation - Phase: $PHASE" +echo "================================================" +echo "" + +case "$PHASE" in + add-key) + if [[ -z "$REPO_DIR" ]] || [[ -z "$NEW_KEY" ]]; then + log_error "add-key requires --repo and --new-key" + usage + fi + + if [[ ! -f "$NEW_KEY" ]]; then + log_error "New key file not found: $NEW_KEY" + exit 1 + fi + + if [[ ! -d "$REPO_DIR" ]]; then + log_error "TUF repository not found: $REPO_DIR" + exit 1 + fi + + # Determine new key name if not specified + if [[ -z "$NEW_KEY_NAME" ]]; then + # Find highest version and increment + HIGHEST=$(ls "$REPO_DIR/targets/" 2>/dev/null | grep -E '^rekor-key-v[0-9]+' | \ + sed 's/rekor-key-v//' | sed 's/\.pub$//' | sort -n | tail -1 || echo "0") + NEW_VERSION=$((HIGHEST + 1)) + NEW_KEY_NAME="rekor-key-v${NEW_VERSION}" + fi + + log_step "Adding new Rekor key: $NEW_KEY_NAME" + log_info "Source: $NEW_KEY" + + # Copy key to targets + cp "$NEW_KEY" "$REPO_DIR/targets/${NEW_KEY_NAME}.pub" + + # Add to targets.json + if [[ -x "$REPO_DIR/scripts/add-target.sh" ]]; then + "$REPO_DIR/scripts/add-target.sh" "$REPO_DIR/targets/${NEW_KEY_NAME}.pub" "${NEW_KEY_NAME}.pub" --repo "$REPO_DIR" + else + log_warn "add-target.sh not found, updating targets.json manually required" + fi + + log_info "" + log_info "Key added: $NEW_KEY_NAME" + log_info "" + log_warn "IMPORTANT: Dual-key period has started." + log_warn "Wait at least $GRACE_DAYS days before running 'remove-old' phase." + log_warn "During this time, clients will sync and receive both keys." + log_info "" + log_info "Next steps:" + echo " 1. Sign and publish updated TUF metadata" + echo " 2. Monitor client sync status" + echo " 3. After $GRACE_DAYS days, run: $0 remove-old --repo $REPO_DIR --old-key-name " + ;; + + verify) + log_step "Verifying key rotation status..." + + # Check local trust state + stella trust status --show-keys + + log_info "" + log_info "Verify that:" + echo " 1. Both old and new Rekor keys are listed" + echo " 2. Service endpoints are resolving correctly" + echo " 3. Attestations signed with old key still verify" + ;; + + remove-old) + if [[ -z "$REPO_DIR" ]] || [[ -z "$OLD_KEY_NAME" ]]; then + log_error "remove-old requires --repo and --old-key-name" + usage + fi + + if [[ ! -d "$REPO_DIR" ]]; then + log_error "TUF repository not found: $REPO_DIR" + exit 1 + fi + + OLD_KEY_FILE="$REPO_DIR/targets/${OLD_KEY_NAME}.pub" + if [[ ! -f "$OLD_KEY_FILE" ]]; then + OLD_KEY_FILE="$REPO_DIR/targets/${OLD_KEY_NAME}" + fi + + if [[ ! -f "$OLD_KEY_FILE" ]]; then + log_error "Old key not found: $OLD_KEY_NAME" + exit 1 + fi + + log_step "Removing old Rekor key: $OLD_KEY_NAME" + log_warn "This is IRREVERSIBLE. Ensure all clients have synced the new key." + + read -p "Type 'CONFIRM' to proceed: " CONFIRM + if [[ "$CONFIRM" != "CONFIRM" ]]; then + log_error "Aborted" + exit 1 + fi + + # Remove key file + rm -f "$OLD_KEY_FILE" + + # Remove from targets.json (simplified - production should use proper JSON manipulation) + log_warn "Remember to update targets.json to remove the old key entry" + log_warn "Then sign and publish the updated metadata" + + log_info "" + log_info "Old key removed: $OLD_KEY_NAME" + log_info "Key rotation complete!" + ;; +esac + +echo "" diff --git a/deploy/scripts/rotate-signing-key.sh b/deploy/scripts/rotate-signing-key.sh new file mode 100644 index 000000000..4a1da9bd9 --- /dev/null +++ b/deploy/scripts/rotate-signing-key.sh @@ -0,0 +1,265 @@ +#!/bin/bash +# ----------------------------------------------------------------------------- +# rotate-signing-key.sh +# Sprint: SPRINT_20260125_003_Attestor_trust_workflows_conformance +# Task: WORKFLOW-002 - Create key rotation workflow script +# Description: Rotate organization signing key with dual-key grace period +# ----------------------------------------------------------------------------- + +set -euo pipefail + +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' + +log_info() { echo -e "${GREEN}[INFO]${NC} $1"; } +log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; } +log_error() { echo -e "${RED}[ERROR]${NC} $1"; } +log_step() { echo -e "${BLUE}[STEP]${NC} $1"; } + +usage() { + echo "Usage: $0 [options]" + echo "" + echo "Rotate organization signing key through a dual-key grace period." + echo "" + echo "Phases:" + echo " generate Generate new signing key" + echo " activate Activate new key (dual-key period starts)" + echo " verify Verify both keys are functional" + echo " retire Retire old key (after grace period)" + echo "" + echo "Options:" + echo " --key-dir DIR Directory for signing keys (default: /etc/stellaops/keys)" + echo " --key-type TYPE Key type: ecdsa-p256, ecdsa-p384, rsa-4096 (default: ecdsa-p256)" + echo " --new-key NAME Name for new key (default: signing-key-v{N+1})" + echo " --old-key NAME Name of old key to retire" + echo " --grace-days N Grace period in days (default: 14)" + echo " --ci-config FILE CI config file to update" + echo " -h, --help Show this help message" + echo "" + echo "Example (4-phase rotation):" + echo " # Phase 1: Generate new key" + echo " $0 generate --key-dir /etc/stellaops/keys" + echo "" + echo " # Phase 2: Activate (update CI to use both keys)" + echo " $0 activate --ci-config .gitea/workflows/ci.yaml" + echo "" + echo " # Wait for grace period" + echo " sleep 14d" + echo "" + echo " # Phase 3: Verify" + echo " $0 verify" + echo "" + echo " # Phase 4: Retire old key" + echo " $0 retire --old-key signing-key-v1" + exit 1 +} + +PHASE="" +KEY_DIR="/etc/stellaops/keys" +KEY_TYPE="ecdsa-p256" +NEW_KEY_NAME="" +OLD_KEY_NAME="" +GRACE_DAYS=14 +CI_CONFIG="" + +while [[ $# -gt 0 ]]; do + case $1 in + generate|activate|verify|retire) + PHASE="$1" + shift + ;; + --key-dir) KEY_DIR="$2"; shift 2 ;; + --key-type) KEY_TYPE="$2"; shift 2 ;; + --new-key) NEW_KEY_NAME="$2"; shift 2 ;; + --old-key) OLD_KEY_NAME="$2"; shift 2 ;; + --grace-days) GRACE_DAYS="$2"; shift 2 ;; + --ci-config) CI_CONFIG="$2"; shift 2 ;; + -h|--help) usage ;; + *) log_error "Unknown argument: $1"; usage ;; + esac +done + +if [[ -z "$PHASE" ]]; then + log_error "Phase is required" + usage +fi + +echo "" +echo "================================================" +echo " Signing Key Rotation - Phase: $PHASE" +echo "================================================" +echo "" + +case "$PHASE" in + generate) + log_step "Generating new signing key..." + + mkdir -p "$KEY_DIR" + chmod 700 "$KEY_DIR" + + # Determine new key name if not specified + if [[ -z "$NEW_KEY_NAME" ]]; then + HIGHEST=$(ls "$KEY_DIR" 2>/dev/null | grep -E '^signing-key-v[0-9]+' | \ + sed 's/signing-key-v//' | sed 's/\.pem$//' | sort -n | tail -1 || echo "0") + NEW_VERSION=$((HIGHEST + 1)) + NEW_KEY_NAME="signing-key-v${NEW_VERSION}" + fi + + NEW_KEY_PATH="$KEY_DIR/${NEW_KEY_NAME}.pem" + NEW_PUB_PATH="$KEY_DIR/${NEW_KEY_NAME}.pub" + + if [[ -f "$NEW_KEY_PATH" ]]; then + log_error "Key already exists: $NEW_KEY_PATH" + exit 1 + fi + + case "$KEY_TYPE" in + ecdsa-p256) + openssl ecparam -name prime256v1 -genkey -noout -out "$NEW_KEY_PATH" + openssl ec -in "$NEW_KEY_PATH" -pubout -out "$NEW_PUB_PATH" 2>/dev/null + ;; + ecdsa-p384) + openssl ecparam -name secp384r1 -genkey -noout -out "$NEW_KEY_PATH" + openssl ec -in "$NEW_KEY_PATH" -pubout -out "$NEW_PUB_PATH" 2>/dev/null + ;; + rsa-4096) + openssl genrsa -out "$NEW_KEY_PATH" 4096 + openssl rsa -in "$NEW_KEY_PATH" -pubout -out "$NEW_PUB_PATH" 2>/dev/null + ;; + *) + log_error "Unknown key type: $KEY_TYPE" + exit 1 + ;; + esac + + chmod 600 "$NEW_KEY_PATH" + chmod 644 "$NEW_PUB_PATH" + + log_info "" + log_info "New signing key generated:" + log_info " Private key: $NEW_KEY_PATH" + log_info " Public key: $NEW_PUB_PATH" + log_info "" + log_info "Key fingerprint:" + openssl dgst -sha256 -r "$NEW_PUB_PATH" | cut -d' ' -f1 + log_info "" + log_warn "Store the public key securely for distribution." + log_warn "Next: Run '$0 activate' to enable dual-key signing." + ;; + + activate) + log_step "Activating dual-key signing..." + + # List available keys + log_info "Available signing keys in $KEY_DIR:" + ls -la "$KEY_DIR"/*.pem 2>/dev/null || log_warn "No .pem files found" + + if [[ -n "$CI_CONFIG" ]] && [[ -f "$CI_CONFIG" ]]; then + log_info "" + log_info "CI config file: $CI_CONFIG" + log_warn "Manual update required:" + echo " 1. Add the new key path to signing configuration" + echo " 2. Ensure both old and new keys can sign" + echo " 3. Update verification to accept both key signatures" + fi + + log_info "" + log_info "Dual-key activation checklist:" + echo " [ ] New key added to CI/CD pipeline" + echo " [ ] New public key distributed to verifiers" + echo " [ ] Both keys tested for signing" + echo " [ ] Grace period documented: $GRACE_DAYS days" + log_info "" + log_warn "Grace period starts now. Do not retire old key for $GRACE_DAYS days." + log_info "Next: Run '$0 verify' to confirm both keys work." + ;; + + verify) + log_step "Verifying signing key status..." + + # Test each key + log_info "Testing signing keys in $KEY_DIR:" + + TEST_FILE=$(mktemp) + echo "StellaOps key rotation verification $(date -u +%Y-%m-%dT%H:%M:%SZ)" > "$TEST_FILE" + + for keyfile in "$KEY_DIR"/*.pem; do + if [[ -f "$keyfile" ]]; then + keyname=$(basename "$keyfile" .pem) + TEST_SIG=$(mktemp) + + if openssl dgst -sha256 -sign "$keyfile" -out "$TEST_SIG" "$TEST_FILE" 2>/dev/null; then + log_info " $keyname: OK (signing works)" + else + log_warn " $keyname: FAILED (cannot sign)" + fi + + rm -f "$TEST_SIG" + fi + done + + rm -f "$TEST_FILE" + + log_info "" + log_info "Verification checklist:" + echo " [ ] All active keys can sign successfully" + echo " [ ] Old attestations still verify" + echo " [ ] New attestations verify with new key" + echo " [ ] Verifiers have both public keys" + ;; + + retire) + if [[ -z "$OLD_KEY_NAME" ]]; then + log_error "retire requires --old-key" + usage + fi + + OLD_KEY_PATH="$KEY_DIR/${OLD_KEY_NAME}.pem" + OLD_PUB_PATH="$KEY_DIR/${OLD_KEY_NAME}.pub" + + if [[ ! -f "$OLD_KEY_PATH" ]] && [[ ! -f "$KEY_DIR/${OLD_KEY_NAME}" ]]; then + log_error "Old key not found: $OLD_KEY_NAME" + exit 1 + fi + + log_step "Retiring old signing key: $OLD_KEY_NAME" + log_warn "This is IRREVERSIBLE. Ensure:" + echo " 1. Grace period ($GRACE_DAYS days) has passed" + echo " 2. All systems have been updated to use new key" + echo " 3. Old attestations have been resigned or archived" + + read -p "Type 'RETIRE' to proceed: " CONFIRM + if [[ "$CONFIRM" != "RETIRE" ]]; then + log_error "Aborted" + exit 1 + fi + + # Archive old key (don't delete immediately) + ARCHIVE_DIR="$KEY_DIR/archived" + mkdir -p "$ARCHIVE_DIR" + chmod 700 "$ARCHIVE_DIR" + + TIMESTAMP=$(date -u +%Y%m%d%H%M%S) + if [[ -f "$OLD_KEY_PATH" ]]; then + mv "$OLD_KEY_PATH" "$ARCHIVE_DIR/${OLD_KEY_NAME}-retired-${TIMESTAMP}.pem" + fi + if [[ -f "$OLD_PUB_PATH" ]]; then + mv "$OLD_PUB_PATH" "$ARCHIVE_DIR/${OLD_KEY_NAME}-retired-${TIMESTAMP}.pub" + fi + + log_info "" + log_info "Old key archived to: $ARCHIVE_DIR/" + log_info "Key rotation complete!" + log_warn "" + log_warn "Post-retirement checklist:" + echo " [ ] Remove old key from CI/CD configuration" + echo " [ ] Update documentation" + echo " [ ] Notify stakeholders of completion" + echo " [ ] Delete archived key after retention period" + ;; +esac + +echo "" diff --git a/deploy/scripts/test-local.sh b/deploy/scripts/test-local.sh new file mode 100644 index 000000000..4a1f360ce --- /dev/null +++ b/deploy/scripts/test-local.sh @@ -0,0 +1,183 @@ +#!/bin/bash +# test-local.sh - Run full CI test suite locally using Docker +# Sprint: SPRINT_20251226_006_CICD +# +# Usage: +# ./devops/scripts/test-local.sh # Run all PR-gating tests +# ./devops/scripts/test-local.sh --category Unit # Run specific category +# ./devops/scripts/test-local.sh --build-only # Only build, skip tests +# ./devops/scripts/test-local.sh --no-docker # Run directly without Docker + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" + +# Configuration +CI_IMAGE="stellaops-ci:local" +DOCKERFILE="$REPO_ROOT/devops/docker/Dockerfile.ci" +RESULTS_DIR="$REPO_ROOT/TestResults" + +# Default options +USE_DOCKER=true +BUILD_ONLY=false +SPECIFIC_CATEGORY="" +REBUILD_IMAGE=false + +# PR-gating test categories +PR_GATING_CATEGORIES=(Unit Architecture Contract Integration Security Golden) + +# Parse arguments +while [[ $# -gt 0 ]]; do + case $1 in + --category) + SPECIFIC_CATEGORY="$2" + shift 2 + ;; + --build-only) + BUILD_ONLY=true + shift + ;; + --no-docker) + USE_DOCKER=false + shift + ;; + --rebuild) + REBUILD_IMAGE=true + shift + ;; + --help) + echo "Usage: $0 [OPTIONS]" + echo "" + echo "Options:" + echo " --category CATEGORY Run only specific test category" + echo " --build-only Only build, skip tests" + echo " --no-docker Run directly without Docker container" + echo " --rebuild Force rebuild of CI Docker image" + echo " --help Show this help message" + echo "" + echo "Available categories: ${PR_GATING_CATEGORIES[*]}" + exit 0 + ;; + *) + echo "Unknown option: $1" + exit 1 + ;; + esac +done + +echo "=== StellaOps Local CI Test Runner ===" +echo "Repository: $REPO_ROOT" +echo "Use Docker: $USE_DOCKER" +echo "Build Only: $BUILD_ONLY" +echo "Category: ${SPECIFIC_CATEGORY:-All PR-gating}" + +# Create results directory +mkdir -p "$RESULTS_DIR" + +run_tests() { + local category=$1 + echo "" + echo "=== Running $category tests ===" + + dotnet test "$REPO_ROOT/src/StellaOps.sln" \ + --filter "Category=$category" \ + --configuration Release \ + --no-build \ + --logger "trx;LogFileName=${category}-tests.trx" \ + --results-directory "$RESULTS_DIR/$category" \ + --verbosity minimal || true +} + +run_build() { + echo "" + echo "=== Restoring dependencies ===" + dotnet restore "$REPO_ROOT/src/StellaOps.sln" + + echo "" + echo "=== Building solution ===" + dotnet build "$REPO_ROOT/src/StellaOps.sln" \ + --configuration Release \ + --no-restore +} + +run_all_tests() { + run_build + + if [[ "$BUILD_ONLY" == "true" ]]; then + echo "" + echo "=== Build completed (tests skipped) ===" + return + fi + + if [[ -n "$SPECIFIC_CATEGORY" ]]; then + run_tests "$SPECIFIC_CATEGORY" + else + for category in "${PR_GATING_CATEGORIES[@]}"; do + run_tests "$category" + done + fi + + echo "" + echo "=== Test Summary ===" + find "$RESULTS_DIR" -name "*.trx" -exec echo " Found: {}" \; + + # Convert TRX to JUnit if trx2junit is available + if command -v trx2junit &>/dev/null; then + echo "" + echo "=== Converting TRX to JUnit ===" + find "$RESULTS_DIR" -name "*.trx" -exec trx2junit {} \; 2>/dev/null || true + fi +} + +if [[ "$USE_DOCKER" == "true" ]]; then + # Check if Docker is available + if ! command -v docker &>/dev/null; then + echo "Error: Docker is not installed or not in PATH" + echo "Use --no-docker to run tests directly" + exit 1 + fi + + # Build CI image if needed + if [[ "$REBUILD_IMAGE" == "true" ]] || ! docker image inspect "$CI_IMAGE" &>/dev/null; then + echo "" + echo "=== Building CI Docker image ===" + docker build -t "$CI_IMAGE" -f "$DOCKERFILE" "$REPO_ROOT" + fi + + # Run in Docker container + echo "" + echo "=== Running in Docker container ===" + + DOCKER_ARGS=( + --rm + -v "$REPO_ROOT:/src" + -v "$RESULTS_DIR:/src/TestResults" + -e DOTNET_NOLOGO=1 + -e DOTNET_CLI_TELEMETRY_OPTOUT=1 + -w /src + ) + + # Mount Docker socket if available (for Testcontainers) + if [[ -S /var/run/docker.sock ]]; then + DOCKER_ARGS+=(-v /var/run/docker.sock:/var/run/docker.sock) + fi + + # Build test command + TEST_CMD="./devops/scripts/test-local.sh --no-docker" + if [[ -n "$SPECIFIC_CATEGORY" ]]; then + TEST_CMD="$TEST_CMD --category $SPECIFIC_CATEGORY" + fi + if [[ "$BUILD_ONLY" == "true" ]]; then + TEST_CMD="$TEST_CMD --build-only" + fi + + docker run "${DOCKER_ARGS[@]}" "$CI_IMAGE" bash -c "$TEST_CMD" +else + # Run directly + run_all_tests +fi + +echo "" +echo "=== Done ===" +echo "Results saved to: $RESULTS_DIR" diff --git a/deploy/scripts/test-package-publish.sh b/deploy/scripts/test-package-publish.sh new file mode 100644 index 000000000..81a895bb4 --- /dev/null +++ b/deploy/scripts/test-package-publish.sh @@ -0,0 +1,181 @@ +#!/bin/bash +# test-package-publish.sh - Test NuGet package publishing to local Gitea +# Sprint: SPRINT_20251226_004_CICD +# +# Prerequisites: +# - Docker running +# - Gitea test instance running (docker compose -f devops/compose/docker-compose.gitea-test.yaml up -d) +# - GITEA_TEST_TOKEN environment variable set +# - GITEA_TEST_OWNER environment variable set (default: stellaops) +# +# Usage: +# export GITEA_TEST_TOKEN="your-access-token" +# ./test-package-publish.sh # Test with sample package +# ./test-package-publish.sh --module Authority # Test specific module + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" + +# Configuration +GITEA_URL="${GITEA_TEST_URL:-http://localhost:3000}" +GITEA_OWNER="${GITEA_TEST_OWNER:-stellaops}" +GITEA_TOKEN="${GITEA_TEST_TOKEN:-}" +TEST_MODULE="" +DRY_RUN=false + +# Colors +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[0;33m' +NC='\033[0m' + +# Parse arguments +while [[ $# -gt 0 ]]; do + case $1 in + --module) + TEST_MODULE="$2" + shift 2 + ;; + --dry-run) + DRY_RUN=true + shift + ;; + --help) + echo "Usage: $0 [OPTIONS]" + echo "" + echo "Options:" + echo " --module MODULE Test specific module (e.g., Authority)" + echo " --dry-run Validate without pushing" + echo " --help Show this help message" + echo "" + echo "Environment Variables:" + echo " GITEA_TEST_URL Gitea URL (default: http://localhost:3000)" + echo " GITEA_TEST_OWNER Package owner (default: stellaops)" + echo " GITEA_TEST_TOKEN Access token with package:write scope" + exit 0 + ;; + *) + echo "Unknown option: $1" + exit 1 + ;; + esac +done + +echo "=== Package Publishing Test ===" +echo "Gitea URL: $GITEA_URL" +echo "Owner: $GITEA_OWNER" +echo "Dry Run: $DRY_RUN" + +# Check prerequisites +if [[ -z "$GITEA_TOKEN" && "$DRY_RUN" == "false" ]]; then + echo -e "${RED}ERROR: GITEA_TEST_TOKEN environment variable is required${NC}" + echo "Generate a token at: $GITEA_URL/user/settings/applications" + exit 1 +fi + +# Check if Gitea is running +if ! curl -s "$GITEA_URL/api/healthz" >/dev/null 2>&1; then + echo -e "${YELLOW}WARNING: Gitea not reachable at $GITEA_URL${NC}" + echo "Start it with: docker compose -f devops/compose/docker-compose.gitea-test.yaml up -d" + if [[ "$DRY_RUN" == "false" ]]; then + exit 1 + fi +fi + +# NuGet source URL +NUGET_SOURCE="$GITEA_URL/api/packages/$GITEA_OWNER/nuget/index.json" +echo "NuGet Source: $NUGET_SOURCE" +echo "" + +# Create a test package +TEST_DIR="$REPO_ROOT/out/package-test" +mkdir -p "$TEST_DIR" + +# If no module specified, use a simple test +if [[ -z "$TEST_MODULE" ]]; then + echo "=== Creating Test Package ===" + + # Create a minimal test package + TEST_PROJ_DIR="$TEST_DIR/StellaOps.PackageTest" + mkdir -p "$TEST_PROJ_DIR" + + cat > "$TEST_PROJ_DIR/StellaOps.PackageTest.csproj" <<'EOF' + + + net10.0 + StellaOps.PackageTest + 0.0.1-test + StellaOps + Test package for registry validation + BUSL-1.1 + + +EOF + + cat > "$TEST_PROJ_DIR/Class1.cs" <<'EOF' +namespace StellaOps.PackageTest; +public class TestClass { } +EOF + + echo "Building test package..." + dotnet pack "$TEST_PROJ_DIR/StellaOps.PackageTest.csproj" -c Release -o "$TEST_DIR/packages" + + PACKAGE_FILE=$(find "$TEST_DIR/packages" -name "*.nupkg" | head -1) +else + echo "=== Packing Module: $TEST_MODULE ===" + + # Find the module's main project + MODULE_PROJ=$(find "$REPO_ROOT/src" -path "*/$TEST_MODULE/*" -name "StellaOps.$TEST_MODULE.csproj" | head -1) + + if [[ -z "$MODULE_PROJ" ]]; then + echo -e "${RED}ERROR: Module project not found for $TEST_MODULE${NC}" + exit 1 + fi + + echo "Project: $MODULE_PROJ" + dotnet pack "$MODULE_PROJ" -c Release -p:Version=0.0.1-test -o "$TEST_DIR/packages" + + PACKAGE_FILE=$(find "$TEST_DIR/packages" -name "*.nupkg" | head -1) +fi + +if [[ -z "$PACKAGE_FILE" ]]; then + echo -e "${RED}ERROR: No package file created${NC}" + exit 1 +fi + +echo "" +echo "Package created: $PACKAGE_FILE" +echo "" + +if [[ "$DRY_RUN" == "true" ]]; then + echo -e "${YELLOW}=== DRY RUN: Skipping push ===${NC}" + echo "Package validated successfully!" + echo "" + echo "To push manually:" + echo " dotnet nuget push \"$PACKAGE_FILE\" \\" + echo " --source $NUGET_SOURCE \\" + echo " --api-key YOUR_TOKEN" +else + echo "=== Pushing Package ===" + if dotnet nuget push "$PACKAGE_FILE" \ + --source "$NUGET_SOURCE" \ + --api-key "$GITEA_TOKEN" \ + --skip-duplicate; then + echo "" + echo -e "${GREEN}SUCCESS: Package pushed to Gitea registry${NC}" + echo "View at: $GITEA_URL/$GITEA_OWNER/-/packages" + else + echo "" + echo -e "${RED}FAILED: Package push failed${NC}" + exit 1 + fi +fi + +echo "" +echo "=== Cleanup ===" +rm -rf "$TEST_DIR" +echo "Test directory cleaned up" +echo "" +echo -e "${GREEN}Done!${NC}" diff --git a/deploy/scripts/validate-before-commit.sh b/deploy/scripts/validate-before-commit.sh new file mode 100644 index 000000000..d6cc6b885 --- /dev/null +++ b/deploy/scripts/validate-before-commit.sh @@ -0,0 +1,318 @@ +#!/usr/bin/env bash +# ============================================================================= +# PRE-COMMIT VALIDATION SCRIPT +# ============================================================================= +# Run this script before committing to ensure all CI checks will pass. +# +# Usage: +# ./devops/scripts/validate-before-commit.sh [level] +# +# Levels: +# quick - Smoke test only (~2 min) +# pr - Full PR-gating suite (~15 min) [default] +# full - All tests including extended (~45 min) +# +# Examples: +# ./devops/scripts/validate-before-commit.sh # PR-gating +# ./devops/scripts/validate-before-commit.sh quick # Smoke only +# ./devops/scripts/validate-before-commit.sh full # Everything +# +# ============================================================================= + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" + +# Colors +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +CYAN='\033[0;36m' +NC='\033[0m' + +# Validation level +LEVEL="${1:-pr}" + +# ============================================================================= +# UTILITIES +# ============================================================================= + +print_header() { + echo "" + echo -e "${CYAN}=============================================${NC}" + echo -e "${CYAN} $1${NC}" + echo -e "${CYAN}=============================================${NC}" + echo "" +} + +print_step() { + echo -e "${BLUE}>>> $1${NC}" +} + +print_success() { + echo -e "${GREEN}[PASS] $1${NC}" +} + +print_fail() { + echo -e "${RED}[FAIL] $1${NC}" +} + +print_warn() { + echo -e "${YELLOW}[WARN] $1${NC}" +} + +print_info() { + echo -e "${CYAN}[INFO] $1${NC}" +} + +# ============================================================================= +# CHECKS +# ============================================================================= + +check_git_status() { + print_step "Checking git status..." + + # Check for uncommitted changes + if ! git diff --quiet 2>/dev/null; then + print_warn "You have unstaged changes" + fi + + # Check for untracked files + local untracked + untracked=$(git ls-files --others --exclude-standard 2>/dev/null | wc -l) + if [[ "$untracked" -gt 0 ]]; then + print_warn "You have $untracked untracked file(s)" + fi + + # Show current branch + local branch + branch=$(git rev-parse --abbrev-ref HEAD 2>/dev/null) + print_info "Current branch: $branch" +} + +check_dependencies() { + print_step "Checking dependencies..." + + local missing=0 + + # Check .NET + if ! command -v dotnet &>/dev/null; then + print_fail ".NET SDK not found" + missing=1 + else + local version + version=$(dotnet --version) + print_success ".NET SDK: $version" + fi + + # Check Docker + if ! command -v docker &>/dev/null; then + print_warn "Docker not found (some tests may fail)" + else + if docker info &>/dev/null; then + print_success "Docker: running" + else + print_warn "Docker: not running" + fi + fi + + # Check Git + if ! command -v git &>/dev/null; then + print_fail "Git not found" + missing=1 + else + print_success "Git: installed" + fi + + return $missing +} + +run_smoke_tests() { + print_step "Running smoke tests..." + + if "$SCRIPT_DIR/local-ci.sh" smoke; then + print_success "Smoke tests passed" + return 0 + else + print_fail "Smoke tests failed" + return 1 + fi +} + +run_pr_tests() { + print_step "Running PR-gating suite..." + + if "$SCRIPT_DIR/local-ci.sh" pr; then + print_success "PR-gating suite passed" + return 0 + else + print_fail "PR-gating suite failed" + return 1 + fi +} + +run_full_tests() { + print_step "Running full test suite..." + + if "$SCRIPT_DIR/local-ci.sh" full; then + print_success "Full test suite passed" + return 0 + else + print_fail "Full test suite failed" + return 1 + fi +} + +run_module_tests() { + print_step "Running module tests..." + + if "$SCRIPT_DIR/local-ci.sh" module; then + print_success "Module tests passed" + return 0 + else + print_fail "Module tests failed" + return 1 + fi +} + +validate_helm() { + if command -v helm &>/dev/null; then + print_step "Validating Helm chart..." + local chart="$REPO_ROOT/devops/helm/stellaops" + if [[ -d "$chart" ]]; then + if helm lint "$chart" &>/dev/null; then + print_success "Helm chart valid" + else + print_warn "Helm chart has warnings" + fi + fi + fi +} + +validate_compose() { + print_step "Validating Docker Compose..." + local compose="$REPO_ROOT/devops/compose/docker-compose.ci.yaml" + if [[ -f "$compose" ]]; then + if docker compose -f "$compose" config &>/dev/null; then + print_success "Docker Compose valid" + else + print_warn "Docker Compose has issues" + fi + fi +} + +# ============================================================================= +# MAIN +# ============================================================================= + +main() { + print_header "Pre-Commit Validation" + print_info "Level: $LEVEL" + print_info "Repository: $REPO_ROOT" + + local start_time + start_time=$(date +%s) + local failed=0 + + # Always run these checks + check_git_status + check_dependencies || failed=1 + + if [[ $failed -eq 1 ]]; then + print_fail "Dependency check failed" + exit 1 + fi + + # Run appropriate test level + case "$LEVEL" in + quick|smoke) + run_smoke_tests || failed=1 + ;; + pr|default) + run_smoke_tests || failed=1 + if [[ $failed -eq 0 ]]; then + run_module_tests || failed=1 + fi + if [[ $failed -eq 0 ]]; then + run_pr_tests || failed=1 + fi + validate_helm + validate_compose + ;; + full|all) + run_smoke_tests || failed=1 + if [[ $failed -eq 0 ]]; then + run_full_tests || failed=1 + fi + validate_helm + validate_compose + ;; + *) + print_fail "Unknown level: $LEVEL" + echo "Valid levels: quick, pr, full" + exit 1 + ;; + esac + + # Calculate duration + local end_time + end_time=$(date +%s) + local duration=$((end_time - start_time)) + local minutes=$((duration / 60)) + local seconds=$((duration % 60)) + + # Final summary + print_header "Summary" + print_info "Duration: ${minutes}m ${seconds}s" + + if [[ $failed -eq 0 ]]; then + echo "" + echo -e "${GREEN}=============================================${NC}" + echo -e "${GREEN} ALL CHECKS PASSED - Ready to commit!${NC}" + echo -e "${GREEN}=============================================${NC}" + echo "" + echo "Next steps:" + echo " git add -A" + echo " git commit -m \"Your commit message\"" + echo "" + exit 0 + else + echo "" + echo -e "${RED}=============================================${NC}" + echo -e "${RED} VALIDATION FAILED - Do not commit!${NC}" + echo -e "${RED}=============================================${NC}" + echo "" + echo "Check the logs in: out/local-ci/logs/" + echo "" + exit 1 + fi +} + +# Show usage if --help +if [[ "${1:-}" == "--help" ]] || [[ "${1:-}" == "-h" ]]; then + cat </dev/null; then + echo "Error: Docker is not installed" + exit 1 +fi + +# Check compose directory exists +if [[ ! -d "$COMPOSE_DIR" ]]; then + echo "Error: Compose directory not found: $COMPOSE_DIR" + exit 1 +fi + +# Determine profiles to validate +if [[ $# -gt 0 ]]; then + PROFILES=("$@") +else + PROFILES=("${DEFAULT_PROFILES[@]}") +fi + +FAILED=0 +PASSED=0 +SKIPPED=0 + +# Validate base compose file first +BASE_COMPOSE="$COMPOSE_DIR/docker-compose.yml" +if [[ -f "$BASE_COMPOSE" ]]; then + echo "" + echo "=== Validating base: docker-compose.yml ===" + if docker compose -f "$BASE_COMPOSE" config --quiet 2>/dev/null; then + echo " [PASS] docker-compose.yml" + ((PASSED++)) + else + echo " [FAIL] docker-compose.yml" + docker compose -f "$BASE_COMPOSE" config 2>&1 | head -20 + ((FAILED++)) + fi +else + echo "" + echo "Warning: Base compose file not found: $BASE_COMPOSE" +fi + +# Validate each profile +for profile in "${PROFILES[@]}"; do + # Check for both .yml and .yaml extensions + PROFILE_FILE="$COMPOSE_DIR/docker-compose.${profile}.yaml" + if [[ ! -f "$PROFILE_FILE" ]]; then + PROFILE_FILE="$COMPOSE_DIR/docker-compose.${profile}.yml" + fi + + echo "" + echo "=== Validating profile: $profile ===" + + if [[ ! -f "$PROFILE_FILE" ]]; then + echo " [SKIP] Profile file not found: docker-compose.${profile}.yml" + ((SKIPPED++)) + continue + fi + + # Validate profile alone + if docker compose -f "$PROFILE_FILE" config --quiet 2>/dev/null; then + echo " [PASS] docker-compose.${profile}.yml (standalone)" + else + echo " [FAIL] docker-compose.${profile}.yml (standalone)" + docker compose -f "$PROFILE_FILE" config 2>&1 | head -10 + ((FAILED++)) + continue + fi + + # Validate profile with base + if [[ -f "$BASE_COMPOSE" ]]; then + if docker compose -f "$BASE_COMPOSE" -f "$PROFILE_FILE" config --quiet 2>/dev/null; then + echo " [PASS] docker-compose.yml + docker-compose.${profile}.yml (merged)" + ((PASSED++)) + else + echo " [FAIL] Merged validation failed" + docker compose -f "$BASE_COMPOSE" -f "$PROFILE_FILE" config 2>&1 | head -10 + ((FAILED++)) + fi + fi +done + +# Validate Helm chart if present +HELM_DIR="$REPO_ROOT/devops/helm/stellaops" +if [[ -d "$HELM_DIR" ]]; then + echo "" + echo "=== Validating Helm chart ===" + if command -v helm &>/dev/null; then + if helm lint "$HELM_DIR" --quiet 2>/dev/null; then + echo " [PASS] Helm chart: stellaops" + ((PASSED++)) + else + echo " [FAIL] Helm chart: stellaops" + helm lint "$HELM_DIR" 2>&1 | head -20 + ((FAILED++)) + fi + else + echo " [SKIP] Helm not installed" + ((SKIPPED++)) + fi +fi + +# Summary +echo "" +echo "=== Validation Summary ===" +echo " Passed: $PASSED" +echo " Failed: $FAILED" +echo " Skipped: $SKIPPED" + +if [[ $FAILED -gt 0 ]]; then + echo "" + echo "ERROR: $FAILED validation(s) failed" + exit 1 +fi + +echo "" +echo "All validations passed!" diff --git a/deploy/secrets/surface-secrets-provisioning.md b/deploy/secrets/surface-secrets-provisioning.md new file mode 100644 index 000000000..2168f9a3b --- /dev/null +++ b/deploy/secrets/surface-secrets-provisioning.md @@ -0,0 +1,74 @@ +# Surface.Secrets provisioning playbook (OPS-SECRETS-01) + +Audience: DevOps/Ops teams shipping Scanner/Zastava/Orchestrator bundles. +Scope: how to provision secrets for the `StellaOps.Scanner.Surface.Secrets` providers across Kubernetes, Docker Compose, and Offline Kit. + +## Secret types (handles only) +- Registry pull creds (CAS / OCI / private feeds) +- CAS/attestation tokens +- TLS client certs for Surface.FS / RustFS (optional) +- Feature flag/token bundles used by Surface.Validation (non-sensitive payloads still go through handles) + +All values are referenced via `secret://` handles inside service configs; plaintext never enters configs or SBOMs. + +## Provider matrix +| Environment | Provider | Location | Notes | +| --- | --- | --- | --- | +| Kubernetes | `kubernetes` | Namespace-scoped `Secret` objects | Mount-free: providers read via API using service account; RBAC must allow `get/list` on the secret names. | +| Compose (connected) | `file` | Host-mounted path (e.g., `/etc/stellaops/secrets`) | Keep per-tenant subfolders; chmod 700 root; avoid embedding in images. | +| Airgap/Offline Kit | `file` | Unpacked bundle `surface-secrets//...` | Bundled as encrypted payloads; decrypt/unpack to the expected directory before first boot. | +| Tests | `inline` | Environment variables or minimal inline JSON | Only for unit/system tests; disable in prod (`SCANNER_SURFACE_SECRETS_ALLOW_INLINE=false`). | + +## Kubernetes workflow +1) Namespace: choose one per environment (e.g., `stellaops-prod`). +2) Secret layout: one K8s Secret per tenant+component to keep RBAC narrow. +``` +apiVersion: v1 +kind: Secret +metadata: + name: scanner-secrets-default + namespace: stellaops-prod +stringData: + registry.json: | + { "type": "registry", "name": "default", "username": "svc", "password": "********", "scopes": ["stella/*"] } + cas.json: | + { "type": "cas-token", "name": "default", "token": "********" } +``` +3) RBAC: service accounts for Scanner Worker/WebService and Zastava Observer/Webhook need `get/list` on these secrets. +4) Values: set in Helm via `surface.secrets.provider=kubernetes` and `surface.secrets.namespace=` (already templated in `values*.yaml`). + +## Compose workflow +1) Create secrets directory (default `/etc/stellaops/secrets`). +2) Layout per schema (see `docs/modules/scanner/design/surface-secrets-schema.md`): +``` +/etc/stellaops/secrets/ + tenants/default/registry/default.json + tenants/default/cas/default.json +``` +3) Set env in `.env` files: +``` +SCANNER_SURFACE_SECRETS_PROVIDER=file +SCANNER_SURFACE_SECRETS_ROOT=/etc/stellaops/secrets +SCANNER_SURFACE_SECRETS_NAMESPACE= +SCANNER_SURFACE_SECRETS_ALLOW_INLINE=false +ZASTAVA_SURFACE_SECRETS_PROVIDER=${SCANNER_SURFACE_SECRETS_PROVIDER} +ZASTAVA_SURFACE_SECRETS_ROOT=${SCANNER_SURFACE_SECRETS_ROOT} +``` +4) Ensure docker-compose mounts the secrets path read-only to the services that need it. Use `SURFACE_SECRETS_HOST_PATH` to point at the decrypted bundle on the host (defaults to `./offline/surface-secrets` in the Compose profiles). + +## Offline Kit workflow +- The offline kit already ships encrypted `surface-secrets` bundles (see `docs/24_OFFLINE_KIT.md`). +- Operators must: (a) decrypt using the provided key, (b) place contents under `/etc/stellaops/secrets` (or override `*_SURFACE_SECRETS_ROOT`), (c) keep permissions 700/600. +- Set `*_SURFACE_SECRETS_PROVIDER=file` and root path envs as in Compose; Kubernetes provider is not available offline. + +## Validation & observability +- Surface.Validation will fail readiness if required secrets are missing or malformed. +- Metrics/Logs: look for `surface.secrets.*` issue codes; readiness should fail on `Error` severities. +- For CI smoke: run service with `SURFACE_SECRETS_ALLOW_INLINE=true` and inject test secrets via env for deterministic integration tests. + +## Quick checklist +- [ ] Provider selected per environment (`kubernetes`/`file`/`inline`) +- [ ] Secrets directory or namespace populated per schema +- [ ] RBAC (K8s) or file permissions (Compose/offline) locked down +- [ ] Env variables set for both Scanner (`SCANNER_*`) and Zastava (`ZASTAVA_*` prefixes) +- [ ] Readiness wired to Surface.Validation so missing secrets block rollout diff --git a/deploy/telemetry/alerts/alerts-slo.yaml b/deploy/telemetry/alerts/alerts-slo.yaml new file mode 100644 index 000000000..5738c1d34 --- /dev/null +++ b/deploy/telemetry/alerts/alerts-slo.yaml @@ -0,0 +1,36 @@ +groups: + - name: slo-burn + rules: + - alert: SLOBurnRateFast + expr: | + (rate(service_request_errors_total[5m]) / rate(service_requests_total[5m])) > + 4 * (1 - 0.99) + for: 5m + labels: + severity: critical + team: devops + annotations: + summary: "Fast burn: 99% SLO breached" + description: "Error budget burn (5m) exceeds fast threshold." + - alert: SLOBurnRateSlow + expr: | + (rate(service_request_errors_total[1h]) / rate(service_requests_total[1h])) > + 1 * (1 - 0.99) + for: 1h + labels: + severity: warning + team: devops + annotations: + summary: "Slow burn: 99% SLO at risk" + description: "Error budget burn (1h) exceeds slow threshold." + - name: slo-webhook + rules: + - alert: SLOWebhookFailures + expr: rate(slo_webhook_failures_total[5m]) > 0 + for: 10m + labels: + severity: warning + team: devops + annotations: + summary: "SLO webhook failures" + description: "Webhook emitter has failures in last 5m." diff --git a/deploy/telemetry/alerts/export-center-alerts.yaml b/deploy/telemetry/alerts/export-center-alerts.yaml new file mode 100644 index 000000000..6d38be9d5 --- /dev/null +++ b/deploy/telemetry/alerts/export-center-alerts.yaml @@ -0,0 +1,164 @@ +# ExportCenter Alert Rules +# SLO Burn-rate alerts for export service reliability + +groups: + - name: export-center-slo + interval: 30s + rules: + # SLO: 99.5% success rate target + # Error budget: 0.5% (432 errors per day at 86400 requests/day) + + # Fast burn - 2% budget consumption in 1 hour (critical) + - alert: ExportCenterHighErrorBurnRate + expr: | + ( + sum(rate(export_runs_failed_total[1h])) + / + sum(rate(export_runs_total[1h])) + ) > (14.4 * 0.005) + for: 2m + labels: + severity: critical + service: export-center + slo: availability + annotations: + summary: "ExportCenter high error burn rate" + description: "Error rate is {{ $value | humanizePercentage }} over the last hour, consuming error budget at 14.4x the sustainable rate." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/high-error-rate" + + # Slow burn - 10% budget consumption in 6 hours (warning) + - alert: ExportCenterElevatedErrorBurnRate + expr: | + ( + sum(rate(export_runs_failed_total[6h])) + / + sum(rate(export_runs_total[6h])) + ) > (6 * 0.005) + for: 5m + labels: + severity: warning + service: export-center + slo: availability + annotations: + summary: "ExportCenter elevated error burn rate" + description: "Error rate is {{ $value | humanizePercentage }} over the last 6 hours, consuming error budget at 6x the sustainable rate." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/elevated-error-rate" + + - name: export-center-latency + interval: 30s + rules: + # SLO: 95% of exports complete within 120s + # Fast burn - p95 latency exceeding threshold + - alert: ExportCenterHighLatency + expr: | + histogram_quantile(0.95, + sum(rate(export_run_duration_seconds_bucket[5m])) by (le) + ) > 120 + for: 5m + labels: + severity: warning + service: export-center + slo: latency + annotations: + summary: "ExportCenter high latency" + description: "95th percentile export duration is {{ $value | humanizeDuration }}, exceeding 120s SLO target." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/high-latency" + + # Critical latency - p99 exceeding 5 minutes + - alert: ExportCenterCriticalLatency + expr: | + histogram_quantile(0.99, + sum(rate(export_run_duration_seconds_bucket[5m])) by (le) + ) > 300 + for: 2m + labels: + severity: critical + service: export-center + slo: latency + annotations: + summary: "ExportCenter critical latency" + description: "99th percentile export duration is {{ $value | humanizeDuration }}, indicating severe performance degradation." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/critical-latency" + + - name: export-center-capacity + interval: 60s + rules: + # Queue buildup warning + - alert: ExportCenterHighConcurrency + expr: sum(export_runs_in_progress) > 50 + for: 5m + labels: + severity: warning + service: export-center + annotations: + summary: "ExportCenter high concurrency" + description: "{{ $value }} exports currently in progress. Consider scaling or investigating slow exports." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/high-concurrency" + + # Stuck exports - exports running longer than 30 minutes + - alert: ExportCenterStuckExports + expr: | + histogram_quantile(0.99, + sum(rate(export_run_duration_seconds_bucket{status!="completed"}[1h])) by (le) + ) > 1800 + for: 10m + labels: + severity: warning + service: export-center + annotations: + summary: "ExportCenter potentially stuck exports" + description: "Some exports may be stuck - 99th percentile duration for incomplete exports exceeds 30 minutes." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/stuck-exports" + + - name: export-center-errors + interval: 30s + rules: + # Specific error code spike + - alert: ExportCenterErrorCodeSpike + expr: | + sum by (error_code) ( + rate(export_runs_failed_total[5m]) + ) > 0.1 + for: 5m + labels: + severity: warning + service: export-center + annotations: + summary: "ExportCenter error code spike: {{ $labels.error_code }}" + description: "Error code {{ $labels.error_code }} is occurring at {{ $value | humanize }}/s rate." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/error-codes" + + # No successful exports in 15 minutes (when there is traffic) + - alert: ExportCenterNoSuccessfulExports + expr: | + ( + sum(rate(export_runs_total[15m])) > 0 + ) + and + ( + sum(rate(export_runs_success_total[15m])) == 0 + ) + for: 10m + labels: + severity: critical + service: export-center + annotations: + summary: "ExportCenter no successful exports" + description: "No exports have completed successfully in the last 15 minutes despite ongoing attempts." + runbook_url: "https://docs.stellaops.io/runbooks/export-center/no-successful-exports" + + - name: export-center-deprecation + interval: 5m + rules: + # Deprecated endpoint usage + - alert: ExportCenterDeprecatedEndpointUsage + expr: | + sum(rate(export_center_deprecated_endpoint_access_total[1h])) > 0 + for: 1h + labels: + severity: info + service: export-center + annotations: + summary: "Deprecated export endpoints still in use" + description: "Legacy /exports endpoints are still being accessed at {{ $value | humanize }}/s. Migration to v1 API recommended." + runbook_url: "https://docs.stellaops.io/api/export-center/migration" diff --git a/deploy/telemetry/alerts/policy-alerts.yaml b/deploy/telemetry/alerts/policy-alerts.yaml new file mode 100644 index 000000000..c614ad003 --- /dev/null +++ b/deploy/telemetry/alerts/policy-alerts.yaml @@ -0,0 +1,52 @@ +groups: + - name: policy-pipeline + rules: + - alert: PolicyCompileLatencyP99High + expr: histogram_quantile(0.99, sum(rate(policy_compile_duration_seconds_bucket[5m])) by (le)) > 5 + for: 10m + labels: + severity: warning + service: policy + annotations: + summary: "Policy compile latency elevated (p99)" + description: "p99 compile duration has been >5s for 10m" + + - alert: PolicySimulationQueueBacklog + expr: sum(policy_simulation_queue_depth) > 100 + for: 10m + labels: + severity: warning + service: policy + annotations: + summary: "Policy simulation backlog" + description: "Simulation queue depth above 100 for 10m" + + - alert: PolicyApprovalLatencyHigh + expr: histogram_quantile(0.95, sum(rate(policy_approval_latency_seconds_bucket[5m])) by (le)) > 30 + for: 15m + labels: + severity: critical + service: policy + annotations: + summary: "Policy approval latency high" + description: "p95 approval latency above 30s for 15m" + + - alert: PolicyPromotionFailureRate + expr: clamp_min(rate(policy_promotion_outcomes_total{outcome="failure"}[15m]), 0) / clamp_min(rate(policy_promotion_outcomes_total[15m]), 1) > 0.2 + for: 10m + labels: + severity: critical + service: policy + annotations: + summary: "Policy promotion failure rate elevated" + description: "Failures exceed 20% of promotions over 15m" + + - alert: PolicyPromotionStall + expr: rate(policy_promotion_outcomes_total{outcome="success"}[10m]) == 0 and sum(policy_simulation_queue_depth) > 0 + for: 10m + labels: + severity: warning + service: policy + annotations: + summary: "Policy promotion stalled" + description: "No successful promotions while work is queued" diff --git a/deploy/telemetry/alerts/scanner-fn-drift-alerts.yaml b/deploy/telemetry/alerts/scanner-fn-drift-alerts.yaml new file mode 100644 index 000000000..5572e5101 --- /dev/null +++ b/deploy/telemetry/alerts/scanner-fn-drift-alerts.yaml @@ -0,0 +1,42 @@ +# Scanner FN-Drift Alert Rules +# SLO alerts for false-negative drift thresholds (30-day rolling window) + +groups: + - name: scanner-fn-drift + interval: 30s + rules: + - alert: ScannerFnDriftWarning + expr: scanner_fn_drift_percent > 1.0 + for: 5m + labels: + severity: warning + service: scanner + slo: fn-drift + annotations: + summary: "Scanner FN-Drift rate above warning threshold" + description: "FN-Drift is {{ $value | humanizePercentage }} (> 1.0%) over the 30-day rolling window." + runbook_url: "https://docs.stellaops.io/runbooks/scanner/fn-drift-warning" + + - alert: ScannerFnDriftCritical + expr: scanner_fn_drift_percent > 2.5 + for: 5m + labels: + severity: critical + service: scanner + slo: fn-drift + annotations: + summary: "Scanner FN-Drift rate above critical threshold" + description: "FN-Drift is {{ $value | humanizePercentage }} (> 2.5%) over the 30-day rolling window." + runbook_url: "https://docs.stellaops.io/runbooks/scanner/fn-drift-critical" + + - alert: ScannerFnDriftEngineViolation + expr: scanner_fn_drift_cause_engine > 0 + for: 1m + labels: + severity: page + service: scanner + slo: determinism + annotations: + summary: "Engine-caused FN drift detected (determinism violation)" + description: "Engine-caused FN drift count is {{ $value }} (> 0). This indicates non-feed, non-policy changes affecting outcomes." + runbook_url: "https://docs.stellaops.io/runbooks/scanner/fn-drift-engine-violation" diff --git a/deploy/telemetry/alerts/signals-alerts.yaml b/deploy/telemetry/alerts/signals-alerts.yaml new file mode 100644 index 000000000..7e5ca5efb --- /dev/null +++ b/deploy/telemetry/alerts/signals-alerts.yaml @@ -0,0 +1,54 @@ +groups: + - name: signals-pipeline + rules: + - alert: SignalsScoringLatencyP95High + expr: histogram_quantile(0.95, sum(rate(signals_reachability_scoring_duration_seconds_bucket[5m])) by (le)) > 2 + for: 10m + labels: + severity: warning + service: signals + annotations: + summary: "Signals scoring latency high (p95)" + description: "Reachability scoring p95 exceeds 2s for 10m" + + - alert: SignalsCacheMissRateHigh + expr: | + clamp_min(rate(signals_cache_misses_total[5m]), 0) + / clamp_min(rate(signals_cache_hits_total[5m]) + rate(signals_cache_misses_total[5m]), 1) > 0.3 + for: 10m + labels: + severity: warning + service: signals + annotations: + summary: "Signals cache miss rate high" + description: "Cache miss ratio >30% over 10m; investigate Redis or key churn." + + - alert: SignalsCacheDown + expr: signals_cache_available == 0 + for: 2m + labels: + severity: critical + service: signals + annotations: + summary: "Signals cache unavailable" + description: "Redis cache reported unavailable for >2m" + + - alert: SignalsSensorStaleness + expr: time() - max(signals_sensor_last_seen_timestamp_seconds) by (sensor) > 900 + for: 5m + labels: + severity: warning + service: signals + annotations: + summary: "Signals sensor stale" + description: "No updates from sensor for >15 minutes" + + - alert: SignalsIngestionErrorRate + expr: clamp_min(rate(signals_ingestion_failures_total[5m]), 0) / clamp_min(rate(signals_ingestion_total[5m]), 1) > 0.05 + for: 5m + labels: + severity: critical + service: signals + annotations: + summary: "Signals ingestion failures elevated" + description: "Ingestion failure ratio above 5% over 5m" diff --git a/deploy/telemetry/alerts/stella-p0-alerts.yml b/deploy/telemetry/alerts/stella-p0-alerts.yml new file mode 100644 index 000000000..b02a95591 --- /dev/null +++ b/deploy/telemetry/alerts/stella-p0-alerts.yml @@ -0,0 +1,118 @@ +# Sprint: SPRINT_20260117_028_Telemetry_p0_metrics +# Task: P0M-006 - Alerting Rules +# P0 Product Metrics Alert Rules + +groups: + - name: stella-p0-metrics + rules: + # P0M-001: Time to First Verified Release + - alert: StellaTimeToFirstReleaseHigh + expr: | + histogram_quantile(0.90, sum(rate(stella_time_to_first_verified_release_seconds_bucket[24h])) by (le, tenant)) > 14400 + for: 1h + labels: + severity: warning + category: adoption + annotations: + summary: "Time to first verified release is high for tenant {{ $labels.tenant }}" + description: "P90 time to first verified release is {{ $value | humanizeDuration }} (threshold: 4 hours)" + runbook_url: "https://docs.stella-ops.org/runbooks/adoption-onboarding" + + - alert: StellaTimeToFirstReleaseCritical + expr: | + histogram_quantile(0.90, sum(rate(stella_time_to_first_verified_release_seconds_bucket[24h])) by (le, tenant)) > 86400 + for: 1h + labels: + severity: critical + category: adoption + annotations: + summary: "Time to first verified release critically high for tenant {{ $labels.tenant }}" + description: "P90 time to first verified release is {{ $value | humanizeDuration }} (threshold: 24 hours)" + runbook_url: "https://docs.stella-ops.org/runbooks/adoption-onboarding" + + # P0M-002: Why Blocked Latency + - alert: StellaWhyBlockedLatencyHigh + expr: | + histogram_quantile(0.90, sum(rate(stella_why_blocked_latency_seconds_bucket[1h])) by (le, tenant)) > 300 + for: 30m + labels: + severity: warning + category: usability + annotations: + summary: "Why-blocked latency is high for tenant {{ $labels.tenant }}" + description: "P90 time to answer 'why blocked' is {{ $value | humanizeDuration }} (threshold: 5 minutes)" + runbook_url: "https://docs.stella-ops.org/runbooks/usability-explain" + + - alert: StellaWhyBlockedLatencyCritical + expr: | + histogram_quantile(0.90, sum(rate(stella_why_blocked_latency_seconds_bucket[1h])) by (le, tenant)) > 3600 + for: 30m + labels: + severity: critical + category: usability + annotations: + summary: "Why-blocked latency critically high for tenant {{ $labels.tenant }}" + description: "P90 time to answer 'why blocked' is {{ $value | humanizeDuration }} (threshold: 1 hour)" + runbook_url: "https://docs.stella-ops.org/runbooks/usability-explain" + + # P0M-003: Support Burden + - alert: StellaSupportBurdenHigh + expr: | + sum by (tenant, month) (stella_support_burden_minutes_total) > 30 + for: 0m + labels: + severity: warning + category: operations + annotations: + summary: "Support burden high for tenant {{ $labels.tenant }}" + description: "Support time for {{ $labels.tenant }} in {{ $labels.month }} is {{ $value }} minutes (threshold: 30 minutes)" + runbook_url: "https://docs.stella-ops.org/runbooks/support-optimization" + + - alert: StellaSupportBurdenCritical + expr: | + sum by (tenant, month) (stella_support_burden_minutes_total) > 60 + for: 0m + labels: + severity: critical + category: operations + annotations: + summary: "Support burden critically high for tenant {{ $labels.tenant }}" + description: "Support time for {{ $labels.tenant }} in {{ $labels.month }} is {{ $value }} minutes (threshold: 60 minutes)" + runbook_url: "https://docs.stella-ops.org/runbooks/support-optimization" + + # P0M-004: Determinism Regressions + - alert: StellaDeterminismRegression + expr: | + increase(stella_determinism_regressions_total{severity="policy"}[5m]) > 0 + for: 0m + labels: + severity: critical + category: reliability + annotations: + summary: "Policy-level determinism regression detected for tenant {{ $labels.tenant }}" + description: "Determinism failure in {{ $labels.component }} component - same inputs produced different policy decisions" + runbook_url: "https://docs.stella-ops.org/runbooks/determinism-failure" + + - alert: StellaDeterminismRegressionSemantic + expr: | + increase(stella_determinism_regressions_total{severity="semantic"}[1h]) > 0 + for: 0m + labels: + severity: warning + category: reliability + annotations: + summary: "Semantic determinism regression detected for tenant {{ $labels.tenant }}" + description: "Semantic-level determinism failure in {{ $labels.component }} - outputs differ but policy decision unchanged" + runbook_url: "https://docs.stella-ops.org/runbooks/determinism-failure" + + - alert: StellaDeterminismRegressionBitwise + expr: | + increase(stella_determinism_regressions_total{severity="bitwise"}[24h]) > 5 + for: 0m + labels: + severity: warning + category: reliability + annotations: + summary: "Multiple bitwise determinism regressions for tenant {{ $labels.tenant }}" + description: "{{ $value }} bitwise-level determinism failures in {{ $labels.component }} in last 24h" + runbook_url: "https://docs.stella-ops.org/runbooks/determinism-failure" diff --git a/deploy/telemetry/alerts/triage-alerts.yaml b/deploy/telemetry/alerts/triage-alerts.yaml new file mode 100644 index 000000000..6507fb912 --- /dev/null +++ b/deploy/telemetry/alerts/triage-alerts.yaml @@ -0,0 +1,62 @@ +groups: + - name: triage-ttfs + rules: + - alert: TriageTtfsFirstEvidenceP95High + expr: histogram_quantile(0.95, sum(rate(stellaops_ttfs_first_evidence_seconds_bucket[5m])) by (le)) > 1.5 + for: 10m + labels: + severity: critical + service: triage + annotations: + summary: "TTFS first evidence p95 high" + description: "TTFS first-evidence p95 exceeds 1.5s for 10m (triage experience degraded)." + + - alert: TriageTtfsSkeletonP95High + expr: histogram_quantile(0.95, sum(rate(stellaops_ttfs_skeleton_seconds_bucket[5m])) by (le)) > 0.2 + for: 10m + labels: + severity: warning + service: triage + annotations: + summary: "TTFS skeleton p95 high" + description: "TTFS skeleton p95 exceeds 200ms for 10m." + + - alert: TriageTtfsFullEvidenceP95High + expr: histogram_quantile(0.95, sum(rate(stellaops_ttfs_full_evidence_seconds_bucket[5m])) by (le)) > 1.5 + for: 10m + labels: + severity: warning + service: triage + annotations: + summary: "TTFS full evidence p95 high" + description: "TTFS full-evidence p95 exceeds 1.5s for 10m." + + - alert: TriageClicksToClosureMedianHigh + expr: histogram_quantile(0.50, sum(rate(stellaops_clicks_to_closure_bucket[5m])) by (le)) > 6 + for: 15m + labels: + severity: warning + service: triage + annotations: + summary: "Clicks-to-closure median high" + description: "Median clicks-to-closure exceeds 6 for 15m." + + - alert: TriageEvidenceCompletenessAvgLow + expr: (sum(rate(stellaops_evidence_completeness_score_sum[15m])) / clamp_min(sum(rate(stellaops_evidence_completeness_score_count[15m])), 1)) < 3.6 + for: 30m + labels: + severity: warning + service: triage + annotations: + summary: "Evidence completeness below target" + description: "Average evidence completeness score below 3.6 (90%) for 30m." + + - alert: TriageBudgetViolationRateHigh + expr: sum(rate(stellaops_performance_budget_violations_total[5m])) by (phase) > 0.05 + for: 10m + labels: + severity: warning + service: triage + annotations: + summary: "Performance budget violations elevated" + description: "Performance budget violation rate exceeds 0.05/s for 10m." diff --git a/deploy/telemetry/collectors/otel-collector-config.yaml b/deploy/telemetry/collectors/otel-collector-config.yaml new file mode 100644 index 000000000..0f96bc69c --- /dev/null +++ b/deploy/telemetry/collectors/otel-collector-config.yaml @@ -0,0 +1,92 @@ +receivers: + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + require_client_certificate: ${STELLAOPS_OTEL_REQUIRE_CLIENT_CERT:true} + http: + endpoint: 0.0.0.0:4318 + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + require_client_certificate: ${STELLAOPS_OTEL_REQUIRE_CLIENT_CERT:true} + +processors: + attributes/tenant-tag: + actions: + - key: tenant.id + action: insert + value: ${STELLAOPS_TENANT_ID:unknown} + batch: + send_batch_size: 1024 + timeout: 5s + +exporters: + logging: + verbosity: normal + prometheus: + endpoint: ${STELLAOPS_OTEL_PROMETHEUS_ENDPOINT:0.0.0.0:9464} + enable_open_metrics: true + metric_expiration: 5m + tls: + cert_file: ${STELLAOPS_OTEL_TLS_CERT:?STELLAOPS_OTEL_TLS_CERT not set} + key_file: ${STELLAOPS_OTEL_TLS_KEY:?STELLAOPS_OTEL_TLS_KEY not set} + client_ca_file: ${STELLAOPS_OTEL_TLS_CA:?STELLAOPS_OTEL_TLS_CA not set} + otlphttp/tempo: + endpoint: ${STELLAOPS_TEMPO_ENDPOINT:https://stellaops-tempo:3200} + compression: gzip + tls: + ca_file: ${STELLAOPS_TEMPO_TLS_CA_FILE:/etc/otel-collector/tls/ca.crt} + cert_file: ${STELLAOPS_TEMPO_TLS_CERT_FILE:/etc/otel-collector/tls/client.crt} + key_file: ${STELLAOPS_TEMPO_TLS_KEY_FILE:/etc/otel-collector/tls/client.key} + insecure_skip_verify: false + headers: + "X-Scope-OrgID": ${STELLAOPS_TENANT_ID:unknown} + loki/tenant: + endpoint: ${STELLAOPS_LOKI_ENDPOINT:https://stellaops-loki:3100/loki/api/v1/push} + tenant_id: ${STELLAOPS_TENANT_ID:unknown} + tls: + ca_file: ${STELLAOPS_LOKI_TLS_CA_FILE:/etc/otel-collector/tls/ca.crt} + cert_file: ${STELLAOPS_LOKI_TLS_CERT_FILE:/etc/otel-collector/tls/client.crt} + key_file: ${STELLAOPS_LOKI_TLS_KEY_FILE:/etc/otel-collector/tls/client.key} + insecure_skip_verify: false + default_labels_enabled: + exporter: false + job: false + instance: false + format: json + drain_interval: 5s + queue: + enabled: true + queue_size: 1024 + retry_on_failure: true + +extensions: + health_check: + endpoint: ${STELLAOPS_OTEL_HEALTH_ENDPOINT:0.0.0.0:13133} + pprof: + endpoint: ${STELLAOPS_OTEL_PPROF_ENDPOINT:0.0.0.0:1777} + +service: + telemetry: + logs: + level: ${STELLAOPS_OTEL_LOG_LEVEL:info} + extensions: [health_check, pprof] + pipelines: + traces: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging, otlphttp/tempo] + metrics: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging, prometheus] + logs: + receivers: [otlp] + processors: [attributes/tenant-tag, batch] + exporters: [logging, loki/tenant] diff --git a/deploy/telemetry/dashboards/export-center.json b/deploy/telemetry/dashboards/export-center.json new file mode 100644 index 000000000..0ba6d42cc --- /dev/null +++ b/deploy/telemetry/dashboards/export-center.json @@ -0,0 +1,638 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": { "type": "grafana", "uid": "-- Grafana --" }, + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "description": "ExportCenter service observability dashboard", + "editable": true, + "fiscalYearStartMonth": 0, + "graphTooltip": 0, + "id": null, + "links": [], + "liveNow": false, + "panels": [ + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "panels": [], + "title": "Export Runs Overview", + "type": "row" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 0, "y": 1 }, + "id": 2, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "10.0.0", + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum(increase(export_runs_total{tenant=~\"$tenant\"}[$__range]))", + "legendFormat": "Total Runs", + "range": true, + "refId": "A" + } + ], + "title": "Total Export Runs", + "type": "stat" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 4, "y": 1 }, + "id": 3, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "10.0.0", + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum(increase(export_runs_success_total{tenant=~\"$tenant\"}[$__range]))", + "legendFormat": "Successful", + "range": true, + "refId": "A" + } + ], + "title": "Successful Runs", + "type": "stat" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 1 }, + { "color": "red", "value": 5 } + ] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 8, "y": 1 }, + "id": 4, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "10.0.0", + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum(increase(export_runs_failed_total{tenant=~\"$tenant\"}[$__range]))", + "legendFormat": "Failed", + "range": true, + "refId": "A" + } + ], + "title": "Failed Runs", + "type": "stat" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 95 }, + { "color": "green", "value": 99 } + ] + }, + "unit": "percent" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 12, "y": 1 }, + "id": 5, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "10.0.0", + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "100 * sum(increase(export_runs_success_total{tenant=~\"$tenant\"}[$__range])) / sum(increase(export_runs_total{tenant=~\"$tenant\"}[$__range]))", + "legendFormat": "Success Rate", + "range": true, + "refId": "A" + } + ], + "title": "Success Rate", + "type": "stat" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 16, "y": 1 }, + "id": 6, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "10.0.0", + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum(export_runs_in_progress{tenant=~\"$tenant\"})", + "legendFormat": "In Progress", + "range": true, + "refId": "A" + } + ], + "title": "Runs In Progress", + "type": "stat" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 5 }, + "id": 7, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom", "showLegend": true }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum by (export_type) (rate(export_runs_total{tenant=~\"$tenant\"}[5m]))", + "legendFormat": "{{export_type}}", + "range": true, + "refId": "A" + } + ], + "title": "Export Runs by Type (rate/5m)", + "type": "timeseries" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 5 }, + "id": 8, + "options": { + "legend": { "calcs": ["mean", "max", "p95"], "displayMode": "table", "placement": "bottom", "showLegend": true }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "histogram_quantile(0.50, sum by (le) (rate(export_run_duration_seconds_bucket{tenant=~\"$tenant\"}[5m])))", + "legendFormat": "p50", + "range": true, + "refId": "A" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "histogram_quantile(0.95, sum by (le) (rate(export_run_duration_seconds_bucket{tenant=~\"$tenant\"}[5m])))", + "legendFormat": "p95", + "range": true, + "refId": "B" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "histogram_quantile(0.99, sum by (le) (rate(export_run_duration_seconds_bucket{tenant=~\"$tenant\"}[5m])))", + "legendFormat": "p99", + "range": true, + "refId": "C" + } + ], + "title": "Export Run Duration (latency percentiles)", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 13 }, + "id": 9, + "panels": [], + "title": "Artifacts & Bundle Sizes", + "type": "row" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "bars", + "fillOpacity": 50, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "normal" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 14 }, + "id": 10, + "options": { + "legend": { "calcs": ["sum"], "displayMode": "table", "placement": "bottom", "showLegend": true }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum by (artifact_type) (increase(export_artifacts_total{tenant=~\"$tenant\"}[1h]))", + "legendFormat": "{{artifact_type}}", + "range": true, + "refId": "A" + } + ], + "title": "Artifacts Exported by Type (per hour)", + "type": "timeseries" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "auto", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "bytes" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 14 }, + "id": 11, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom", "showLegend": true }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "histogram_quantile(0.50, sum by (le, export_type) (rate(export_bundle_size_bytes_bucket{tenant=~\"$tenant\"}[5m])))", + "legendFormat": "{{export_type}} p50", + "range": true, + "refId": "A" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "histogram_quantile(0.95, sum by (le, export_type) (rate(export_bundle_size_bytes_bucket{tenant=~\"$tenant\"}[5m])))", + "legendFormat": "{{export_type}} p95", + "range": true, + "refId": "B" + } + ], + "title": "Bundle Size Distribution by Type", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 22 }, + "id": 12, + "panels": [], + "title": "Error Analysis", + "type": "row" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "hideFrom": { "legend": false, "tooltip": false, "viz": false } + }, + "mappings": [], + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 8, "x": 0, "y": 23 }, + "id": 13, + "options": { + "legend": { "displayMode": "table", "placement": "right", "showLegend": true }, + "pieType": "pie", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "tooltip": { "mode": "single", "sort": "none" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum by (error_code) (increase(export_runs_failed_total{tenant=~\"$tenant\"}[$__range]))", + "legendFormat": "{{error_code}}", + "range": true, + "refId": "A" + } + ], + "title": "Failures by Error Code", + "type": "piechart" + }, + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "line" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "red", "value": 0.01 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 16, "x": 8, "y": 23 }, + "id": 14, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom", "showLegend": true }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "targets": [ + { + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "editorMode": "code", + "expr": "sum(rate(export_runs_failed_total{tenant=~\"$tenant\"}[5m])) / sum(rate(export_runs_total{tenant=~\"$tenant\"}[5m]))", + "legendFormat": "Error Rate", + "range": true, + "refId": "A" + } + ], + "title": "Error Rate (5m window)", + "type": "timeseries" + } + ], + "refresh": "30s", + "schemaVersion": 38, + "style": "dark", + "tags": ["export-center", "stellaops"], + "templating": { + "list": [ + { + "current": {}, + "hide": 0, + "includeAll": false, + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "allValue": ".*", + "current": {}, + "datasource": { "type": "prometheus", "uid": "${datasource}" }, + "definition": "label_values(export_runs_total, tenant)", + "hide": 0, + "includeAll": true, + "multi": true, + "name": "tenant", + "options": [], + "query": { "query": "label_values(export_runs_total, tenant)", "refId": "StandardVariableQuery" }, + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "type": "query" + } + ] + }, + "time": { "from": "now-6h", "to": "now" }, + "timepicker": {}, + "timezone": "utc", + "title": "ExportCenter Service", + "uid": "export-center-overview", + "version": 1, + "weekStart": "" +} diff --git a/deploy/telemetry/dashboards/stella-ops-error-tracking.json b/deploy/telemetry/dashboards/stella-ops-error-tracking.json new file mode 100644 index 000000000..c4c0e51c0 --- /dev/null +++ b/deploy/telemetry/dashboards/stella-ops-error-tracking.json @@ -0,0 +1,536 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + }, + { + "datasource": "${datasource}", + "enable": true, + "expr": "increase(stella_error_total[1m]) > 0", + "iconColor": "red", + "name": "Error Spikes", + "tagKeys": "error_type", + "titleFormat": "Error: {{error_type}}" + } + ] + }, + "description": "Stella Ops Release Orchestrator - Error Tracking", + "editable": true, + "gnetId": null, + "graphTooltip": 1, + "id": null, + "iteration": 1737158400000, + "links": [], + "panels": [ + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "panels": [], + "title": "Error Summary", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 1 }, + { "color": "red", "value": 10 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 0, "y": 1 }, + "id": 2, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["sum"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_error_total[1h]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Errors (1h)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 0.01 }, + { "color": "red", "value": 0.05 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 6, "y": 1 }, + "id": 3, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_error_total[5m])) / sum(rate(stella_api_requests_total[5m]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Error Rate", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 1 }, + { "color": "red", "value": 5 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 12, "y": 1 }, + "id": 4, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["sum"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_release_failed_total[1h]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Failed Releases (1h)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 1 }, + { "color": "red", "value": 3 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 18, "y": 1 }, + "id": 5, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["sum"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_gate_failed_total[1h]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Gate Failures (1h)", + "type": "stat" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 5 }, + "id": 6, + "panels": [], + "title": "Error Trends", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 20, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "normal" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 6 }, + "id": 7, + "options": { + "legend": { "calcs": ["sum"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_error_total[5m])) by (error_type)", + "legendFormat": "{{error_type}}", + "refId": "A" + } + ], + "title": "Errors by Type", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 20, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "normal" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 6 }, + "id": 8, + "options": { + "legend": { "calcs": ["sum"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_error_total{environment=~\"$environment\"}[5m])) by (component)", + "legendFormat": "{{component}}", + "refId": "A" + } + ], + "title": "Errors by Component", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 14 }, + "id": 9, + "panels": [], + "title": "Release Failures", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "fillOpacity": 80, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineWidth": 1, + "scaleDistribution": { "type": "linear" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + } + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 15 }, + "id": 10, + "options": { + "barRadius": 0.1, + "barWidth": 0.8, + "groupWidth": 0.7, + "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, + "orientation": "horizontal", + "showValue": "auto", + "stacking": "none", + "tooltip": { "mode": "single", "sort": "none" }, + "xTickLabelRotation": 0, + "xTickLabelSpacing": 0 + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "topk(10, sum(increase(stella_release_failed_total[24h])) by (failure_reason))", + "format": "table", + "instant": true, + "legendFormat": "{{failure_reason}}", + "refId": "A" + } + ], + "title": "Top Failure Reasons (24h)", + "transformations": [ + { + "id": "organize", + "options": { + "excludeByName": { "Time": true }, + "indexByName": {}, + "renameByName": { "Value": "Count", "failure_reason": "Reason" } + } + } + ], + "type": "barchart" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "bars", + "fillOpacity": 80, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "normal" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [ + { + "matcher": { "id": "byName", "options": "Failures" }, + "properties": [{ "id": "color", "value": { "fixedColor": "red", "mode": "fixed" } }] + }, + { + "matcher": { "id": "byName", "options": "Rollbacks" }, + "properties": [{ "id": "color", "value": { "fixedColor": "orange", "mode": "fixed" } }] + } + ] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 15 }, + "id": 11, + "options": { + "legend": { "calcs": ["sum"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_release_failed_total{environment=~\"$environment\"}[1h])) by (environment)", + "legendFormat": "{{environment}} Failures", + "refId": "A" + }, + { + "expr": "sum(increase(stella_rollback_total{environment=~\"$environment\"}[1h])) by (environment)", + "legendFormat": "{{environment}} Rollbacks", + "refId": "B" + } + ], + "title": "Failures & Rollbacks by Environment", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 23 }, + "id": 12, + "panels": [], + "title": "Recent Errors", + "type": "row" + }, + { + "datasource": "${loki_datasource}", + "fieldConfig": { + "defaults": {}, + "overrides": [] + }, + "gridPos": { "h": 10, "w": 24, "x": 0, "y": 24 }, + "id": 13, + "options": { + "dedupStrategy": "none", + "enableLogDetails": true, + "prettifyLogMessage": false, + "showCommonLabels": false, + "showLabels": true, + "showTime": true, + "sortOrder": "Descending", + "wrapLogMessage": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "{app=\"stella-ops\"} |= \"error\" | json | level=~\"error|fatal\"", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Error Logs", + "type": "logs" + } + ], + "refresh": "30s", + "schemaVersion": 36, + "style": "dark", + "tags": ["stella-ops", "errors"], + "templating": { + "list": [ + { + "current": { "selected": false, "text": "Prometheus", "value": "Prometheus" }, + "hide": 0, + "includeAll": false, + "label": "Metrics", + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "current": { "selected": false, "text": "Loki", "value": "Loki" }, + "hide": 0, + "includeAll": false, + "label": "Logs", + "multi": false, + "name": "loki_datasource", + "options": [], + "query": "loki", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "allValue": ".*", + "current": { "selected": true, "text": "All", "value": "$__all" }, + "datasource": "${datasource}", + "definition": "label_values(stella_error_total, environment)", + "hide": 0, + "includeAll": true, + "label": "Environment", + "multi": true, + "name": "environment", + "options": [], + "query": { "query": "label_values(stella_error_total, environment)", "refId": "StandardVariableQuery" }, + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "type": "query" + } + ] + }, + "time": { "from": "now-6h", "to": "now" }, + "timepicker": {}, + "timezone": "", + "title": "Stella Ops - Error Tracking", + "uid": "stella-ops-errors", + "version": 1, + "weekStart": "" +} diff --git a/deploy/telemetry/dashboards/stella-ops-performance.json b/deploy/telemetry/dashboards/stella-ops-performance.json new file mode 100644 index 000000000..ad32a50b4 --- /dev/null +++ b/deploy/telemetry/dashboards/stella-ops-performance.json @@ -0,0 +1,607 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "description": "Stella Ops Release Orchestrator - Performance Metrics", + "editable": true, + "gnetId": null, + "graphTooltip": 1, + "id": null, + "iteration": 1737158400000, + "links": [], + "panels": [ + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "panels": [], + "title": "System Performance", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 0.7 }, + { "color": "red", "value": 0.9 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 0, "y": 1 }, + "id": 2, + "options": { + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "avg(stella_cpu_usage_ratio{component=\"orchestrator\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "CPU Usage", + "type": "gauge" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 0.7 }, + { "color": "red", "value": 0.9 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 6, "y": 1 }, + "id": 3, + "options": { + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "avg(stella_memory_usage_ratio{component=\"orchestrator\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Memory Usage", + "type": "gauge" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 100 }, + { "color": "red", "value": 500 } + ] + }, + "unit": "ms" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 12, "y": 1 }, + "id": 4, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["mean"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.95, sum(rate(stella_api_request_duration_seconds_bucket[5m])) by (le)) * 1000", + "legendFormat": "", + "refId": "A" + } + ], + "title": "API Latency (p95)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + }, + "unit": "reqps" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 6, "x": 18, "y": 1 }, + "id": 5, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_api_requests_total[5m]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Request Rate", + "type": "stat" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 5 }, + "id": 6, + "panels": [], + "title": "Gate Evaluation Performance", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 6 }, + "id": 7, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.99, sum(rate(stella_gate_evaluation_duration_seconds_bucket{gate_type=~\"$gate_type\"}[5m])) by (le, gate_type))", + "legendFormat": "{{gate_type}} p99", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.50, sum(rate(stella_gate_evaluation_duration_seconds_bucket{gate_type=~\"$gate_type\"}[5m])) by (le, gate_type))", + "legendFormat": "{{gate_type}} p50", + "refId": "B" + } + ], + "title": "Gate Evaluation Duration by Type", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 6 }, + "id": 8, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_gate_evaluations_total{gate_type=~\"$gate_type\"}[5m])) by (gate_type)", + "legendFormat": "{{gate_type}}", + "refId": "A" + } + ], + "title": "Gate Evaluations per Second", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 14 }, + "id": 9, + "panels": [], + "title": "Cache Performance", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 0.7 }, + { "color": "green", "value": 0.9 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 6, "x": 0, "y": 15 }, + "id": 10, + "options": { + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(stella_cache_hits_total) / (sum(stella_cache_hits_total) + sum(stella_cache_misses_total))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Cache Hit Ratio", + "type": "gauge" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [ + { + "matcher": { "id": "byName", "options": "Hits" }, + "properties": [{ "id": "color", "value": { "fixedColor": "green", "mode": "fixed" } }] + }, + { + "matcher": { "id": "byName", "options": "Misses" }, + "properties": [{ "id": "color", "value": { "fixedColor": "red", "mode": "fixed" } }] + } + ] + }, + "gridPos": { "h": 6, "w": 12, "x": 6, "y": 15 }, + "id": 11, + "options": { + "legend": { "calcs": ["sum"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_cache_hits_total[5m])) by (cache_name)", + "legendFormat": "{{cache_name}} Hits", + "refId": "A" + }, + { + "expr": "sum(rate(stella_cache_misses_total[5m])) by (cache_name)", + "legendFormat": "{{cache_name}} Misses", + "refId": "B" + } + ], + "title": "Cache Hits vs Misses", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 0.7 }, + { "color": "red", "value": 0.9 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 6, "x": 18, "y": 15 }, + "id": 12, + "options": { + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "stella_cache_size_bytes / stella_cache_max_size_bytes", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Cache Utilization", + "type": "gauge" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 21 }, + "id": 13, + "panels": [], + "title": "Database Performance", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "ms" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 22 }, + "id": 14, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.95, sum(rate(stella_db_query_duration_seconds_bucket[5m])) by (le, query_type)) * 1000", + "legendFormat": "{{query_type}} p95", + "refId": "A" + } + ], + "title": "Database Query Duration (p95)", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 22 }, + "id": 15, + "options": { + "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "stella_db_connections_active", + "legendFormat": "Active", + "refId": "A" + }, + { + "expr": "stella_db_connections_idle", + "legendFormat": "Idle", + "refId": "B" + }, + { + "expr": "stella_db_connections_max", + "legendFormat": "Max", + "refId": "C" + } + ], + "title": "Database Connection Pool", + "type": "timeseries" + } + ], + "refresh": "30s", + "schemaVersion": 36, + "style": "dark", + "tags": ["stella-ops", "performance"], + "templating": { + "list": [ + { + "current": { "selected": false, "text": "Prometheus", "value": "Prometheus" }, + "hide": 0, + "includeAll": false, + "label": "Data Source", + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "allValue": ".*", + "current": { "selected": true, "text": "All", "value": "$__all" }, + "datasource": "${datasource}", + "definition": "label_values(stella_gate_evaluation_duration_seconds_bucket, gate_type)", + "hide": 0, + "includeAll": true, + "label": "Gate Type", + "multi": true, + "name": "gate_type", + "options": [], + "query": { "query": "label_values(stella_gate_evaluation_duration_seconds_bucket, gate_type)", "refId": "StandardVariableQuery" }, + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "type": "query" + } + ] + }, + "time": { "from": "now-6h", "to": "now" }, + "timepicker": {}, + "timezone": "", + "title": "Stella Ops - Performance Metrics", + "uid": "stella-ops-performance", + "version": 1, + "weekStart": "" +} diff --git a/deploy/telemetry/dashboards/stella-ops-release-overview.json b/deploy/telemetry/dashboards/stella-ops-release-overview.json new file mode 100644 index 000000000..8a09b8491 --- /dev/null +++ b/deploy/telemetry/dashboards/stella-ops-release-overview.json @@ -0,0 +1,566 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + }, + { + "datasource": "${datasource}", + "enable": true, + "expr": "stella_release_promotion_completed{environment=~\"$environment\"}", + "iconColor": "green", + "name": "Promotions", + "tagKeys": "version,environment", + "titleFormat": "Promotion to {{environment}}" + } + ] + }, + "description": "Stella Ops Release Orchestrator - Release Overview", + "editable": true, + "gnetId": null, + "graphTooltip": 1, + "id": null, + "iteration": 1737158400000, + "links": [], + "panels": [ + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "panels": [], + "title": "Release Summary", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 0, "y": 1 }, + "id": 2, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "count(stella_release_active{environment=~\"$environment\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Active Releases", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 5 }, + { "color": "red", "value": 10 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 4, "y": 1 }, + "id": 3, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "count(stella_release_pending_approval{environment=~\"$environment\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Pending Approvals", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 8, "y": 1 }, + "id": 4, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(stella_release_success_total{environment=~\"$environment\"}) / sum(stella_release_total{environment=~\"$environment\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Success Rate (24h)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 900 }, + { "color": "red", "value": 1800 } + ] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 12, "y": 1 }, + "id": 5, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["mean"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.50, sum(rate(stella_release_duration_seconds_bucket{environment=~\"$environment\"}[24h])) by (le))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Median Release Time", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "green", "value": 1 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 16, "y": 1 }, + "id": 6, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(stella_gate_passed_total{environment=~\"$environment\"}) / sum(stella_gate_evaluated_total{environment=~\"$environment\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Gate Pass Rate", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "red", "value": 1 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 4, "w": 4, "x": 20, "y": 1 }, + "id": 7, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(stella_rollback_total{environment=~\"$environment\"})", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Rollbacks (24h)", + "type": "stat" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 5 }, + "id": 8, + "panels": [], + "title": "Release Activity", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 6 }, + "id": 9, + "options": { + "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(rate(stella_release_total{environment=~\"$environment\"}[5m])) by (environment)", + "legendFormat": "{{environment}}", + "refId": "A" + } + ], + "title": "Releases per Minute", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "bars", + "fillOpacity": 80, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "linear", + "lineWidth": 1, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "normal" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "short" + }, + "overrides": [ + { + "matcher": { "id": "byName", "options": "Success" }, + "properties": [{ "id": "color", "value": { "fixedColor": "green", "mode": "fixed" } }] + }, + { + "matcher": { "id": "byName", "options": "Failed" }, + "properties": [{ "id": "color", "value": { "fixedColor": "red", "mode": "fixed" } }] + } + ] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 6 }, + "id": 10, + "options": { + "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_release_success_total{environment=~\"$environment\"}[1h]))", + "legendFormat": "Success", + "refId": "A" + }, + { + "expr": "sum(increase(stella_release_failed_total{environment=~\"$environment\"}[1h]))", + "legendFormat": "Failed", + "refId": "B" + } + ], + "title": "Release Outcomes (Hourly)", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 14 }, + "id": 11, + "panels": [], + "title": "Environment Health", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [ + { "options": { "0": { "color": "red", "index": 0, "text": "Down" } }, "type": "value" }, + { "options": { "1": { "color": "green", "index": 1, "text": "Up" } }, "type": "value" } + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "green", "value": 1 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 8, "x": 0, "y": 15 }, + "id": 12, + "options": { + "colorMode": "background", + "graphMode": "none", + "justifyMode": "center", + "orientation": "horizontal", + "reduceOptions": { + "calcs": ["lastNotNull"], + "fields": "", + "values": false + }, + "textMode": "value_and_name" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "stella_environment_health{environment=~\"$environment\"}", + "legendFormat": "{{environment}}", + "refId": "A" + } + ], + "title": "Environment Status", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "off" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [{ "color": "green", "value": null }] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 16, "x": 8, "y": 15 }, + "id": 13, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "right" }, + "tooltip": { "mode": "multi", "sort": "desc" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.95, sum(rate(stella_release_duration_seconds_bucket{environment=~\"$environment\"}[5m])) by (le, environment))", + "legendFormat": "{{environment}} p95", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.50, sum(rate(stella_release_duration_seconds_bucket{environment=~\"$environment\"}[5m])) by (le, environment))", + "legendFormat": "{{environment}} p50", + "refId": "B" + } + ], + "title": "Release Duration by Environment", + "type": "timeseries" + } + ], + "refresh": "30s", + "schemaVersion": 36, + "style": "dark", + "tags": ["stella-ops", "releases"], + "templating": { + "list": [ + { + "current": { "selected": false, "text": "Prometheus", "value": "Prometheus" }, + "hide": 0, + "includeAll": false, + "label": "Data Source", + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "allValue": ".*", + "current": { "selected": true, "text": "All", "value": "$__all" }, + "datasource": "${datasource}", + "definition": "label_values(stella_release_total, environment)", + "hide": 0, + "includeAll": true, + "label": "Environment", + "multi": true, + "name": "environment", + "options": [], + "query": { "query": "label_values(stella_release_total, environment)", "refId": "StandardVariableQuery" }, + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "type": "query" + } + ] + }, + "time": { "from": "now-24h", "to": "now" }, + "timepicker": {}, + "timezone": "", + "title": "Stella Ops - Release Overview", + "uid": "stella-ops-releases", + "version": 1, + "weekStart": "" +} diff --git a/deploy/telemetry/dashboards/stella-ops-sla-monitoring.json b/deploy/telemetry/dashboards/stella-ops-sla-monitoring.json new file mode 100644 index 000000000..644f16e32 --- /dev/null +++ b/deploy/telemetry/dashboards/stella-ops-sla-monitoring.json @@ -0,0 +1,541 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + }, + { + "datasource": "${datasource}", + "enable": true, + "expr": "changes(stella_sla_breach_total[1m]) > 0", + "iconColor": "red", + "name": "SLA Breaches", + "tagKeys": "sla_name", + "titleFormat": "SLA Breach: {{sla_name}}" + } + ] + }, + "description": "Stella Ops Release Orchestrator - SLA Monitoring", + "editable": true, + "gnetId": null, + "graphTooltip": 1, + "id": null, + "iteration": 1737158400000, + "links": [], + "panels": [ + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "panels": [], + "title": "SLA Overview", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 0.99 }, + { "color": "green", "value": 0.999 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 5, "w": 6, "x": 0, "y": 1 }, + "id": 2, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "1 - (sum(increase(stella_release_failed_total[30d])) / sum(increase(stella_release_total[30d])))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Release Success Rate (30d SLA)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 0.99 }, + { "color": "green", "value": 0.999 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 5, "w": 6, "x": 6, "y": 1 }, + "id": 3, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "avg_over_time(stella_api_availability[30d])", + "legendFormat": "", + "refId": "A" + } + ], + "title": "API Availability (30d SLA)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "yellow", "value": 300 }, + { "color": "red", "value": 600 } + ] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 5, "w": 6, "x": 12, "y": 1 }, + "id": 4, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["mean"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.95, sum(rate(stella_release_duration_seconds_bucket[30d])) by (le))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Release Time p95 (Target: <10m)", + "type": "stat" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "red", "value": 1 } + ] + } + }, + "overrides": [] + }, + "gridPos": { "h": 5, "w": 6, "x": 18, "y": 1 }, + "id": 5, + "options": { + "colorMode": "value", + "graphMode": "none", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { "calcs": ["sum"], "fields": "", "values": false }, + "textMode": "auto" + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "sum(increase(stella_sla_breach_total[30d]))", + "legendFormat": "", + "refId": "A" + } + ], + "title": "SLA Breaches (30d)", + "type": "stat" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 6 }, + "id": 6, + "panels": [], + "title": "Error Budget", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "mappings": [], + "max": 100, + "min": 0, + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 20 }, + { "color": "green", "value": 50 } + ] + }, + "unit": "percent" + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 8, "x": 0, "y": 7 }, + "id": 7, + "options": { + "orientation": "auto", + "reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false }, + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "((0.001 * sum(increase(stella_release_total[30d]))) - sum(increase(stella_release_failed_total[30d]))) / (0.001 * sum(increase(stella_release_total[30d]))) * 100", + "legendFormat": "", + "refId": "A" + } + ], + "title": "Error Budget Remaining (99.9% SLA)", + "type": "gauge" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 10, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "line" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "green", "value": null }, + { "color": "red", "value": 0 } + ] + }, + "unit": "short" + }, + "overrides": [] + }, + "gridPos": { "h": 6, "w": 16, "x": 8, "y": 7 }, + "id": 8, + "options": { + "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "(0.001 * sum(increase(stella_release_total[30d]))) - sum(increase(stella_release_failed_total[30d]))", + "legendFormat": "Remaining Budget (failures allowed)", + "refId": "A" + } + ], + "title": "Error Budget Burn Rate", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 13 }, + "id": 9, + "panels": [], + "title": "SLI Trends", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "line+area" } + }, + "mappings": [], + "max": 1, + "min": 0.99, + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "transparent", "value": 0.999 } + ] + }, + "unit": "percentunit" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 0, "y": 14 }, + "id": 10, + "options": { + "legend": { "calcs": ["mean", "min"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "1 - (sum(rate(stella_release_failed_total[1h])) / sum(rate(stella_release_total[1h])))", + "legendFormat": "Success Rate", + "refId": "A" + } + ], + "title": "Release Success Rate Over Time", + "type": "timeseries" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "palette-classic" }, + "custom": { + "axisCenteredZero": false, + "axisColorMode": "text", + "axisLabel": "", + "axisPlacement": "auto", + "barAlignment": 0, + "drawStyle": "line", + "fillOpacity": 0, + "gradientMode": "none", + "hideFrom": { "legend": false, "tooltip": false, "viz": false }, + "lineInterpolation": "smooth", + "lineWidth": 2, + "pointSize": 5, + "scaleDistribution": { "type": "linear" }, + "showPoints": "never", + "spanNulls": false, + "stacking": { "group": "A", "mode": "none" }, + "thresholdsStyle": { "mode": "line+area" } + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "transparent", "value": null }, + { "color": "red", "value": 600 } + ] + }, + "unit": "s" + }, + "overrides": [] + }, + "gridPos": { "h": 8, "w": 12, "x": 12, "y": 14 }, + "id": 11, + "options": { + "legend": { "calcs": ["mean", "max"], "displayMode": "table", "placement": "bottom" }, + "tooltip": { "mode": "multi", "sort": "none" } + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "histogram_quantile(0.95, sum(rate(stella_release_duration_seconds_bucket[1h])) by (le))", + "legendFormat": "p95 Duration", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.99, sum(rate(stella_release_duration_seconds_bucket[1h])) by (le))", + "legendFormat": "p99 Duration", + "refId": "B" + } + ], + "title": "Release Duration SLI", + "type": "timeseries" + }, + { + "collapsed": false, + "gridPos": { "h": 1, "w": 24, "x": 0, "y": 22 }, + "id": 12, + "panels": [], + "title": "SLA by Environment", + "type": "row" + }, + { + "datasource": "${datasource}", + "fieldConfig": { + "defaults": { + "color": { "mode": "thresholds" }, + "custom": { + "align": "auto", + "displayMode": "auto", + "inspect": false + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { "color": "red", "value": null }, + { "color": "yellow", "value": 0.99 }, + { "color": "green", "value": 0.999 } + ] + } + }, + "overrides": [ + { + "matcher": { "id": "byName", "options": "Success Rate" }, + "properties": [ + { "id": "unit", "value": "percentunit" }, + { "id": "custom.displayMode", "value": "color-background-solid" } + ] + }, + { + "matcher": { "id": "byName", "options": "Avg Duration" }, + "properties": [{ "id": "unit", "value": "s" }] + } + ] + }, + "gridPos": { "h": 8, "w": 24, "x": 0, "y": 23 }, + "id": 13, + "options": { + "footer": { "fields": "", "reducer": ["sum"], "show": false }, + "showHeader": true, + "sortBy": [] + }, + "pluginVersion": "9.0.0", + "targets": [ + { + "expr": "1 - (sum(increase(stella_release_failed_total[7d])) by (environment) / sum(increase(stella_release_total[7d])) by (environment))", + "format": "table", + "instant": true, + "legendFormat": "", + "refId": "A" + }, + { + "expr": "sum(increase(stella_release_total[7d])) by (environment)", + "format": "table", + "instant": true, + "legendFormat": "", + "refId": "B" + }, + { + "expr": "avg(rate(stella_release_duration_seconds_sum[7d]) / rate(stella_release_duration_seconds_count[7d])) by (environment)", + "format": "table", + "instant": true, + "legendFormat": "", + "refId": "C" + } + ], + "title": "SLA by Environment (7d)", + "transformations": [ + { + "id": "seriesToColumns", + "options": { "byField": "environment" } + }, + { + "id": "organize", + "options": { + "excludeByName": { "Time 1": true, "Time 2": true, "Time 3": true }, + "indexByName": {}, + "renameByName": { + "Value #A": "Success Rate", + "Value #B": "Total Releases", + "Value #C": "Avg Duration", + "environment": "Environment" + } + } + } + ], + "type": "table" + } + ], + "refresh": "5m", + "schemaVersion": 36, + "style": "dark", + "tags": ["stella-ops", "sla"], + "templating": { + "list": [ + { + "current": { "selected": false, "text": "Prometheus", "value": "Prometheus" }, + "hide": 0, + "includeAll": false, + "label": "Data Source", + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + } + ] + }, + "time": { "from": "now-30d", "to": "now" }, + "timepicker": {}, + "timezone": "", + "title": "Stella Ops - SLA Monitoring", + "uid": "stella-ops-sla", + "version": 1, + "weekStart": "" +} diff --git a/deploy/telemetry/storage/loki.yaml b/deploy/telemetry/storage/loki.yaml new file mode 100644 index 000000000..101b4df35 --- /dev/null +++ b/deploy/telemetry/storage/loki.yaml @@ -0,0 +1,48 @@ +auth_enabled: true + +server: + http_listen_port: 3100 + log_level: info + +common: + ring: + instance_addr: 127.0.0.1 + kvstore: + store: inmemory + replication_factor: 1 + path_prefix: /var/loki + +schema_config: + configs: + - from: 2024-01-01 + store: boltdb-shipper + object_store: filesystem + schema: v13 + index: + prefix: loki_index_ + period: 24h + +storage_config: + filesystem: + directory: /var/loki/chunks + boltdb_shipper: + active_index_directory: /var/loki/index + cache_location: /var/loki/index_cache + shared_store: filesystem + +ruler: + storage: + type: local + local: + directory: /var/loki/rules + rule_path: /tmp/loki-rules + enable_api: true + +limits_config: + enforce_metric_name: false + reject_old_samples: true + reject_old_samples_max_age: 168h + max_entries_limit_per_query: 5000 + ingestion_rate_mb: 10 + ingestion_burst_size_mb: 20 + per_tenant_override_config: /etc/telemetry/tenants/loki-overrides.yaml diff --git a/deploy/telemetry/storage/prometheus.yaml b/deploy/telemetry/storage/prometheus.yaml new file mode 100644 index 000000000..e1dcfe4c3 --- /dev/null +++ b/deploy/telemetry/storage/prometheus.yaml @@ -0,0 +1,19 @@ +global: + scrape_interval: 15s + evaluation_interval: 30s + +scrape_configs: + - job_name: "stellaops-otel-collector" + scheme: https + metrics_path: / + tls_config: + ca_file: ${PROMETHEUS_TLS_CA_FILE:-/etc/telemetry/tls/ca.crt} + cert_file: ${PROMETHEUS_TLS_CERT_FILE:-/etc/telemetry/tls/client.crt} + key_file: ${PROMETHEUS_TLS_KEY_FILE:-/etc/telemetry/tls/client.key} + insecure_skip_verify: false + authorization: + type: Bearer + credentials_file: ${PROMETHEUS_BEARER_TOKEN_FILE:-/etc/telemetry/auth/token} + static_configs: + - targets: + - ${PROMETHEUS_COLLECTOR_TARGET:-stellaops-otel-collector:9464} diff --git a/deploy/telemetry/storage/tempo.yaml b/deploy/telemetry/storage/tempo.yaml new file mode 100644 index 000000000..976e517bd --- /dev/null +++ b/deploy/telemetry/storage/tempo.yaml @@ -0,0 +1,56 @@ +multitenancy_enabled: true +usage_report: + reporting_enabled: false + +server: + http_listen_port: 3200 + log_level: info + +distributor: + receivers: + otlp: + protocols: + grpc: + tls: + cert_file: ${TEMPO_TLS_CERT_FILE:-/etc/telemetry/tls/server.crt} + key_file: ${TEMPO_TLS_KEY_FILE:-/etc/telemetry/tls/server.key} + client_ca_file: ${TEMPO_TLS_CA_FILE:-/etc/telemetry/tls/ca.crt} + require_client_cert: true + http: + tls: + cert_file: ${TEMPO_TLS_CERT_FILE:-/etc/telemetry/tls/server.crt} + key_file: ${TEMPO_TLS_KEY_FILE:-/etc/telemetry/tls/server.key} + client_ca_file: ${TEMPO_TLS_CA_FILE:-/etc/telemetry/tls/ca.crt} + require_client_cert: true + +ingester: + lifecycler: + ring: + instance_availability_zone: ${TEMPO_ZONE:-zone-a} + trace_idle_period: 10s + max_block_bytes: 1_048_576 + +compactor: + compaction: + block_retention: 168h + +metrics_generator: + registry: + external_labels: + cluster: stellaops + +storage: + trace: + backend: local + local: + path: /var/tempo/traces + wal: + path: /var/tempo/wal + metrics: + backend: prometheus + +overrides: + defaults: + ingestion_rate_limit_bytes: 1048576 + max_traces_per_user: 200000 + per_tenant_override_config: /etc/telemetry/tenants/tempo-overrides.yaml diff --git a/deploy/tools/ci/determinism/compare-platform-hashes.py b/deploy/tools/ci/determinism/compare-platform-hashes.py new file mode 100644 index 000000000..41c89adf8 --- /dev/null +++ b/deploy/tools/ci/determinism/compare-platform-hashes.py @@ -0,0 +1,160 @@ +#!/usr/bin/env python3 +""" +Cross-platform hash comparison for determinism verification. +Sprint: SPRINT_20251226_007_BE_determinism_gaps +Task: DET-GAP-13 - Cross-platform hash comparison report generation +""" + +import argparse +import json +import sys +from datetime import datetime, timezone +from pathlib import Path +from typing import Any + + +def load_hashes(path: str) -> dict[str, str]: + """Load hash file from path.""" + with open(path) as f: + data = json.load(f) + return data.get("hashes", data) + + +def compare_hashes( + linux: dict[str, str], + windows: dict[str, str], + macos: dict[str, str] +) -> tuple[list[dict], list[str]]: + """ + Compare hashes across platforms. + Returns (divergences, matched_keys). + """ + all_keys = set(linux.keys()) | set(windows.keys()) | set(macos.keys()) + divergences = [] + matched = [] + + for key in sorted(all_keys): + linux_hash = linux.get(key, "MISSING") + windows_hash = windows.get(key, "MISSING") + macos_hash = macos.get(key, "MISSING") + + if linux_hash == windows_hash == macos_hash: + matched.append(key) + else: + divergences.append({ + "key": key, + "linux": linux_hash, + "windows": windows_hash, + "macos": macos_hash + }) + + return divergences, matched + + +def generate_markdown_report( + divergences: list[dict], + matched: list[str], + linux_path: str, + windows_path: str, + macos_path: str +) -> str: + """Generate Markdown report.""" + lines = [ + f"**Generated:** {datetime.now(timezone.utc).isoformat()}", + "", + "### Summary", + "", + f"- ✅ **Matched:** {len(matched)} hashes", + f"- {'❌' if divergences else '✅'} **Divergences:** {len(divergences)} hashes", + "", + ] + + if divergences: + lines.extend([ + "### Divergences", + "", + "| Key | Linux | Windows | macOS |", + "|-----|-------|---------|-------|", + ]) + for d in divergences: + linux_short = d["linux"][:16] + "..." if len(d["linux"]) > 16 else d["linux"] + windows_short = d["windows"][:16] + "..." if len(d["windows"]) > 16 else d["windows"] + macos_short = d["macos"][:16] + "..." if len(d["macos"]) > 16 else d["macos"] + lines.append(f"| `{d['key']}` | `{linux_short}` | `{windows_short}` | `{macos_short}` |") + lines.append("") + + lines.extend([ + "### Matched Hashes", + "", + f"
Show {len(matched)} matched hashes", + "", + ]) + for key in matched[:50]: # Limit display + lines.append(f"- `{key}`") + if len(matched) > 50: + lines.append(f"- ... and {len(matched) - 50} more") + lines.extend(["", "
", ""]) + + return "\n".join(lines) + + +def main(): + parser = argparse.ArgumentParser(description="Compare determinism hashes across platforms") + parser.add_argument("--linux", required=True, help="Path to Linux hashes JSON") + parser.add_argument("--windows", required=True, help="Path to Windows hashes JSON") + parser.add_argument("--macos", required=True, help="Path to macOS hashes JSON") + parser.add_argument("--output", required=True, help="Output JSON report path") + parser.add_argument("--markdown", required=True, help="Output Markdown report path") + args = parser.parse_args() + + # Load hashes + linux_hashes = load_hashes(args.linux) + windows_hashes = load_hashes(args.windows) + macos_hashes = load_hashes(args.macos) + + # Compare + divergences, matched = compare_hashes(linux_hashes, windows_hashes, macos_hashes) + + # Generate reports + report = { + "timestamp": datetime.now(timezone.utc).isoformat(), + "sources": { + "linux": args.linux, + "windows": args.windows, + "macos": args.macos + }, + "summary": { + "matched": len(matched), + "divergences": len(divergences), + "total": len(matched) + len(divergences) + }, + "divergences": divergences, + "matched": matched + } + + # Write JSON report + Path(args.output).parent.mkdir(parents=True, exist_ok=True) + with open(args.output, "w") as f: + json.dump(report, f, indent=2) + + # Write Markdown report + markdown = generate_markdown_report( + divergences, matched, + args.linux, args.windows, args.macos + ) + with open(args.markdown, "w") as f: + f.write(markdown) + + # Print summary + print(f"Comparison complete:") + print(f" Matched: {len(matched)}") + print(f" Divergences: {len(divergences)}") + + # Exit with error if divergences found + if divergences: + print("\nERROR: Hash divergences detected!") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/GlobalUsings.cs b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/GlobalUsings.cs new file mode 100644 index 000000000..8c927eb74 --- /dev/null +++ b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/GlobalUsings.cs @@ -0,0 +1 @@ +global using Xunit; \ No newline at end of file diff --git a/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrime.Tests.csproj b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrime.Tests.csproj new file mode 100644 index 000000000..bbb98faa3 --- /dev/null +++ b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrime.Tests.csproj @@ -0,0 +1,16 @@ + + + net10.0 + true + enable + enable + preview + true + + + + + + + + diff --git a/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrimeTests.cs b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrimeTests.cs new file mode 100644 index 000000000..adf182ce1 --- /dev/null +++ b/deploy/tools/ci/nuget-prime/__Tests/NugetPrime.Tests/NugetPrimeTests.cs @@ -0,0 +1,48 @@ +using System.Xml.Linq; +using FluentAssertions; + +namespace NugetPrime.Tests; + +public sealed class NugetPrimeTests +{ + [Theory] + [InlineData("nuget-prime.csproj")] + [InlineData("nuget-prime-v9.csproj")] + public void PackageDownloads_ArePinned(string projectFile) + { + var repoRoot = FindRepoRoot(); + var path = Path.Combine(repoRoot, "devops", "tools", "nuget-prime", projectFile); + File.Exists(path).Should().BeTrue($"expected {projectFile} under devops/tools/nuget-prime"); + + var doc = XDocument.Load(path); + var packages = doc.Descendants().Where(element => element.Name.LocalName == "PackageDownload").ToList(); + packages.Should().NotBeEmpty(); + + foreach (var package in packages) + { + var include = package.Attribute("Include")?.Value; + include.Should().NotBeNullOrWhiteSpace(); + + var version = package.Attribute("Version")?.Value; + version.Should().NotBeNullOrWhiteSpace(); + version.Should().NotContain("*"); + } + } + + private static string FindRepoRoot() + { + var current = new DirectoryInfo(AppContext.BaseDirectory); + for (var i = 0; i < 12 && current is not null; i++) + { + var candidate = Path.Combine(current.FullName, "devops", "tools", "nuget-prime", "nuget-prime.csproj"); + if (File.Exists(candidate)) + { + return current.FullName; + } + + current = current.Parent; + } + + throw new DirectoryNotFoundException("Repo root not found for devops/tools/nuget-prime"); + } +} diff --git a/deploy/tools/ci/nuget-prime/mirror-packages.txt b/deploy/tools/ci/nuget-prime/mirror-packages.txt new file mode 100644 index 000000000..2e3967478 --- /dev/null +++ b/deploy/tools/ci/nuget-prime/mirror-packages.txt @@ -0,0 +1,30 @@ +AWSSDK.S3|3.7.305.6 +CycloneDX.Core|10.0.1 +Google.Protobuf|3.27.2 +Grpc.Net.Client|2.65.0 +Grpc.Tools|2.65.0 +Microsoft.Data.Sqlite|9.0.0-rc.1.24451.1 +Microsoft.Extensions.Configuration.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Configuration.Abstractions|9.0.0 +Microsoft.Extensions.Configuration.Binder|10.0.0-rc.2.25502.107 +Microsoft.Extensions.DependencyInjection.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.DependencyInjection.Abstractions|9.0.0 +Microsoft.Extensions.Diagnostics.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Diagnostics.HealthChecks|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Hosting.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Http.Polly|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Http|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Logging.Abstractions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Logging.Abstractions|9.0.0 +Microsoft.Extensions.Options.ConfigurationExtensions|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Options|10.0.0-rc.2.25502.107 +Microsoft.Extensions.Options|9.0.0 +Npgsql|9.0.3 +Npgsql.EntityFrameworkCore.PostgreSQL|9.0.3 +RoaringBitmap|0.0.9 +Serilog.AspNetCore|8.0.1 +Serilog.Extensions.Hosting|8.0.0 +Serilog.Sinks.Console|5.0.1 +StackExchange.Redis|2.7.33 +System.Text.Json|10.0.0-preview.7.25380.108 diff --git a/deploy/tools/ci/nuget-prime/nuget-prime-v9.csproj b/deploy/tools/ci/nuget-prime/nuget-prime-v9.csproj new file mode 100644 index 000000000..36dbbdb0b --- /dev/null +++ b/deploy/tools/ci/nuget-prime/nuget-prime-v9.csproj @@ -0,0 +1,14 @@ + + + net10.0 + ../../.nuget/packages + true + false + + + + + + + + diff --git a/deploy/tools/ci/nuget-prime/nuget-prime.csproj b/deploy/tools/ci/nuget-prime/nuget-prime.csproj new file mode 100644 index 000000000..1200a8844 --- /dev/null +++ b/deploy/tools/ci/nuget-prime/nuget-prime.csproj @@ -0,0 +1,45 @@ + + + net10.0 + ../../.nuget/packages + true + false + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/deploy/tools/feeds/concelier/backfill-store-aoc-19-005.sh b/deploy/tools/feeds/concelier/backfill-store-aoc-19-005.sh new file mode 100644 index 000000000..03f15f9da --- /dev/null +++ b/deploy/tools/feeds/concelier/backfill-store-aoc-19-005.sh @@ -0,0 +1,87 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Postgres backfill runner for STORE-AOC-19-005-DEV (Link-Not-Merge raw linksets/chunks) +# Usage: +# PGURI=postgres://.../concelier ./scripts/concelier/backfill-store-aoc-19-005.sh /path/to/linksets-stage-backfill.tar.zst +# Optional: +# PGSCHEMA=lnm_raw (default), DRY_RUN=1 to stop after extraction +# +# Assumptions: +# - Dataset contains ndjson files: linksets.ndjson, advisory_chunks.ndjson, manifest.json +# - Target staging tables are created by this script if absent: +# .linksets_raw(id text primary key, raw jsonb) +# .advisory_chunks_raw(id text primary key, raw jsonb) + +DATASET_PATH="${1:-}" +if [[ -z "${DATASET_PATH}" || ! -f "${DATASET_PATH}" ]]; then + echo "Dataset tarball not found. Provide path to linksets-stage-backfill.tar.zst" >&2 + exit 1 +fi + +PGURI="${PGURI:-${CONCELIER_PG_URI:-}}" +PGSCHEMA="${PGSCHEMA:-lnm_raw}" +DRY_RUN="${DRY_RUN:-0}" + +if [[ -z "${PGURI}" ]]; then + echo "PGURI (or CONCELIER_PG_URI) must be set" >&2 + exit 1 +fi + +WORKDIR="$(mktemp -d)" +cleanup() { rm -rf "${WORKDIR}"; } +trap cleanup EXIT + +echo "==> Dataset: ${DATASET_PATH}" +sha256sum "${DATASET_PATH}" + +echo "==> Extracting to ${WORKDIR}" +tar -xf "${DATASET_PATH}" -C "${WORKDIR}" + +for required in linksets.ndjson advisory_chunks.ndjson manifest.json; do + if [[ ! -f "${WORKDIR}/${required}" ]]; then + echo "Missing required file in dataset: ${required}" >&2 + exit 1 + fi +done + +echo "==> Ensuring staging schema/tables exist in Postgres" +psql "${PGURI}" < Importing linksets into ${PGSCHEMA}.linksets_raw" +cat >"${WORKDIR}/linksets.tsv" <(jq -rc '[._id, .] | @tsv' "${WORKDIR}/linksets.ndjson") +psql "${PGURI}" < Importing advisory_chunks into ${PGSCHEMA}.advisory_chunks_raw" +cat >"${WORKDIR}/advisory_chunks.tsv" <(jq -rc '[._id, .] | @tsv' "${WORKDIR}/advisory_chunks.ndjson") +psql "${PGURI}" < Post-import counts" +psql -tA "${PGURI}" -c "select 'linksets_raw='||count(*) from ${PGSCHEMA}.linksets_raw;" +psql -tA "${PGURI}" -c "select 'advisory_chunks_raw='||count(*) from ${PGSCHEMA}.advisory_chunks_raw;" + +echo "==> Manifest summary" +cat "${WORKDIR}/manifest.json" + +echo "Backfill complete." diff --git a/deploy/tools/feeds/concelier/build-store-aoc-19-005-dataset.sh b/deploy/tools/feeds/concelier/build-store-aoc-19-005-dataset.sh new file mode 100644 index 000000000..c7b3e5e5a --- /dev/null +++ b/deploy/tools/feeds/concelier/build-store-aoc-19-005-dataset.sh @@ -0,0 +1,74 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Deterministic dataset builder for STORE-AOC-19-005-DEV. +# Generates linksets-stage-backfill.tar.zst from repo seed data. +# Usage: +# ./scripts/concelier/build-store-aoc-19-005-dataset.sh [output_tarball] +# Default output: out/linksets/linksets-stage-backfill.tar.zst + +command -v tar >/dev/null || { echo "tar is required" >&2; exit 1; } +command -v sha256sum >/dev/null || { echo "sha256sum is required" >&2; exit 1; } + +TAR_COMPRESS=() +if command -v zstd >/dev/null 2>&1; then + TAR_COMPRESS=(--zstd) +else + echo "zstd not found; building uncompressed tarball (extension kept for compatibility)" >&2 +fi + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" +SEED_DIR="${ROOT_DIR}/src/__Tests/__Datasets/seed-data/concelier/store-aoc-19-005" +OUT_DIR="${ROOT_DIR}/out/linksets" +OUT_PATH="${1:-${OUT_DIR}/linksets-stage-backfill.tar.zst}" +GEN_TIME="2025-12-07T00:00:00Z" + +for seed in linksets.ndjson advisory_chunks.ndjson; do + if [[ ! -f "${SEED_DIR}/${seed}" ]]; then + echo "Missing seed file: ${SEED_DIR}/${seed}" >&2 + exit 1 + fi +done + +WORKDIR="$(mktemp -d)" +cleanup() { rm -rf "${WORKDIR}"; } +trap cleanup EXIT + +cp "${SEED_DIR}/linksets.ndjson" "${WORKDIR}/linksets.ndjson" +cp "${SEED_DIR}/advisory_chunks.ndjson" "${WORKDIR}/advisory_chunks.ndjson" + +linksets_sha=$(sha256sum "${WORKDIR}/linksets.ndjson" | awk '{print $1}') +advisory_sha=$(sha256sum "${WORKDIR}/advisory_chunks.ndjson" | awk '{print $1}') +linksets_count=$(wc -l < "${WORKDIR}/linksets.ndjson" | tr -d '[:space:]') +advisory_count=$(wc -l < "${WORKDIR}/advisory_chunks.ndjson" | tr -d '[:space:]') + +cat >"${WORKDIR}/manifest.json" < "${OUT_PATH}.sha256" + +echo "Wrote ${OUT_PATH}" +cat "${OUT_PATH}.sha256" diff --git a/deploy/tools/feeds/concelier/export-linksets-tarball.sh b/deploy/tools/feeds/concelier/export-linksets-tarball.sh new file mode 100644 index 000000000..2b05c5336 --- /dev/null +++ b/deploy/tools/feeds/concelier/export-linksets-tarball.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Export Concelier linksets/advisory_chunks from Postgres to a tar.zst bundle. +# Usage: +# PGURI=postgres://user:pass@host:5432/db \ +# ./scripts/concelier/export-linksets-tarball.sh out/linksets/linksets-stage-backfill.tar.zst +# +# Optional env: +# PGSCHEMA=public # schema that owns linksets/advisory_chunks +# LINKSETS_TABLE=linksets # table name for linksets +# CHUNKS_TABLE=advisory_chunks # table name for advisory chunks +# TMPDIR=/tmp/export-linksets # working directory (defaults to mktemp) + +TARGET="${1:-}" +if [[ -z "${TARGET}" ]]; then + echo "Usage: PGURI=... $0 out/linksets/linksets-stage-backfill.tar.zst" >&2 + exit 1 +fi + +if [[ -z "${PGURI:-}" ]]; then + echo "PGURI environment variable is required (postgres://...)" >&2 + exit 1 +fi + +PGSCHEMA="${PGSCHEMA:-public}" +LINKSETS_TABLE="${LINKSETS_TABLE:-linksets}" +CHUNKS_TABLE="${CHUNKS_TABLE:-advisory_chunks}" +WORKDIR="${TMPDIR:-$(mktemp -d)}" + +mkdir -p "${WORKDIR}" +OUTDIR="$(dirname "${TARGET}")" +mkdir -p "${OUTDIR}" + +echo "==> Exporting linksets from ${PGSCHEMA}.${LINKSETS_TABLE}" +psql "${PGURI}" -c "\copy (select row_to_json(t) from ${PGSCHEMA}.${LINKSETS_TABLE} t) to '${WORKDIR}/linksets.ndjson'" + +echo "==> Exporting advisory_chunks from ${PGSCHEMA}.${CHUNKS_TABLE}" +psql "${PGURI}" -c "\copy (select row_to_json(t) from ${PGSCHEMA}.${CHUNKS_TABLE} t) to '${WORKDIR}/advisory_chunks.ndjson'" + +LINKSETS_COUNT="$(wc -l < "${WORKDIR}/linksets.ndjson")" +CHUNKS_COUNT="$(wc -l < "${WORKDIR}/advisory_chunks.ndjson")" + +echo "==> Writing manifest.json" +jq -n --argjson linksets "${LINKSETS_COUNT}" --argjson advisory_chunks "${CHUNKS_COUNT}" \ + '{linksets: $linksets, advisory_chunks: $advisory_chunks}' \ + > "${WORKDIR}/manifest.json" + +echo "==> Building tarball ${TARGET}" +tar -I "zstd -19" -cf "${TARGET}" -C "${WORKDIR}" linksets.ndjson advisory_chunks.ndjson manifest.json + +echo "==> SHA-256" +sha256sum "${TARGET}" + +echo "Done. Workdir: ${WORKDIR}" diff --git a/deploy/tools/feeds/concelier/test-store-aoc-19-005-dataset.sh b/deploy/tools/feeds/concelier/test-store-aoc-19-005-dataset.sh new file mode 100644 index 000000000..04621d0f3 --- /dev/null +++ b/deploy/tools/feeds/concelier/test-store-aoc-19-005-dataset.sh @@ -0,0 +1,90 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Validates the store-aoc-19-005 dataset tarball. +# Usage: ./scripts/concelier/test-store-aoc-19-005-dataset.sh [tarball] + +command -v tar >/dev/null || { echo "tar is required" >&2; exit 1; } +command -v sha256sum >/dev/null || { echo "sha256sum is required" >&2; exit 1; } +command -v python >/dev/null || { echo "python is required" >&2; exit 1; } + +DATASET="${1:-out/linksets/linksets-stage-backfill.tar.zst}" + +if [[ ! -f "${DATASET}" ]]; then + echo "Dataset not found: ${DATASET}" >&2 + exit 1 +fi + +WORKDIR="$(mktemp -d)" +cleanup() { rm -rf "${WORKDIR}"; } +trap cleanup EXIT + +tar -xf "${DATASET}" -C "${WORKDIR}" + +for required in linksets.ndjson advisory_chunks.ndjson manifest.json; do + if [[ ! -f "${WORKDIR}/${required}" ]]; then + echo "Missing ${required} in dataset" >&2 + exit 1 + fi +done + +manifest="${WORKDIR}/manifest.json" +expected_linksets=$(python - <<'PY' "${manifest}" +import json, sys +with open(sys.argv[1], "r", encoding="utf-8") as f: + data = json.load(f) +print(data["records"]["linksets"]) +PY +) +expected_chunks=$(python - <<'PY' "${manifest}" +import json, sys +with open(sys.argv[1], "r", encoding="utf-8") as f: + data = json.load(f) +print(data["records"]["advisory_chunks"]) +PY +) +expected_linksets_sha=$(python - <<'PY' "${manifest}" +import json, sys +with open(sys.argv[1], "r", encoding="utf-8") as f: + data = json.load(f) +print(data["sha256"]["linksets.ndjson"]) +PY +) +expected_chunks_sha=$(python - <<'PY' "${manifest}" +import json, sys +with open(sys.argv[1], "r", encoding="utf-8") as f: + data = json.load(f) +print(data["sha256"]["advisory_chunks.ndjson"]) +PY +) + +actual_linksets=$(wc -l < "${WORKDIR}/linksets.ndjson" | tr -d '[:space:]') +actual_chunks=$(wc -l < "${WORKDIR}/advisory_chunks.ndjson" | tr -d '[:space:]') +actual_linksets_sha=$(sha256sum "${WORKDIR}/linksets.ndjson" | awk '{print $1}') +actual_chunks_sha=$(sha256sum "${WORKDIR}/advisory_chunks.ndjson" | awk '{print $1}') + +if [[ "${expected_linksets}" != "${actual_linksets}" ]]; then + echo "linksets count mismatch: expected ${expected_linksets}, got ${actual_linksets}" >&2 + exit 1 +fi + +if [[ "${expected_chunks}" != "${actual_chunks}" ]]; then + echo "advisory_chunks count mismatch: expected ${expected_chunks}, got ${actual_chunks}" >&2 + exit 1 +fi + +if [[ "${expected_linksets_sha}" != "${actual_linksets_sha}" ]]; then + echo "linksets sha mismatch: expected ${expected_linksets_sha}, got ${actual_linksets_sha}" >&2 + exit 1 +fi + +if [[ "${expected_chunks_sha}" != "${actual_chunks_sha}" ]]; then + echo "advisory_chunks sha mismatch: expected ${expected_chunks_sha}, got ${actual_chunks_sha}" >&2 + exit 1 +fi + +echo "Dataset validation succeeded:" +echo " linksets: ${actual_linksets}" +echo " advisory_chunks: ${actual_chunks}" +echo " linksets.sha256=${actual_linksets_sha}" +echo " advisory_chunks.sha256=${actual_chunks_sha}" diff --git a/deploy/tools/feeds/feeds/run_icscisa_kisa_refresh.py b/deploy/tools/feeds/feeds/run_icscisa_kisa_refresh.py new file mode 100644 index 000000000..1813d45f9 --- /dev/null +++ b/deploy/tools/feeds/feeds/run_icscisa_kisa_refresh.py @@ -0,0 +1,467 @@ +#!/usr/bin/env python3 +""" +ICS/KISA feed refresh runner. + +Runs the SOP v0.2 workflow to emit NDJSON advisories, delta, fetch log, and hash +manifest under out/feeds/icscisa-kisa//. + +Defaults to live fetch with offline-safe fallback to baked-in samples. You can +force live/offline via env or CLI flags. +""" + +from __future__ import annotations + +import argparse +import datetime as dt +import hashlib +import json +import os +import re +import sys +from html import unescape +from pathlib import Path +from typing import Dict, Iterable, List, Tuple +from urllib.error import URLError, HTTPError +from urllib.parse import urlparse, urlunparse +from urllib.request import Request, urlopen +from xml.etree import ElementTree + + +DEFAULT_OUTPUT_ROOT = Path("out/feeds/icscisa-kisa") +DEFAULT_ICSCISA_URL = "https://www.cisa.gov/news-events/ics-advisories/icsa.xml" +DEFAULT_KISA_URL = "https://knvd.krcert.or.kr/rss/securityInfo.do" +DEFAULT_GATEWAY_HOST = "concelier-webservice" +DEFAULT_GATEWAY_SCHEME = "http" +USER_AGENT = "StellaOpsFeedRefresh/1.0 (+https://stella-ops.org)" + + +def utcnow() -> dt.datetime: + return dt.datetime.utcnow().replace(tzinfo=dt.timezone.utc) + + +def iso(ts: dt.datetime) -> str: + return ts.strftime("%Y-%m-%dT%H:%M:%SZ") + + +def sha256_bytes(data: bytes) -> str: + return hashlib.sha256(data).hexdigest() + + +def strip_html(value: str) -> str: + return re.sub(r"<[^>]+>", "", value or "").strip() + + +def safe_request(url: str) -> bytes: + req = Request(url, headers={"User-Agent": USER_AGENT}) + with urlopen(req, timeout=30) as resp: + return resp.read() + + +def parse_rss_items(xml_bytes: bytes) -> Iterable[Dict[str, str]]: + root = ElementTree.fromstring(xml_bytes) + for item in root.findall(".//item"): + title = (item.findtext("title") or "").strip() + link = (item.findtext("link") or "").strip() + description = strip_html(unescape(item.findtext("description") or "")) + pub_date = (item.findtext("pubDate") or "").strip() + yield { + "title": title, + "link": link, + "description": description, + "pub_date": pub_date, + } + + +def normalize_icscisa_record(item: Dict[str, str], fetched_at: str, run_id: str) -> Dict[str, object]: + advisory_id = item["title"].split(":")[0].strip() or "icsa-unknown" + summary = item["description"] or item["title"] + raw_payload = f"{item['title']}\n{item['link']}\n{item['description']}" + record = { + "advisory_id": advisory_id, + "source": "icscisa", + "source_url": item["link"] or DEFAULT_ICSCISA_URL, + "title": item["title"] or advisory_id, + "summary": summary, + "published": iso(parse_pubdate(item["pub_date"])), + "updated": iso(parse_pubdate(item["pub_date"])), + "severity": "unknown", + "cvss": None, + "cwe": [], + "affected_products": [], + "references": [url for url in (item["link"],) if url], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": fetched_at, + "run_id": run_id, + "payload_sha256": sha256_bytes(raw_payload.encode("utf-8")), + } + return record + + +def normalize_kisa_record(item: Dict[str, str], fetched_at: str, run_id: str) -> Dict[str, object]: + advisory_id = extract_kisa_id(item) + raw_payload = f"{item['title']}\n{item['link']}\n{item['description']}" + record = { + "advisory_id": advisory_id, + "source": "kisa", + "source_url": item["link"] or DEFAULT_KISA_URL, + "title": item["title"] or advisory_id, + "summary": item["description"] or item["title"], + "published": iso(parse_pubdate(item["pub_date"])), + "updated": iso(parse_pubdate(item["pub_date"])), + "severity": "unknown", + "cvss": None, + "cwe": [], + "affected_products": [], + "references": [url for url in (item["link"], DEFAULT_KISA_URL) if url], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": fetched_at, + "run_id": run_id, + "payload_sha256": sha256_bytes(raw_payload.encode("utf-8")), + } + return record + + +def extract_kisa_id(item: Dict[str, str]) -> str: + link = item["link"] + match = re.search(r"IDX=([0-9]+)", link) + if match: + return f"KISA-{match.group(1)}" + return (item["title"].split()[0] if item["title"] else "KISA-unknown").strip() + + +def parse_pubdate(value: str) -> dt.datetime: + if not value: + return utcnow() + try: + # RFC1123-ish + return dt.datetime.strptime(value, "%a, %d %b %Y %H:%M:%S %Z").replace(tzinfo=dt.timezone.utc) + except ValueError: + try: + return dt.datetime.fromisoformat(value.replace("Z", "+00:00")) + except ValueError: + return utcnow() + + +def sample_records() -> List[Dict[str, object]]: + now_iso = iso(utcnow()) + return [ + { + "advisory_id": "ICSA-25-123-01", + "source": "icscisa", + "source_url": "https://www.cisa.gov/news-events/ics-advisories/icsa-25-123-01", + "title": "Example ICS Advisory", + "summary": "Example Corp ControlSuite RCE via exposed management service.", + "published": "2025-10-13T12:00:00Z", + "updated": "2025-11-30T00:00:00Z", + "severity": "High", + "cvss": {"version": "3.1", "vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "score": 9.8}, + "cwe": ["CWE-269"], + "affected_products": [{"vendor": "Example Corp", "product": "ControlSuite", "versions": ["4.2.0", "4.2.1"]}], + "references": [ + "https://example.com/security/icsa-25-123-01.pdf", + "https://www.cisa.gov/news-events/ics-advisories/icsa-25-123-01", + ], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": now_iso, + "run_id": "", + "payload_sha256": sha256_bytes(b"ICSA-25-123-01 Example ControlSuite advisory payload"), + }, + { + "advisory_id": "ICSMA-25-045-01", + "source": "icscisa", + "source_url": "https://www.cisa.gov/news-events/ics-medical-advisories/icsma-25-045-01", + "title": "Example Medical Advisory", + "summary": "HealthTech infusion pump vulnerabilities including two CVEs.", + "published": "2025-10-14T09:30:00Z", + "updated": "2025-12-01T00:00:00Z", + "severity": "Medium", + "cvss": {"version": "3.1", "vector": "CVSS:3.1/AV:N/AC:H/PR:L/UI:R/S:U/C:L/I:L/A:L", "score": 6.3}, + "cwe": ["CWE-319"], + "affected_products": [{"vendor": "HealthTech", "product": "InfusionManager", "versions": ["2.1.0", "2.1.1"]}], + "references": [ + "https://www.cisa.gov/news-events/ics-medical-advisories/icsma-25-045-01", + "https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2025-11111", + ], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": now_iso, + "run_id": "", + "payload_sha256": sha256_bytes(b"ICSMA-25-045-01 Example medical advisory payload"), + }, + { + "advisory_id": "KISA-2025-5859", + "source": "kisa", + "source_url": "https://knvd.krcert.or.kr/detailDos.do?IDX=5859", + "title": "KISA sample advisory 5859", + "summary": "Remote code execution in ControlBoard service (offline HTML snapshot).", + "published": "2025-11-03T22:53:00Z", + "updated": "2025-12-02T00:00:00Z", + "severity": "High", + "cvss": {"version": "3.1", "vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H", "score": 9.8}, + "cwe": ["CWE-787"], + "affected_products": [{"vendor": "ACME", "product": "ControlBoard", "versions": ["1.0.1.0084", "2.0.1.0034"]}], + "references": [ + "https://knvd.krcert.or.kr/rss/securityInfo.do", + "https://knvd.krcert.or.kr/detailDos.do?IDX=5859", + ], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": now_iso, + "run_id": "", + "payload_sha256": sha256_bytes(b"KISA advisory IDX 5859 cached HTML payload"), + }, + { + "advisory_id": "KISA-2025-5860", + "source": "kisa", + "source_url": "https://knvd.krcert.or.kr/detailDos.do?IDX=5860", + "title": "KISA sample advisory 5860", + "summary": "Authentication bypass via default credentials in NetGateway appliance.", + "published": "2025-11-03T22:53:00Z", + "updated": "2025-12-02T00:00:00Z", + "severity": "Medium", + "cvss": {"version": "3.1", "vector": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L", "score": 7.3}, + "cwe": ["CWE-798"], + "affected_products": [{"vendor": "NetGateway", "product": "Edge", "versions": ["3.4.2", "3.4.3"]}], + "references": [ + "https://knvd.krcert.or.kr/rss/securityInfo.do", + "https://knvd.krcert.or.kr/detailDos.do?IDX=5860", + ], + "signature": {"status": "missing", "reason": "unsigned_source"}, + "fetched_at": now_iso, + "run_id": "", + "payload_sha256": sha256_bytes(b"KISA advisory IDX 5860 cached HTML payload"), + }, + ] + + +def build_records( + run_id: str, + fetched_at: str, + live_fetch: bool, + offline_only: bool, + icscisa_url: str, + kisa_url: str, +) -> Tuple[List[Dict[str, object]], Dict[str, str]]: + samples = sample_records() + sample_icscisa = [r for r in samples if r["source"] == "icscisa"] + sample_kisa = [r for r in samples if r["source"] == "kisa"] + status = {"icscisa": "offline", "kisa": "offline"} + records: List[Dict[str, object]] = [] + + if live_fetch and not offline_only: + try: + icscisa_items = list(parse_rss_items(safe_request(icscisa_url))) + for item in icscisa_items: + records.append(normalize_icscisa_record(item, fetched_at, run_id)) + status["icscisa"] = f"live:{len(icscisa_items)}" + except (URLError, HTTPError, ElementTree.ParseError, TimeoutError) as exc: + print(f"[warn] ICS CISA fetch failed ({exc}); falling back to samples.", file=sys.stderr) + + try: + kisa_items = list(parse_rss_items(safe_request(kisa_url))) + for item in kisa_items: + records.append(normalize_kisa_record(item, fetched_at, run_id)) + status["kisa"] = f"live:{len(kisa_items)}" + except (URLError, HTTPError, ElementTree.ParseError, TimeoutError) as exc: + print(f"[warn] KISA fetch failed ({exc}); falling back to samples.", file=sys.stderr) + + if not records or status["icscisa"].startswith("live") is False: + records.extend(apply_run_metadata(sample_icscisa, run_id, fetched_at)) + status["icscisa"] = status.get("icscisa") or "offline" + + if not any(r["source"] == "kisa" for r in records): + records.extend(apply_run_metadata(sample_kisa, run_id, fetched_at)) + status["kisa"] = status.get("kisa") or "offline" + + return records, status + + +def apply_run_metadata(records: Iterable[Dict[str, object]], run_id: str, fetched_at: str) -> List[Dict[str, object]]: + updated = [] + for record in records: + copy = dict(record) + copy["run_id"] = run_id + copy["fetched_at"] = fetched_at + copy["payload_sha256"] = record.get("payload_sha256") or sha256_bytes(json.dumps(record, sort_keys=True).encode("utf-8")) + updated.append(copy) + return updated + + +def find_previous_snapshot(base_dir: Path, current_run_date: str) -> Path | None: + if not base_dir.exists(): + return None + candidates = sorted(p for p in base_dir.iterdir() if p.is_dir() and p.name != current_run_date) + if not candidates: + return None + return candidates[-1] / "advisories.ndjson" + + +def load_previous_hash(path: Path | None) -> str | None: + if path and path.exists(): + return sha256_bytes(path.read_bytes()) + return None + + +def compute_delta(new_records: List[Dict[str, object]], previous_path: Path | None) -> Dict[str, object]: + prev_records = {} + if previous_path and previous_path.exists(): + with previous_path.open("r", encoding="utf-8") as handle: + for line in handle: + if line.strip(): + rec = json.loads(line) + prev_records[rec["advisory_id"]] = rec + + new_by_id = {r["advisory_id"]: r for r in new_records} + added = [rid for rid in new_by_id if rid not in prev_records] + updated = [ + rid + for rid, rec in new_by_id.items() + if rid in prev_records and rec.get("payload_sha256") != prev_records[rid].get("payload_sha256") + ] + removed = [rid for rid in prev_records if rid not in new_by_id] + + return { + "added": {"icscisa": [rid for rid in added if new_by_id[rid]["source"] == "icscisa"], + "kisa": [rid for rid in added if new_by_id[rid]["source"] == "kisa"]}, + "updated": {"icscisa": [rid for rid in updated if new_by_id[rid]["source"] == "icscisa"], + "kisa": [rid for rid in updated if new_by_id[rid]["source"] == "kisa"]}, + "removed": {"icscisa": [rid for rid in removed if prev_records[rid]["source"] == "icscisa"], + "kisa": [rid for rid in removed if prev_records[rid]["source"] == "kisa"]}, + "totals": { + "icscisa": { + "added": len([rid for rid in added if new_by_id[rid]["source"] == "icscisa"]), + "updated": len([rid for rid in updated if new_by_id[rid]["source"] == "icscisa"]), + "removed": len([rid for rid in removed if prev_records[rid]["source"] == "icscisa"]), + "remaining": len([rid for rid, rec in new_by_id.items() if rec["source"] == "icscisa"]), + }, + "kisa": { + "added": len([rid for rid in added if new_by_id[rid]["source"] == "kisa"]), + "updated": len([rid for rid in updated if new_by_id[rid]["source"] == "kisa"]), + "removed": len([rid for rid in removed if prev_records[rid]["source"] == "kisa"]), + "remaining": len([rid for rid, rec in new_by_id.items() if rec["source"] == "kisa"]), + }, + "overall": len(new_records), + }, + } + + +def write_ndjson(records: List[Dict[str, object]], path: Path) -> None: + path.write_text("\n".join(json.dumps(r, sort_keys=True, separators=(",", ":")) for r in records) + "\n", encoding="utf-8") + + +def write_fetch_log( + path: Path, + run_id: str, + start: str, + end: str, + status: Dict[str, str], + gateway_host: str, + gateway_scheme: str, + icscisa_url: str, + kisa_url: str, + live_fetch: bool, + offline_only: bool, +) -> None: + lines = [ + f"run_id={run_id} start={start} end={end}", + f"sources=icscisa,kisa cadence=weekly backlog_window=60d live_fetch={str(live_fetch).lower()} offline_only={str(offline_only).lower()}", + f"gateway={gateway_scheme}://{gateway_host}", + f"icscisa_url={icscisa_url} status={status.get('icscisa','offline')} retries=0", + f"kisa_url={kisa_url} status={status.get('kisa','offline')} retries=0", + "outputs=advisories.ndjson,delta.json,hashes.sha256", + ] + path.write_text("\n".join(lines) + "\n", encoding="utf-8") + + +def write_hashes(dir_path: Path) -> None: + entries = [] + for name in ["advisories.ndjson", "delta.json", "fetch.log"]: + file_path = dir_path / name + entries.append(f"{sha256_bytes(file_path.read_bytes())} {name}") + (dir_path / "hashes.sha256").write_text("\n".join(entries) + "\n", encoding="utf-8") + + +def main() -> None: + parser = argparse.ArgumentParser(description="Run ICS/KISA feed refresh SOP v0.2") + parser.add_argument("--out-dir", default=str(DEFAULT_OUTPUT_ROOT), help="Base output directory (default: out/feeds/icscisa-kisa)") + parser.add_argument("--run-date", default=None, help="Override run date (YYYYMMDD)") + parser.add_argument("--run-id", default=None, help="Override run id") + parser.add_argument("--live", action="store_true", default=False, help="Force live fetch (default: enabled via env LIVE_FETCH=true)") + parser.add_argument("--offline", action="store_true", default=False, help="Force offline samples only") + args = parser.parse_args() + + now = utcnow() + run_date = args.run_date or now.strftime("%Y%m%d") + run_id = args.run_id or f"icscisa-kisa-{now.strftime('%Y%m%dT%H%M%SZ')}" + fetched_at = iso(now) + start = fetched_at + + live_fetch = args.live or os.getenv("LIVE_FETCH", "true").lower() == "true" + offline_only = args.offline or os.getenv("OFFLINE_SNAPSHOT", "false").lower() == "true" + + output_root = Path(args.out_dir) + output_dir = output_root / run_date + output_dir.mkdir(parents=True, exist_ok=True) + + previous_path = find_previous_snapshot(output_root, run_date) + + gateway_host = os.getenv("FEED_GATEWAY_HOST", DEFAULT_GATEWAY_HOST) + gateway_scheme = os.getenv("FEED_GATEWAY_SCHEME", DEFAULT_GATEWAY_SCHEME) + + def resolve_feed(url_env: str, default_url: str) -> str: + if url_env: + return url_env + parsed = urlparse(default_url) + # Replace host/scheme to allow on-prem DNS (docker network) defaults. + rewritten = parsed._replace(netloc=gateway_host, scheme=gateway_scheme) + return urlunparse(rewritten) + + resolved_icscisa_url = resolve_feed(os.getenv("ICSCISA_FEED_URL"), DEFAULT_ICSCISA_URL) + resolved_kisa_url = resolve_feed(os.getenv("KISA_FEED_URL"), DEFAULT_KISA_URL) + + records, status = build_records( + run_id=run_id, + fetched_at=fetched_at, + live_fetch=live_fetch, + offline_only=offline_only, + icscisa_url=resolved_icscisa_url, + kisa_url=resolved_kisa_url, + ) + + write_ndjson(records, output_dir / "advisories.ndjson") + + delta = compute_delta(records, previous_path) + delta_payload = { + "run_id": run_id, + "generated_at": iso(utcnow()), + **delta, + "previous_snapshot_sha256": load_previous_hash(previous_path), + } + (output_dir / "delta.json").write_text(json.dumps(delta_payload, separators=(",", ":")) + "\n", encoding="utf-8") + + end = iso(utcnow()) + write_fetch_log( + output_dir / "fetch.log", + run_id, + start, + end, + status, + gateway_host=gateway_host, + gateway_scheme=gateway_scheme, + icscisa_url=resolved_icscisa_url, + kisa_url=resolved_kisa_url, + live_fetch=live_fetch and not offline_only, + offline_only=offline_only, + ) + write_hashes(output_dir) + + print(f"[ok] wrote {len(records)} advisories to {output_dir}") + print(f" run_id={run_id} live_fetch={live_fetch and not offline_only} offline_only={offline_only}") + print(f" gateway={gateway_scheme}://{gateway_host}") + print(f" icscisa_url={resolved_icscisa_url}") + print(f" kisa_url={resolved_kisa_url}") + print(f" status={status}") + if previous_path: + print(f" previous_snapshot={previous_path}") + + +if __name__ == "__main__": + main() diff --git a/deploy/tools/feeds/vex/requirements.txt b/deploy/tools/feeds/vex/requirements.txt new file mode 100644 index 000000000..b5d4deeb2 --- /dev/null +++ b/deploy/tools/feeds/vex/requirements.txt @@ -0,0 +1,2 @@ +blake3==0.4.1 +jsonschema==4.22.0 diff --git a/deploy/tools/feeds/vex/verify_proof_bundle.py b/deploy/tools/feeds/vex/verify_proof_bundle.py new file mode 100644 index 000000000..dae47518e --- /dev/null +++ b/deploy/tools/feeds/vex/verify_proof_bundle.py @@ -0,0 +1,176 @@ +#!/usr/bin/env python3 +""" +Offline verifier for StellaOps VEX proof bundles. + +- Validates the bundle against `docs/benchmarks/vex-evidence-playbook.schema.json`. +- Checks justification IDs against the signed catalog. +- Recomputes hashes for CAS artefacts, OpenVEX payload, and DSSE envelopes. +- Enforces coverage and negative-test requirements per task VEX-GAPS-401-062. +""" + +from __future__ import annotations + +import argparse +import base64 +import json +from pathlib import Path +import sys +from typing import Dict, Any + +import jsonschema +from blake3 import blake3 + + +def load_json(path: Path) -> Any: + return json.loads(path.read_text(encoding="utf-8")) + + +def digest_for(data: bytes, algo: str) -> str: + if algo == "sha256": + import hashlib + + return hashlib.sha256(data).hexdigest() + if algo == "blake3": + return blake3(data).hexdigest() + raise ValueError(f"Unsupported hash algorithm: {algo}") + + +def parse_digest(digest: str) -> tuple[str, str]: + if ":" not in digest: + raise ValueError(f"Digest missing prefix: {digest}") + algo, value = digest.split(":", 1) + return algo, value + + +def verify_digest(path: Path, expected: str) -> None: + algo, value = parse_digest(expected) + actual = digest_for(path.read_bytes(), algo) + if actual.lower() != value.lower(): + raise ValueError(f"Digest mismatch for {path}: expected {value}, got {actual}") + + +def resolve_cas_uri(cas_root: Path, cas_uri: str) -> Path: + if not cas_uri.startswith("cas://"): + raise ValueError(f"CAS URI must start with cas:// — got {cas_uri}") + relative = cas_uri[len("cas://") :] + return cas_root / relative + + +def verify_dsse(dsse_ref: Dict[str, Any]) -> None: + path = Path(dsse_ref["path"]) + verify_digest(path, dsse_ref["sha256"]) + if "payload_sha256" in dsse_ref: + envelope = load_json(path) + payload = base64.b64decode(envelope["payload"]) + verify_digest_from_bytes(payload, dsse_ref["payload_sha256"]) + + +def verify_digest_from_bytes(data: bytes, expected: str) -> None: + algo, value = parse_digest(expected) + actual = digest_for(data, algo) + if actual.lower() != value.lower(): + raise ValueError(f"Digest mismatch for payload: expected {value}, got {actual}") + + +def main() -> int: + parser = argparse.ArgumentParser(description="Verify a StellaOps VEX proof bundle.") + parser.add_argument("--bundle", required=True, type=Path) + parser.add_argument("--schema", required=True, type=Path) + parser.add_argument("--catalog", required=True, type=Path) + parser.add_argument("--cas-root", required=True, type=Path) + parser.add_argument("--min-coverage", type=float, default=95.0) + args = parser.parse_args() + + bundle = load_json(args.bundle) + schema = load_json(args.schema) + catalog = load_json(args.catalog) + + jsonschema.validate(instance=bundle, schema=schema) + + justification_ids = {entry["id"] for entry in catalog.get("entries", [])} + if bundle["justification"]["id"] not in justification_ids: + raise ValueError(f"Justification {bundle['justification']['id']} not found in catalog") + + # Justification DSSE integrity + if "dsse" in bundle["justification"]: + verify_dsse(bundle["justification"]["dsse"]) + + # OpenVEX canonical hashes + openvex_path = Path(bundle["openvex"]["path"]) + openvex_bytes = openvex_path.read_bytes() + verify_digest_from_bytes(openvex_bytes, bundle["openvex"]["canonical_sha256"]) + verify_digest_from_bytes(openvex_bytes, bundle["openvex"]["canonical_blake3"]) + + # CAS evidence + evidence_by_type: Dict[str, Dict[str, Any]] = {} + for ev in bundle["evidence"]: + ev_path = resolve_cas_uri(args.cas_root, ev["cas_uri"]) + verify_digest(ev_path, ev["hash"]) + if "dsse" in ev: + verify_dsse(ev["dsse"]) + evidence_by_type.setdefault(ev["type"], ev) + + # Graph hash alignment + graph = bundle["graph"] + graph_evidence = evidence_by_type.get("graph") + if not graph_evidence: + raise ValueError("Graph evidence missing from bundle") + if graph["hash"].lower() != graph_evidence["hash"].lower(): + raise ValueError("Graph hash does not match evidence hash") + if "dsse" in graph: + verify_dsse(graph["dsse"]) + + # Entrypoint coverage + negative tests + config/flags hashes + for ep in bundle["entrypoints"]: + if ep["coverage_percent"] < args.min_coverage: + raise ValueError( + f"Entrypoint {ep['id']} coverage {ep['coverage_percent']} below required {args.min_coverage}" + ) + if not ep["negative_tests"]: + raise ValueError(f"Entrypoint {ep['id']} missing negative test confirmation") + config_ev = evidence_by_type.get("config") + if not config_ev or config_ev["hash"].lower() != ep["config_hash"].lower(): + raise ValueError(f"Entrypoint {ep['id']} config_hash not backed by evidence") + flags_ev = evidence_by_type.get("flags") + if not flags_ev or flags_ev["hash"].lower() != ep["flags_hash"].lower(): + raise ValueError(f"Entrypoint {ep['id']} flags_hash not backed by evidence") + + # RBAC enforcement + rbac = bundle["rbac"] + if rbac["approvals_required"] < 1 or not rbac["roles_allowed"]: + raise ValueError("RBAC section is incomplete") + + # Reevaluation triggers: must all be true to satisfy VEX-GAPS-401-062 + reevaluation = bundle["reevaluation"] + if not all( + [ + reevaluation.get("on_sbom_change"), + reevaluation.get("on_graph_change"), + reevaluation.get("on_runtime_change"), + ] + ): + raise ValueError("Reevaluation triggers must all be true") + + # Uncertainty gating present + uncertainty = bundle["uncertainty"] + if uncertainty["state"] not in {"U0-none", "U1-low", "U2-medium", "U3-high"}: + raise ValueError("Invalid uncertainty state") + + # Signature envelope integrity (best-effort) + default_dsse_path = args.bundle.with_suffix(".dsse.json") + if default_dsse_path.exists(): + sig_envelope_digest = f"sha256:{digest_for(default_dsse_path.read_bytes(), 'sha256')}" + for sig in bundle["signatures"]: + if sig["envelope_digest"].lower() != sig_envelope_digest.lower(): + raise ValueError("Signature envelope digest mismatch") + + print("✔ VEX proof bundle verified") + return 0 + + +if __name__ == "__main__": + try: + sys.exit(main()) + except Exception as exc: # pragma: no cover - top-level guard + print(f"Verification failed: {exc}", file=sys.stderr) + sys.exit(1) diff --git a/deploy/tools/security/attest/build-attestation-bundle.sh b/deploy/tools/security/attest/build-attestation-bundle.sh new file mode 100644 index 000000000..7f416ab52 --- /dev/null +++ b/deploy/tools/security/attest/build-attestation-bundle.sh @@ -0,0 +1,63 @@ +#!/usr/bin/env bash +set -euo pipefail + +# DEVOPS-ATTEST-74-002: package attestation outputs into an offline bundle with checksums. + +if [[ $# -lt 1 ]]; then + echo "Usage: $0 [bundle-out]" >&2 + exit 64 +fi + +ATTEST_DIR=$1 +BUNDLE_OUT=${2:-"out/attest-bundles"} + +if [[ ! -d "$ATTEST_DIR" ]]; then + echo "[attest-bundle] attestation directory not found: $ATTEST_DIR" >&2 + exit 66 +fi + +mkdir -p "$BUNDLE_OUT" + +TS=$(date -u +"%Y%m%dT%H%M%SZ") +BUNDLE_NAME="attestation-bundle-${TS}" +WORK_DIR="${BUNDLE_OUT}/${BUNDLE_NAME}" +mkdir -p "$WORK_DIR" + +copy_if_exists() { + local pattern="$1" + shopt -s nullglob + local files=("$ATTEST_DIR"/$pattern) + if (( ${#files[@]} > 0 )); then + cp "${files[@]}" "$WORK_DIR/" + fi + shopt -u nullglob +} + +# Collect common attestation artefacts +copy_if_exists "*.dsse.json" +copy_if_exists "*.in-toto.jsonl" +copy_if_exists "*.sarif" +copy_if_exists "*.intoto.json" +copy_if_exists "*.rekor.txt" +copy_if_exists "*.sig" +copy_if_exists "*.crt" +copy_if_exists "*.pem" +copy_if_exists "*.json" + +# Manifest +cat > "${WORK_DIR}/manifest.json" < SHA256SUMS +) + +tar -C "$BUNDLE_OUT" -czf "${WORK_DIR}.tgz" "${BUNDLE_NAME}" +echo "[attest-bundle] bundle created at ${WORK_DIR}.tgz" diff --git a/deploy/tools/security/cosign/README.md b/deploy/tools/security/cosign/README.md new file mode 100644 index 000000000..f86e29747 --- /dev/null +++ b/deploy/tools/security/cosign/README.md @@ -0,0 +1,124 @@ +# Cosign binaries (runtime/signals signing) + +## Preferred (system) +- Version: `v3.0.2` +- Path: `/usr/local/bin/cosign` (installed on WSL Debian host) +- Breaking change: v3 requires `--bundle ` when signing blobs; older `--output-signature`/`--output-certificate` pairs are deprecated. + +## Offline fallback (repo-pinned) +- Version: `v2.6.0` +- Binary: `tools/cosign/cosign` → `tools/cosign/v2.6.0/cosign-linux-amd64` +- SHA256: `ea5c65f99425d6cfbb5c4b5de5dac035f14d09131c1a0ea7c7fc32eab39364f9` +- Check: `cd tools/cosign/v2.6.0 && sha256sum -c cosign_checksums.txt --ignore-missing` + +## Usage examples +- v3 DSSE blob: `cosign sign-blob --key cosign.key --predicate-type stella.ops/confidenceDecayConfig@v1 --bundle confidence_decay_config.sigstore.json decay/confidence_decay_config.yaml` +- v3 verify: `cosign verify-blob --bundle confidence_decay_config.sigstore.json decay/confidence_decay_config.yaml` +- To force offline fallback, export `PATH=./tools/cosign:$PATH` (ensures v2.6.0 is used). + +## CI Workflow: signals-dsse-sign.yml + +The `.gitea/workflows/signals-dsse-sign.yml` workflow automates DSSE signing for Signals artifacts. + +### Required Secrets +| Secret | Description | Required | +|--------|-------------|----------| +| `COSIGN_PRIVATE_KEY_B64` | Base64-encoded cosign private key | Yes (for production) | +| `COSIGN_PASSWORD` | Password for the private key | If key is encrypted | +| `CI_EVIDENCE_LOCKER_TOKEN` | Token for Evidence Locker upload | Optional | + +### Trigger Options +1. **Automatic**: On push to `main` when signals artifacts change +2. **Manual**: Via workflow_dispatch with options: + - `out_dir`: Output directory (default: `evidence-locker/signals/2025-12-01`) + - `allow_dev_key`: Set to `1` for testing with dev key + +### Setting Up CI Secrets +```bash +# Generate production key pair (do this once, securely) +cosign generate-key-pair + +# Base64 encode the private key +cat cosign.key | base64 -w0 > cosign.key.b64 + +# Add to Gitea secrets: +# - COSIGN_PRIVATE_KEY_B64: contents of cosign.key.b64 +# - COSIGN_PASSWORD: password used during key generation +``` + +## CI / secrets (manual usage) +- CI should provide a base64-encoded private key via secret `COSIGN_PRIVATE_KEY_B64` and optional password in `COSIGN_PASSWORD`. +- Example bootstrap in jobs: + ```bash + echo "$COSIGN_PRIVATE_KEY_B64" | base64 -d > /tmp/cosign.key + chmod 600 /tmp/cosign.key + COSIGN_PASSWORD="${COSIGN_PASSWORD:-}" cosign version + ``` +- For local dev, copy your own key to `tools/cosign/cosign.key` or export `COSIGN_PRIVATE_KEY_B64` before running signing scripts. Never commit real keys; only `cosign.key.example` lives in git. + +## Development signing key + +A development key pair is provided for local testing and smoke tests: + +| File | Description | +|------|-------------| +| `tools/cosign/cosign.dev.key` | Private key (password-protected) | +| `tools/cosign/cosign.dev.pub` | Public key for verification | + +### Usage +```bash +# Sign signals artifacts with dev key +COSIGN_ALLOW_DEV_KEY=1 COSIGN_PASSWORD=stellaops-dev \ + OUT_DIR=docs/modules/signals/dev-test \ + tools/cosign/sign-signals.sh + +# Verify a signature +cosign verify-blob \ + --key tools/cosign/cosign.dev.pub \ + --bundle docs/modules/signals/dev-test/confidence_decay_config.sigstore.json \ + docs/modules/signals/decay/confidence_decay_config.yaml +``` + +### Security Notes +- Password: `stellaops-dev` (do not reuse elsewhere) +- **NOT** for production or Evidence Locker ingestion +- Real signing requires the Signals Guild key via `COSIGN_PRIVATE_KEY_B64` (CI) or `tools/cosign/cosign.key` (local drop-in) +- `sign-signals.sh` requires `COSIGN_ALLOW_DEV_KEY=1` to use the dev key; otherwise it refuses +- The signing helper disables tlog upload (`--tlog-upload=false`) and auto-accepts prompts (`--yes`) for offline runs + +## Signing Scripts + +### sign-signals.sh +Signs decay config, unknowns manifest, and heuristics catalog with DSSE envelopes. + +```bash +# Production (CI secret or cosign.key drop-in) +OUT_DIR=evidence-locker/signals/2025-12-01 tools/cosign/sign-signals.sh + +# Development (dev key) +COSIGN_ALLOW_DEV_KEY=1 COSIGN_PASSWORD=stellaops-dev \ + OUT_DIR=docs/modules/signals/dev-test \ + tools/cosign/sign-signals.sh +``` + +### Key Resolution Order +1. `COSIGN_KEY_FILE` environment variable +2. `COSIGN_PRIVATE_KEY_B64` (decoded to temp file) +3. `tools/cosign/cosign.key` (production drop-in) +4. `tools/cosign/cosign.dev.key` (only if `COSIGN_ALLOW_DEV_KEY=1`) + +### sign-authority-gaps.sh +Signs Authority gap artefacts (AU1–AU10, RR1–RR10) under `docs/modules/authority/gaps/artifacts/`. + +``` +# Production (Authority key via CI secret or cosign.key drop-in) +OUT_DIR=docs/modules/authority/gaps/dsse/2025-12-04 tools/cosign/sign-authority-gaps.sh + +# Development (dev key, smoke only) +COSIGN_ALLOW_DEV_KEY=1 COSIGN_PASSWORD=stellaops-dev \ + OUT_DIR=docs/modules/authority/gaps/dev-smoke/2025-12-04 \ + tools/cosign/sign-authority-gaps.sh +``` + +- Outputs bundles or dsse signatures plus `SHA256SUMS` in `OUT_DIR`. +- tlog upload disabled (`--tlog-upload=false`) and prompts auto-accepted (`--yes`) for offline use. diff --git a/deploy/tools/security/cosign/cosign b/deploy/tools/security/cosign/cosign new file mode 100644 index 000000000..396f39d8b --- /dev/null +++ b/deploy/tools/security/cosign/cosign @@ -0,0 +1 @@ +v2.6.0/cosign-linux-amd64 \ No newline at end of file diff --git a/deploy/tools/security/cosign/cosign.dev.key b/deploy/tools/security/cosign/cosign.dev.key new file mode 100644 index 000000000..49ad1d456 --- /dev/null +++ b/deploy/tools/security/cosign/cosign.dev.key @@ -0,0 +1,11 @@ +-----BEGIN ENCRYPTED SIGSTORE PRIVATE KEY----- +eyJrZGYiOnsibmFtZSI6InNjcnlwdCIsInBhcmFtcyI6eyJOIjo2NTUzNiwiciI6 +OCwicCI6MX0sInNhbHQiOiJ5dlhpaXliR2lTR0NPS2x0Q2M1dlFhTy91S3pBVzNs +Skl3QTRaU2dEMTAwPSJ9LCJjaXBoZXIiOnsibmFtZSI6Im5hY2wvc2VjcmV0Ym94 +Iiwibm9uY2UiOiIyNHA0T2xJZnJxdnhPVnM3dlY2MXNwVGpkNk80cVBEVCJ9LCJj +aXBoZXJ0ZXh0IjoiTHRWSGRqVi94MXJrYXhscGxJbVB5dkVtc2NBYTB5dW5oakZ5 +UUFiZ1RSNVdZL3lCS0tYMWdFb09hclZDWksrQU0yY0tIM2tJQWlJNWlMd1AvV3c5 +Q3k2SVY1ek4za014cExpcjJ1QVZNV3c3Y3BiYUhnNjV4TzNOYkEwLzJOSi84R0dN +NWt1QXhJRWsraER3ZWJ4Tld4WkRtNEZ4NTJVcVJxa2NPT09vNk9xWXB4OWFMaVZw +RjgzRElGZFpRK2R4K05RUnUxUmNrKzBtOHc9PSJ9 +-----END ENCRYPTED SIGSTORE PRIVATE KEY----- diff --git a/deploy/tools/security/cosign/cosign.dev.pub b/deploy/tools/security/cosign/cosign.dev.pub new file mode 100644 index 000000000..3e63f0f5b --- /dev/null +++ b/deploy/tools/security/cosign/cosign.dev.pub @@ -0,0 +1,4 @@ +-----BEGIN PUBLIC KEY----- +MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEfoI+9RFCTcfjeMqpCQ3FAyvKwBQU +YAIM2cfDR8W98OxnXV+gfV5Dhfoi8qofAnG/vC7DbBlX2t/gT7GKUZAChA== +-----END PUBLIC KEY----- diff --git a/deploy/tools/security/cosign/cosign.key.example b/deploy/tools/security/cosign/cosign.key.example new file mode 100644 index 000000000..8fb495c61 --- /dev/null +++ b/deploy/tools/security/cosign/cosign.key.example @@ -0,0 +1,8 @@ +# Placeholder development cosign key +# +# Do not use in production. Generate your own: +# cosign generate-key-pair +# +# Store the private key securely (e.g., CI secret COSIGN_PRIVATE_KEY_B64). +# +# This file exists only as a path stub for tooling; it is not a real key. diff --git a/deploy/tools/security/cosign/v2.6.0/cosign-linux-amd64 b/deploy/tools/security/cosign/v2.6.0/cosign-linux-amd64 new file mode 100644 index 000000000..5ac4f4563 Binary files /dev/null and b/deploy/tools/security/cosign/v2.6.0/cosign-linux-amd64 differ diff --git a/deploy/tools/security/cosign/v2.6.0/cosign_checksums.txt b/deploy/tools/security/cosign/v2.6.0/cosign_checksums.txt new file mode 100644 index 000000000..571c4dda1 --- /dev/null +++ b/deploy/tools/security/cosign/v2.6.0/cosign_checksums.txt @@ -0,0 +1,40 @@ +e8c634db1252725eabfd517f02e6ebf0d07bfba5b4779d7b45ef373ceff07b38 cosign-2.6.0-1.aarch64.rpm +9de55601c34fe7a8eaecb7a2fab93da032dd91d423a04ae6ac17e3f5ed99ec72 cosign-2.6.0-1.armv7hl.rpm +f7281a822306c35f2bd66c055ba6f77a7298de3375a401b12664035b8b323fdf cosign-2.6.0-1.ppc64le.rpm +814b890a07b56bcc6a42dfdf9004fadfe45c112e9b11a0c2f4ebf45568e72b4c cosign-2.6.0-1.riscv64.rpm +19241a09cc065f062d63a9c9ce45ed7c7ff839b93672be4688334b925809d266 cosign-2.6.0-1.s390x.rpm +52709467f072043f24553c6dd1e0f287eeeedb23340dd90a4438b8506df0a0bc cosign-2.6.0-1.x86_64.rpm +83b0fb42bc265e62aef7de49f4979b7957c9b7320d362a9f20046b2f823330f3 cosign-darwin-amd64 +3bcbcfc41d89e162e47ba08f70ffeffaac567f663afb3545c0265a5041ce652d cosign-darwin-amd64_2.6.0_darwin_amd64.sbom.json +dea5b83b8b375b99ac803c7bdb1f798963dbeb47789ceb72153202e7f20e8d07 cosign-darwin-arm64 +c09a84869eb31fcf334e54d0a9f81bf466ba7444dc975a8fe46b94d742288980 cosign-darwin-arm64_2.6.0_darwin_arm64.sbom.json +ea5c65f99425d6cfbb5c4b5de5dac035f14d09131c1a0ea7c7fc32eab39364f9 cosign-linux-amd64 +b4ccc276a5cc326f87d81fd1ae12f12a8dba64214ec368a39401522cccae7f9a cosign-linux-amd64_2.6.0_linux_amd64.sbom.json +641e05c21ce423cd263a49b1f9ffca58e2df022cb12020dcea63f8317c456950 cosign-linux-arm +e09684650882fd721ed22b716ffc399ee11426cd4d1c9b4fec539cba8bf46b86 cosign-linux-arm64 +d05d37f6965c3f3c77260171289281dbf88d1f2b07e865bf9d4fd94d9f2fe5c4 cosign-linux-arm64_2.6.0_linux_arm64.sbom.json +1b8b96535a7c30dbecead51ac3f51f559b31d8ab1dd4842562f857ebb1941fa5 cosign-linux-arm_2.6.0_linux_arm.sbom.json +6fa93dbd97664ccce6c3e5221e22e14547b0d202ba829e2b34a3479266b33751 cosign-linux-pivkey-pkcs11key-amd64 +17b9803701f5908476d5904492b7a4d1568b86094c3fbb5a06afaa62a6910e8c cosign-linux-pivkey-pkcs11key-amd64_2.6.0_linux_amd64.sbom.json +fbb78394e6fc19a2f34fea4ba03ea796aca84b666b6cdf65f46775f295fc9103 cosign-linux-pivkey-pkcs11key-arm64 +35ac308bd9c59844e056f6251ab76184bfc321cb1b3ac337fdb94a9a289d4d44 cosign-linux-pivkey-pkcs11key-arm64_2.6.0_linux_arm64.sbom.json +bd9cc643ec8a517ca66b22221b830dc9d6064bd4f3b76579e4e28b6af5cfba5f cosign-linux-ppc64le +ef04b0e087b95ce1ba7a902ecc962e50bfc974da0bd6b5db59c50880215a3f06 cosign-linux-ppc64le_2.6.0_linux_ppc64le.sbom.json +17c8ff6a5dc48d3802b511c3eb7495da6142397ace28af9a1baa58fb34fad75c cosign-linux-riscv64 +2007628a662808f221dc1983d9fba2676df32bb98717f89360cd191c929492ba cosign-linux-riscv64_2.6.0_linux_riscv64.sbom.json +7f7f042e7131950c658ff87079ac9080e7d64392915f06811f06a96238c242c1 cosign-linux-s390x +e22a35083b21552c80bafb747c022aa2aad302c861a392199bc2a8fad22dd6b5 cosign-linux-s390x_2.6.0_linux_s390x.sbom.json +7beb4dd1e19a72c328bbf7c0d7342d744edbf5cbb082f227b2b76e04a21c16ef cosign-windows-amd64.exe +8110eab8c5842caf93cf05dd26f260b6836d93b0263e49e06c1bd22dd5abb82c cosign-windows-amd64.exe_2.6.0_windows_amd64.sbom.json +7713d587f8668ce8f2a48556ee17f47c281cfb90102adfdb7182de62bc016cab cosign_2.6.0_aarch64.apk +c51b6437559624ef88b29a1ddd88d0782549b585dbbae0a5cb2fcc02bec72687 cosign_2.6.0_amd64.deb +438baaa35101e9982081c6450a44ea19e04cd4d2aba283ed52242e451736990b cosign_2.6.0_arm64.deb +8dc33858a68e18bf0cc2cb18c2ba0a7d829aa59ad3125366b24477e7d6188024 cosign_2.6.0_armhf.deb +88397077deee943690033276eef5206f7c60a30ea5f6ced66a51601ce79d0d0e cosign_2.6.0_armv7.apk +ca45b82cde86634705187f2361363e67c70c23212283594ff942d583a543f9dd cosign_2.6.0_ppc64el.deb +497f1a6d3899493153a4426286e673422e357224f3f931fdc028455db2fb5716 cosign_2.6.0_ppc64le.apk +1e37d9c3d278323095899897236452858c0bc49b52a48c3bcf8ce7a236bf2ee1 cosign_2.6.0_riscv64.apk +f2f65cf3d115fa5b25c61f6692449df2f4da58002a99e3efacc52a848fd3bca8 cosign_2.6.0_riscv64.deb +af0a62231880fd3495bbd1f5d4c64384034464b80930b7ffcd819d7152e75759 cosign_2.6.0_s390x.apk +e282d9337e4ba163a48ff1175855a6f6d6fbb562bc6c576c93944a6126984203 cosign_2.6.0_s390x.deb +382a842b2242656ecd442ae461c4dc454a366ed50d41a2dafcce8b689bfd03e4 cosign_2.6.0_x86_64.apk diff --git a/deploy/tools/security/crypto/download-cryptopro-playwright.cjs b/deploy/tools/security/crypto/download-cryptopro-playwright.cjs new file mode 100644 index 000000000..da6d623f5 --- /dev/null +++ b/deploy/tools/security/crypto/download-cryptopro-playwright.cjs @@ -0,0 +1,220 @@ +#!/usr/bin/env node +/** + * CryptoPro CSP downloader (Playwright-driven). + * + * Navigates cryptopro.ru downloads page, optionally fills login form, and selects + * Linux packages (.rpm/.deb/.tar.gz/.tgz/.bin) under the CSP Linux section. + * + * Environment: + * - CRYPTOPRO_URL (default: https://cryptopro.ru/products/csp/downloads#latest_csp50r3_linux) + * - CRYPTOPRO_EMAIL / CRYPTOPRO_PASSWORD (default demo creds: contact@stella-ops.org / Hoko33JD3nj3aJD.) + * - CRYPTOPRO_DRY_RUN (default: 1) -> list candidates, do not download + * - CRYPTOPRO_OUTPUT_DIR (default: /opt/cryptopro/downloads) + * - CRYPTOPRO_OUTPUT_FILE (optional: force a specific output filename/path) + * - CRYPTOPRO_UNPACK (default: 0) -> attempt to unpack tar.gz/tgz/rpm/deb + */ + +const path = require('path'); +const fs = require('fs'); +const { spawnSync } = require('child_process'); +const { chromium } = require('playwright-chromium'); + +const url = process.env.CRYPTOPRO_URL || 'https://cryptopro.ru/products/csp/downloads#latest_csp50r3_linux'; +const email = process.env.CRYPTOPRO_EMAIL || 'contact@stella-ops.org'; +const password = process.env.CRYPTOPRO_PASSWORD || 'Hoko33JD3nj3aJD.'; +const dryRun = (process.env.CRYPTOPRO_DRY_RUN || '1') !== '0'; +const outputDir = process.env.CRYPTOPRO_OUTPUT_DIR || '/opt/cryptopro/downloads'; +const outputFile = process.env.CRYPTOPRO_OUTPUT_FILE; +const unpack = (process.env.CRYPTOPRO_UNPACK || '0') === '1'; +const navTimeout = parseInt(process.env.CRYPTOPRO_NAV_TIMEOUT || '60000', 10); + +const linuxPattern = /\.(rpm|deb|tar\.gz|tgz|bin)(\?|$)/i; +const debugLinks = (process.env.CRYPTOPRO_DEBUG || '0') === '1'; + +function log(msg) { + process.stdout.write(`${msg}\n`); +} + +function warn(msg) { + process.stderr.write(`[WARN] ${msg}\n`); +} + +async function maybeLogin(page) { + const emailSelector = 'input[type="email"], input[name*="email" i], input[name*="login" i], input[name="name"]'; + const passwordSelector = 'input[type="password"], input[name*="password" i]'; + const submitSelector = 'button[type="submit"], input[type="submit"]'; + + const emailInput = await page.$(emailSelector); + const passwordInput = await page.$(passwordSelector); + if (emailInput && passwordInput) { + log('[login] Form detected; submitting credentials'); + await emailInput.fill(email); + await passwordInput.fill(password); + const submit = await page.$(submitSelector); + if (submit) { + await Promise.all([ + page.waitForNavigation({ waitUntil: 'networkidle', timeout: 15000 }).catch(() => {}), + submit.click() + ]); + } else { + await passwordInput.press('Enter'); + await page.waitForTimeout(2000); + } + } else { + log('[login] No login form detected; continuing anonymously'); + } +} + +async function findLinuxLinks(page) { + const targets = [page, ...page.frames()]; + const hrefs = []; + + // Collect href/data-href/data-url across main page + frames + for (const target of targets) { + try { + const collected = await target.$$eval('a[href], [data-href], [data-url]', (els) => + els + .map((el) => el.getAttribute('href') || el.getAttribute('data-href') || el.getAttribute('data-url')) + .filter((href) => typeof href === 'string') + ); + hrefs.push(...collected); + } catch (err) { + warn(`[scan] Failed to collect links from frame: ${err.message}`); + } + } + + const unique = Array.from(new Set(hrefs)); + return unique.filter((href) => linuxPattern.test(href)); +} + +function unpackIfSupported(filePath) { + if (!unpack) { + return; + } + const cwd = path.dirname(filePath); + if (filePath.endsWith('.tar.gz') || filePath.endsWith('.tgz')) { + const res = spawnSync('tar', ['-xzf', filePath, '-C', cwd], { stdio: 'inherit' }); + if (res.status === 0) { + log(`[unpack] Extracted ${filePath}`); + } else { + warn(`[unpack] Failed to extract ${filePath}`); + } + } else if (filePath.endsWith('.rpm')) { + const res = spawnSync('bash', ['-lc', `rpm2cpio "${filePath}" | cpio -idmv`], { stdio: 'inherit', cwd }); + if (res.status === 0) { + log(`[unpack] Extracted RPM ${filePath}`); + } else { + warn(`[unpack] Failed to extract RPM ${filePath}`); + } + } else if (filePath.endsWith('.deb')) { + const res = spawnSync('dpkg-deb', ['-x', filePath, cwd], { stdio: 'inherit' }); + if (res.status === 0) { + log(`[unpack] Extracted DEB ${filePath}`); + } else { + warn(`[unpack] Failed to extract DEB ${filePath}`); + } + } else if (filePath.endsWith('.bin')) { + const res = spawnSync('chmod', ['+x', filePath], { stdio: 'inherit' }); + if (res.status === 0) { + log(`[unpack] Marked ${filePath} as executable (self-extract expected)`); + } else { + warn(`[unpack] Could not mark ${filePath} executable`); + } + } else { + warn(`[unpack] Skipping unsupported archive type for ${filePath}`); + } +} + +async function main() { + if (email === 'contact@stella-ops.org' && password === 'Hoko33JD3nj3aJD.') { + warn('Using default demo credentials; set CRYPTOPRO_EMAIL/CRYPTOPRO_PASSWORD to real customer creds.'); + } + + const browser = await chromium.launch({ headless: true }); + const context = await browser.newContext({ + acceptDownloads: true, + httpCredentials: { username: email, password } + }); + const page = await context.newPage(); + log(`[nav] Opening ${url}`); + try { + await page.goto(url, { waitUntil: 'networkidle', timeout: navTimeout }); + } catch (err) { + warn(`[nav] Navigation at networkidle failed (${err.message}); retrying with waitUntil=load`); + await page.goto(url, { waitUntil: 'load', timeout: navTimeout }); + } + log(`[nav] Landed on ${page.url()}`); + await maybeLogin(page); + await page.waitForTimeout(2000); + + const loginGate = + page.url().includes('/user') || + (await page.$('form#user-login, form[id*="user-login"], .captcha, #captcha-container')); + if (loginGate) { + warn('[auth] Login/captcha gate detected on downloads page; automated fetch blocked. Provide session/cookies or run headful to solve manually.'); + await browser.close(); + return 2; + } + + let links = await findLinuxLinks(page); + if (links.length === 0) { + await page.waitForTimeout(1500); + await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight)); + await page.waitForTimeout(2000); + links = await findLinuxLinks(page); + } + if (links.length === 0) { + if (debugLinks) { + const targetDir = outputFile ? path.dirname(outputFile) : outputDir; + await fs.promises.mkdir(targetDir, { recursive: true }); + const debugHtml = path.join(targetDir, 'cryptopro-download-page.html'); + await fs.promises.writeFile(debugHtml, await page.content(), 'utf8'); + log(`[debug] Saved page HTML to ${debugHtml}`); + const allLinks = await page.$$eval('a[href], [data-href], [data-url]', (els) => + els + .map((el) => el.getAttribute('href') || el.getAttribute('data-href') || el.getAttribute('data-url')) + .filter((href) => typeof href === 'string') + ); + log(`[debug] Total link-like attributes: ${allLinks.length}`); + allLinks.slice(0, 20).forEach((href, idx) => log(` [all ${idx + 1}] ${href}`)); + } + warn('No Linux download links found on page.'); + await browser.close(); + return 1; + } + + log(`[scan] Found ${links.length} Linux candidate links`); + links.slice(0, 10).forEach((href, idx) => log(` [${idx + 1}] ${href}`)); + + if (dryRun) { + log('[mode] Dry-run enabled; not downloading. Set CRYPTOPRO_DRY_RUN=0 to fetch.'); + await browser.close(); + return 0; + } + + const target = links[0]; + log(`[download] Fetching ${target}`); + const [download] = await Promise.all([ + page.waitForEvent('download', { timeout: 30000 }), + page.goto(target).catch(() => page.click(`a[href="${target}"]`).catch(() => {})) + ]); + + const targetDir = outputFile ? path.dirname(outputFile) : outputDir; + await fs.promises.mkdir(targetDir, { recursive: true }); + const suggested = download.suggestedFilename(); + const outPath = outputFile ? outputFile : path.join(outputDir, suggested); + await download.saveAs(outPath); + log(`[download] Saved to ${outPath}`); + + unpackIfSupported(outPath); + + await browser.close(); + return 0; +} + +main() + .then((code) => process.exit(code)) + .catch((err) => { + console.error(err); + process.exit(1); + }); diff --git a/deploy/tools/security/crypto/package-rootpack-ru.sh b/deploy/tools/security/crypto/package-rootpack-ru.sh new file mode 100644 index 000000000..db3de813f --- /dev/null +++ b/deploy/tools/security/crypto/package-rootpack-ru.sh @@ -0,0 +1,69 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(git rev-parse --show-toplevel)" +TIMESTAMP="$(date -u +%Y%m%dT%H%M%SZ)" +OUTPUT_ROOT="${1:-${ROOT_DIR}/build/rootpack_ru_${TIMESTAMP}}" +ARTIFACT_DIR="${OUTPUT_ROOT}/artifacts" +DOC_DIR="${OUTPUT_ROOT}/docs" +CONFIG_DIR="${OUTPUT_ROOT}/config" +TRUST_DIR="${OUTPUT_ROOT}/trust" + +mkdir -p "$ARTIFACT_DIR" "$DOC_DIR" "$CONFIG_DIR" "$TRUST_DIR" + +publish_plugin() { + local project="$1" + local name="$2" + local publish_dir="${ARTIFACT_DIR}/${name}" + echo "[rootpack-ru] Publishing ${project} -> ${publish_dir}" + dotnet publish "$project" -c Release -o "$publish_dir" --nologo >/dev/null +} + +publish_plugin "src/__Libraries/StellaOps.Cryptography.Plugin.CryptoPro/StellaOps.Cryptography.Plugin.CryptoPro.csproj" "StellaOps.Cryptography.Plugin.CryptoPro" +publish_plugin "src/__Libraries/StellaOps.Cryptography.Plugin.Pkcs11Gost/StellaOps.Cryptography.Plugin.Pkcs11Gost.csproj" "StellaOps.Cryptography.Plugin.Pkcs11Gost" + +cp docs/security/rootpack_ru_validation.md "$DOC_DIR/" +cp docs/security/crypto-routing-audit-2025-11-07.md "$DOC_DIR/" +cp docs/security/rootpack_ru_package.md "$DOC_DIR/" +cp etc/rootpack/ru/crypto.profile.yaml "$CONFIG_DIR/rootpack_ru.crypto.yaml" + +if [ "${INCLUDE_GOST_VALIDATION:-1}" != "0" ]; then + candidate="${OPENSSL_GOST_LOG_DIR:-}" + if [ -z "$candidate" ]; then + candidate="$(ls -d "${ROOT_DIR}"/logs/openssl_gost_validation_* "${ROOT_DIR}"/logs/rootpack_ru_*/openssl_gost 2>/dev/null | sort | tail -n 1 || true)" + fi + + if [ -n "$candidate" ] && [ -d "$candidate" ]; then + mkdir -p "${DOC_DIR}/gost-validation" + cp -r "$candidate" "${DOC_DIR}/gost-validation/latest" + fi +fi + +shopt -s nullglob +for pem in "$ROOT_DIR"/certificates/russian_trusted_*; do + cp "$pem" "$TRUST_DIR/" +done +shopt -u nullglob + +cat <"${OUTPUT_ROOT}/README.txt" +RootPack_RU bundle (${TIMESTAMP}) +-------------------------------- +Contents: + - artifacts/ : Sovereign crypto plug-ins published for net10.0 (CryptoPro + PKCS#11) + - config/rootpack_ru.crypto.yaml : example configuration binding registry profiles + - docs/ : validation + audit documentation + - trust/ : Russian trust anchor PEM bundle copied from certificates/ + +Usage: + 1. Review docs/rootpack_ru_package.md for installation steps. + 2. Execute scripts/crypto/run-rootpack-ru-tests.sh (or CI equivalent) and attach the logs to this bundle. + 3. Record hardware validation outputs per docs/rootpack_ru_validation.md and store alongside this directory. +README + +if [[ "${PACKAGE_TAR:-1}" != "0" ]]; then + tarball="${OUTPUT_ROOT}.tar.gz" + echo "[rootpack-ru] Creating ${tarball}" + tar -czf "$tarball" -C "$(dirname "$OUTPUT_ROOT")" "$(basename "$OUTPUT_ROOT")" +fi + +echo "[rootpack-ru] Bundle staged under $OUTPUT_ROOT" diff --git a/deploy/tools/security/crypto/run-cryptopro-tests.ps1 b/deploy/tools/security/crypto/run-cryptopro-tests.ps1 new file mode 100644 index 000000000..883acb045 --- /dev/null +++ b/deploy/tools/security/crypto/run-cryptopro-tests.ps1 @@ -0,0 +1,25 @@ +param( + [string]$Configuration = "Release" +) + +if (-not $IsWindows) { + Write-Host "CryptoPro tests require Windows" -ForegroundColor Yellow + exit 0 +} + +if (-not (Get-Command dotnet -ErrorAction SilentlyContinue)) { + Write-Host "dotnet SDK not found" -ForegroundColor Red + exit 1 +} + +# Opt-in flag to avoid accidental runs on agents without CryptoPro CSP installed +$env:STELLAOPS_CRYPTO_PRO_ENABLED = "1" + +Write-Host "Running CryptoPro-only tests..." -ForegroundColor Cyan + +pushd $PSScriptRoot\..\.. +try { + dotnet test src/__Libraries/__Tests/StellaOps.Cryptography.Tests/StellaOps.Cryptography.Tests.csproj -c $Configuration --filter CryptoProGostSignerTests +} finally { + popd +} diff --git a/deploy/tools/security/crypto/run-rootpack-ru-tests.sh b/deploy/tools/security/crypto/run-rootpack-ru-tests.sh new file mode 100644 index 000000000..9011a1c62 --- /dev/null +++ b/deploy/tools/security/crypto/run-rootpack-ru-tests.sh @@ -0,0 +1,96 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(git rev-parse --show-toplevel)" +DEFAULT_LOG_ROOT="${ROOT_DIR}/logs/rootpack_ru_$(date -u +%Y%m%dT%H%M%SZ)" +LOG_ROOT="${ROOTPACK_LOG_DIR:-$DEFAULT_LOG_ROOT}" +ALLOW_PARTIAL="${ALLOW_PARTIAL:-1}" +mkdir -p "$LOG_ROOT" + +PROJECTS=( + "src/__Libraries/__Tests/StellaOps.Cryptography.Tests/StellaOps.Cryptography.Tests.csproj" + "src/Scanner/__Tests/StellaOps.Scanner.Worker.Tests/StellaOps.Scanner.Worker.Tests.csproj" + "src/Scanner/__Tests/StellaOps.Scanner.Sbomer.BuildXPlugin.Tests/StellaOps.Scanner.Sbomer.BuildXPlugin.Tests.csproj" +) +if [ "${RUN_SCANNER:-1}" != "1" ]; then + PROJECTS=("${PROJECTS[0]}") + echo "[rootpack-ru] RUN_SCANNER=0 set; skipping scanner test suites" +fi + +run_test() { + local project="$1" + local extra_props="" + + if [ "${STELLAOPS_ENABLE_CRYPTO_PRO:-""}" = "1" ]; then + extra_props+=" /p:StellaOpsEnableCryptoPro=true" + fi + + if [ "${STELLAOPS_ENABLE_PKCS11:-""}" = "1" ]; then + extra_props+=" /p:StellaOpsEnablePkcs11=true" + fi + local safe_name + safe_name="$(basename "${project%.csproj}")" + local log_file="${LOG_ROOT}/${safe_name}.log" + local trx_name="${safe_name}.trx" + + echo "[rootpack-ru] Running tests for ${project}" | tee "$log_file" + dotnet test "$project" \ + --nologo \ + --verbosity minimal \ + --results-directory "$LOG_ROOT" \ + --logger "trx;LogFileName=${trx_name}" ${extra_props} | tee -a "$log_file" +} + +PROJECT_SUMMARY=() +for project in "${PROJECTS[@]}"; do + safe_name="$(basename "${project%.csproj}")" + if run_test "$project"; then + PROJECT_SUMMARY+=("$project|$safe_name|PASS") + echo "[rootpack-ru] Wrote logs for ${project} -> ${LOG_ROOT}/${safe_name}.log" + else + PROJECT_SUMMARY+=("$project|$safe_name|FAIL") + echo "[rootpack-ru] Test run failed for ${project}; see ${LOG_ROOT}/${safe_name}.log" + if [ "${ALLOW_PARTIAL}" != "1" ]; then + echo "[rootpack-ru] ALLOW_PARTIAL=0; aborting harness." + exit 1 + fi + fi + done + +GOST_SUMMARY="skipped (docker not available)" +if [ "${RUN_GOST_VALIDATION:-1}" = "1" ]; then + if command -v docker >/dev/null 2>&1; then + echo "[rootpack-ru] Running OpenSSL GOST validation harness" + OPENSSL_GOST_LOG_DIR="${LOG_ROOT}/openssl_gost" + if OPENSSL_GOST_LOG_DIR="${OPENSSL_GOST_LOG_DIR}" bash "${ROOT_DIR}/scripts/crypto/validate-openssl-gost.sh"; then + if [ -d "${OPENSSL_GOST_LOG_DIR}" ] && [ -f "${OPENSSL_GOST_LOG_DIR}/summary.txt" ]; then + GOST_SUMMARY="$(cat "${OPENSSL_GOST_LOG_DIR}/summary.txt")" + else + GOST_SUMMARY="completed (see logs/openssl_gost_validation_*)" + fi + else + GOST_SUMMARY="failed (see logs/openssl_gost_validation_*)" + fi + else + echo "[rootpack-ru] Docker not available; skipping OpenSSL GOST validation." + fi +fi + +{ + echo "RootPack_RU deterministic test harness" + echo "Generated: $(date -u +%Y-%m-%dT%H:%M:%SZ)" + echo "Log Directory: $LOG_ROOT" + echo "" + echo "Projects:" + for entry in "${PROJECT_SUMMARY[@]}"; do + project_path="${entry%%|*}" + rest="${entry#*|}" + safe_name="${rest%%|*}" + status="${rest##*|}" + printf ' - %s (log: %s.log, trx: %s.trx) [%s]\n' "$project_path" "$safe_name" "$safe_name" "$status" + done + echo "" + echo "GOST validation: ${GOST_SUMMARY}" +} > "$LOG_ROOT/README.tests" + +echo "Logs and TRX files available under $LOG_ROOT" diff --git a/deploy/tools/security/crypto/run-sim-smoke.ps1 b/deploy/tools/security/crypto/run-sim-smoke.ps1 new file mode 100644 index 000000000..3f87ed6d4 --- /dev/null +++ b/deploy/tools/security/crypto/run-sim-smoke.ps1 @@ -0,0 +1,42 @@ +param( + [string] $BaseUrl = "http://localhost:5000", + [string] $SimProfile = "sm" +) + +$ErrorActionPreference = "Stop" +$repoRoot = Resolve-Path "$PSScriptRoot/../.." + +Push-Location $repoRoot +$job = $null +try { + Write-Host "Building sim service and smoke harness..." + dotnet build ops/crypto/sim-crypto-service/SimCryptoService.csproj -c Release | Out-Host + dotnet build ops/crypto/sim-crypto-smoke/SimCryptoSmoke.csproj -c Release | Out-Host + + Write-Host "Starting sim service at $BaseUrl ..." + $job = Start-Job -ArgumentList $repoRoot, $BaseUrl -ScriptBlock { + param($path, $url) + Set-Location $path + $env:ASPNETCORE_URLS = $url + dotnet run --project ops/crypto/sim-crypto-service/SimCryptoService.csproj --no-build -c Release + } + + Start-Sleep -Seconds 6 + + $env:STELLAOPS_CRYPTO_SIM_URL = $BaseUrl + $env:SIM_PROFILE = $SimProfile + Write-Host "Running smoke harness (profile=$SimProfile, url=$BaseUrl)..." + dotnet run --project ops/crypto/sim-crypto-smoke/SimCryptoSmoke.csproj --no-build -c Release + $exitCode = $LASTEXITCODE + if ($exitCode -ne 0) { + throw "Smoke harness failed with exit code $exitCode" + } +} +finally { + if ($job) { + Stop-Job $job -ErrorAction SilentlyContinue | Out-Null + Receive-Job $job -ErrorAction SilentlyContinue | Out-Null + Remove-Job $job -ErrorAction SilentlyContinue | Out-Null + } + Pop-Location +} diff --git a/deploy/tools/security/crypto/validate-openssl-gost.sh b/deploy/tools/security/crypto/validate-openssl-gost.sh new file mode 100644 index 000000000..c4000da23 --- /dev/null +++ b/deploy/tools/security/crypto/validate-openssl-gost.sh @@ -0,0 +1,108 @@ +#!/usr/bin/env bash +set -euo pipefail + +if ! command -v docker >/dev/null 2>&1; then + echo "[gost-validate] docker is required but not found on PATH" >&2 + exit 1 +fi + +ROOT_DIR="$(git rev-parse --show-toplevel)" +TIMESTAMP="$(date -u +%Y%m%dT%H%M%SZ)" +LOG_ROOT="${OPENSSL_GOST_LOG_DIR:-${ROOT_DIR}/logs/openssl_gost_validation_${TIMESTAMP}}" +IMAGE="${OPENSSL_GOST_IMAGE:-rnix/openssl-gost:latest}" +MOUNT_PATH="${LOG_ROOT}" + +UNAME_OUT="$(uname -s || true)" +case "${UNAME_OUT}" in + MINGW*|MSYS*|CYGWIN*) + if command -v wslpath >/dev/null 2>&1; then + # Docker Desktop on Windows prefers Windows-style mount paths. + MOUNT_PATH="$(wslpath -m "${LOG_ROOT}")" + fi + ;; + *) + MOUNT_PATH="${LOG_ROOT}" + ;; +esac + +mkdir -p "${LOG_ROOT}" + +cat >"${LOG_ROOT}/message.txt" <<'EOF' +StellaOps OpenSSL GOST validation message (md_gost12_256) +EOF + +echo "[gost-validate] Using image ${IMAGE}" +docker pull "${IMAGE}" >/dev/null + +CONTAINER_SCRIPT_PATH="${LOG_ROOT}/container-script.sh" + +cat > "${CONTAINER_SCRIPT_PATH}" <<'CONTAINER_SCRIPT' +set -eu + +MESSAGE="/out/message.txt" + +openssl version -a > /out/openssl-version.txt +openssl engine -c > /out/engine-list.txt + +openssl genpkey -engine gost -algorithm gost2012_256 -pkeyopt paramset:A -out /tmp/gost.key.pem >/dev/null +openssl pkey -engine gost -in /tmp/gost.key.pem -pubout -out /out/gost.pub.pem >/dev/null + +DIGEST_LINE="$(openssl dgst -engine gost -md_gost12_256 "${MESSAGE}")" +echo "${DIGEST_LINE}" > /out/digest.txt +DIGEST="$(printf "%s" "${DIGEST_LINE}" | awk -F'= ' '{print $2}')" + +openssl dgst -engine gost -md_gost12_256 -sign /tmp/gost.key.pem -out /tmp/signature1.bin "${MESSAGE}" +openssl dgst -engine gost -md_gost12_256 -sign /tmp/gost.key.pem -out /tmp/signature2.bin "${MESSAGE}" + +openssl dgst -engine gost -md_gost12_256 -verify /out/gost.pub.pem -signature /tmp/signature1.bin "${MESSAGE}" > /out/verify1.txt +openssl dgst -engine gost -md_gost12_256 -verify /out/gost.pub.pem -signature /tmp/signature2.bin "${MESSAGE}" > /out/verify2.txt + +SIG1_SHA="$(sha256sum /tmp/signature1.bin | awk '{print $1}')" +SIG2_SHA="$(sha256sum /tmp/signature2.bin | awk '{print $1}')" +MSG_SHA="$(sha256sum "${MESSAGE}" | awk '{print $1}')" + +cp /tmp/signature1.bin /out/signature1.bin +cp /tmp/signature2.bin /out/signature2.bin + +DETERMINISTIC_BOOL=false +DETERMINISTIC_LABEL="no" +if [ "${SIG1_SHA}" = "${SIG2_SHA}" ]; then + DETERMINISTIC_BOOL=true + DETERMINISTIC_LABEL="yes" +fi + +cat > /out/summary.txt < /out/summary.json <\S+)['\"]?\s*$") + + +def extract_images(path: pathlib.Path) -> List[str]: + images: List[str] = [] + for line in path.read_text(encoding="utf-8").splitlines(): + match = IMAGE_LINE.match(line) + if match: + images.append(match.group("image")) + return images + + +def image_repo(image: str) -> str: + if "@" in image: + return image.split("@", 1)[0] + # Split on the last colon to preserve registries with ports (e.g. localhost:5000) + if ":" in image: + prefix, tag = image.rsplit(":", 1) + if "/" in tag: + # handle digestive colon inside path (unlikely) + return image + return prefix + return image + + +def load_release_map(release_path: pathlib.Path) -> Dict[str, str]: + release_map: Dict[str, str] = {} + for image in extract_images(release_path): + repo = image_repo(image) + release_map[repo] = image + return release_map + + +def check_target( + target_path: pathlib.Path, + release_map: Dict[str, str], + ignore_repos: Set[str], +) -> List[str]: + errors: List[str] = [] + for image in extract_images(target_path): + repo = image_repo(image) + if repo in ignore_repos: + continue + if repo not in release_map: + continue + expected = release_map[repo] + if image != expected: + errors.append( + f"{target_path}: {image} does not match release value {expected}" + ) + return errors + + +def parse_args(argv: Optional[Iterable[str]] = None) -> argparse.Namespace: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument( + "--release", + required=True, + type=pathlib.Path, + help="Path to the release manifest (YAML)", + ) + parser.add_argument( + "--target", + action="append", + required=True, + type=pathlib.Path, + help="Deployment profile to validate against the release manifest", + ) + parser.add_argument( + "--ignore-repo", + action="append", + default=[], + help="Repository prefix to ignore (may be repeated)", + ) + return parser.parse_args(argv) + + +def main(argv: Optional[Iterable[str]] = None) -> int: + args = parse_args(argv) + + release_map = load_release_map(args.release) + ignore_repos = {repo.rstrip("/") for repo in args.ignore_repo} + + if not release_map: + print(f"error: no images found in release manifest {args.release}", file=sys.stderr) + return 2 + + total_errors: List[str] = [] + for target in args.target: + if not target.exists(): + total_errors.append(f"{target}: file not found") + continue + total_errors.extend(check_target(target, release_map, ignore_repos)) + + if total_errors: + print("✖ channel alignment check failed:", file=sys.stderr) + for err in total_errors: + print(f" - {err}", file=sys.stderr) + return 1 + + print("✓ deployment profiles reference release images for the inspected repositories.") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/deploy/tools/validation/validate-profiles.sh b/deploy/tools/validation/validate-profiles.sh new file mode 100644 index 000000000..5680f0f5a --- /dev/null +++ b/deploy/tools/validation/validate-profiles.sh @@ -0,0 +1,61 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +COMPOSE_DIR="$ROOT_DIR/compose" +HELM_DIR="$ROOT_DIR/helm/stellaops" + +compose_profiles=( + "docker-compose.dev.yaml:env/dev.env.example" + "docker-compose.stage.yaml:env/stage.env.example" + "docker-compose.prod.yaml:env/prod.env.example" + "docker-compose.airgap.yaml:env/airgap.env.example" + "docker-compose.mirror.yaml:env/mirror.env.example" + "docker-compose.telemetry.yaml:" + "docker-compose.telemetry-storage.yaml:" +) + +docker_ready=false +if command -v docker >/dev/null 2>&1; then + if docker compose version >/dev/null 2>&1; then + docker_ready=true + else + echo "⚠️ docker CLI present but Compose plugin unavailable; skipping compose validation" >&2 + fi +else + echo "⚠️ docker CLI not found; skipping compose validation" >&2 +fi + +if [[ "$docker_ready" == "true" ]]; then + for entry in "${compose_profiles[@]}"; do + IFS=":" read -r compose_file env_file <<<"$entry" + printf '→ validating %s with %s\n' "$compose_file" "$env_file" + if [[ -n "$env_file" ]]; then + docker compose \ + --env-file "$COMPOSE_DIR/$env_file" \ + -f "$COMPOSE_DIR/$compose_file" config >/dev/null + else + docker compose -f "$COMPOSE_DIR/$compose_file" config >/dev/null + fi + done +fi + +helm_values=( + "$HELM_DIR/values-dev.yaml" + "$HELM_DIR/values-stage.yaml" + "$HELM_DIR/values-prod.yaml" + "$HELM_DIR/values-airgap.yaml" + "$HELM_DIR/values-mirror.yaml" +) + +if command -v helm >/dev/null 2>&1; then + for values in "${helm_values[@]}"; do + printf '→ linting Helm chart with %s\n' "$(basename "$values")" + helm lint "$HELM_DIR" -f "$values" + helm template test-release "$HELM_DIR" -f "$values" >/dev/null + done +else + echo "⚠️ helm CLI not found; skipping Helm lint/template" >&2 +fi + +printf 'Profiles validated (where tooling was available).\n' diff --git a/deploy/tools/validation/validate_restore_sources.py b/deploy/tools/validation/validate_restore_sources.py new file mode 100644 index 000000000..06bb2bc52 --- /dev/null +++ b/deploy/tools/validation/validate_restore_sources.py @@ -0,0 +1,183 @@ +#!/usr/bin/env python3 + +""" +Validate NuGet source ordering for StellaOps. + +Ensures `local-nuget` is the highest priority feed in both NuGet.config and the +Directory.Build.props restore configuration. Fails fast with actionable errors +so CI/offline kit workflows can assert deterministic restore ordering. +""" + +from __future__ import annotations + +import argparse +import subprocess +import sys +import xml.etree.ElementTree as ET +from pathlib import Path + + +REPO_ROOT = Path(__file__).resolve().parents[2] +NUGET_CONFIG = REPO_ROOT / "NuGet.config" +ROOT_PROPS = REPO_ROOT / "Directory.Build.props" +EXPECTED_SOURCE_KEYS = ["local", "dotnet-public", "nuget.org"] + + +class ValidationError(Exception): + """Raised when validation fails.""" + + +def _fail(message: str) -> None: + raise ValidationError(message) + + +def _parse_xml(path: Path) -> ET.ElementTree: + try: + return ET.parse(path) + except FileNotFoundError as exc: + _fail(f"Missing required file: {path}") + except ET.ParseError as exc: + _fail(f"Could not parse XML for {path}: {exc}") + + +def validate_nuget_config() -> None: + tree = _parse_xml(NUGET_CONFIG) + root = tree.getroot() + + package_sources = root.find("packageSources") + if package_sources is None: + _fail("NuGet.config must declare a section.") + + children = list(package_sources) + if not children or children[0].tag != "clear": + _fail("NuGet.config packageSources must begin with a element.") + + adds = [child for child in children if child.tag == "add"] + if not adds: + _fail("NuGet.config packageSources must define at least one entry.") + + keys = [add.attrib.get("key") for add in adds] + if keys[: len(EXPECTED_SOURCE_KEYS)] != EXPECTED_SOURCE_KEYS: + formatted = ", ".join(keys) or "" + _fail( + "NuGet.config packageSources must list feeds in the order " + f"{EXPECTED_SOURCE_KEYS}. Found: {formatted}" + ) + + local_value = adds[0].attrib.get("value", "") + if Path(local_value).name != "local-nuget": + _fail( + "NuGet.config local feed should point at the repo-local mirror " + f"'local-nuget', found value '{local_value}'." + ) + + clear = package_sources.find("clear") + if clear is None: + _fail("NuGet.config packageSources must start with to avoid inherited feeds.") + + +def validate_directory_build_props() -> None: + tree = _parse_xml(ROOT_PROPS) + root = tree.getroot() + defaults = None + for element in root.findall(".//_StellaOpsDefaultRestoreSources"): + defaults = [fragment.strip() for fragment in element.text.split(";") if fragment.strip()] + break + + if defaults is None: + _fail("Directory.Build.props must define _StellaOpsDefaultRestoreSources.") + + expected_props = [ + "$(StellaOpsLocalNuGetSource)", + "$(StellaOpsDotNetPublicSource)", + "$(StellaOpsNuGetOrgSource)", + ] + if defaults != expected_props: + _fail( + "Directory.Build.props _StellaOpsDefaultRestoreSources must list feeds " + f"in the order {expected_props}. Found: {defaults}" + ) + + restore_nodes = root.findall(".//RestoreSources") + if not restore_nodes: + _fail("Directory.Build.props must override RestoreSources to force deterministic ordering.") + + uses_default_first = any( + node.text + and node.text.strip().startswith("$(_StellaOpsDefaultRestoreSources)") + for node in restore_nodes + ) + if not uses_default_first: + _fail( + "Directory.Build.props RestoreSources override must place " + "$(_StellaOpsDefaultRestoreSources) at the beginning." + ) + + +def assert_single_nuget_config() -> None: + extra_configs: list[Path] = [] + configs: set[Path] = set() + for glob in ("NuGet.config", "nuget.config"): + try: + result = subprocess.run( + ["rg", "--files", f"-g{glob}"], + check=False, + capture_output=True, + text=True, + cwd=REPO_ROOT, + ) + except FileNotFoundError as exc: + _fail("ripgrep (rg) is required for validation but was not found on PATH.") + if result.returncode not in (0, 1): + _fail( + f"ripgrep failed while searching for {glob}: {result.stderr.strip() or result.returncode}" + ) + for line in result.stdout.splitlines(): + configs.add((REPO_ROOT / line).resolve()) + + configs.discard(NUGET_CONFIG.resolve()) + extra_configs.extend(sorted(configs)) + if extra_configs: + formatted = "\n ".join(str(path.relative_to(REPO_ROOT)) for path in extra_configs) + _fail( + "Unexpected additional NuGet.config files detected. " + "Consolidate feed configuration in the repo root:\n " + f"{formatted}" + ) + + +def parse_args(argv: list[str]) -> argparse.Namespace: + parser = argparse.ArgumentParser( + description="Verify StellaOps NuGet feeds prioritise the local mirror." + ) + parser.add_argument( + "--skip-rg", + action="store_true", + help="Skip ripgrep discovery of extra NuGet.config files (useful for focused runs).", + ) + return parser.parse_args(argv) + + +def main(argv: list[str]) -> int: + args = parse_args(argv) + validations = [ + ("NuGet.config ordering", validate_nuget_config), + ("Directory.Build.props restore override", validate_directory_build_props), + ] + if not args.skip_rg: + validations.append(("single NuGet.config", assert_single_nuget_config)) + + for label, check in validations: + try: + check() + except ValidationError as exc: + sys.stderr.write(f"[FAIL] {label}: {exc}\n") + return 1 + else: + sys.stdout.write(f"[OK] {label}\n") + + return 0 + + +if __name__ == "__main__": + sys.exit(main(sys.argv[1:])) diff --git a/devops/services/export/seed-rustfs.sh b/devops/services/export/seed-rustfs.sh new file mode 100644 index 000000000..9c0f41798 --- /dev/null +++ b/devops/services/export/seed-rustfs.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash +set -euo pipefail +RUSTFS_ENDPOINT=${RUSTFS_ENDPOINT:-http://localhost:8080} +BUCKET=${BUCKET:-export-ci} +TMP=$(mktemp) +cleanup(){ rm -f "$TMP"; } +trap cleanup EXIT + +cat > "$TMP" <<'DATA' +{"id":"exp-001","object":"s3://export-ci/sample-export.ndjson","status":"ready"} +DATA + +# RustFS uses S3-compatible API +export AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID:-exportci}" +export AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY:-exportci123}" +export AWS_EC2_METADATA_DISABLED=true + +if ! aws --endpoint-url "$RUSTFS_ENDPOINT" s3 ls "s3://$BUCKET" >/dev/null 2>&1; then + aws --endpoint-url "$RUSTFS_ENDPOINT" s3 mb "s3://$BUCKET" +fi +aws --endpoint-url "$RUSTFS_ENDPOINT" s3 cp "$TMP" "s3://$BUCKET/sample-export.ndjson" +echo "Seeded $BUCKET/sample-export.ndjson" diff --git a/devops/tools/ops-scripts/check-advisory-raw-duplicates.sql b/devops/tools/ops-scripts/check-advisory-raw-duplicates.sql new file mode 100644 index 000000000..0c5ffb9aa --- /dev/null +++ b/devops/tools/ops-scripts/check-advisory-raw-duplicates.sql @@ -0,0 +1,46 @@ +-- Advisory raw duplicate detection query +-- Surfaces advisory_raw duplicate candidates prior to enabling the idempotency unique index. +-- Intended for staging/offline snapshots. +-- +-- Usage: +-- psql -d concelier -f ops/devops/tools/ops-scripts/check-advisory-raw-duplicates.sql +-- +-- Environment variables: +-- LIMIT - optional cap on number of duplicate groups to print (default 50). + +\echo '== advisory_raw duplicate audit ==' +\conninfo + +WITH duplicates AS ( + SELECT + source_vendor, + upstream_id, + content_hash, + tenant, + COUNT(*) as count, + ARRAY_AGG(id) as ids + FROM advisory_raw + GROUP BY source_vendor, upstream_id, content_hash, tenant + HAVING COUNT(*) > 1 + ORDER BY COUNT(*) DESC, source_vendor, upstream_id + LIMIT COALESCE(NULLIF(:'LIMIT', '')::INT, 50) +) +SELECT + 'vendor: ' || source_vendor || E'\n' || + 'upstream_id: ' || upstream_id || E'\n' || + 'tenant: ' || COALESCE(tenant, 'NULL') || E'\n' || + 'content_hash: ' || content_hash || E'\n' || + 'count: ' || count || E'\n' || + 'ids: ' || ARRAY_TO_STRING(ids, ', ') AS duplicate_info +FROM duplicates; + +SELECT CASE WHEN COUNT(*) = 0 + THEN 'No duplicate advisory_raw documents detected.' + ELSE 'Found ' || COUNT(*) || ' duplicate groups.' +END as status +FROM ( + SELECT 1 FROM advisory_raw + GROUP BY source_vendor, upstream_id, content_hash, tenant + HAVING COUNT(*) > 1 + LIMIT 1 +) t; diff --git a/devops/tools/ops-scripts/rollback-lnm-backfill.sql b/devops/tools/ops-scripts/rollback-lnm-backfill.sql new file mode 100644 index 000000000..be20752e6 --- /dev/null +++ b/devops/tools/ops-scripts/rollback-lnm-backfill.sql @@ -0,0 +1,60 @@ +-- Rollback script for LNM-21-102-DEV legacy advisory backfill migration. +-- Removes backfilled observations and linksets by querying the backfill_marker field, +-- then clears the tombstone markers from advisory_raw. +-- +-- Usage: +-- psql -d concelier -f ops/devops/tools/ops-scripts/rollback-lnm-backfill.sql +-- +-- Environment variables: +-- DRY_RUN - if set to "1", only reports what would be deleted without making changes. +-- +-- After running this script, delete the migration record: +-- DELETE FROM schema_migrations WHERE id = '20251127_lnm_legacy_backfill'; +-- +-- Then restart the Concelier service. + +\echo '' +\echo '== LNM-21-102-DEV Backfill Rollback ==' +\conninfo + +-- Count backfilled observations +SELECT 'Found ' || COUNT(*) || ' backfilled observations to remove.' as status +FROM advisory_observations +WHERE backfill_marker = 'lnm_21_102_dev'; + +-- Count backfilled linksets +SELECT 'Found ' || COUNT(*) || ' backfilled linksets to remove.' as status +FROM advisory_linksets +WHERE backfill_marker = 'lnm_21_102_dev'; + +-- Count advisory_raw tombstone markers +SELECT 'Found ' || COUNT(*) || ' advisory_raw documents with tombstone markers to clear.' as status +FROM advisory_raw +WHERE backfill_marker = 'lnm_21_102_dev'; + +-- Only execute if not DRY_RUN +\if :{?DRY_RUN} + \echo 'DRY RUN mode - no changes made' + \echo 'Set DRY_RUN=0 or omit it to execute the rollback' +\else + -- Step 1: Delete backfilled observations + DELETE FROM advisory_observations WHERE backfill_marker = 'lnm_21_102_dev'; + \echo 'Deleted observations' + + -- Step 2: Delete backfilled linksets + DELETE FROM advisory_linksets WHERE backfill_marker = 'lnm_21_102_dev'; + \echo 'Deleted linksets' + + -- Step 3: Clear tombstone markers from advisory_raw + UPDATE advisory_raw SET backfill_marker = NULL WHERE backfill_marker = 'lnm_21_102_dev'; + \echo 'Cleared tombstone markers' +\endif + +\echo '' +\echo '== Rollback Summary ==' +\echo '' +\echo 'Next steps:' +\echo '1. Delete the migration record:' +\echo ' DELETE FROM schema_migrations WHERE id = ''20251127_lnm_legacy_backfill'';' +\echo '2. Restart the Concelier service.' +\echo ''