CD/CD consolidation
This commit is contained in:
22
devops/docs/AGENTS.md
Normal file
22
devops/docs/AGENTS.md
Normal file
@@ -0,0 +1,22 @@
|
||||
# DevOps & Release — Agent Charter
|
||||
|
||||
## Mission
|
||||
Execute deterministic build/release pipeline per `docs/modules/devops/ARCHITECTURE.md`:
|
||||
- Reproducible builds with SBOM/provenance, cosign signing, transparency logging.
|
||||
- Channel manifests (LTS/Stable/Edge) with digests, Helm/Compose profiles.
|
||||
- Performance guard jobs ensuring budgets.
|
||||
|
||||
## Expectations
|
||||
- Coordinate with Scanner/Scheduler/Notify teams for artifact availability.
|
||||
- Maintain CI reliability; update the owning sprint entries as states change.
|
||||
|
||||
## Required Reading
|
||||
- `docs/modules/platform/architecture-overview.md`
|
||||
- `docs/modules/airgap/airgap-mode.md`
|
||||
|
||||
## Working Agreement
|
||||
- 1. Update task status to `DOING`/`DONE` inside the corresponding `docs/implplan/SPRINT_*.md` entry when you start or finish work.
|
||||
- 2. Review this charter and the Required Reading documents before coding; confirm prerequisites are met.
|
||||
- 3. Keep changes deterministic (stable ordering, timestamps, hashes) and align with offline/air-gap expectations.
|
||||
- 4. Coordinate doc updates, tests, and cross-guild communication whenever contracts or workflows change.
|
||||
- 5. Revert to `TODO` if you pause the task without shipping changes; leave notes in commit/PR descriptions for context.
|
||||
28
devops/docs/README-space.md
Normal file
28
devops/docs/README-space.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Freeing Disk Space Quickly
|
||||
|
||||
If PTY allocation or builds fail with “No space left on device”, try these steps (safe order):
|
||||
|
||||
1) Remove build/test artefacts:
|
||||
```
|
||||
DRY_RUN=1 scripts/devops/cleanup-workspace.sh # preview
|
||||
SAFE_ONLY=0 scripts/devops/cleanup-workspace.sh # include bin/obj if needed
|
||||
```
|
||||
|
||||
2) Prune Docker cache (if allowed):
|
||||
```
|
||||
docker system prune -af
|
||||
docker volume prune -f
|
||||
```
|
||||
|
||||
3) Clear NuGet cache (local user):
|
||||
```
|
||||
rm -rf ~/.nuget/packages
|
||||
```
|
||||
|
||||
4) Remove old CI artefacts:
|
||||
- `ops/devops/artifacts/`
|
||||
- `ops/devops/ci-110-runner/artifacts/`
|
||||
- `ops/devops/sealed-mode-ci/artifacts/`
|
||||
- `out/` directories
|
||||
|
||||
5) Re-run the blocked workflow.
|
||||
27
devops/docs/TASKS.completed.md
Normal file
27
devops/docs/TASKS.completed.md
Normal file
@@ -0,0 +1,27 @@
|
||||
# Completed Tasks
|
||||
|
||||
| ID | Status | Owner(s) | Depends on | Description | Exit Criteria |
|
||||
|----|--------|----------|------------|-------------|---------------|
|
||||
| DEVOPS-HELM-09-001 | DONE | DevOps Guild | SCANNER-WEB-09-101 | Create Helm/Compose environment profiles (dev, staging, airgap) with deterministic digests. | Profiles committed under `deploy/`; docs updated; CI smoke deploy passes. |
|
||||
| DEVOPS-SCANNER-09-204 | DONE (2025-10-21) | DevOps Guild, Scanner WebService Guild | SCANNER-EVENTS-15-201 | Surface `SCANNER__EVENTS__*` environment variables across docker-compose (dev/stage/airgap) and Helm values, defaulting to share the Redis queue DSN. | Compose/Helm configs ship enabled Redis event publishing with documented overrides; lint jobs updated; docs cross-link to new knobs. |
|
||||
| DEVOPS-SCANNER-09-205 | DONE (2025-10-21) | DevOps Guild, Notify Guild | DEVOPS-SCANNER-09-204 | Add Notify smoke stage that tails the Redis stream and asserts `scanner.report.ready`/`scanner.scan.completed` reach Notify WebService in staging. | CI job reads Redis stream during scanner smoke deploy, confirms Notify ingestion via API, alerts on failure. |
|
||||
| DEVOPS-PERF-10-001 | DONE | DevOps Guild | BENCH-SCANNER-10-001 | Add perf smoke job (SBOM compose <5 s target) to CI. | CI job runs sample build verifying <5 s; alerts configured. |
|
||||
| DEVOPS-PERF-10-002 | DONE (2025-10-23) | DevOps Guild | BENCH-SCANNER-10-002 | Publish analyzer bench metrics to Grafana/perf workbook and alarm on ≥20 % regressions. | CI exports JSON for dashboards; Grafana panel wired; Ops on-call doc updated with alert hook. |
|
||||
| DEVOPS-REL-14-001 | DONE (2025-10-26) | DevOps Guild | SIGNER-API-11-101, ATTESTOR-API-11-201 | Deterministic build/release pipeline with SBOM/provenance, signing, manifest generation. | CI pipeline produces signed images + SBOM/attestations, manifests published with verified hashes, docs updated. |
|
||||
| DEVOPS-REL-14-004 | DONE (2025-10-26) | DevOps Guild, Scanner Guild | DEVOPS-REL-14-001, SCANNER-ANALYZERS-LANG-10-309P | Extend release/offline smoke jobs to exercise the Python analyzer plug-in (warm/cold scans, determinism, signature checks). | Release/Offline pipelines run Python analyzer smoke suite; alerts hooked; docs updated with new coverage matrix. |
|
||||
| DEVOPS-REL-17-002 | DONE (2025-10-26) | DevOps Guild | DEVOPS-REL-14-001, SCANNER-EMIT-17-701 | Persist stripped-debug artifacts organised by GNU build-id and bundle them into release/offline kits with checksum manifests. | CI job writes `.debug` files under `artifacts/debug/.build-id/`, manifest + checksums published, offline kit includes cache, smoke job proves symbol lookup via build-id. |
|
||||
| DEVOPS-MIRROR-08-001 | DONE (2025-10-19) | DevOps Guild | DEVOPS-REL-14-001 | Stand up managed mirror profiles for `*.stella-ops.org` (Concelier/Excititor), including Helm/Compose overlays, multi-tenant secrets, CDN caching, and sync documentation. | Infra overlays committed, CI smoke deploy hits mirror endpoints, runbooks published for downstream sync and quota management. |
|
||||
| DEVOPS-POLICY-20-001 | DONE (2025-10-26) | DevOps Guild, Policy Guild | POLICY-ENGINE-20-001 | Integrate DSL linting in CI (parser/compile) to block invalid policies; add pipeline step compiling sample policies. | CI fails on syntax errors; lint logs surfaced; docs updated with pipeline instructions. |
|
||||
| DEVOPS-POLICY-20-003 | DONE (2025-10-26) | DevOps Guild, QA Guild | DEVOPS-POLICY-20-001, POLICY-ENGINE-20-005 | Determinism CI: run Policy Engine twice with identical inputs and diff outputs to guard non-determinism. | CI job compares outputs, fails on differences, logs stored; documentation updated. |
|
||||
| DEVOPS-POLICY-20-004 | DONE (2025-10-27) | DevOps Guild, Scheduler Guild, CLI Guild | SCHED-MODELS-20-001, CLI-POLICY-20-002 | Automate policy schema exports: generate JSON Schema from `PolicyRun*` DTOs during CI, publish artefacts, and emit change alerts for CLI consumers (Slack + changelog). | CI stage outputs versioned schema files, uploads artefacts, notifies #policy-engine channel on change; docs/CLI references updated. |
|
||||
| DEVOPS-OBS-50-001 | DONE (2025-10-26) | DevOps Guild, Observability Guild | TELEMETRY-OBS-50-001 | Deliver default OpenTelemetry collector deployment (Compose/Helm manifests), OTLP ingestion endpoints, and secure pipeline (authN, mTLS, tenant partitioning). Provide smoke test verifying traces/logs/metrics ingestion. | Collector manifests committed; smoke test green; docs updated; imposed rule banner reminder noted. |
|
||||
| DEVOPS-OBS-50-003 | DONE (2025-10-26) | DevOps Guild, Offline Kit Guild | DEVOPS-OBS-50-001 | Package telemetry stack configs for air-gapped installs (Offline Kit bundle, documented overrides, sample values) and automate checksum/signature generation. | Offline bundle includes collector+storage configs; checksums published; docs cross-linked; imposed rule annotation recorded. |
|
||||
| DEVOPS-LAUNCH-18-100 | DONE (2025-10-26) | DevOps Guild | - | Finalise production environment footprint (clusters, secrets, network overlays) for full-platform go-live. | IaC/compose overlays committed, secrets placeholders documented, dry-run deploy succeeds in staging. |
|
||||
|
||||
| DEVOPS-CONSOLE-23-002 | TODO | DevOps Guild, Console Guild | DEVOPS-CONSOLE-23-001, CONSOLE-REL-23-301 | Produce `stella-console` container build + Helm chart overlays with deterministic digests, SBOM/provenance artefacts, and offline bundle packaging scripts. | Container published to registry mirror, Helm values committed, SBOM/attestations generated, offline kit job passes smoke test, docs updated. |
|
||||
| DEVOPS-LAUNCH-18-100 | DONE (2025-10-26) | DevOps Guild | - | Finalise production environment footprint (clusters, secrets, network overlays) for full-platform go-live. | IaC/compose overlays committed, secrets placeholders documented, dry-run deploy succeeds in staging. |
|
||||
| DEVOPS-LAUNCH-18-900 | DONE (2025-10-26) | DevOps Guild, Module Leads | Wave 0 completion | Collect “full implementation” sign-off from module owners and consolidate launch readiness checklist. | Sign-off record stored under `docs/modules/devops/runbooks/launch-readiness.md`; outstanding gaps triaged; checklist approved. |
|
||||
| DEVOPS-LAUNCH-18-001 | DONE (2025-10-26) | DevOps Guild | DEVOPS-LAUNCH-18-100, DEVOPS-LAUNCH-18-900 | Production launch cutover rehearsal and runbook publication. | `docs/modules/devops/runbooks/launch-cutover.md` drafted, rehearsal executed with rollback drill, approvals captured. |
|
||||
| DEVOPS-NUGET-13-001 | DONE (2025-10-25) | DevOps Guild, Platform Leads | DEVOPS-REL-14-001 | Add .NET 10 preview feeds / local mirrors so `Microsoft.Extensions.*` 10.0 preview packages restore offline; refresh restore docs. | NuGet.config maps preview feeds (or local mirrored packages), `dotnet restore` succeeds for Excititor/Concelier solutions without ad-hoc feed edits, docs updated for offline bootstrap. |
|
||||
| DEVOPS-NUGET-13-002 | DONE (2025-10-26) | DevOps Guild | DEVOPS-NUGET-13-001 | Ensure all solutions/projects prefer `local-nuget` before public sources and document restore order validation. | `NuGet.config` and solution-level configs resolve from `local-nuget` first; automated check verifies priority; docs updated for restore ordering. |
|
||||
| DEVOPS-NUGET-13-003 | DONE (2025-10-26) | DevOps Guild, Platform Leads | DEVOPS-NUGET-13-002 | Sweep `Microsoft.*` NuGet dependencies pinned to 8.* and upgrade to latest .NET 10 equivalents (or .NET 9 when 10 unavailable), updating restore guidance. | Dependency audit shows no 8.* `Microsoft.*` packages remaining; CI builds green; changelog/doc sections capture upgrade rationale. |
|
||||
74
devops/docs/deploy-readme.md
Normal file
74
devops/docs/deploy-readme.md
Normal file
@@ -0,0 +1,74 @@
|
||||
# Deployment Profiles
|
||||
|
||||
This directory contains deterministic deployment bundles for the core Stella Ops stack. All manifests reference immutable image digests and map 1:1 to the release manifests stored under `deploy/releases/`.
|
||||
|
||||
## Structure
|
||||
|
||||
- `releases/` – canonical release manifests (edge, stable, airgap) used to source image digests.
|
||||
- `compose/` – Docker Compose bundles for dev/stage/airgap targets plus `.env` seed files.
|
||||
- `compose/docker-compose.mirror.yaml` – managed mirror bundle for `*.stella-ops.org` with gateway cache and multi-tenant auth.
|
||||
- `compose/docker-compose.telemetry.yaml` – optional OpenTelemetry collector overlay (mutual TLS, OTLP pipelines).
|
||||
- `compose/docker-compose.telemetry-storage.yaml` – optional Prometheus/Tempo/Loki stack for observability backends.
|
||||
- `helm/stellaops/` – multi-profile Helm chart with values files for dev/stage/airgap.
|
||||
- `helm/stellaops/INSTALL.md` – install/runbook for prod and airgap profiles with digest pins.
|
||||
- `telemetry/` – shared OpenTelemetry collector configuration and certificate artefacts (generated via tooling).
|
||||
- `tools/validate-profiles.sh` – helper that runs `docker compose config` and `helm lint/template` for every profile.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. Update or add a release manifest under `releases/` with the new digests.
|
||||
2. Mirror the digests into the Compose and Helm profiles that correspond to that channel.
|
||||
3. Run `deploy/tools/validate-profiles.sh` (requires Docker CLI and Helm) to ensure the bundles lint and template cleanly.
|
||||
4. If telemetry ingest is required for the release, generate development certificates using
|
||||
`./ops/devops/telemetry/generate_dev_tls.sh` and run the collector smoke test with
|
||||
`python ./ops/devops/telemetry/smoke_otel_collector.py` to verify the OTLP endpoints.
|
||||
5. Commit the change alongside any documentation updates (e.g. install guide cross-links).
|
||||
|
||||
Maintaining the digest linkage keeps offline/air-gapped installs reproducible and avoids tag drift between environments.
|
||||
|
||||
### Surface.Env rollout warnings
|
||||
|
||||
- Compose (`deploy/compose/env/*.env.example`) and Helm (`deploy/helm/stellaops/values-*.yaml`) now seed `SCANNER_SURFACE_*` _and_ `ZASTAVA_SURFACE_*` variables so Scanner Worker/WebService and Zastava Observer/Webhook resolve cache roots, Surface.FS endpoints, and secrets providers through `StellaOps.Scanner.Surface.Env`.
|
||||
- During rollout, watch for structured log messages (and readiness output) prefixed with `surface.env.`—for example, `surface.env.cache_root_missing`, `surface.env.endpoint_unreachable`, or `surface.env.secrets_provider_invalid`.
|
||||
- Treat these warnings as deployment blockers: update the endpoint/cache/secrets values or permissions before promoting the environment, otherwise workers will fail fast at startup.
|
||||
- Air-gapped bundles default the secrets provider to `file` with `/etc/stellaops/secrets`; connected clusters default to `kubernetes`. Adjust the provider/root pair if your secrets manager differs.
|
||||
- Secret provisioning workflows for Kubernetes/Compose/Offline Kit are documented in `ops/devops/secrets/surface-secrets-provisioning.md`; follow that for `Surface.Secrets` handles and RBAC/permissions.
|
||||
|
||||
### Mongo2Go OpenSSL prerequisites
|
||||
|
||||
- Linux runners that execute Mongo2Go-backed suites (Excititor, Scheduler, Graph, etc.) must expose OpenSSL 1.1 (`libcrypto.so.1.1`, `libssl.so.1.1`). The canonical copies live under `tests/native/openssl-1.1/linux-x64`.
|
||||
- Export `LD_LIBRARY_PATH="$(git rev-parse --show-toplevel)/tests/native/openssl-1.1/linux-x64:${LD_LIBRARY_PATH:-}"` before invoking `dotnet test`. Example:\
|
||||
`LD_LIBRARY_PATH="$(pwd)/tests/native/openssl-1.1/linux-x64" dotnet test src/Excititor/__Tests/StellaOps.Excititor.WebService.Tests/StellaOps.Excititor.WebService.Tests.csproj --nologo`.
|
||||
- CI agents or Dockerfiles that host these tests should either mount the directory into the container or copy the two `.so` files into a directory that is already on the runtime library path.
|
||||
|
||||
### Additional tooling
|
||||
|
||||
- `deploy/tools/check-channel-alignment.py` – verifies that Helm/Compose profiles reference the exact images listed in a release manifest. Run it for each channel before promoting a release.
|
||||
- `ops/devops/telemetry/generate_dev_tls.sh` – produces local CA/server/client certificates for Compose-based collector testing.
|
||||
- `ops/devops/telemetry/smoke_otel_collector.py` – sends OTLP traffic and asserts the collector accepted traces, metrics, and logs.
|
||||
- `ops/devops/telemetry/package_offline_bundle.py` – packages telemetry assets (config/Helm/Compose) into a signed tarball for air-gapped installs.
|
||||
- `docs/modules/devops/runbooks/deployment-upgrade.md` – end-to-end instructions for upgrade, rollback, and channel promotion workflows (Helm + Compose).
|
||||
|
||||
### Tenancy observability & chaos (DEVOPS-TEN-49-001)
|
||||
|
||||
- Import `ops/devops/tenant/recording-rules.yaml` and `ops/devops/tenant/alerts.yaml` into your Prometheus rule groups.
|
||||
- Add Grafana dashboard `ops/devops/tenant/dashboards/tenant-audit.json` (folder `StellaOps / Tenancy`) to watch latency/error/auth cache ratios per tenant/service.
|
||||
- Run the multi-tenant k6 harness `ops/devops/tenant/k6-tenant-load.js` to hit 5k concurrent tenant-labelled requests (defaults to read/write 90/10, header `X-StellaOps-Tenant`).
|
||||
- Execute JWKS outage chaos via `ops/devops/tenant/jwks-chaos.sh` on an isolated agent with sudo/iptables; watch alerts `jwks_cache_miss_spike` and `tenant_auth_failures_spike` while load is active.
|
||||
|
||||
## CI smoke checks
|
||||
|
||||
The `.gitea/workflows/build-test-deploy.yml` pipeline includes a `notify-smoke` stage that validates scanner event propagation after staging deployments. Configure the following repository secrets (or environment-level secrets) so the job can connect to Redis and the Notify API:
|
||||
|
||||
- `NOTIFY_SMOKE_REDIS_DSN` – Redis connection string (`redis://user:pass@host:port/db`).
|
||||
- `NOTIFY_SMOKE_NOTIFY_BASEURL` – Base URL for the staging Notify WebService (e.g. `https://notify.stage.stella-ops.internal`).
|
||||
- `NOTIFY_SMOKE_NOTIFY_TOKEN` – OAuth bearer token (service account) with permission to read deliveries.
|
||||
- `NOTIFY_SMOKE_NOTIFY_TENANT` – Tenant identifier used for the smoke validation requests.
|
||||
- *(Optional)* `NOTIFY_SMOKE_NOTIFY_TENANT_HEADER` – Override for the tenant header name (defaults to `X-StellaOps-Tenant`).
|
||||
|
||||
Define the following repository variables (or secrets) to drive the assertions performed by the smoke check:
|
||||
|
||||
- `NOTIFY_SMOKE_EXPECT_KINDS` – Comma-separated event kinds the checker must observe (for example `scanner.report.ready,scanner.scan.completed`).
|
||||
- `NOTIFY_SMOKE_LOOKBACK_MINUTES` – Time window (in minutes) used when scanning the Redis stream for recent events (for example `30`).
|
||||
|
||||
All of the above values are required—the workflow fails fast with a descriptive error if any are missing or empty. Provide the variables at the organisation or repository scope before enabling the smoke stage.
|
||||
30
devops/docs/nuget-preview-packages.csv
Normal file
30
devops/docs/nuget-preview-packages.csv
Normal file
@@ -0,0 +1,30 @@
|
||||
# Package,Version,SHA256,SourceBase(optional)
|
||||
# DotNetPublicFlat=https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.AspNetCore.Authentication.JwtBearer,10.0.0-rc.2.25502.107,3223f447bde9a3620477305a89520e8becafe23b481a0b423552af572439f8c2,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.AspNetCore.Mvc.Testing,10.0.0-rc.2.25502.107,b6b53c62e0abefdca30e6ca08ab8357e395177dd9f368ab3ad4bbbd07e517229,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.AspNetCore.OpenApi,10.0.0-rc.2.25502.107,f64de1fe870306053346a31263e53e29f2fdfe0eae432a3156f8d7d705c81d85,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Data.Sqlite,9.0.0-rc.1.24451.1,770b637317e1e924f1b13587b31af0787c8c668b1d9f53f2fccae8ee8704e167,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Caching.Memory,10.0.0-rc.2.25502.107,6ec6d156ed06b07cbee9fa1c0803b8d54a5f904a0bf0183172f87b63c4044426,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration,10.0.0-rc.2.25502.107,0716f72cdc99b03946c98c418c39d42208fc65f20301bd1f26a6c174646870f6,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.Abstractions,10.0.0-rc.2.25502.107,db6e2cd37c40b5ac5ca7a4f40f5edafda2b6a8690f95a8c64b54c777a1d757c0,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.Binder,10.0.0-rc.2.25502.107,80f04da6beef001d3c357584485c2ddc6fdbf3776cfd10f0d7b40dfe8a79ee43,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.CommandLine,10.0.0-rc.2.25502.107,91974a95ae35bcfcd5e977427f3d0e6d3416e78678a159f5ec9e55f33a2e19af,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.EnvironmentVariables,10.0.0-rc.2.25502.107,74d65a20e2764d5f42863f5f203b216533fc51b22fb02a8491036feb98ae5fef,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.FileExtensions,10.0.0-rc.2.25502.107,5f97b56ea2ba3a1b252022504060351ce457f78ac9055d5fdd1311678721c1a1,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Configuration.Json,10.0.0-rc.2.25502.107,0ba362c479213eb3425f8e14d8a8495250dbaf2d5dad7c0a4ca8d3239b03c392,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.DependencyInjection,10.0.0-rc.2.25502.107,2e1b51b4fa196f0819adf69a15ad8c3432b64c3b196f2ed3d14b65136a6a8709,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.DependencyInjection.Abstractions,10.0.0-rc.2.25502.107,d6787ccf69e09428b3424974896c09fdabb8040bae06ed318212871817933352,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Diagnostics.Abstractions,10.0.0-rc.2.25502.107,b4bc47b4b4ded4ab2f134d318179537cbe16aed511bb3672553ea197929dc7d8,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Diagnostics.HealthChecks,10.0.0-rc.2.25502.107,855fd4da26b955b6b1d036390b1af10564986067b5cc6356cffa081c83eec158,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions,10.0.0-rc.2.25502.107,59f4724daed68a067a661e208f0a934f253b91ec5d52310d008e185bc2c9294c,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Hosting,10.0.0-rc.2.25502.107,ea9b1fa8e50acae720294671e6c36d4c58e20cfc9720335ab4f5ad4eba92cf62,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Hosting.Abstractions,10.0.0-rc.2.25502.107,98fa23ac82e19be221a598fc6f4b469e8b00c4ca2b7a42ad0bfea8b63bbaa9a2,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Http,10.0.0-rc.2.25502.107,c63c8bf4ca637137a561ca487b674859c2408918c4838a871bb26eb0c809a665,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Http.Polly,10.0.0-rc.2.25502.107,0b436196bcedd484796795f6a795d7a191294f1190f7a477f1a4937ef7f78110,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Logging.Abstractions,10.0.0-rc.2.25502.107,92b9a5ed62fe945ee88983af43c347429ec15691c9acb207872c548241cef961,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Logging.Console,10.0.0-rc.2.25502.107,fa1e10b5d6261675d9d2e97b9584ff9aaea2a2276eac584dfa77a1e35dcc58f5,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Options,10.0.0-rc.2.25502.107,d208acec60bec3350989694fd443e2d2f0ab583ad5f2c53a2879ade16908e5b4,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.Options.ConfigurationExtensions,10.0.0-rc.2.25502.107,c2863bb28c36fd67f308dd4af486897b512d62ecff2d96613ef954f5bef443e2,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.Extensions.TimeProvider.Testing,9.10.0,919a47156fc13f756202702cacc6e853123c84f1b696970445d89f16dfa45829,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.IdentityModel.Tokens,8.14.0,00b78c7b7023132e1d6b31d305e47524732dce6faca92dd16eb8d05a835bba7a,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
Microsoft.SourceLink.GitLab,8.0.0,a7efb9c177888f952ea8c88bc5714fc83c64af32b70fb080a1323b8d32233973,https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public/nuget/v3/flat2
|
||||
|
99
devops/docs/ops-devops-readme.md
Normal file
99
devops/docs/ops-devops-readme.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# DevOps Release Automation
|
||||
|
||||
The **release** workflow builds and signs the StellaOps service containers,
|
||||
generates SBOM + provenance attestations, and emits a canonical
|
||||
`release.yaml`. The logic lives under `ops/devops/release/` and is invoked
|
||||
by the new `.gitea/workflows/release.yml` pipeline.
|
||||
|
||||
## Local dry run
|
||||
|
||||
```bash
|
||||
./ops/devops/release/build_release.py \
|
||||
--version 2025.10.0-edge \
|
||||
--channel edge \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
Outputs land under `out/release/`. Use `--no-push` to run full builds without
|
||||
pushing to the registry.
|
||||
|
||||
After the build completes, run the verifier to validate recorded hashes and artefact
|
||||
presence:
|
||||
|
||||
```bash
|
||||
python ops/devops/release/verify_release.py --release-dir out/release
|
||||
```
|
||||
|
||||
## Python analyzer smoke & signing
|
||||
|
||||
`dotnet run --project src/Tools/LanguageAnalyzerSmoke` exercises the Python language
|
||||
analyzer plug-in against the golden fixtures (cold/warm timings, determinism). The
|
||||
release workflow runs this harness automatically and then produces Cosign
|
||||
signatures + SHA-256 sidecars for `StellaOps.Scanner.Analyzers.Lang.Python.dll`
|
||||
and its `manifest.json`. Keep `COSIGN_KEY_REF`/`COSIGN_IDENTITY_TOKEN` populated so
|
||||
the step can sign the artefacts; the generated `.sig`/`.sha256` files ship with the
|
||||
Offline Kit bundle.
|
||||
|
||||
## Required tooling
|
||||
|
||||
- Docker 25+ with Buildx
|
||||
- .NET 10 preview SDK (builds container stages and the SBOM generator)
|
||||
- Node.js 20 (Angular UI build)
|
||||
- Helm 3.16+
|
||||
- Cosign 2.2+
|
||||
|
||||
Supply signing material via environment variables:
|
||||
|
||||
- `COSIGN_KEY_REF` – e.g. `file:./keys/cosign.key` or `azurekms://…`
|
||||
- `COSIGN_PASSWORD` – password protecting the above key
|
||||
|
||||
The workflow defaults to multi-arch (`linux/amd64,linux/arm64`), SBOM in
|
||||
CycloneDX, and SLSA provenance (`https://slsa.dev/provenance/v1`).
|
||||
|
||||
## Debug store extraction
|
||||
|
||||
`build_release.py` now exports stripped debug artefacts for every ELF discovered in the published images. The files land under `out/release/debug/.build-id/<aa>/<rest>.debug`, with metadata captured in `debug/debug-manifest.json` (and a `.sha256` sidecar). Use `jq` to inspect the manifest or `readelf -n` to spot-check a build-id. Offline Kit packaging should reuse the `debug/` directory as-is.
|
||||
|
||||
## UI auth smoke (Playwright)
|
||||
|
||||
As part of **DEVOPS-UI-13-006** the pipelines will execute the UI auth smoke
|
||||
tests (`npm run test:e2e`) after building the Angular bundle. See
|
||||
`docs/modules/ui/operations/auth-smoke.md` for the job design, environment stubs, and
|
||||
offline runner considerations.
|
||||
|
||||
## NuGet preview bootstrap
|
||||
|
||||
`.NET 10` preview packages (Microsoft.Extensions.*, JwtBearer 10.0 RC, Sqlite 9 RC)
|
||||
ship from the public `dotnet-public` Azure DevOps feed. We mirror them into
|
||||
`./local-nuget` so restores succeed inside Offline Kit.
|
||||
|
||||
1. Run `./ops/devops/sync-preview-nuget.sh` whenever you update the manifest.
|
||||
2. The script now understands the optional `SourceBase` column (V3 flat container)
|
||||
and writes packages alongside their SHA-256 checks.
|
||||
3. `NuGet.config` registers the mirror (`local`), dotnet-public, and nuget.org.
|
||||
|
||||
Use `python3 ops/devops/validate_restore_sources.py` to prove the repo still
|
||||
prefers the local mirror and that `Directory.Build.props` enforces the same order.
|
||||
The validator now runs automatically in the `build-test-deploy` and `release`
|
||||
workflows so CI fails fast when a feed priority regression slips in.
|
||||
|
||||
Detailed operator instructions live in `docs/modules/devops/runbooks/nuget-preview-bootstrap.md`.
|
||||
|
||||
## CI harnesses (offline-friendly)
|
||||
|
||||
- **Concelier**: `ops/devops/concelier-ci-runner/run-concelier-ci.sh` builds `concelier-webservice.slnf` and runs WebService + Storage Mongo tests. Outputs binlog + TRX + summary under `ops/devops/artifacts/concelier-ci/<ts>/`.
|
||||
- **Advisory AI**: `ops/devops/advisoryai-ci-runner/run-advisoryai-ci.sh` builds `src/AdvisoryAI/StellaOps.AdvisoryAI.sln`, runs `StellaOps.AdvisoryAI.Tests`, and emits binlog + TRX + summary under `ops/devops/artifacts/advisoryai-ci/<ts>/`. For offline parity, configure a local NuGet feed in `nuget.config`.
|
||||
- **Scanner**: `ops/devops/scanner-ci-runner/run-scanner-ci.sh` builds `src/Scanner/StellaOps.Scanner.sln` and runs core/analyzer/web/worker test buckets with binlog + TRX outputs under `ops/devops/artifacts/scanner-ci/<ts>/`.
|
||||
|
||||
## Telemetry collector tooling (DEVOPS-OBS-50-001)
|
||||
|
||||
- `ops/devops/telemetry/generate_dev_tls.sh` – generates a development CA and
|
||||
client/server certificates for the OpenTelemetry collector overlay (mutual TLS).
|
||||
- `ops/devops/telemetry/smoke_otel_collector.py` – sends OTLP traces/metrics/logs
|
||||
over TLS and validates that the collector increments its receiver counters.
|
||||
- `ops/devops/telemetry/package_offline_bundle.py` – re-packages collector assets for the Offline Kit.
|
||||
- `ops/devops/telemetry/tenant_isolation_smoke.py` – verifies Tempo/Loki tenant isolation with mTLS and scoped headers.
|
||||
- `deploy/compose/docker-compose.telemetry-storage.yaml` – Prometheus/Tempo/Loki stack for staging validation.
|
||||
|
||||
Combine these helpers with `deploy/compose/docker-compose.telemetry.yaml` to run
|
||||
a secured collector locally before rolling out the Helm-based deployment.
|
||||
46
devops/docs/policy-signing.md
Normal file
46
devops/docs/policy-signing.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Policy Signing & Attestation (DevOps)
|
||||
|
||||
## Purpose
|
||||
- Keep policy artefacts (DSL files, bundles) signed with a short‑lived cosign key (or OIDC workload identity) so promotion is verifiable offline.
|
||||
- Provide deterministic, reproducible signing/attestation flows that runners can execute without external registries.
|
||||
- Make key rotation and verification one-liners for on-call and CI.
|
||||
|
||||
## Scripts
|
||||
- `scripts/policy/rotate-key.sh` – generate cosign keypair, emit base64 values for CI secrets in `out/policy-sign/keys/`.
|
||||
- `scripts/policy/sign-policy.sh` – sign a policy blob with `COSIGN_KEY_B64` and verify the signature; emits signature + public key to `out/policy-sign/`.
|
||||
- `scripts/policy/attest-verify.sh` – create a DSSE attestation for a policy blob and verify it against the generated bundle/public key.
|
||||
|
||||
## Local / CI workflow
|
||||
1. **Generate key (ephemeral or rotated):**
|
||||
```bash
|
||||
OUT_DIR=out/policy-sign/keys PREFIX=ci-policy COSIGN_PASSWORD= scripts/policy/rotate-key.sh
|
||||
```
|
||||
Copy the base64 strings from `out/policy-sign/keys/README.txt` into `POLICY_COSIGN_KEY_B64` / `POLICY_COSIGN_PUB_B64` secrets.
|
||||
2. **Sign a policy:**
|
||||
```bash
|
||||
export COSIGN_KEY_B64=$(base64 -w0 out/policy-sign/keys/ci-policy-cosign.key)
|
||||
COSIGN_PASSWORD= scripts/policy/sign-policy.sh --file docs/examples/policies/baseline.stella --out-dir out/policy-sign
|
||||
```
|
||||
Outputs: `baseline.stella.sig`, `cosign.pub`.
|
||||
3. **Attest + verify:**
|
||||
```bash
|
||||
export COSIGN_KEY_B64=$(base64 -w0 out/policy-sign/keys/ci-policy-cosign.key)
|
||||
COSIGN_PASSWORD= scripts/policy/attest-verify.sh --file docs/examples/policies/baseline.stella --out-dir out/policy-sign
|
||||
```
|
||||
Outputs: DSSE bundle `.attestation.sigstore` and re-verifies it with the public key.
|
||||
4. **CI stage:** `.gitea/workflows/policy-simulate.yml` now installs cosign, runs the three steps above, and publishes `out/policy-sign/` as an artifact alongside simulation outputs.
|
||||
|
||||
## OIDC / workload identity
|
||||
- Runners with keyless cosign enabled can skip `COSIGN_KEY_B64` and rely on `COSIGN_EXPERIMENTAL=1` + `COSIGN_FULCIO_URL`/`COSIGN_REKOR_URL`; keep offline jobs on key mode.
|
||||
- Rotate keys per environment; keep prod keys in Gitea secrets and staging keys in repo‑local `out/` for reproducibility.
|
||||
|
||||
## Verification quick check
|
||||
- To verify a policy blob from artifacts:
|
||||
```bash
|
||||
cosign verify-blob --key out/policy-sign/cosign.pub --signature out/policy-sign/baseline.stella.sig docs/examples/policies/baseline.stella
|
||||
cosign verify-blob-attestation --key out/policy-sign/cosign.pub --type stella.policy --bundle out/policy-sign/baseline.stella.attestation.sigstore docs/examples/policies/baseline.stella
|
||||
```
|
||||
|
||||
## Notes
|
||||
- All outputs are deterministic (UTC timestamps, fixed file names) to stay audit-friendly and offline-ready.
|
||||
- Attestation predicate captures filename + SHA256 + timestamp for traceability. Update predicate schema if promotion metadata expands.
|
||||
Reference in New Issue
Block a user