Speed up scratch image builds with publish-first contexts

This commit is contained in:
master
2026-03-09 07:37:24 +02:00
parent c9686edf07
commit f218ec82ec
8 changed files with 358 additions and 38 deletions

View File

@@ -55,11 +55,19 @@ The scripts will:
3. Copy `env/stellaops.env.example` to `.env` if needed (works out of the box)
4. Start infrastructure and wait for healthy containers
5. Create or reuse the external frontdoor Docker network from `.env` (`FRONTDOOR_NETWORK`, default `stellaops_frontdoor`)
6. Build .NET solutions and Docker images
6. Build repo-owned .NET solutions, then publish backend services locally into small Docker contexts before building hardened runtime images (vendored dependency trees such as `node_modules` are excluded)
7. Launch the full platform with health checks
Open **https://stella-ops.local** when setup completes.
For targeted backend rebuilds after a scoped code change on Windows:
```powershell
.\devops\docker\build-all.ps1 -Services notify-web,orchestrator
```
This path avoids re-sending the full monorepo to Docker for every .NET service image.
## Manual path (step by step)
### 1. Environment file

View File

@@ -31,6 +31,8 @@ Setup scripts validate prerequisites, build solutions and Docker images, and lau
The scripts will check for required tools (dotnet 10.x, node 20+, npm 10+, docker, git), warn about missing hosts file entries, and copy `.env` from the example if needed. See the manual steps below for details on each stage.
On Windows and Linux, the backend image builder now publishes each selected .NET service locally and builds the hardened runtime image from a small temporary context. That avoids repeatedly streaming the whole monorepo into Docker during scratch setup.
### Quick validation + demo seed (first-run path)
```powershell
@@ -197,6 +199,18 @@ dotnet test src\Scanner\StellaOps.Scanner.sln
./scripts/build-all-solutions.sh --test
```
### Targeted backend image rebuilds
```powershell
.\devops\docker\build-all.ps1 -Services notify-web,advisory-ai-web
```
```bash
SERVICES=notify-web,advisory-ai-web ./devops/docker/build-all.sh
```
Use this after scoped backend changes instead of re-running the full image matrix.
### Module solution index
See [`docs/dev/SOLUTION_BUILD_GUIDE.md`](SOLUTION_BUILD_GUIDE.md) for the authoritative list. Current modules (39):

View File

@@ -5,7 +5,7 @@
- Treat the setup script itself as production surface: a clean repo plus docs must be enough to bootstrap the platform without manual script surgery.
- Re-run the clean setup path after the fix, then continue into Playwright-backed live verification on the rebuilt stack.
- Working directory: `devops/docker`.
- Allowed coordination edits: `scripts/setup.ps1`, `scripts/setup.sh`, `devops/compose/docker-compose.stella-ops.yml`, `docs/quickstart.md`, `docs/INSTALL_GUIDE.md`, `devops/README.md`, `devops/compose/README.md`, `src/Web/StellaOps.Web/scripts/chrome-path.js`, `src/Web/StellaOps.Web/scripts/verify-chromium.js`, `docs/implplan/SPRINT_20260309_001_Platform_scratch_setup_bootstrap_restore.md`.
- Allowed coordination edits: `scripts/setup.ps1`, `scripts/setup.sh`, `scripts/build-all-solutions.ps1`, `devops/compose/docker-compose.stella-ops.yml`, `docs/quickstart.md`, `docs/INSTALL_GUIDE.md`, `devops/README.md`, `devops/compose/README.md`, `src/Web/StellaOps.Web/scripts/chrome-path.js`, `src/Web/StellaOps.Web/scripts/verify-chromium.js`, `src/Authority/StellaOps.Authority.sln`, `src/Cli/StellaOps.Cli.sln`, `src/EvidenceLocker/StellaOps.EvidenceLocker.sln`, `src/Signals/StellaOps.Signals.sln`, `src/Tools/StellaOps.Tools.sln`, `src/Policy/StellaOps.Policy.engine.slnf`, `src/Policy/StellaOps.Policy.min.slnf`, `src/Policy/StellaOps.Policy.tests.slnf`, `src/Telemetry/StellaOps.Telemetry.Core/telemetry-tests.slnf`, `docs/implplan/SPRINT_20260309_001_Platform_scratch_setup_bootstrap_restore.md`.
- Expected evidence: clean setup invocation output, successful image-builder startup, rebuilt compose stack, and downstream Playwright verification artifacts.
## Dependencies & Concurrency
@@ -35,7 +35,7 @@ Completion criteria:
- [x] The fix preserves `REGISTRY`, `TAG_SUFFIX`, `SDK_IMAGE`, and `RUNTIME_IMAGE` overrides.
### PLATFORM-SETUP-002 - Re-run clean platform bootstrap and continue QA
Status: DONE
Status: DOING
Dependency: PLATFORM-SETUP-001
Owners: QA, Developer
Task description:
@@ -44,8 +44,21 @@ Task description:
Completion criteria:
- [x] The clean setup path is rerun from the repo script after the fix.
- [x] The stack is reachable through `https://stella-ops.local`.
- [x] The next live verification findings are captured for follow-on iterations.
- [ ] The stack is reachable through `https://stella-ops.local`.
- [ ] The next live verification findings are captured for follow-on iterations.
### PLATFORM-SETUP-003 - Repair scratch-bootstrap solution graph blockers
Status: DOING
Dependency: PLATFORM-SETUP-002
Owners: Developer
Task description:
- Fix the repo-level build graph defects exposed only by the documented full setup path after a complete Docker wipe. The fixes must preserve the canonical bootstrap workflow instead of bypassing it with `-SkipBuild`.
- Keep the repair limited to stale/corrupted solution metadata and bootstrap helper logic that prevents `scripts/setup.ps1` from completing from a clean repo state.
Completion criteria:
- [ ] `scripts/build-all-solutions.ps1` runs on this Windows host without PowerShell API compatibility errors.
- [ ] Broken solution entries discovered during the documented full setup are corrected in place.
- [ ] `scripts/setup.ps1` advances past the solution-build phase on an empty Docker state.
## Execution Log
| Date (UTC) | Update | Owner |
@@ -58,6 +71,8 @@ Completion criteria:
| 2026-03-09 | The next clean-start blocker was the external `FRONTDOOR_NETWORK` contract: a full Docker wipe removed `stellaops_frontdoor`, but neither setup script recreated it before `docker compose -f docker-compose.stella-ops.yml up -d`. Wired network creation into both setup scripts and updated the install docs to document the same manual prerequisite. | Developer |
| 2026-03-09 | Re-ran `scripts/setup.ps1 -SkipBuild -SkipImages` after the setup fixes and confirmed the stack came up cleanly on `https://stella-ops.local`; live Playwright auth also succeeded, proving the scratch bootstrap now reaches real browser-verifiable UI state. | Developer |
| 2026-03-09 | Demo seeding still exposed module migration debt (`no migration resources to consolidate` across several modules plus a duplicate `Unknowns` migration name). I did not treat that as a setup pass condition because the live frontdoor remained operable, but it remains a follow-on platform quality gap. | Developer |
| 2026-03-09 | Performed a full Docker wipe and reran the documented scratch bootstrap from zero state. Fixed additional repo bootstrap blockers exposed by the clean build matrix: stale `Authority`/`Cli`/`EvidenceLocker`/`Signals`/`Tools` solution references, `Tools` verifier project/test boundary drift, broken `Policy` and `Telemetry` solution filters, and unbounded solution discovery that recursed into frontend `node_modules` vendor samples. | Developer |
| 2026-03-09 | Investigated the next Windows bootstrap bottleneck: `devops/docker/build-all.ps1` still rebuilt every .NET service image from repo root, so Docker repeatedly transferred the monorepo into BuildKit during scratch setup. Reworked the builder to publish backend services locally into small temp contexts, kept the Angular console on its dedicated Dockerfile path, and threaded `--no-restore` through setup when the solution build already ran. | Developer |
## Decisions & Risks
- Decision: repair the documented setup path first instead of working around it with ad hoc manual builds, because scratch bootstrap is part of the product surface for this mission.
@@ -66,6 +81,8 @@ Completion criteria:
- Decision: treat browser-binary discovery as part of the scratch-bootstrap contract because a clean rebuild is not complete until Playwright can attach to a browser for live verification.
- Decision: preserve the `jobengine` compose service name and `jobengine.stella-ops.local` alias for compatibility, but map it to the canonical `orchestrator` image names emitted by the Docker build matrix so scratch setup uses the images it just produced.
- Decision: the automated setup path now owns creation of the external frontdoor Docker network because that network is part of the documented default compose topology, and a scratch bootstrap should not depend on an undocumented pre-existing Docker artifact.
- Decision: `scripts/build-all-solutions.ps1` must build only repo-owned solution surfaces under `src/`; vendored dependency trees such as frontend `node_modules` are excluded because they are not Stella bootstrap contracts and can contain native/Visual Studio samples that are invalid under `dotnet build`.
- Decision: the canonical .NET image builder now uses local `dotnet publish` plus a runtime-only Docker context by default, because repo-root `docker build` repeated monorepo context transfer for every service and made scratch setup unreasonably slow on Windows.
## Next Checkpoints
- 2026-03-09: rerun `scripts/setup.ps1 -SkipBuild` after the parser fix.