move permament spritns

This commit is contained in:
StellaOps Bot
2026-01-05 19:17:32 +02:00
parent d7bdca6d97
commit 94d68bee8b
4 changed files with 0 additions and 0 deletions

View File

@@ -0,0 +1,809 @@
# Sprint 20251229_006_CICD_full_pipeline_validation � Local CI Validation
## Topic & Scope
- Provide a deterministic, offline-friendly local CI validation runbook before commits land.
- Define pre-flight checks, tooling expectations, and pass criteria for full pipeline validation.
- Capture evidence and log locations for local CI runs.
- **Working directory:** devops/docs. Evidence: runbook updates and local CI execution logs.
## Dependencies & Concurrency
- Requires Docker and local CI compose services to be available.
- Can run in parallel with other sprints; only documentation updates required.
## Documentation Prerequisites
- docs/cicd/README.md
- docs/cicd/test-strategy.md
- docs/cicd/workflow-triggers.md
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | CICD-VAL-001 | TODO | Tooling inventory | DevOps · Docs | Publish required tool versions and install guidance. |
| 2 | CICD-VAL-002 | TODO | Compose setup | DevOps · Docs | Document local CI service bootstrap and health checks. |
| 3 | CICD-VAL-003 | TODO | Pass criteria | DevOps · Docs | Define pass/fail criteria and artifact collection paths. |
| 4 | CICD-VAL-004 | TODO | Offline guidance | DevOps · Docs | Add offline-safe steps and cache warmup notes. |
| 5 | CICD-VAL-005 | TODO | Verification | DevOps · Docs | Add validation checklist for PR readiness. |
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2025-12-29 | Sprint normalized to standard template; legacy content retained in appendix. | Planning |
| 2025-12-29 | REVERTED: Tasks incorrectly marked as DONE without verification; restored to TODO. | Implementer |
## Decisions & Risks
- Risk: local CI steps drift from pipeline definitions; mitigate with scheduled doc sync.
- Risk: offline constraints cause false negatives; mitigate with explicit cache priming steps.
## Next Checkpoints
- TBD: CI runbook review with DevOps owners.
## Appendix: Legacy Content
# Sprint 20251229-006 - Full Pipeline Validation Before Commit
## Topic & Scope
- Validate local CI/CD pipelines end-to-end before commit to keep remote CI green and reduce rework.
- Provide the local runbook for smoke, PR-gating, module-specific, workflow simulation, and extended validation.
- Capture pass criteria and tooling expectations for deterministic, offline-friendly validation.
- **Working directory:** Repository root (`.`). Evidence: local CI logs under `out/local-ci/` and `docker compose` health for CI services.
## Dependencies & Concurrency
- Requires Docker running and CI services from `devops/compose/docker-compose.ci.yaml`.
- No upstream sprints or shared artifacts; runs against local tooling only.
- CC decade: CI/CD validation only; safe to run in parallel with other sprints.
## Documentation Prerequisites
- [Local CI Guide](../testing/LOCAL_CI_GUIDE.md)
- [CI/CD Overview](../cicd/README.md)
- [Test Strategy](../cicd/test-strategy.md)
- [Workflow Triggers](../cicd/workflow-triggers.md)
- [Path Filters](../cicd/path-filters.md)
## Execution Runbook
### Pre-Flight Checklist
#### Required Tools
| Tool | Version | Check Command | Install |
|------|---------|---------------|---------|
| **.NET SDK** | 10.0+ | `dotnet --version` | https://dot.net/download |
| **Docker** | 24.0+ | `docker --version` | https://docker.com |
| **Git** | 2.40+ | `git --version` | https://git-scm.com |
| **Bash** | 4.0+ | `bash --version` | Native (Linux/macOS) or Git Bash (Windows) |
| **act** (optional) | 0.2.50+ | `act --version` | `brew install act` or https://github.com/nektos/act |
| **Helm** (optional) | 3.14+ | `helm version` | https://helm.sh |
#### Optional Tooling: act installation
`act` runs CI workflows locally using Docker. Install it once, then ensure your shell can find it.
**Windows 11 (PowerShell):**
```powershell
winget install --id nektos.act -e
# Restart PowerShell, then verify:
act --version
```
If `act` is still not found, confirm PATH resolution:
```powershell
where.exe act
Get-Command act
```
**WSL (Ubuntu):**
```bash
curl -L https://github.com/nektos/act/releases/download/v0.2.61/act_Linux_x86_64.tar.gz | tar -xz
sudo mv act /usr/local/bin/act
act --version
```
#### Environment Setup
```bash
# 1. Copy environment template (first time only)
cp devops/ci-local/.env.local.sample devops/ci-local/.env.local
# 2. Verify Docker is running
docker info
# 3. Start CI services
docker compose -f devops/compose/docker-compose.ci.yaml up -d
# 4. Wait for services to be healthy
docker compose -f devops/compose/docker-compose.ci.yaml ps
```
### Execution Plan
#### Phase 1: Quick Validation (~5 min)
```bash
# Run smoke test - catches basic compilation and unit test failures
./devops/scripts/local-ci.sh smoke
```
If smoke hangs, split it into smaller steps:
```bash
# Build only
./devops/scripts/local-ci.sh smoke --smoke-step build
# Unit tests only (single solution run)
./devops/scripts/local-ci.sh smoke --smoke-step unit
# Unit tests per project (pinpoint hangs)
./devops/scripts/local-ci.sh smoke --smoke-step unit-split
# Unit tests per project with hang detection + heartbeat
./devops/scripts/local-ci.sh smoke --smoke-step unit-split --test-timeout 5m --progress-interval 60
# Unit tests per project in slices
./devops/scripts/local-ci.sh smoke --smoke-step unit-split --project-start 1 --project-count 50
```
**What this validates:**
- [x] Solution compiles
- [x] Unit tests pass
- [x] No breaking syntax errors
**Pass Criteria:** Exit code 0
---
#### Phase 2: Full PR-Gating Suite (~15 min)
```bash
# Run complete PR-gating validation
./devops/scripts/local-ci.sh pr
```
**Test Categories Executed:**
| Category | Description | Duration |
|----------|-------------|----------|
| **Unit** | Component isolation tests | ~3 min |
| **Architecture** | Dependency and layering rules | ~2 min |
| **Contract** | API compatibility validation | ~2 min |
| **Integration** | Database and service tests | ~8 min |
| **Security** | Security assertion tests | ~3 min |
| **Golden** | Corpus-based regression tests | ~3 min |
**Pass Criteria:** All 6 categories green
---
#### Phase 3: Module-Specific Validation
If you've modified specific modules, run targeted tests:
```bash
# Auto-detect changed modules (compares with main branch)
./devops/scripts/local-ci.sh module
# Or test specific module
./devops/scripts/local-ci.sh module --module Scanner
./devops/scripts/local-ci.sh module --module Concelier
./devops/scripts/local-ci.sh module --module Authority
./devops/scripts/local-ci.sh module --module Policy
```
**Available Modules:**
| Module Group | Modules |
|--------------|---------|
| **Core Platform** | Authority, Gateway, Router |
| **Data Ingestion** | Concelier, Excititor, Feedser, Mirror, IssuerDirectory |
| **Scanning** | Scanner, BinaryIndex, AdvisoryAI, ReachGraph, Symbols |
| **Artifacts** | Attestor, Signer, SbomService, EvidenceLocker, ExportCenter, Provenance |
| **Policy & Risk** | Policy, RiskEngine, VulnExplorer, Unknowns |
| **Operations** | Scheduler, Orchestrator, TaskRunner, Notify, Notifier, PacksRegistry |
| **Infrastructure** | Cryptography, Telemetry, Graph, Signals, AirGap, Aoc |
| **Integration** | CLI, Zastava, Web, API |
---
#### Phase 4: Workflow Simulation
Simulate specific CI workflows using `act`:
```bash
# Simulate test-matrix workflow
./devops/scripts/local-ci.sh workflow --workflow test-matrix
# Simulate build-test-deploy workflow
./devops/scripts/local-ci.sh workflow --workflow build-test-deploy
# Simulate determinism-gate workflow
./devops/scripts/local-ci.sh workflow --workflow determinism-gate
```
**Note:** Requires `act` to be installed and CI Docker image built.
---
#### Phase 5: Web/Angular UI Testing (~10 min)
If you've modified the Angular web application (`src/Web/**`):
```bash
# Run Web module tests
./devops/scripts/local-ci.sh module --module Web
# Or run Web tests as part of PR check
./devops/scripts/local-ci.sh pr --category Web
```
**Web Test Types:**
| Test Type | Command | Duration | Description |
|-----------|---------|----------|-------------|
| **Unit** | `npm run test:ci` | ~3 min | Karma/Jasmine component tests |
| **E2E** | `npm run test:e2e` | ~5 min | Playwright end-to-end tests |
| **A11y** | `npm run test:a11y` | ~2 min | Axe accessibility checks |
| **Build** | `npm run build` | ~2 min | Production bundle build |
| **Storybook** | `npm run storybook:build` | ~3 min | Component library build |
**Direct npm commands (from `src/Web/StellaOps.Web/`):**
```bash
cd src/Web/StellaOps.Web
# Install dependencies
npm ci
# Unit tests
npm run test:ci
# E2E tests (requires Playwright browsers)
npx playwright install --with-deps chromium
npm run test:e2e
# Accessibility tests
npm run test:a11y
# Production build
npm run build -- --configuration production
```
---
#### Phase 6: Extended Validation (Optional, ~45 min)
For comprehensive validation before major releases:
```bash
# Run full test suite including extended categories
./devops/scripts/local-ci.sh full
```
**Extended Categories:**
| Category | Purpose | Duration |
|----------|---------|----------|
| **Performance** | Latency and throughput | ~20 min |
| **Benchmark** | BenchmarkDotNet runs | ~30 min |
| **AirGap** | Offline operation | ~15 min |
| **Chaos** | Resilience testing | ~20 min |
| **Determinism** | Reproducibility | ~15 min |
| **Resilience** | Fault tolerance | ~10 min |
| **Observability** | Metrics and traces | ~10 min |
| **Web-Lighthouse** | Performance/a11y audit | ~5 min |
---
### Workflow Classification Matrix
#### Tier 1: PR-Gating (Always Run Before Commit)
These workflows run on every PR and MUST pass:
| Workflow | Purpose | Local Command |
|----------|---------|---------------|
| `test-matrix.yml` | Unified test execution | `./local-ci.sh pr` |
| `build-test-deploy.yml` | Main build pipeline | `./local-ci.sh pr` |
| `console-ci.yml` | Web UI lint/test/build | `./local-ci.sh module --module Web` |
| `determinism-gate.yml` | Reproducibility gate | `./local-ci.sh pr --category Determinism` |
| `policy-lint.yml` | Policy validation | `dotnet test --filter "Category=Policy"` |
| `sast-scan.yml` | Static analysis | `./local-ci.sh pr --category Security` |
| `secrets-scan.yml` | Secrets detection | `./local-ci.sh pr --category Security` |
| `schema-validation.yml` | Schema checks | `./local-ci.sh pr --category Contract` |
| `integration-tests-gate.yml` | Integration gate | `./local-ci.sh pr --category Integration` |
| `aoc-guard.yml` | Append-only contract | `./local-ci.sh pr --category Architecture` |
| `license-audit.yml` | License compliance | Manual check |
| `dependency-license-gate.yml` | License gate | Manual check |
| `dependency-security-scan.yml` | Dependency security | Manual check |
| `container-scan.yml` | Container security | `docker scan` |
#### Tier 2: Module-Specific
Run when modifying specific modules:
| Workflow | Module | Local Command |
|----------|--------|---------------|
| `scanner-analyzers.yml` | Scanner | `./local-ci.sh module --module Scanner` |
| `scanner-determinism.yml` | Scanner | `./local-ci.sh module --module Scanner` |
| `scanner-analyzers-release.yml` | Scanner | `./local-ci.sh release --dry-run` |
| `concelier-attestation-tests.yml` | Concelier | `./local-ci.sh module --module Concelier` |
| `concelier-store-aoc-19-005.yml` | Concelier | `./local-ci.sh module --module Concelier` |
| `authority-key-rotation.yml` | Authority | `./local-ci.sh module --module Authority` |
| `signals-ci.yml` | Signals | `./local-ci.sh module --module Signals` |
| `signals-dsse-sign.yml` | Signals | `./local-ci.sh module --module Signals` |
| `signals-evidence-locker.yml` | Signals | `./local-ci.sh module --module Signals` |
| `signals-reachability.yml` | Signals | `./local-ci.sh module --module Signals` |
| `symbols-ci.yml` | Symbols | `./local-ci.sh module --module Symbols` |
| `symbols-release.yml` | Symbols | `./local-ci.sh release --dry-run` |
| `cli-build.yml` | CLI | `dotnet publish src/Cli/StellaOps.Cli` |
| `cli-chaos-parity.yml` | CLI | `./local-ci.sh module --module CLI` |
| `findings-ledger-ci.yml` | Findings | `./local-ci.sh module --module Findings` |
| `ledger-packs-ci.yml` | Findings | `./local-ci.sh module --module Findings` |
| `ledger-oas-ci.yml` | Findings | `./local-ci.sh module --module Findings` |
| `console-ci.yml` | Console | `./local-ci.sh module --module Console` |
| `export-ci.yml` | ExportCenter | `./local-ci.sh module --module ExportCenter` |
| `export-compat.yml` | ExportCenter | `./local-ci.sh module --module ExportCenter` |
| `exporter-ci.yml` | ExportCenter | `./local-ci.sh module --module ExportCenter` |
| `notify-smoke-test.yml` | Notify | `./local-ci.sh module --module Notify` |
| `policy-simulate.yml` | Policy | `./local-ci.sh module --module Policy` |
| `risk-bundle-ci.yml` | RiskEngine | `./local-ci.sh module --module RiskEngine` |
| `graph-load.yml` | Graph | `./local-ci.sh module --module Graph` |
| `graph-ui-sim.yml` | Graph | `./local-ci.sh module --module Graph` |
| `router-chaos.yml` | Router | `./local-ci.sh module --module Router` |
| `obs-stream.yml` | Observability | `./local-ci.sh full --category Observability` |
| `obs-slo.yml` | Observability | `./local-ci.sh full --category Observability` |
| `lighthouse-ci.yml` | Web Performance/A11y | `cd src/Web/StellaOps.Web && npm run build` |
#### Tier 3: Extended Validation
Run for comprehensive testing:
| Workflow | Purpose | Local Command |
|----------|---------|---------------|
| `benchmark-vs-competitors.yml` | Performance comparison | `./local-ci.sh full --category Benchmark` |
| `bench-determinism.yml` | Determinism benchmarks | `./local-ci.sh full --category Determinism` |
| `cross-platform-determinism.yml` | Cross-OS determinism | Requires multi-platform |
| `e2e-reproducibility.yml` | End-to-end reproducibility | `./local-ci.sh full` |
| `parity-tests.yml` | Parity validation | `./local-ci.sh full` |
| `epss-ingest-perf.yml` | EPSS performance | `./local-ci.sh full --category Performance` |
| `reachability-corpus-ci.yml` | Reachability corpus | `./local-ci.sh full` |
| `offline-e2e.yml` | Offline end-to-end | `./local-ci.sh full --category AirGap` |
| `airgap-sealed-ci.yml` | Air-gap sealed tests | `./local-ci.sh full --category AirGap` |
| `interop-e2e.yml` | Interoperability | `./local-ci.sh full` |
| `nightly-regression.yml` | Nightly regression | `./local-ci.sh full` |
| `migration-test.yml` | Database migrations | `./local-ci.sh pr --category Integration` |
#### Tier 4: Release Pipelines (Dry-Run Only)
```bash
# Always use --dry-run for release pipelines
./devops/scripts/local-ci.sh release --dry-run
```
| Workflow | Purpose |
|----------|---------|
| `release-suite.yml` | Full suite release |
| `release.yml` | Release automation |
| `release-keyless-sign.yml` | Keyless signing |
| `release-manifest-verify.yml` | Manifest verification |
| `release-validation.yml` | Release validation |
| `service-release.yml` | Service release |
| `module-publish.yml` | Module publishing |
| `sdk-publish.yml` | SDK publishing |
| `sdk-generator.yml` | SDK generation |
| `promote.yml` | Promotion pipeline |
#### Tier 5: Infrastructure & DevOps
| Workflow | Purpose | When to Run |
|----------|---------|-------------|
| `docs.yml` | Documentation | Changes in `docs/` |
| `api-governance.yml` | API governance | Changes in `src/Api/` |
| `oas-ci.yml` | OpenAPI validation | Changes in `src/Api/` |
| `containers-multiarch.yml` | Multi-arch builds | Dockerfile changes |
| `docker-regional-builds.yml` | Regional builds | Dockerfile changes |
| `console-runner-image.yml` | Runner image | Runner changes |
| `crypto-compliance.yml` | Crypto compliance | Crypto module changes |
| `crypto-sim-smoke.yml` | Crypto smoke | Crypto module changes |
| `cryptopro-linux-csp.yml` | CryptoPro tests | CryptoPro changes |
| `cryptopro-optin.yml` | CryptoPro opt-in | CryptoPro changes |
| `sm-remote-ci.yml` | SM crypto | SM changes |
| `lighthouse-ci.yml` | Frontend performance | Web changes |
| `devportal-offline.yml` | DevPortal offline | Portal changes |
| `renovate.yml` | Dependency updates | Automated |
| `rollback.yml` | Rollback automation | Emergency only |
#### Tier 6: Specialized Pipelines
| Workflow | Purpose | Notes |
|----------|---------|-------|
| `artifact-signing.yml` | Artifact signing | Requires signing keys |
| `attestation-bundle.yml` | Attestation bundles | Requires keys |
| `connector-fixture-drift.yml` | Connector drift | External data |
| `deploy-keyless-verify.yml` | Deploy verification | Production only |
| `evidence-locker.yml` | Evidence locker | Full E2E |
| `icscisa-kisa-refresh.yml` | ICS/KISA refresh | External feeds |
| `lnm-backfill.yml` | LNM backfill | Data migration |
| `lnm-migration-ci.yml` | LNM migration | Data migration |
| `lnm-vex-backfill.yml` | VEX backfill | Data migration |
| `manifest-integrity.yml` | Manifest integrity | Release gate |
| `mirror-sign.yml` | Mirror signing | Requires keys |
| `mock-dev-release.yml` | Mock release | Development |
| `provenance-check.yml` | Provenance | Release gate |
| `replay-verification.yml` | Replay verify | Determinism |
| `test-lanes.yml` | Test lanes | Matrix tests |
| `vex-proof-bundles.yml` | VEX bundles | VEX tests |
| `aoc-backfill-release.yml` | AOC backfill | Data migration |
| `unknowns-budget-gate.yml` | Unknowns budget | Policy gate |
---
### Validation Checklist
#### Before Every Commit
- [ ] **Smoke test passes:** `./devops/scripts/local-ci.sh smoke`
- [ ] **No uncommitted changes after build:** `git status` shows clean (except intended changes)
- [ ] **Linting passes:** No warnings-as-errors violations
#### Before Opening PR
- [ ] **PR-gating suite passes:** `./devops/scripts/local-ci.sh pr`
- [ ] **Module tests pass:** `./devops/scripts/local-ci.sh module`
- [ ] **No merge conflicts:** Branch is rebased on main
- [ ] **Commit messages follow convention:** Brief, imperative mood
#### Before Major Changes
- [ ] **Full test suite passes:** `./devops/scripts/local-ci.sh full`
- [ ] **Determinism tests pass:** `./devops/scripts/local-ci.sh pr --category Determinism`
- [ ] **Integration tests pass:** `./devops/scripts/local-ci.sh pr --category Integration`
- [ ] **Security tests pass:** `./devops/scripts/local-ci.sh pr --category Security`
#### Before Release
- [ ] **Release dry-run succeeds:** `./devops/scripts/local-ci.sh release --dry-run`
- [ ] **All workflows simulated:** Critical workflows tested via act
- [ ] **Helm chart validates:** `helm lint devops/helm/stellaops`
- [ ] **Docker Compose validates:** `./devops/scripts/validate-compose.sh`
---
### Quick Command Reference
#### Essential Commands
```bash
# Quick validation (always run before commit)
./devops/scripts/local-ci.sh smoke
# Full PR check (run before opening PR)
./devops/scripts/local-ci.sh pr
# Test only what you changed
./devops/scripts/local-ci.sh module
# Verbose output for debugging
./devops/scripts/local-ci.sh pr --verbose
# Dry-run to see what would happen
./devops/scripts/local-ci.sh pr --dry-run
# Single category
./devops/scripts/local-ci.sh pr --category Unit
./devops/scripts/local-ci.sh pr --category Integration
./devops/scripts/local-ci.sh pr --category Security
```
#### Windows (PowerShell)
```powershell
# Quick validation
.\devops\scripts\local-ci.ps1 smoke
# Smoke steps (isolate hangs)
.\devops\scripts\local-ci.ps1 smoke -SmokeStep build
.\devops\scripts\local-ci.ps1 smoke -SmokeStep unit
.\devops\scripts\local-ci.ps1 smoke -SmokeStep unit-split
.\devops\scripts\local-ci.ps1 smoke -SmokeStep unit-split -TestTimeout 5m -ProgressInterval 60
.\devops\scripts\local-ci.ps1 smoke -SmokeStep unit-split -ProjectStart 1 -ProjectCount 50
# Full PR check
.\devops\scripts\local-ci.ps1 pr
# With options
.\devops\scripts\local-ci.ps1 pr -Verbose -Docker
```
#### Service Management
```bash
# Start CI services
docker compose -f devops/compose/docker-compose.ci.yaml up -d
# Check service health
docker compose -f devops/compose/docker-compose.ci.yaml ps
# View logs
docker compose -f devops/compose/docker-compose.ci.yaml logs -f
# Stop services
docker compose -f devops/compose/docker-compose.ci.yaml down
# Full reset (remove volumes)
docker compose -f devops/compose/docker-compose.ci.yaml down -v
```
#### Workflow Simulation
```bash
# Build CI image for act
docker build -t stellaops-ci:local -f devops/docker/Dockerfile.ci .
# List available workflows
ls .gitea/workflows/*.yml | xargs -n1 basename
# Simulate workflow
./devops/scripts/local-ci.sh workflow --workflow test-matrix
# Dry-run workflow
act -n -W .gitea/workflows/test-matrix.yml
```
---
### Troubleshooting
#### Build Failures
```bash
# Clean and rebuild
dotnet clean src/StellaOps.sln
dotnet build src/StellaOps.sln
# Check .NET SDK
dotnet --info
# Restore packages
dotnet restore src/StellaOps.sln
```
If you hit NuGet 429 rate limiting from `git.stella-ops.org`, slow client requests:
```powershell
# PowerShell (before running local CI)
$env:NUGET_MAX_HTTP_REQUESTS = "4"
dotnet restore --disable-parallel
```
```bash
# Bash/WSL
export NUGET_MAX_HTTP_REQUESTS=4
dotnet restore --disable-parallel
```
#### Test Failures
```bash
# Run with verbose output
./devops/scripts/local-ci.sh pr --verbose
# Run single category
./devops/scripts/local-ci.sh pr --category Unit
# Split Unit tests to isolate hangs
./devops/scripts/local-ci.sh smoke --smoke-step unit-split
# Check which project is currently running
cat out/local-ci/active-test.txt
# View test log
cat out/local-ci/logs/Unit-*.log
# Run specific test
dotnet test --filter "FullyQualifiedName~TestMethodName" --verbosity detailed
```
#### Docker Issues
```bash
# Check Docker
docker info
# Reset CI services
docker compose -f devops/compose/docker-compose.ci.yaml down -v
# Rebuild CI image
docker build --no-cache -t stellaops-ci:local -f devops/docker/Dockerfile.ci .
# Check container logs
docker compose -f devops/compose/docker-compose.ci.yaml logs postgres-ci
```
#### Act Issues
```bash
# Check act installation
act --version
# List available workflows
act -l
# Dry-run workflow
act -n pull_request -W .gitea/workflows/test-matrix.yml
# Debug mode
act --verbose pull_request
```
#### Windows-Specific
```powershell
# Check WSL
wsl --status
# Install WSL if needed
wsl --install
# Use Git Bash
& "C:\Program Files\Git\bin\bash.exe" devops/scripts/local-ci.sh smoke
```
#### Database Connection
```bash
# Check PostgreSQL is running
docker compose -f devops/compose/docker-compose.ci.yaml ps postgres-ci
# Test connection
docker exec -it postgres-ci psql -U stellaops_ci -d stellaops_test -c "SELECT 1"
# View PostgreSQL logs
docker compose -f devops/compose/docker-compose.ci.yaml logs postgres-ci
```
---
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
|---|---------|--------|----------------------------|--------|-----------------|
| 1 | VAL-SMOKE-001 | DOING | Unit-split slices 1-302 complete; AirGap bundle/persistence fixes applied; re-run smoke pending (see Execution Log + `out/local-ci/logs`) | Developer | Run smoke tests |
| 2 | VAL-PR-001 | BLOCKED | Smoke unit-split still in progress; start CI services once smoke completes | Developer | Run PR-gating suite |
| 3 | VAL-MODULE-001 | BLOCKED | Smoke/PR pending; run module tests after PR-gating or targeted failures | Developer | Run module-specific tests |
| 4 | VAL-WORKFLOW-001 | BLOCKED | `act` installed (WSL ok); build CI image | Developer | Simulate critical workflows |
| 5 | VAL-RELEASE-001 | BLOCKED | Build succeeds; release config present | Developer | Run release dry-run |
| 6 | VAL-FULL-001 | BLOCKED | Build succeeds; allocate extended time | Developer | Run full test suite (if major changes) |
## Execution Log
| Date (UTC) | Update | Owner |
|------------|--------|-------|
| 2025-12-29 | Created sprint for full pipeline validation before commit | DevOps |
| 2025-12-29 | Renamed sprint to SPRINT_20251229_006_CICD_full_pipeline_validation.md and normalized to standard template; no semantic changes. | Planning |
| 2025-12-29 | Started VAL-SMOKE-001; running pre-flight tool checks. | DevOps |
| 2025-12-29 | Smoke run failed during build: NuGet restore returned 429 (Too Many Requests) from git.stella-ops.org feeds. | DevOps |
| 2025-12-29 | Docker Desktop service stopped; Start-Service failed (permission), blocking service-backed tests. | DevOps |
| 2025-12-29 | `act` not installed; workflow simulation blocked. | DevOps |
| 2025-12-29 | Docker Desktop running; `docker info` succeeded. | DevOps |
| 2025-12-29 | `act` installed in WSL; Windows install requires shell restart to pick up PATH. | DevOps |
| 2025-12-29 | Retrying smoke with throttled NuGet restore (`NUGET_MAX_HTTP_REQUESTS`, `--disable-parallel`). | DevOps |
| 2025-12-29 | NuGet restore succeeded with throttling; smoke build failed on Router transport plugin types and Verdict API compile errors. | DevOps |
| 2025-12-29 | `act` resolves in both Windows and WSL; run from repo root and point to `.gitea/workflows`. | DevOps |
| 2025-12-29 | Smoke run stalled >1h; Unit log shows failures in Scheduler stream SSE test and Signer canonical payload test; run still active in `dotnet test`. | DevOps |
| 2025-12-29 | Stopped hung smoke run to unblock targeted fixes/tests. | DevOps |
| 2025-12-29 | Implemented fixes: Scheduler stream test avoids overlapping reads; canonical JSON writer uses relaxed escaping for DSSE payloadType. Smoke re-run pending. | DevOps |
| 2025-12-29 | Targeted tests passed: `RunEndpointTests.StreamRunEmitsInitialEvent` and `CanonicalPayloadDeterminismTests.DsseEnvelope_CanonicalBytes_PayloadTypePreserved`. | DevOps |
| 2025-12-29 | Added smoke step support (`--smoke-step`) and updated runbook/guide to split smoke runs for hang isolation. | DevOps |
| 2025-12-29 | Added per-test timeout + progress heartbeat for unit-split; active test marker added to pinpoint hang location. | DevOps |
| 2025-12-29 | Smoke build step completed successfully (~2m49s); NU1507 warnings observed. | DevOps |
| 2025-12-29 | Unit-split first project (AdvisoryAI) failed 2 tests; subsequent unit-split run progressed but remained slow; user aborted after ~13 min. | DevOps |
| 2025-12-29 | Added unit-split slicing (`--project-start`, `--project-count`) to narrow hang windows faster. | DevOps |
| 2025-12-29 | Fixed AdvisoryAI unit tests (authority + verdict stubs) and re-ran `StellaOps.AdvisoryAI.Tests` (Category=Unit) successfully. | DevOps |
| 2025-12-29 | Added xUnit v3 test SDK + VS runner via `src/Directory.Build.props` to prevent testhost/test discovery failures; `StellaOps.Aoc.AspNetCore.Tests` now passes. | DevOps |
| 2025-12-29 | Unit-split slice 1–10: initial failure in `StellaOps.Aoc.AspNetCore.Tests` resolved; slice 11–20 passed. | DevOps |
| 2025-12-29 | `dotnet build src/StellaOps.sln` initially failed due to locked `testhost` processes; stopped `testhost` and rebuild succeeded (warnings only). | DevOps |
| 2025-12-29 | Unit-split slice 21-30 failed in `StellaOps.Attestor.Types.Tests` due to SchemaRegistry overwrite. | DevOps |
| 2025-12-29 | Fixed SmartDiff schema tests to reuse cached schema; `StellaOps.Attestor.Types.Tests` (Category=Unit) passed. | DevOps |
| 2025-12-29 | Unit-split slices 21-40 passed; Authority Standard/Authority tests required rebuild retry but succeeded. | DevOps |
| 2025-12-29 | Unit-split slices 41-50 passed; `StellaOps.Cartographer.Tests` required rebuild retry but succeeded. | DevOps |
| 2025-12-29 | Unit-split slices 51-60 passed. | DevOps |
| 2025-12-29 | Fixed Concelier advisory reconstruction to derive normalized versions/language from persisted ranges; updated Postgres test fixture truncation to include non-system schemas. | DevOps |
| 2025-12-29 | `StellaOps.Concelier.Connector.Kisa.Tests` (Category=Unit) passed after truncation fix. | DevOps |
| 2025-12-29 | Unit-split slices 61-70 passed. | DevOps |
| 2025-12-29 | Unit-split slices 71-80 passed. | DevOps |
| 2025-12-29 | Unit-split slice 81-90 failed on missing testhost for `StellaOps.Concelier.Interest.Tests`; rebuilt project and reran slice. | DevOps |
| 2025-12-29 | Unit-split slices 81-90 passed. | DevOps |
| 2025-12-29 | Unit-split slice 91-100 failed: `StellaOps.EvidenceLocker.Tests` build error from SbomService (`IRegistrySourceService` missing). | DevOps |
| 2025-12-29 | Unit-split slice 101-110 failed: `StellaOps.Excititor.Connectors.OCI.OpenVEX.Attest.Tests` fixture/predicate failures. | DevOps |
| 2025-12-29 | Unit-split slice 111-120 failed: `StellaOps.ExportCenter.Client.Tests` testhost missing; `StellaOps.ExportCenter.Tests` failed due to SbomService compile errors. | DevOps |
| 2025-12-29 | Unit-split slice 121-130 failed: `StellaOps.Findings.Ledger.Tests` no tests discovered; `StellaOps.Graph.Api.Tests` contract failure (missing cursor). | DevOps |
| 2025-12-29 | Unit-split slice 131-140 failed: Notify connector/core/engine tests missing testhost; `StellaOps.Notify.Queue.Tests` NATS JetStream no response. | DevOps |
| 2025-12-29 | Unit-split slice 141-150 failed: `StellaOps.Notify.WebService.Tests` rejected memory storage; `StellaOps.Notify.Worker.Tests`, `StellaOps.Orchestrator.Tests`, `StellaOps.PacksRegistry.Tests` testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 151-160 passed. | DevOps |
| 2025-12-29 | Unit-split slice 161-170 failed: `StellaOps.Router.Common.Tests` routing expectations; `StellaOps.Router.Transport.InMemory.Tests` TaskCanceled vs OperationCanceled. | DevOps |
| 2025-12-29 | Unit-split slice 171-180 failed: `StellaOps.Router.Transport.Tcp.Tests` testhost missing; `StellaOps.Scanner.Analyzers.Lang.Bun.Tests`/`Deno.Tests` testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 181-190 failed: `StellaOps.Scanner.Analyzers.Lang.DotNet.Tests` testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 191-200 failed: Scanner OS analyzer tests (Homebrew/MacOS/Pkgutil/Windows) testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 201-210 passed. | DevOps |
| 2025-12-29 | Unit-split slice 211-220 failed: `StellaOps.Scanner.ReachabilityDrift.Tests` testhost missing; `StellaOps.Scanner.Sources.Tests` compile error (`SbomSourceRunTrigger.Push`); `StellaOps.Scanner.Surface.Env.Tests`/`FS.Tests` testhost/CoreUtilities missing. | DevOps |
| 2025-12-29 | Unit-split slice 221-230 failed: `StellaOps.Scanner.Surface.Secrets.Tests` testhost CoreUtilities missing; `StellaOps.Scanner.Surface.Validation.Tests` testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 231-240 failed: `StellaOps.Scheduler.Queue.Tests` Testcontainers Redis method missing; `StellaOps.Scheduler.Worker.Tests` ordering assertions; `StellaOps.Signals.Persistence.Tests` migrations failed (`signals.unknowns`). | DevOps |
| 2025-12-29 | Unit-split slice 241-250 failed: `StellaOps.TimelineIndexer.Tests` testhost missing. | DevOps |
| 2025-12-29 | Unit-split slice 251-260 failed: `StellaOps.Determinism.Analyzers.Tests` testhost missing; `GostCryptography.Tests` restore failures (net40/452); `StellaOps.Cryptography.Tests` aborted (testhost crash). | DevOps |
| 2025-12-29 | Unit-split slice 261-270 failed: `StellaOps.Cryptography.Kms.Tests` non-exportable key expectation; `StellaOps.Evidence.Persistence.Tests` unexpected row counts. | DevOps |
| 2025-12-29 | Unit-split slice 271-280 passed. | DevOps |
| 2025-12-29 | Unit-split slice 281-290 failed: `FixtureHarvester.Tests` CPM package version error + missing project path. | DevOps |
| 2025-12-29 | Unit-split slice 291-300 failed: `StellaOps.Reachability.FixtureTests` missing fixture data; `StellaOps.ScannerSignals.IntegrationTests` missing reachability variants. | DevOps |
| 2025-12-29 | Unit-split slice 301-310 passed. | DevOps |
| 2025-12-29 | Direct `dotnet test` re-run: `StellaOps.Notify.Core.Tests` passed (suggests local-ci testhost errors may be transient). | DevOps |
| 2025-12-29 | Direct `dotnet test` re-run: `StellaOps.TimelineIndexer.Tests` failed due to missing EvidenceLocker golden bundle fixtures (`tests/EvidenceLocker/Bundles/Golden`). | DevOps |
| 2025-12-29 | Direct `dotnet test` re-run: `StellaOps.Findings.Ledger.Tests` reports no tests discovered (likely missing xUnit runner reference). | DevOps |
| 2025-12-29 | Direct `dotnet test` re-run: `StellaOps.Notify.Connectors.Email.Tests` failed (fixtures missing under `bin/Release/net10.0/Fixtures/email` + error code expectation mismatches). | DevOps |
| 2025-12-29 | Added xUnit v2 VS runner in `src/Directory.Build.props`; fixed Notify email tests (timeout classification, invalid recipient path) and copied fixtures to output. | DevOps |
| 2025-12-29 | Re-run: `StellaOps.Findings.Ledger.Tests` now discovers tests but failures/timeouts remain; `StellaOps.Notify.Connectors.Email.Tests` passed. | DevOps |
| 2025-12-29 | Converted tests and shared test infra to xUnit v3 (CPM + project refs), aligned `IAsyncLifetime` signatures, and added `xunit.abstractions` for global usings. | DevOps |
| 2025-12-29 | `dotnet test` (Category=Unit) passes for `StellaOps.Findings.Ledger.Tests` after xUnit v3 conversion. | DevOps |
| 2025-12-29 | Smoke unit-split slice 311-320 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Smoke unit-split slice 321-330 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Smoke unit-split slice 331-400 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Smoke unit-split slice 401-470 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Smoke unit-split slice 471-720 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Smoke unit-split slice 721-1000 passed via `local-ci.ps1` (unit-split). | DevOps |
| 2025-12-29 | Verified unit-split project count is 302 (`rg --files -g "*Tests.csproj" src`); slices beyond 302 are no-ops and do not execute tests. | DevOps |
| 2025-12-30 | Fixed AirGap bundle copy lock by closing output before hashing; `StellaOps.AirGap.Bundle.Tests` (Category=Unit) passed. | DevOps |
| 2025-12-30 | Added AirGap persistence migrations + schema alignment and updated tests/fixture; `StellaOps.AirGap.Persistence.Tests` (Category=Unit) passed. | DevOps |
| 2026-01-02 | Fixed smoke build failures (AirGap DSSE PAE ambiguity, Attestor.Oci span mismatch) and resumed unit-split slice 1-100; failures isolated to AirGap Importer + Attestor tests. | DevOps |
| 2026-01-02 | Adjusted AirGap/Attestor tests and in-memory pagination; verified `StellaOps.AirGap.Importer.Tests`, `StellaOps.Attestor.Envelope.Tests`, `StellaOps.Attestor.Infrastructure.Tests`, and `StellaOps.Attestor.ProofChain.Tests` (Category=Unit) pass. | DevOps |
| 2026-01-03 | Fixed RunManifest schema validation to use an isolated schema registry (prevents JsonSchema overwrite errors). | DevOps |
| 2026-01-03 | Ensured Scanner scan manifest idempotency tests insert scan rows before saving manifests (avoid FK failures). | DevOps |
| 2026-01-03 | Re-ran smoke (`local-ci.ps1 smoke`) with full unit span; run in progress after build. | DevOps |
| 2026-01-03 | Stopped hung smoke `dotnet test` process after completion; unit failures captured from TRX for follow-up fixes. | DevOps |
| 2026-01-03 | Adjusted Scanner WebService test fixture lookup to resolve repo root correctly and run triage migrations from filesystem. | DevOps |
| 2026-01-03 | Made Scanner storage job_state enum creation idempotent to avoid migration rerun failures in WebService tests. | DevOps |
| 2026-01-03 | Expanded triage schema migration to align with EF models (scan/policy/attestation tables + triage_finding columns). | DevOps |
| 2026-01-03 | Mapped triage enums for Npgsql and annotated enum labels to match PostgreSQL values. | DevOps |
## Decisions & Risks
- **Risk:** Extended tests (~45 min) may be skipped for time constraints
- **Mitigation:** Always run smoke + PR-gating; run full suite for major changes
- **Risk:** Act workflow simulation requires CI Docker image
- **Mitigation:** Build image once with `docker build -t stellaops-ci:local -f devops/docker/Dockerfile.ci .`
- **Risk:** Some workflows require external resources (signing keys, feeds)
- **Mitigation:** These are dry-run only locally; full validation happens in CI
- **Risk:** NuGet feed rate limiting (429) from git.stella-ops.org blocks restore/build
- **Mitigation:** Retry off-peak, warm the NuGet cache, or reduce restore concurrency (`NUGET_MAX_HTTP_REQUESTS`, `--disable-parallel`)
- **Risk:** Docker Desktop service cannot be started without elevated permissions
- **Mitigation:** Start Docker Desktop manually or run service with appropriate privileges
- **Risk:** `act` is not installed locally
- **Mitigation:** Install `act` before attempting workflow simulation
- **Risk:** Build breaks in Router transport plugins and Verdict API types, blocking smoke/pr runs
- **Mitigation:** Resolve missing plugin interfaces/namespaces and file-scoped namespace errors before re-running validation
- **Risk:** `dotnet test` in smoke mode can hang on long-running Unit tests (e.g., cryptography suite), stretching smoke beyond target duration
- **Mitigation:** Split smoke with `--smoke-step unit-split`, use `out/local-ci/active-test.txt` for the current project, and add `--test-timeout`/`--progress-interval` or slice runs via `--project-start/--project-count`
- **Risk:** Cross-module change for test isolation touches shared Postgres fixture
- **Mitigation:** Monitor other module fixtures for unexpected truncation; scope is non-system schemas only (`src/__Libraries/StellaOps.Infrastructure.Postgres/Testing/PostgresFixture.cs`).
- **Risk:** Widespread testhost/TestPlatform dependency failures (`testhost.dll`/`Microsoft.TestPlatform.CoreUtilities`) abort unit tests
- **Mitigation:** Align `Microsoft.NET.Test.Sdk`/xUnit runner versions with CPM, confirm restore outputs include testhost assets across projects.
- **Risk:** SbomService registry source work-in-progress breaks build (`IRegistrySourceService`, model/property mismatches)
- **Mitigation:** Sync with SPRINT_20251229_012 changes or gate validation until API/DTOs settle.
- **Risk:** Reachability fixtures missing under `src/tests/reachability/**`, blocking fixture/integration tests
- **Mitigation:** Pull required fixture pack or document prerequisites in local CI runbook.
- **Risk:** EvidenceLocker golden bundle fixtures missing under `tests/EvidenceLocker/Bundles/Golden`, blocking TimelineIndexer integration tests
- **Mitigation:** Include fixture pack in offline bundle or document fetch step for local CI.
- **Risk:** Notify connector snapshot fixtures are not copied to output (`Fixtures/email/*.json`), and error code expectations diverge
- **Mitigation:** Ensure fixtures are marked `CopyToOutputDirectory` and align expected error codes with current behavior.
- **Risk:** Queue tests depend on external services (NATS/Redis/Testcontainers) and version alignment
- **Mitigation:** Ensure Docker services are up and Testcontainers packages are compatible.
## Next Checkpoints
| Step | Action | Command | Pass Criteria |
|------|--------|---------|---------------|
| 1 | Smoke test | `./devops/scripts/local-ci.sh smoke` | Exit code 0 |
| 2 | PR-gating | `./devops/scripts/local-ci.sh pr` | All categories green |
| 3 | Module tests | `./devops/scripts/local-ci.sh module` | All modules pass |
| 4 | Ready to commit | `git status` | Only intended changes |
| 5 | Commit | `git commit -m "..."` | Commit created |
| 6 | Push | `git push` | CI passes remotely |

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,164 @@
# Sprint 20260104_001_BE · Determinism: TimeProvider/IGuidProvider Injection
## Topic & Scope
- Systematically replace direct `DateTimeOffset.UtcNow`, `DateTime.UtcNow`, `Guid.NewGuid()`, and `Random.Shared` calls with injectable abstractions.
- Inject `TimeProvider` (from Microsoft.Extensions.TimeProvider.Abstractions) for time-related operations.
- Inject `IGuidProvider` (project-local abstraction) for GUID generation.
- Ensure deterministic, testable code across all production projects.
- **Working directory:** `src/`. Evidence: updated source files, test coverage for injected services.
## Dependencies & Concurrency
- Depends on: SPRINT_20251229_049_BE (TreatWarningsAsErrors applied to all production projects).
- No upstream blocking dependencies; each module can be refactored independently.
- Parallel execution is safe across modules with per-project ownership.
## Documentation Prerequisites
- docs/README.md
- docs/ARCHITECTURE_OVERVIEW.md
- AGENTS.md § 8.2 (Deterministic Time & ID Generation)
- Module dossier for each project under refactoring.
## Scope Analysis
**Total production files with determinism issues:** ~1526 instances of `DateTimeOffset.UtcNow` alone.
### Issue Breakdown by Pattern
| Pattern | Estimated Count | Priority |
| --- | --- | --- |
| `DateTimeOffset.UtcNow` | ~1526 | High |
| `DateTime.UtcNow` | TBD | High |
| `Guid.NewGuid()` | TBD | Medium |
| `Random.Shared` | TBD | Low |
### Modules with Known Issues (from audit)
| Module | Project | Issues | Status |
| --- | --- | --- | --- |
| Policy | StellaOps.Policy.Unknowns | 8+ | TODO |
| Provcache | StellaOps.Provcache.* | TBD | TODO |
| Provenance | StellaOps.Provenance.* | TBD | TODO |
| ReachGraph | StellaOps.ReachGraph.* | TBD | TODO |
| Registry | StellaOps.Registry.TokenService | TBD | TODO |
| Replay | StellaOps.Replay.* | TBD | TODO |
| RiskEngine | StellaOps.RiskEngine.* | TBD | TODO |
| Scanner | StellaOps.Scanner.* | TBD | TODO |
| Scheduler | StellaOps.Scheduler.* | TBD | TODO |
| Signer | StellaOps.Signer.* | TBD | TODO |
| Unknowns | StellaOps.Unknowns.* | TBD | TODO |
| VexLens | StellaOps.VexLens.* | TBD | TODO |
| VulnExplorer | StellaOps.VulnExplorer.* | TBD | TODO |
| Zastava | StellaOps.Zastava.* | TBD | TODO |
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | DET-001 | DONE | Audit complete | Guild | Full audit: count all DateTimeOffset.UtcNow/DateTime.UtcNow/Guid.NewGuid/Random.Shared by project |
| 2 | DET-002 | DONE | DET-001 | Guild | Ensure IGuidProvider abstraction exists in StellaOps.Determinism.Abstractions |
| 3 | DET-003 | DONE | DET-001 | Guild | Ensure TimeProvider registration pattern documented |
| 4 | DET-004 | DONE | DET-002, DET-003 | Guild | Refactor Policy module (Policy library complete, 14 files) |
| 5 | DET-005 | DONE | DET-002, DET-003 | Guild | Refactor Provcache module (8 files: EvidenceChunker, LazyFetchOrchestrator, MinimalProofExporter, FeedEpochAdvancedEvent, SignerRevokedEvent, PostgresProvcacheRepository, PostgresEvidenceChunkRepository, ValkeyProvcacheStore) |
| 6 | DET-006 | DONE | DET-002, DET-003 | Guild | Refactor Provenance module (skipped - already uses TimeProvider in production code) |
| 7 | DET-007 | DONE | DET-002, DET-003 | Guild | Refactor ReachGraph module (1 file: PostgresReachGraphRepository) |
| 8 | DET-008 | DONE | DET-002, DET-003 | Guild | Refactor Registry module (1 file: RegistryTokenIssuer) |
| 9 | DET-009 | DONE | DET-002, DET-003 | Guild | Refactor Replay module (6 files: ReplayEngine, ReplayModels, ReplayExportModels, ReplayManifestExporter, FeedSnapshotCoordinatorService, PolicySimulationInputLock) |
| 10 | DET-010 | DONE | DET-002, DET-003 | Guild | Refactor RiskEngine module (skipped - no determinism issues found) |
| 11 | DET-011 | DONE | DET-002, DET-003 | Guild | Refactor Scanner module - Explainability (2 files: RiskReport, FalsifiabilityGenerator), Sources (5 files: ConnectionTesters, SourceConnectionTester, SourceTriggerDispatcher), VulnSurfaces (1 file: PostgresVulnSurfaceRepository), Storage (5 files: PostgresProofSpineRepository, PostgresScanMetricsRepository, RuntimeEventRepository, PostgresFuncProofRepository, PostgresIdempotencyKeyRepository), Storage.Oci (1 file: SlicePullService), Binary analysis (6 files), Language analyzers (4 files), Benchmark (2 files), Core/Emit/SmartDiff services (10+ files) |
| 12 | DET-012 | DONE | DET-002, DET-003 | Guild | Refactor Scheduler module (WebService, Persistence, Worker projects - 30+ files updated, tests migrated to FakeTimeProvider) |
| 13 | DET-013 | DONE | DET-002, DET-003 | Guild | Refactor Signer module (16 production files refactored: AmbientOidcTokenProvider, EphemeralKeyPair, IOidcTokenProvider, IFulcioClient, TrustAnchorManager, KeyRotationService, DefaultSigningKeyResolver, SigstoreSigningService, InMemorySignerAuditSink, KeyRotationEndpoints, Program.cs) |
| 14 | DET-014 | DONE | DET-002, DET-003 | Guild | Refactor Unknowns module (skipped - no determinism issues found) |
| 15 | DET-015 | DONE | DET-002, DET-003 | Guild | Refactor VexLens module (production files: IConsensusRationaleCache, InMemorySourceTrustScoreCache, ISourceTrustScoreCalculator, InMemoryIssuerDirectory, InMemoryConsensusProjectionStore, OpenVexNormalizer, CycloneDxVexNormalizer, CsafVexNormalizer, IConsensusJobService, VexProofBuilder, IConsensusExportService, IVexLensApiService, TrustScorecardApiModels, OrchestratorLedgerEventEmitter, PostgresConsensusProjectionStore, PostgresConsensusProjectionStoreProxy, ProvenanceChainValidator, VexConsensusEngine, IConsensusRationaleService, VexLensEndpointExtensions) |
| 16 | DET-016 | DONE | DET-002, DET-003 | Guild | Refactor VulnExplorer module (1 file: VexDecisionStore) |
| 17 | DET-017 | DONE | DET-002, DET-003 | Guild | Refactor Zastava module (~48 matches remaining) |
| 18 | DET-018 | DONE | DET-004 to DET-017 | Guild | Final audit: verify sprint-scoped modules (Libraries only) have deterministic TimeProvider injection. Remaining scope documented below. |
| 19 | DET-019 | DONE | DET-018 | Guild | Follow-up: Scanner.WebService determinism refactoring (~40 DateTimeOffset.UtcNow usages) - 12 endpoint/service files + 2 dependency library files fixed |
| 20 | DET-020 | DONE | DET-018 | Guild | Follow-up: Scanner.Analyzers.Native determinism refactoring - hardening extractors (ELF/MachO/PE), OfflineBuildIdIndex, and RuntimeCapture adapters (eBPF/DYLD/ETW) complete. |
| 21 | DET-021 | DOING | DET-018 | Guild | Follow-up: Other modules (AdvisoryAI, Authority, AirGap, Attestor, Cli, Concelier, Excititor, etc.) - full codebase determinism sweep. Sub-tasks: (a) AirGap DONE, (b) EvidenceLocker DONE, (c) IssuerDirectory DONE, (d) Remaining modules pending |
## Implementation Pattern
### Before (Non-deterministic)
```csharp
public class BadService
{
public Record CreateRecord() => new Record
{
Id = Guid.NewGuid(),
CreatedAt = DateTimeOffset.UtcNow
};
}
```
### After (Deterministic, Testable)
```csharp
public class GoodService(TimeProvider timeProvider, IGuidProvider guidProvider)
{
public Record CreateRecord() => new Record
{
Id = guidProvider.NewGuid(),
CreatedAt = timeProvider.GetUtcNow()
};
}
```
### DI Registration
```csharp
services.AddSingleton(TimeProvider.System);
services.AddSingleton<IGuidProvider, SystemGuidProvider>();
```
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-04 | Sprint created; deferred from SPRINT_20251229_049_BE MAINT tasks | Planning |
| 2026-01-04 | DET-001: Audit complete. Found 1526 DateTimeOffset.UtcNow, 181 DateTime.UtcNow, 687 Guid.NewGuid, 16 Random.Shared | Agent |
| 2026-01-04 | DET-002: Created IGuidProvider, SystemGuidProvider, SequentialGuidProvider in StellaOps.Determinism.Abstractions | Agent |
| 2026-01-04 | DET-003: Created DeterminismServiceCollectionExtensions with AddDeterminismDefaults() | Agent |
| 2026-01-04 | DET-004: Policy.Unknowns refactored - UnknownsRepository, BudgetExceededEventFactory, ServiceCollectionExtensions | Agent |
| 2026-01-04 | Fixed Policy.Exceptions csproj - added ImplicitUsings, Nullable, PackageReferences | Agent |
| 2026-01-04 | DET-004: Policy refactored - BudgetLedger, EarnedCapacityEvaluator, BudgetThresholdNotifier, BudgetConstraintEnforcer, EvidenceFreshnessGate | Agent |
| 2026-01-04 | Scope note: 100+ files in Policy module alone need determinism refactoring. Multi-session effort. | Agent |
| 2026-01-04 | DET-004: Policy Replay/Deltas refactored - ReplayEngine, DeltaComputer, DeltaVerdictBuilder, ReplayReportBuilder, ReplayResult | Agent |
| 2026-01-04 | DET-004: Policy Gates, Snapshots, TrustLattice, Scoring, Explanation refactored - 14 files total | Agent |
| 2026-01-04 | DET-004 complete: Policy library now has deterministic TimeProvider/IGuidProvider injection | Agent |
| 2026-01-05 | DET-005: Provcache module refactored - 8 files (EvidenceChunker, LazyFetchOrchestrator, MinimalProofExporter, FeedEpochAdvancedEvent, SignerRevokedEvent, Postgres repos, ValkeyProvcacheStore) | Agent |
| 2026-01-05 | DET-006 to DET-010: Batch completed - ReachGraph (1 file), Registry (1 file), Replay (6 files); Provenance, RiskEngine, Unknowns already clean | Agent |
| 2026-01-05 | Remaining modules assessed: Scanner (~45), Scheduler (~20), Signer (~89), VexLens (~76), VulnExplorer (3), Zastava (~48) matches | Agent |
| 2026-01-05 | DET-012 complete: Scheduler module refactored - WebService, Persistence, Worker projects (30+ files) | Agent |
| 2026-01-05 | DET-013 complete: Signer module refactored - Keyless (4 files: AmbientOidcTokenProvider, EphemeralKeyPair, IOidcTokenProvider, IFulcioClient with IsExpiredAt/IsValidAt methods), KeyManagement (2 files: TrustAnchorManager, KeyRotationService), Infrastructure (3 files: DefaultSigningKeyResolver, SigstoreSigningService, InMemorySignerAuditSink), WebService (2 files: Program.cs, KeyRotationEndpoints) | Agent |
| 2026-01-05 | DET-015 complete: VexLens module refactored - 20 production files (caching, storage, normalization, orchestration, API, consensus, trust, persistence) with TimeProvider and IGuidProvider injection. Note: Pre-existing build errors in NoiseGateService.cs and NoiseGatingApiModels.cs unrelated to determinism changes. | Agent |
| 2026-01-05 | DET-017 complete: Zastava module refactored - Agent (RuntimeEventsClient, HealthCheckHostedService, RuntimeEventDispatchService, RuntimeEventBuffer), Observer (RuntimeEventDispatchService, RuntimeEventBuffer, ProcSnapshotCollector, EbpfProbeManager), Webhook (WebhookCertificateHealthCheck) with TimeProvider and IGuidProvider injection. | Agent |
| 2026-01-05 | DET-011 in progress: Scanner module refactoring - 14 production files refactored (RiskReport.cs, FalsifiabilityGenerator.cs, SourceConnectionTester.cs, SourceTriggerDispatcher.cs, DockerConnectionTester.cs, ZastavaConnectionTester.cs, GitConnectionTester.cs, PostgresVulnSurfaceRepository.cs, PostgresProofSpineRepository.cs, PostgresScanMetricsRepository.cs, RuntimeEventRepository.cs, PostgresFuncProofRepository.cs, PostgresIdempotencyKeyRepository.cs, SlicePullService.cs). Added Determinism.Abstractions references to 4 Scanner sub-projects. | Agent |
| 2026-01-06 | DET-011 continued: Source handlers refactored - DockerSourceHandler.cs, GitSourceHandler.cs, ZastavaSourceHandler.cs, CliSourceHandler.cs (all DateTimeOffset.UtcNow calls now use TimeProvider). Service layer: SbomSourceService.cs, SbomSourceRepository.cs, SbomSourceRunRepository.cs. Worker files: ScanMetricsCollector.cs (TimeProvider+IGuidProvider), BinaryFindingMapper.cs, PoEOrchestrator.cs, FidelityMetricsService.cs. Also fixed pre-existing build errors in Reachability and CallGraph modules. | Agent |
| 2026-01-06 | DET-011 continued: Scanner Storage refactored - PostgresWitnessRepository.cs (3 usages), FnDriftCalculator.cs (2 usages), S3ArtifactObjectStore.cs (2 usages), EpssReplayService.cs (2 usages), VulnSurfaceBuilder.cs (1 usage). Scanner Services refactored - ProofAwareVexGenerator.cs (2 usages), SurfaceAnalyzer.cs (1 usage), SurfaceEnvironmentBuilder.cs (1 usage), VexCandidateEmitter.cs (5 usages), FuncProofBuilder.cs (1 usage), EtwTraceCollector.cs (1 usage), EbpfTraceCollector.cs (1 usage), TraceIngestionService.cs (1 usage), IncrementalReachabilityService.cs (2 usages). All modified libraries verified to build successfully. | Agent |
| 2026-01-06 | DET-011 continued: Scanner domain/service refactoring - SbomSource.cs (rich domain entity with 13 methods refactored to accept TimeProvider parameter), SbomSourceRun.cs (6 methods refactored, DurationMs property converted to GetDurationMs method), SbomSourceService.cs (all callers updated), SbomSourceTests.cs (FakeTimeProvider added, all tests updated), SourceContracts.cs (ConnectionTestResult factory methods updated), CliConnectionTester.cs (TimeProvider injection added), ZeroDayWindowTracking.cs (ZeroDayWindowCalculator now has TimeProvider constructor), ObservedSliceGenerator.cs (TimeProvider injection added). 50+ usages remain in Triage entities and other Scanner libraries requiring entity-level pattern decisions. | Agent |
| 2026-01-06 | DET-011 continued: Scanner Triage entities refactored (10 files) - TriageFinding, TriageDecision, TriageScan, TriageAttestation, TriageEffectiveVex, TriageEvidenceArtifact, TriagePolicyDecision, TriageReachabilityResult, TriageRiskResult, TriageSnapshot - removed DateTimeOffset.UtcNow and Guid.NewGuid() defaults, made properties `required`. Reachability module - SliceCache.cs (TimeProvider injection), EdgeBundle.cs (Build method), MiniMapExtractor.cs (Extract method + CreateNotFoundMap), ReachabilityStackEvaluator.cs (Evaluate method). EntryTrace Risk module - RiskScore.cs (Zero/Critical/High/Medium/Low factory methods), CompositeRiskScorer.cs (TimeProvider constructor, 5 usages), RiskAssessment.Empty, FleetRiskSummary.CreateEmpty. EntryTrace Semantic - SemanticEntryTraceAnalyzer.cs (TimeProvider constructor). Scanner Core - ScanManifest.cs (CreateBuilder), ProofBundleWriter.cs (TimeProvider constructor), ScanManifestSigner.cs (ManifestVerificationResult factories). Storage/Emit/Diff models - ClassificationChangeModels.cs, ScanMetricsModels.cs, ComponentDiffModels.cs, BomIndexBuilder.cs, ISourceTypeHandler.cs, SurfaceEnvironmentSettings.cs, PathExplanationModels.cs, BoundaryExtractionContext.cs - all converted from default initializers to `required` properties. | Agent |
| 2026-01-06 | DET-011 continued: Additional Scanner production files refactored - IAssumptionCollector.cs/AssumptionCollector (TimeProvider constructor), FalsificationConditions.cs/DefaultFalsificationConditionGenerator (TimeProvider constructor), SbomDiffEngine.cs (TimeProvider constructor), ReachabilityUnionWriter.cs (TimeProvider constructor, WriteMetaAsync), PostgresReachabilityCache.cs (TimeProvider constructor, GetAsync TTL calculation, SetAsync expiry calculation). Scanner __Libraries reduced from 61 to 35 DateTimeOffset.UtcNow matches. Remaining are in: Binary analysis (6 files), Language analyzers (Java/DotNet/Deno/Native - 5 files), Benchmark/Claims (2 files), SmartDiff VexEvidence.IsValid property comparison, and test files. | Agent |
| 2026-01-06 | DET-011 continued: Binary analysis module refactored (IFingerprintIndex.cs - InMemoryFingerprintIndex with TimeProvider constructor + _lastUpdated, VulnerableFingerprintIndex with TimeProvider, BinaryIntelligenceAnalyzer.cs, VulnerableFunctionMatcher.cs, BinaryAnalysisResult.cs/BinaryAnalysisResultBuilder, FingerprintCorpusBuilder.cs, BaselineAnalyzer.cs, EpssEvidence.cs). Language analyzers refactored (DotNetCallgraphBuilder.cs, JavaCallgraphBuilder.cs, NativeCallgraphBuilder.cs, DenoRuntimeTraceRecorder.cs, JavaEntrypointAocWriter.cs). Core services refactored (CbomAggregationService.cs, SecretDetectionSettings.cs factory methods). Benchmark/Claims refactored (MetricsCalculator.cs, BattlecardGenerator.cs). SmartDiff VexEvidence.cs - added IsValidAt(DateTimeOffset) method, IsValid property uses TimeProvider. Risk module fixed (RiskExplainer, RiskAggregator constructors). BoundaryExtractionContext.cs - restored deprecated Empty property, added CreateEmpty factory. All Scanner __Libraries now build successfully with 3 acceptable remaining usages (test file, parsing fallback, existing TimeProvider fallback). DET-011 COMPLETE. | Agent |
| 2026-01-06 | DET-018 Final audit complete. Sprint scope was __Libraries modules. Remaining in codebase: Scanner.WebService (~40 usages), Scanner.Analyzers.Native (~4 usages), plus other modules (AdvisoryAI 30+, Authority 40+, AirGap 12+, Attestor 25+, Cli 80+, Concelier 15+, etc.) requiring follow-up sprints. DET-019/020/021 created for follow-up work. | Agent |
| 2026-01-04 | DET-019 complete: Scanner.WebService refactored - 12 endpoint/service files (EpssEndpoints, EvidenceEndpoints, SmartDiffEndpoints, UnknownsEndpoints, WitnessEndpoints, TriageInboxEndpoints, ProofBundleEndpoints, ReportSigner, ScoreReplayService, TestManifestRepository, SliceQueryService, UnifiedEvidenceService) plus dependency fixes in Scanner.Sources (SourceTriggerDispatcher, SourceContracts) and Scanner.WebService (EvidenceBundleExporter, GatingReasonService). All builds verified. | Agent |
| 2026-01-04 | DET-020 in progress: Scanner.Analyzers.Native hardening extractors refactored - ElfHardeningExtractor, MachoHardeningExtractor, PeHardeningExtractor with TimeProvider injection. OfflineBuildIdIndex refactored. Build verified. RuntimeCapture adapters (LinuxEbpfCaptureAdapter, MacOsDyldCaptureAdapter, WindowsEtwCaptureAdapter) pending - require TimeProvider and IGuidProvider injection for 18+ usages across eBPF/DYLD/ETW tracing. | Agent |
| 2026-01-04 | DET-020 complete: RuntimeCapture adapters refactored - LinuxEbpfCaptureAdapter, MacOsDyldCaptureAdapter, WindowsEtwCaptureAdapter with TimeProvider and IGuidProvider injection (SessionId, StartTime, EndTime, Timestamp fields). RuntimeEvidenceAggregator.MergeWithStaticAnalysis updated with optional TimeProvider parameter. StackTraceCapture.CollapsedStack.Parse updated with optional TimeProvider parameter. Added StellaOps.Determinism.Abstractions reference to project. All builds verified. | Agent |
| 2026-01-06 | DET-021(d) continued: Cryptography.Kms module refactored - AwsKmsClient, GcpKmsClient, FileKmsClient (6 usages), Pkcs11KmsClient, Pkcs11Facade, GcpKmsFacade, AwsKmsFacade, Fido2KmsClient, Fido2Options with TimeProvider injection. Removed unnecessary TimeProvider.Abstractions package (built into .NET 10). All builds verified. | Agent |
| 2026-01-06 | DET-021 continued: SbomService module refactored - Clock.cs (SystemClock delegates to TimeProvider), LineageGraphService, SbomLineageEdgeRepository, PostgresOrchestratorRepository, InMemoryOrchestratorRepository, ReplayVerificationService, LineageCompareService, LineageExportService, LineageHoverCache, RegistrySourceService, OrchestratorControlService, WatermarkService. DTOs changed from default timestamps to required fields. All builds verified. | Agent |
| 2026-01-06 | DET-021 continued: Findings module refactored - LedgerEventMapping (TimeProvider parameter), Program.cs (TimeProvider injection), EvidenceGraphBuilder (TimeProvider constructor). Fixed pre-existing null reference issue in FindingWorkflowService.cs. All builds verified. | Agent |
| 2026-01-06 | DET-021 continued: Notify module refactored - InMemoryRepositories.cs (15 repository adapters: Channel, Rule, Template, Delivery, Digest, Lock, EscalationPolicy, EscalationState, OnCallSchedule, QuietHours, MaintenanceWindow, Inbox with TimeProvider constructors). All builds verified. | Agent |
| 2026-01-06 | DET-021 continued: ExportCenter module refactored - LineageEvidencePackService (12 usages), ExportRetentionService (1 usage), InMemorySchedulingStores (1 usage), ExportVerificationModels (VerifiedAt made required), ExportVerificationService (TimeProvider constructor + Failed factory calls), ExceptionReportGenerator (4 usages). All builds verified. | Agent |
| 2026-01-07 | DET-021 continued: Orchestrator module refactored - Infrastructure/Postgres repositories (PostgresPackRunRepository, PostgresPackRegistryRepository, PostgresQuotaRepository, PostgresRunRepository, PostgresSourceRepository, PostgresThrottleRepository, PostgresWatermarkRepository with TimeProvider constructors and usage updates). WebService/Endpoints (HealthEndpoints, KpiEndpoints with TimeProvider injection via [FromServices]). Domain records (IBackfillRepository/BackfillCheckpoint.Create/Complete/Fail methods now accept timestamps). All DateTimeOffset.UtcNow usages in production Postgres/Endpoint code eliminated. Remaining: CLI module (~100 usages), Policy.Gateway module (~50 usages). | Agent |
| 2026-01-07 | DET-021 continued: CLI module critical verifiers refactored - ForensicVerifier.cs (TimeProvider constructor, 2 usages updated), ImageAttestationVerifier.cs (TimeProvider constructor, 7 usages updated for verification timestamps and max age checks). Note: Pre-existing build errors in Policy.Tools and Scanner.Analyzers.Lang.Python unrelated to determinism changes. Further CLI refactoring deferred - large scope (~90+ remaining usages across 30+ files in short-lived CLI process). | Agent |
| 2026-01-07 | DET-021 continued: Policy.Gateway module refactored - ExceptionEndpoints.cs (10 DateTimeOffset.UtcNow usages across 6 endpoints: POST, PUT, approve, activate, extend, revoke), GateEndpoints.cs (3 usages: evaluate endpoint + health check), GovernanceEndpoints.cs (9 usages across sealed mode + risk profile handlers, plus RecordAudit helper), RegistryWebhookEndpoints.cs (3 usages: Docker, Harbor, generic webhook handlers), ExceptionApprovalEndpoints.cs (2 usages: CreateApprovalRequestAsync), InMemoryGateEvaluationQueue.cs (constructor + 2 usages). All handlers now use TimeProvider via [FromServices] or constructor injection. Note: InitializeDefaultProfiles() static initializer retained DateTimeOffset.UtcNow for bootstrap/seed data - acceptable for one-time startup code. | Agent |
| 2026-01-07 | DET-021 continued: Policy.Registry module refactored - InMemoryPolicyPackStore.cs (TimeProvider constructor, 4 usages: CreateAsync, UpdateAsync, UpdateStatusAsync, AddHistoryEntry), InMemorySnapshotStore.cs (TimeProvider constructor, 1 usage), InMemoryVerificationPolicyStore.cs (TimeProvider constructor, 2 usages: CreateAsync, UpdateAsync), InMemoryOverrideStore.cs (TimeProvider constructor, 2 usages: CreateAsync, ApproveAsync), InMemoryViolationStore.cs (TimeProvider constructor, 2 usages: AppendAsync, AppendBatchAsync). All builds verified. | Agent |
| 2026-01-07 | DET-021 continued: Policy.Engine module refactored - InMemoryExceptionRepository.cs (TimeProvider constructor, 2 usages: RevokeAsync, ExpireAsync), InMemoryPolicyPackRepository.cs (TimeProvider constructor, 6 usages across CreateAsync, UpsertRevisionAsync, StoreBundleAsync). Remaining Policy.Engine usages in domain models (TenantContextModels, EvidenceBundle, ExceptionMapper), telemetry services (MigrationTelemetryService, EwsTelemetryService), and complex services (PoEValidationService, PolicyMergePreviewService, VerdictLinkService, RiskProfileConfigurationService) require additional pattern decisions - some are default property initializers requiring schema-level changes. All modified files build verified. | Agent |
## Decisions & Risks
- **Decision:** Defer determinism refactoring from MAINT audit to dedicated sprint for focused, systematic approach.
- **Risk:** Large scope (~1526+ changes). Mitigate by module-by-module refactoring with incremental commits.
- **Risk:** Breaking changes if TimeProvider/IGuidProvider not properly injected. Mitigate with test coverage.
- **Risk (DET-011):** Scanner Triage entities have default property initializers (e.g., `CreatedAt = DateTimeOffset.UtcNow`). Removing defaults requires caller-side changes across all entity instantiation sites. Decision needed: remove defaults vs. leave as documentation debt for later phase.
## Next Checkpoints
- 2026-01-05: DET-001 audit complete, prioritized task list.
- 2026-01-10: First module refactoring complete (Policy).