# Testing Strategy Sprint Execution Playbook

> **Purpose:** Practical guide for executing testing sprints - coordination, Definition of Done, sign-off criteria, ceremonies, and troubleshooting.

---

## Table of Contents
1. [Sprint Lifecycle](#sprint-lifecycle)
2. [Definition of Done (DoD)](#definition-of-done-dod)
3. [Wave-Based Execution](#wave-based-execution)
4. [Sign-Off Criteria](#sign-off-criteria)
5. [Cross-Guild Coordination](#cross-guild-coordination)
6. [Common Failure Patterns](#common-failure-patterns)
7. [Troubleshooting Guide](#troubleshooting-guide)
8. [Sprint Templates](#sprint-templates)

---

## Sprint Lifecycle

### Sprint States

```
TODO → DOING → BLOCKED/IN_REVIEW → DONE
  │       │          │                 │
  │       │          │                 └─ All waves complete + sign-off
  │       │          └─ Waiting on dependency or approval
  │       └─ Active development (1+ waves in progress)
  └─ Not yet started
```

### Standard Sprint Duration

- **Foundation Epics (5100.0007.*):** 2 weeks per sprint
- **Module Tests (5100.0009.*):** 2 weeks per sprint
- **Infrastructure Tests (5100.0010.*):** 2 weeks per sprint
- **Competitor Parity (5100.0008.0001):** Initial setup 2 weeks; then ongoing (nightly/weekly)

### Ceremonies

#### Sprint Kickoff (Day 1)
**Who:** Sprint owner + guild members + dependencies
**Duration:** 60 min
**Agenda:**
1. Review sprint scope and deliverables (10 min)
2. Review wave structure and task breakdown (15 min)
3. Identify dependencies and blockers (15 min)
4. Assign tasks to engineers (10 min)
5. Schedule wave reviews (5 min)
6. Q&A (5 min)

#### Wave Review (End of each wave)
**Who:** Sprint owner + guild members
**Duration:** 30 min
**Agenda:**
1. Demo completed tasks (10 min)
2. Review DoD checklist for wave (10 min)
3. Identify blockers for next wave (5 min)
4. Update sprint status in `Delivery Tracker` (5 min)

#### Sprint Sign-Off (Final day)
**Who:** Sprint owner + guild lead + architect (for critical sprints)
**Duration:** 30 min
**Agenda:**
1. Review all wave completion (10 min)
2. Verify sign-off criteria (10 min)
3. Demo integration (if applicable) (5 min)
4. Sign execution log (5 min)

#### Weekly Sync (Every Friday)
**Who:** All active sprint owners + project manager
**Duration:** 30 min
**Agenda:**
1. Sprint status updates (15 min)
2. Blocker escalation (10 min)
3. Next week preview (5 min)

---

## Definition of Done (DoD)

### Universal DoD (Applies to ALL sprints)

✅ **Code:**
- [ ] All tasks in `Delivery Tracker` marked as `DONE`
- [ ] Code reviewed by at least 1 other engineer
- [ ] No pending TODOs or FIXMEs in committed code
- [ ] Code follows StellaOps coding standards (SOLID, DRY, KISS)

✅ **Tests:**
- [ ] All tests passing locally
- [ ] All tests passing in CI (appropriate lane)
- [ ] Code coverage increase ≥ target (see module-specific DoD)
- [ ] No flaky tests (deterministic pass rate 100%)

✅ **Documentation:**
- [ ] Sprint `Execution Log` updated with completion date
- [ ] Module-specific `AGENTS.md` updated (if new patterns introduced)
- [ ] API documentation updated (if endpoints changed)

✅ **Integration:**
- [ ] Changes merged to `main` branch
- [ ] CI lanes passing (Unit, Contract, Integration, Security as applicable)
- [ ] No regressions introduced (existing tests still passing)

---

### Model-Specific DoD

#### L0 (Library/Core)
- [ ] Unit tests covering all public methods
- [ ] Property tests for key invariants (where applicable)
- [ ] Snapshot tests for canonical outputs (SBOM, VEX, verdicts, etc.)
- [ ] Code coverage: ≥80% for core libraries

#### S1 (Storage/Postgres)
- [ ] Migration tests (apply from scratch, apply from N-1) passing
- [ ] Idempotency tests passing (same operation twice → no duplicates)
- [ ] Query determinism tests passing (explicit ORDER BY checks)
- [ ] Testcontainers Postgres fixture operational

#### T1 (Transport/Queue)
- [ ] Protocol roundtrip tests passing
- [ ] Fuzz tests for invalid input passing
- [ ] Delivery semantics tests passing (at-least-once, idempotency)
- [ ] Backpressure tests passing

#### C1 (Connector/External)
- [ ] Fixture folders created (`Fixtures/<source>/<case>.json`, `Expected/<case>.canonical.json`)
- [ ] Parser tests passing (fixture → parse → snapshot)
- [ ] Resilience tests passing (missing fields, invalid enums, etc.)
- [ ] Security tests passing (URL allowlist, redirect handling, payload limits)

#### W1 (WebService/API)
- [ ] Contract tests passing (OpenAPI snapshot validation)
- [ ] Auth/authz tests passing (deny-by-default, token expiry, scope enforcement)
- [ ] OTel trace assertions passing (spans emitted, tags present)
- [ ] Negative tests passing (malformed requests, size limits, method mismatch)

#### WK1 (Worker/Indexer)
- [ ] End-to-end tests passing (enqueue → worker → stored → events emitted)
- [ ] Retry tests passing (transient failure → backoff; permanent → poison)
- [ ] Idempotency tests passing (same job twice → single execution)
- [ ] OTel correlation tests passing (trace spans across lifecycle)

#### AN1 (Analyzer/SourceGen)
- [ ] Roslyn compilation tests passing (expected diagnostics, no false positives)
- [ ] Golden generated code tests passing (if applicable)

#### CLI1 (Tool/CLI)
- [ ] Exit code tests passing (0=success, 1=user error, 2=system error, etc.)
- [ ] Golden output tests passing (stdout/stderr snapshots)
- [ ] Determinism tests passing (same inputs → same outputs)

#### PERF (Benchmarks)
- [ ] Benchmark tests operational
- [ ] Perf smoke tests in CI (2× regression gate)
- [ ] Baseline results documented

---

### Sprint-Specific DoD

#### Foundation Epic Sprints (5100.0007.*)

**Epic A (TestKit):**
- [ ] `StellaOps.TestKit` NuGet package published internally
- [ ] DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, ValkeyFixture, OtelCapture, HttpFixtureServer all operational
- [ ] Pilot adoption in 2+ modules (e.g., Scanner, Concelier)

**Epic B (Determinism):**
- [ ] Determinism manifest JSON schema defined
- [ ] `tests/integration/StellaOps.Integration.Determinism` expanded for SBOM, VEX, policy verdicts, evidence bundles, AirGap exports
- [ ] Determinism tests in CI (merge gate)
- [ ] Determinism artifacts stored in CI artifact repo

**Epic C (Storage):**
- [ ] PostgresFixture operational (Testcontainers, automatic migrations, schema isolation)
- [ ] ValkeyFixture operational
- [ ] Pilot adoption in 2+ modules with S1 model (e.g., Scanner, Policy)

**Epic D (Connectors):**
- [ ] Connector fixture discipline documented in `docs/testing/connector-fixture-discipline.md`
- [ ] FixtureUpdater tool operational (with `UPDATE_CONNECTOR_FIXTURES=1` env var guard)
- [ ] Pilot adoption in Concelier.Connector.NVD

**Epic E (WebService):**
- [ ] WebServiceFixture<TProgram> operational (Microsoft.AspNetCore.Mvc.Testing)
- [ ] Contract test pattern documented
- [ ] Pilot adoption in Scanner.WebService

**Epic F (Architecture):**
- [ ] `tests/architecture/StellaOps.Architecture.Tests` project operational
- [ ] Lattice placement rules enforced (Concelier/Excititor must NOT reference Scanner lattice)
- [ ] Architecture tests in CI (PR gate, Unit lane)

#### Module Test Sprints (5100.0009.*)

**Per Module:**
- [ ] All model requirements from TEST_CATALOG.yml satisfied
- [ ] Module-specific quality gates passing (see TEST_COVERAGE_MATRIX.md)
- [ ] Code coverage increase: ≥30% from baseline
- [ ] All wave deliverables complete

#### Infrastructure Test Sprints (5100.0010.*)

**Per Infrastructure Module:**
- [ ] All integration tests passing
- [ ] Cross-module dependencies validated (e.g., EvidenceLocker ↔ Scanner)

---

## Wave-Based Execution

### Wave Structure

Most sprints use a 3-4 wave structure:
- **Wave 1:** Foundation / Core logic
- **Wave 2:** Integration / Storage / Connectors
- **Wave 3:** WebService / Workers / End-to-end
- **Wave 4:** (Optional) Polish / Documentation / Edge cases

### Wave Execution Pattern

```
Week 1:
  Day 1-2: Wave 1 development
  Day 3: Wave 1 review → APPROVED → proceed to Wave 2
  Day 4-5: Wave 2 development

Week 2:
  Day 1: Wave 2 review → APPROVED → proceed to Wave 3
  Day 2-4: Wave 3 development
  Day 5: Wave 3 review + Sprint Sign-Off
```

### Wave Review Checklist

✅ **Before Wave Review:**
- [ ] All tasks in wave marked as `DOING` → `DONE` in `Delivery Tracker`
- [ ] All tests for wave passing in CI
- [ ] Code reviewed

✅ **During Wave Review:**
- [ ] Demo completed functionality
- [ ] Review wave DoD checklist
- [ ] Identify blockers for next wave
- [ ] **Sign-off decision:** APPROVED / CHANGES_REQUIRED / BLOCKED

✅ **After Wave Review:**
- [ ] Update sprint `Execution Log` with wave completion
- [ ] Update task status in `Delivery Tracker`
- [ ] If BLOCKED: escalate to project manager

---

## Sign-Off Criteria

### Sprint Sign-Off Levels

#### Level 1: Self-Sign-Off (Guild Lead)
**Applies to:** Routine module test sprints without architectural changes
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval

#### Level 2: Architect Sign-Off
**Applies to:** Foundation epics, architectural changes, cross-cutting concerns
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval
- **Architect review and approval**

#### Level 3: Project Manager + Architect Sign-Off
**Applies to:** Critical path sprints (TestKit, Determinism, Storage)
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval
- Architect approval
- **Project manager approval (validates dependencies unblocked)**

### Sign-Off Process

1. **Engineer completes final wave** → marks all tasks `DONE`
2. **Guild lead reviews** → verifies DoD checklist
3. **Sprint owner schedules sign-off meeting** (if Level 2/3)
4. **Sign-off meeting** (30 min):
   - Demo final deliverables
   - Review DoD checklist
   - Verify integration (if applicable)
   - **Decision:** APPROVED / CHANGES_REQUIRED
5. **Update Execution Log:**
   ```markdown
   | 2026-XX-XX | Sprint signed off by [Guild Lead / Architect / PM]. | [Owner] |
   ```

---

## Cross-Guild Coordination

### Dependency Handoffs

When Sprint A depends on Sprint B:

**Sprint B (Provider):**
1. **Week before completion:** Notify Sprint A owner of expected completion date
2. **Wave 2-3 complete:** Provide preview build / early access to Sprint A
3. **Sprint complete:** Formally notify Sprint A owner; provide integration guide

**Sprint A (Consumer):**
1. **Sprint B Wave 2:** Begin integration planning; identify integration risks
2. **Sprint B Wave 3:** Start integration development (against preview build)
3. **Sprint B complete:** Complete integration; validate against final build

### Coordination Meetings

#### Epic → Module Handoff (Week 5)
**Who:** Epic sprint owners + all module sprint owners
**Duration:** 60 min
**Agenda:**
1. Epic deliverables review (TestKit, Storage, etc.) (20 min)
2. Integration guide walkthrough (15 min)
3. Module sprint kickoff previews (15 min)
4. Q&A (10 min)

#### Module Integration Sync (Bi-weekly, Weeks 7-10)
**Who:** Module sprint owners with cross-dependencies (e.g., Signer ↔ Attestor)
**Duration:** 30 min
**Agenda:**
1. Integration status updates (10 min)
2. Blocker resolution (15 min)
3. Next steps (5 min)

### Blocked Sprint Protocol

If a sprint is BLOCKED:

1. **Sprint owner:** Update sprint status to `BLOCKED` in `Delivery Tracker`
2. **Sprint owner:** Add blocker note to `Decisions & Risks` table
3. **Sprint owner:** Notify project manager immediately (Slack + email)
4. **Project manager:** Schedule blocker resolution meeting within 24 hours
5. **Resolution meeting:** Decide:
   - **Workaround:** Continue with mock/stub dependency
   - **Re-sequence:** Defer sprint; start alternative sprint
   - **Escalate:** Assign additional resources to unblock dependency

---

## Common Failure Patterns

### Pattern 1: Testcontainers Failure (Storage Harness)

**Symptom:** Tests fail with "Docker not running" or "Container startup timeout"

**Root Cause:** Docker daemon not running, Docker Desktop not installed, or Testcontainers compatibility issue

**Fix:**
1. Verify Docker Desktop installed and running
2. Verify Testcontainers.Postgres version compatible with .NET 10
3. Add explicit timeout: `new PostgreSqlBuilder().WithStartupTimeout(TimeSpan.FromMinutes(5))`
4. For CI: ensure Docker available in CI runner environment

### Pattern 2: Determinism Test Drift

**Symptom:** Determinism tests fail with "Expected hash X, got hash Y"

**Root Cause:** Non-deterministic timestamps, GUIDs, or ordering

**Fix:**
1. Use `DeterministicTime` instead of `DateTime.UtcNow`
2. Use `DeterministicRandom` for random data
3. Explicit `ORDER BY` clauses in all queries
4. Strip timestamps from snapshots or use placeholders

### Pattern 3: Fixture Update Breaks Tests

**Symptom:** Connector tests fail after updating fixtures

**Root Cause:** Upstream schema drift (NVD, OSV, etc.)

**Fix:**
1. Review schema changes in upstream source
2. Update connector parser logic if needed
3. Regenerate expected snapshots with `UPDATE_CONNECTOR_FIXTURES=1`
4. Document schema version in fixture filename (e.g., `nvd_v1.1.json`)

### Pattern 4: WebService Contract Drift

**Symptom:** Contract tests fail with "OpenAPI schema mismatch"

**Root Cause:** Backend API schema changed (breaking change)

**Fix:**
1. Review API changes in backend PR
2. **If breaking:** Version API (e.g., `/api/v2/...`)
3. **If non-breaking:** Update contract snapshot
4. Coordinate with frontend/consumer teams

### Pattern 5: Circular Dependency (Attestor ↔ Signer)

**Symptom:** Integration tests blocked waiting for both Attestor and Signer

**Root Cause:** Attestor needs Signer for signing; Signer integration tests need Attestor

**Fix:**
1. **Signer Sprint (5100.0009.0006):** Use mock signing for initial tests; defer Attestor integration
2. **Attestor Sprint (5100.0009.0007):** Coordinate with Signer guild; run integration tests in Week 2
3. **Integration Sprint (post-module):** Full Attestor ↔ Signer integration validation

### Pattern 6: Flaky Tests (Timing Issues)

**Symptom:** Tests pass locally but fail intermittently in CI

**Root Cause:** Race conditions, sleeps, non-deterministic timing

**Fix:**
1. Use `DeterministicTime` instead of `Thread.Sleep` or `Task.Delay`
2. Use explicit waits (e.g., `await condition.UntilAsync(...)`) instead of fixed delays
3. Avoid hard-coded timeouts; use configurable timeouts
4. Run tests 10× locally to verify determinism

---

## Troubleshooting Guide

### Issue: "My sprint depends on Epic X, but Epic X is delayed"

**Solution:**
1. Check if partial Epic X deliverables available (e.g., TestKit Wave 1-2 complete → can start L0 tests)
2. If not, use mock/stub implementation
3. Coordinate with Epic X owner for preview build
4. If critically blocked: escalate to project manager for re-sequencing

### Issue: "Tests passing locally but failing in CI"

**Checklist:**
- [ ] Docker running in CI? (for Testcontainers)
- [ ] Environment variables set? (e.g., `STELLAOPS_TEST_POSTGRES_CONNECTION`)
- [ ] Correct .NET SDK version? (net10.0)
- [ ] Test isolation? (each test resets state)
- [ ] Deterministic? (run tests 10× locally)

### Issue: "Code coverage below target (80%)"

**Solution:**
1. Identify uncovered lines: `dotnet test --collect:"XPlat Code Coverage"`
2. Add unit tests for uncovered public methods
3. Add property tests for invariants
4. If coverage still low: review with guild lead (some boilerplate excluded from coverage)

### Issue: "Architecture tests failing (lattice boundary violation)"

**Solution:**
1. Review failing types: which assembly is referencing Scanner lattice?
2. **If legitimate:** Refactor to remove dependency (move logic to Scanner.WebService)
3. **If test project:** Add to allowlist in architecture test

### Issue: "Snapshot test failing after refactor"

**Solution:**
1. Review snapshot diff: is it intentional?
2. **If intentional:** Update snapshot (re-run test with snapshot update flag)
3. **If unintentional:** Revert refactor; investigate why output changed

---

## Sprint Templates

### Template: Task Status Update

```markdown
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | MODULE-5100-001 | DONE | None | John Doe | Add unit tests for... |
| 2 | MODULE-5100-002 | DOING | Task 1 | Jane Smith | Add property tests for... |
| 3 | MODULE-5100-003 | TODO | Task 2 | - | Add snapshot tests for... |
```

### Template: Execution Log Entry

```markdown
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-20 | Sprint created. | Project Mgmt |
| 2026-01-27 | Wave 1 complete (Tasks 1-5). | Guild Lead |
| 2026-02-03 | Wave 2 complete (Tasks 6-10). | Guild Lead |
| 2026-02-10 | Sprint signed off by Architect. | Project Mgmt |
```

### Template: Blocker Note

```markdown
## Decisions & Risks
| Risk | Impact | Mitigation | Owner |
| --- | --- | --- | --- |
| [BLOCKER] TestKit delayed by 1 week | Cannot start module tests | Using mock TestKit for initial development; switch to real TestKit Week 5 | Module Guild |
```

---

## Next Steps

1. **Week 1:** All guild leads review this playbook
2. **Week 1:** Project manager schedules kickoff meetings for Foundation Epics (Week 3)
3. **Week 2:** Epic sprint owners prepare kickoff materials (scope, wave breakdown, task assignments)
4. **Week 3:** Foundation Epic sprints begin (5100.0007.0002-0007)

---

**Prepared by:** Project Management
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)