# StellaOps Testing Strategy Master Plan (2026 H1)

> **Executive Summary:** Comprehensive 14-week testing initiative to establish model-driven test coverage across 15 modules, implement 6 foundation epics, and achieve 30%+ code coverage increase platform-wide.

---

## Document Control

| Attribute | Value |
|-----------|-------|
| **Program Name** | Testing Strategy Implementation 2026 H1 |
| **Program ID** | SPRINT-5100 |
| **Owner** | Project Management |
| **Status** | PLANNING |
| **Start Date** | 2026-01-06 (Week 1) |
| **Target End Date** | 2026-04-14 (Week 14) |
| **Budget** | TBD (resource model below) |
| **Last Updated** | 2025-12-23 |

---

## Table of Contents
1. [Program Objectives](#program-objectives)
2. [Scope & Deliverables](#scope--deliverables)
3. [Timeline & Phases](#timeline--phases)
4. [Resource Model](#resource-model)
5. [Risk Register](#risk-register)
6. [Success Metrics](#success-metrics)
7. [Governance](#governance)
8. [Communication Plan](#communication-plan)

---

## Program Objectives

### Primary Objectives

1. **Establish Model-Driven Testing:** Implement 9 test models (L0, S1, T1, C1, W1, WK1, AN1, CLI1, PERF) across 15 modules
2. **Increase Code Coverage:** Achieve ≥30% code coverage increase from baseline (current ~40% → target 70%+)
3. **Enforce Quality Gates:** Implement determinism, architecture, and module-specific quality gates
4. **Build Test Infrastructure:** Deliver 6 foundation epics (TestKit, Determinism, Storage, Connectors, WebService, Architecture)
5. **Enable CI/CD Confidence:** Establish PR-gating and merge-gating test lanes

### Secondary Objectives

1. **Reduce Test Flakiness:** Achieve 100% deterministic test pass rate (eliminate timing-based failures)
2. **Improve Developer Experience:** Standardize test patterns, reduce test authoring friction
3. **Establish Parity Monitoring:** Continuous validation against competitor tools (Syft, Grype, Trivy, Anchore)
4. **Document Test Strategy:** Create comprehensive testing guides and playbooks

---

## Scope & Deliverables

### In-Scope

#### Foundation Epics (Batch 5100.0007, 90 tasks)
| Sprint ID | Epic | Deliverables |
|-----------|------|--------------|
| 5100.0007.0001 | Master Testing Strategy | Strategy docs, test runner scripts, trait standardization, Epic sprint creation |
| 5100.0007.0002 | TestKit Foundations | DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, ValkeyFixture, OtelCapture, HttpFixtureServer |
| 5100.0007.0003 | Determinism Gate | Determinism manifest format, expanded integration tests, CI artifact storage, drift detection |
| 5100.0007.0004 | Storage Harness | PostgresFixture (Testcontainers), ValkeyFixture, automatic migrations, schema isolation |
| 5100.0007.0005 | Connector Fixtures | Fixture discipline, FixtureUpdater tool, pilot adoption in Concelier.Connector.NVD |
| 5100.0007.0006 | WebService Contract | WebServiceFixture<TProgram>, contract test pattern, pilot adoption in Scanner.WebService |
| 5100.0007.0007 | Architecture Tests | NetArchTest.Rules, lattice placement enforcement, PR-gating architecture tests |

#### Module Test Implementations (Batch 5100.0009, 185 tasks)
| Sprint ID | Module | Test Models | Deliverables |
|-----------|--------|-------------|--------------|
| 5100.0009.0001 | Scanner | L0, AN1, S1, T1, W1, WK1, PERF | 25 tasks: property tests, SBOM/reachability/verdict snapshots, determinism, WebService contract, Worker e2e, perf smoke |
| 5100.0009.0002 | Concelier | C1, L0, S1, W1, AN1 | 18 tasks: connector fixtures (NVD/OSV/GHSA/CSAF), merge property tests, WebService contract, architecture enforcement |
| 5100.0009.0003 | Excititor | C1, L0, S1, W1, WK1 | 21 tasks: connector fixtures (CSAF/OpenVEX), format export snapshots, preserve-prune tests, Worker e2e, architecture enforcement |
| 5100.0009.0004 | Policy | L0, S1, W1 | 15 tasks: policy engine property tests, DSL roundtrip tests, verdict snapshots, unknown budget enforcement |
| 5100.0009.0005 | Authority | L0, W1, C1 | 17 tasks: auth logic tests, connector fixtures (OIDC/SAML/LDAP), WebService contract, sign/verify integration |
| 5100.0009.0006 | Signer | L0, W1, C1 | 17 tasks: canonical payload tests, crypto plugin tests (BouncyCastle/CryptoPro/eIDAS/SimRemote), WebService contract |
| 5100.0009.0007 | Attestor | L0, W1 | 14 tasks: DSSE envelope tests, Rekor integration tests, attestation statement snapshots, WebService contract |
| 5100.0009.0008 | Scheduler | L0, S1, W1, WK1 | 14 tasks: scheduling invariant property tests, storage idempotency, WebService contract, Worker e2e |
| 5100.0009.0009 | Notify | L0, C1, S1, W1, WK1 | 18 tasks: connector fixtures (email/Slack/Teams/webhook), WebService contract, Worker e2e |
| 5100.0009.0010 | CLI | CLI1 | 13 tasks: exit code tests, golden output tests, determinism tests |
| 5100.0009.0011 | UI | W1 | 13 tasks: API contract tests, E2E smoke tests, accessibility tests |

#### Infrastructure Test Implementations (Batch 5100.0010, 62 tasks)
| Sprint ID | Module Family | Deliverables |
|-----------|---------------|--------------|
| 5100.0010.0001 | EvidenceLocker + Findings + Replay | Immutability tests, ledger determinism, replay token security, WebService contract |
| 5100.0010.0002 | Graph + TimelineIndexer | Graph construction/traversal tests, indexer e2e, query determinism, WebService contract |
| 5100.0010.0003 | Router + Messaging | Transport compliance suite (in-memory/TCP/TLS/Valkey/RabbitMQ), routing determinism, fuzz tests |
| 5100.0010.0004 | AirGap | Bundle export/import determinism, policy analyzer tests, WebService contract, CLI tool tests |

#### Quality Gates (Batch 5100.0008, 11 tasks)
| Sprint ID | Purpose | Deliverables |
|-----------|---------|--------------|
| 5100.0008.0001 | Competitor Parity Testing | Parity test harness, fixture set (10-15 container images), comparison logic (SBOM/vuln/latency/errors), time-series storage, drift detection (>5% threshold) |

### Out-of-Scope

- ❌ **Performance optimization** (beyond PERF smoke tests for Scanner)
- ❌ **UI/UX testing** (beyond W1 contract tests and E2E smoke tests)
- ❌ **Load testing** (deferred to future sprint)
- ❌ **Chaos engineering** (deferred to future sprint)
- ❌ **Mobile/responsive testing** (not applicable - server-side platform)
- ❌ **Penetration testing** (separate security initiative)

---

## Timeline & Phases

### Master Timeline (14 Weeks, 2026-01-06 to 2026-04-14)

```
PHASE 1: FOUNDATION (Weeks 1-4)
┌─────────────────────────────────────────────────────────────┐
│ Week 1-2: Master Strategy (5100.0007.0001)                  │
│  - Documentation sync                                        │
│  - Test runner scripts                                       │
│  - Trait standardization                                     │
│  - Epic sprint creation                                      │
│                                                              │
│ Week 3-4: TestKit Foundations (5100.0007.0002) ← CRITICAL   │
│  - DeterministicTime, DeterministicRandom                    │
│  - CanonicalJsonAssert, SnapshotAssert                       │
│  - PostgresFixture, ValkeyFixture, OtelCapture               │
└─────────────────────────────────────────────────────────────┘

PHASE 2: EPIC IMPLEMENTATION (Weeks 5-6)
┌─────────────────────────────────────────────────────────────┐
│ Week 5-6: 5 Epic Sprints (PARALLEL)                         │
│  - 5100.0007.0003 (Determinism Gate)                         │
│  - 5100.0007.0004 (Storage Harness)                          │
│  - 5100.0007.0005 (Connector Fixtures)                       │
│  - 5100.0007.0006 (WebService Contract)                      │
│  - 5100.0007.0007 (Architecture Tests)                       │
└─────────────────────────────────────────────────────────────┘

PHASE 3: MODULE TESTS - TIER 1 (Weeks 7-8)
┌─────────────────────────────────────────────────────────────┐
│ Week 7-8: 6 Module Sprints (PARALLEL)                       │
│  - Scanner, Concelier, Excititor (core platform)            │
│  - Policy, Authority, Signer (security/compliance)          │
└─────────────────────────────────────────────────────────────┘

PHASE 4: MODULE TESTS - TIER 2 (Weeks 9-10)
┌─────────────────────────────────────────────────────────────┐
│ Week 9-10: 5 Module Sprints (PARALLEL)                      │
│  - Attestor, Scheduler, Notify (platform services)          │
│  - CLI, UI (client interfaces)                              │
└─────────────────────────────────────────────────────────────┘

PHASE 5: INFRASTRUCTURE TESTS (Weeks 11-14)
┌─────────────────────────────────────────────────────────────┐
│ Week 11-14: 4 Infrastructure Sprints (PARALLEL)             │
│  - EvidenceLocker, Graph, Router/Messaging, AirGap          │
└─────────────────────────────────────────────────────────────┘

ONGOING: QUALITY GATES (Weeks 3-14+)
┌─────────────────────────────────────────────────────────────┐
│ Week 3: Competitor Parity harness setup                     │
│ Week 4+: Nightly/weekly parity tests                        │
└─────────────────────────────────────────────────────────────┘
```

### Critical Path (14 Weeks)

**Week 1-2:** Master Strategy → **Week 3-4:** TestKit ← **BOTTLENECK** → **Week 5-6:** Epic Implementation → **Week 7-10:** Module Tests → **Week 11-14:** Infrastructure Tests

**Critical Path Risks:**
- TestKit delay → ALL downstream sprints blocked (+2-4 weeks)
- Storage harness delay → 10 sprints blocked (+2-3 weeks)

### Milestones

| Milestone | Week | Deliverables | Sign-Off Criteria |
|-----------|------|--------------|-------------------|
| **M1: Foundation Ready** | Week 4 | TestKit operational | DeterministicTime, SnapshotAssert, PostgresFixture, OtelCapture available; pilot adoption in 2+ modules |
| **M2: Epic Complete** | Week 6 | All 6 foundation epics complete | Determinism gate in CI; Storage harness operational; WebService contract tests in Scanner; Architecture tests PR-gating |
| **M3: Core Modules Tested** | Week 8 | Scanner, Concelier, Excititor, Policy, Authority, Signer complete | Code coverage increase ≥30%; quality gates passing |
| **M4: All Modules Tested** | Week 10 | All 11 module test sprints complete | All module-specific quality gates passing |
| **M5: Program Complete** | Week 14 | All infrastructure tests complete; program retrospective | All sprints signed off; final metrics review |

---

## Resource Model

### Guild Allocation

| Guild | Assigned Sprints | Peak Staffing (Weeks 7-10) | Avg Sprint Ownership |
|-------|------------------|----------------------------|----------------------|
| **Platform Guild** | TestKit, Storage, Architecture, EvidenceLocker, Graph, Router | 10 engineers | 6 sprints |
| **Scanner Guild** | Scanner | 3 engineers | 1 sprint |
| **Concelier Guild** | Concelier | 2 engineers | 1 sprint |
| **Excititor Guild** | Excititor | 2 engineers | 1 sprint |
| **Policy Guild** | Policy, AirGap (analyzers) | 2-4 engineers | 2 sprints |
| **Authority Guild** | Authority | 2 engineers | 1 sprint |
| **Crypto Guild** | Signer, Attestor | 4 engineers | 2 sprints |
| **Scheduler Guild** | Scheduler | 2 engineers | 1 sprint |
| **Notify Guild** | Notify | 2 engineers | 1 sprint |
| **CLI Guild** | CLI | 1 engineer | 1 sprint |
| **UI Guild** | UI | 2 engineers | 1 sprint |
| **AirGap Guild** | AirGap (core) | 2 engineers | 1 sprint |
| **QA Guild** | Competitor Parity | 2 engineers | 1 sprint |

### Staffing Profile

**Peak Staffing (Weeks 7-10):** 22-26 engineers
**Average Staffing (Weeks 1-14):** 12-16 engineers
**Critical Path Sprints (TestKit, Storage):** 3-4 senior engineers each

### Resource Constraints

| Constraint | Impact | Mitigation |
|------------|--------|------------|
| Platform Guild oversubscribed (10 engineers, 6 sprints) | Burnout, delays | Stagger Epic sprints (Storage Week 5, Connectors Week 6); hire contractors for Weeks 5-10 |
| Senior engineers limited (5-6 available) | TestKit/Storage quality risk | Assign 2 senior engineers to TestKit (critical path); 1 senior to Storage; rotate for reviews |
| UI Guild availability (Angular expertise scarce) | UI sprint delayed | Start UI sprint Week 10 (after Tier 1/2 modules); hire Angular contractor if needed |

---

## Risk Register

### High-Impact Risks (Severity: CRITICAL)

| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R1 | TestKit delayed by 2+ weeks | MEDIUM | Blocks ALL 15 module/infra sprints; +4-6 weeks program delay | Staff with 2 senior engineers; daily standups; incremental releases (partial TestKit unblocks some modules) | Platform Guild | OPEN |
| R2 | Storage harness (Testcontainers) incompatible with .NET 10 | LOW | Blocks 10 sprints; +3-4 weeks delay | Validate Testcontainers compatibility Week 1; fallback to manual Postgres setup | Platform Guild | OPEN |
| R3 | Determinism tests fail due to non-deterministic crypto signatures | MEDIUM | Scanner, Signer, Attestor blocked; compliance issues | Focus determinism tests on payload hash (not signature bytes); document non-deterministic algorithms | Crypto Guild | OPEN |
| R4 | Concurrent module tests overwhelm CI infrastructure | HIGH | Test suite timeout, flaky tests, developer friction | Stagger module test starts (Tier 1 Week 7-8, Tier 2 Week 9-10); use dedicated CI runners; implement CI parallelization | Platform Guild | OPEN |

### Medium-Impact Risks

| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R5 | Attestor-Signer circular dependency blocks integration tests | MEDIUM | Integration tests delayed 1-2 weeks | Signer uses mock attestation initially; coordinate integration in Week 9 | Crypto Guild | OPEN |
| R6 | Upstream schema drift (NVD, OSV) breaks connector fixtures | MEDIUM | Connector tests fail; manual fixture regeneration required | FixtureUpdater tool automates regeneration; weekly live smoke tests detect drift early | Concelier Guild | OPEN |
| R7 | WebService contract tests too brittle (fail on every API change) | MEDIUM | Developer friction, contract tests disabled | Version APIs explicitly; allow non-breaking changes; review contract test strategy Week 6 | Platform Guild | OPEN |

### Low-Impact Risks

| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R8 | Property test generation too slow (FsCheck iterations high) | LOW | Test suite timeout | Limit property test iterations (default 100 → 50); profile and optimize generators | Scanner Guild | OPEN |
| R9 | Architecture tests false positive (allowlist too restrictive) | LOW | Valid code blocked | Review architecture rules Week 5; explicit allowlist for test projects, benchmarks | Platform Guild | OPEN |
| R10 | Competitor parity tests require paid Trivy/Anchore licenses | LOW | Parity testing incomplete | Use Trivy free tier; defer Anchore to future sprint; focus on Syft/Grype (OSS) | QA Guild | OPEN |

### Risk Burn-Down Plan

**Week 1:** Validate Testcontainers .NET 10 compatibility (R2)
**Week 2:** TestKit API design review (R1)
**Week 4:** Determinism test strategy review (R3)
**Week 6:** CI infrastructure capacity review (R4)
**Week 8:** Signer-Attestor integration coordination (R5)

---

## Success Metrics

### Quantitative Metrics

| Metric | Baseline | Target | Measurement Method | Tracked By |
|--------|----------|--------|-------------------|------------|
| **Code Coverage** | ~40% | ≥70% | `dotnet test --collect:"XPlat Code Coverage"` | Weekly (Fridays) |
| **Test Count** | ~200 tests | ≥500 tests | Test suite execution count | Weekly |
| **Determinism Pass Rate** | N/A (not tracked) | 100% (no flaky tests) | Determinism gate CI job | Daily (CI) |
| **Contract Test Coverage** | 0 WebServices | 13 WebServices (100%) | Contract lane CI job | Weekly |
| **Architecture Violations** | Unknown | 0 violations | Architecture test failures | Daily (CI, PR gate) |
| **Sprint On-Time Completion** | N/A | ≥80% | Tasks complete by wave deadline | Weekly |

### Qualitative Metrics

| Metric | Success Criteria | Measurement Method | Tracked By |
|--------|------------------|-------------------|------------|
| **Developer Experience** | ≥80% of developers rate test authoring as "easy" or "very easy" | Post-sprint developer survey (Week 14) | Project Manager |
| **Test Maintainability** | ≥75% of test failures are due to actual bugs (not test brittleness) | Monthly test failure classification | QA Guild |
| **Integration Confidence** | ≥90% of PRs pass CI on first attempt (no test fixes required) | CI metrics (PR pass rate) | Platform Guild |

### Program Success Criteria

✅ **Program Successful If:**
- All 22 sprints signed off (5100.0007.* + 5100.0008.0001 + 5100.0009.* + 5100.0010.*)
- Code coverage ≥70% platform-wide
- Determinism tests passing 100% in CI (no flaky tests)
- Contract tests enforced for all 13 WebServices
- Architecture tests PR-gating (lattice boundary violations blocked)

❌ **Program Failed If:**
- <18 sprints signed off (<80% completion)
- Code coverage increase <20% (baseline ~40% → <60%)
- Critical quality gates missing (Determinism, Architecture, Contract)
- TestKit not operational (blocking all module tests)

---

## Governance

### Steering Committee

| Role | Name | Responsibility |
|------|------|----------------|
| **Program Sponsor** | CTO | Final escalation; budget approval |
| **Program Manager** | Project Management | Overall program coordination; risk management |
| **Technical Lead** | Platform Guild Lead | Architecture decisions; technical escalation |
| **QA Lead** | QA Guild Lead | Quality gate oversight; test strategy validation |

### Decision-Making Authority

| Decision Type | Authority | Escalation Path |
|---------------|-----------|----------------|
| **Sprint scope changes** | Sprint owner + Guild lead | Program Manager → Steering Committee |
| **Architecture changes** | Platform Guild Lead | Steering Committee |
| **Resource allocation** | Program Manager | CTO (if >10% budget impact) |
| **Schedule changes (>1 week)** | Program Manager | Steering Committee |
| **Risk acceptance** | Program Manager | Steering Committee (for HIGH/CRITICAL risks) |

### Status Reporting

**Weekly Status Report (Fridays):**
- Sprint completion status (% tasks complete)
- Blockers and risks (RED/YELLOW/GREEN)
- Resource allocation (current vs. planned)
- Next week preview

**Monthly Executive Summary:**
- Program health (on-track / at-risk / off-track)
- Milestone completion (M1-M5)
- Budget vs. actuals
- Key risks and mitigations

### Change Control

**Change Request Process:**
1. **Requester submits change request** (scope, schedule, or resource change)
2. **Program Manager reviews** (impact analysis: cost, schedule, quality)
3. **Steering Committee approves/rejects** (for changes >1 week or >10% budget)
4. **Program Manager updates plan** (timeline, resource model, risk register)

---

## Communication Plan

### Stakeholders

| Stakeholder Group | Interest | Communication Frequency | Method |
|-------------------|----------|------------------------|--------|
| **Engineering Teams (Guilds)** | Sprint execution, dependencies | Daily/Weekly | Slack #testing-strategy, guild standups |
| **Guild Leads** | Sprint status, blockers | Weekly | Friday status sync (30 min) |
| **Product Management** | Quality gates, feature readiness | Bi-weekly | Sprint demos, monthly exec summary |
| **CTO / Executives** | Program health, budget | Monthly | Executive summary (email) |

### Meetings

#### Weekly Sync (Every Friday, 30 min)
**Attendees:** All active sprint owners + program manager
**Agenda:**
1. Sprint status updates (green/yellow/red) (15 min)
2. Blocker escalation (10 min)
3. Next week preview (5 min)

#### Monthly Steering Committee (First Monday, 60 min)
**Attendees:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead)
**Agenda:**
1. Program health review (on-track / at-risk / off-track) (20 min)
2. Milestone completion (M1-M5) (15 min)
3. Budget vs. actuals (10 min)
4. Risk review (top 3 risks) (10 min)
5. Decisions required (5 min)

#### Retrospective (Week 14, 90 min)
**Attendees:** All guild leads + program manager + steering committee
**Agenda:**
1. Program retrospective (what went well, what didn't, lessons learned) (60 min)
2. Metrics review (code coverage, test count, determinism, etc.) (20 min)
3. Future improvements (next testing initiatives) (10 min)

---

## Appendices

### Appendix A: Sprint Inventory

**Total Sprints:** 22
- Foundation Epics: 7 (5100.0007.0001-0007)
- Quality Gates: 1 (5100.0008.0001)
- Module Tests: 11 (5100.0009.0001-0011)
- Infrastructure Tests: 4 (5100.0010.0001-0004)

**Total Tasks:** ~370 tasks
**Total Estimated Effort:** ~450 engineer-days (assuming avg 1.2 days/task)

### Appendix B: Reference Documents

1. **Advisory:** `docs/product-advisories/22-Dec-2026 - Better testing strategy.md`
2. **Test Catalog:** `docs/testing/TEST_CATALOG.yml`
3. **Test Models:** `docs/testing/testing-strategy-models.md`
4. **Dependency Graph:** `docs/testing/SPRINT_DEPENDENCY_GRAPH.md`
5. **Coverage Matrix:** `docs/testing/TEST_COVERAGE_MATRIX.md`
6. **Execution Playbook:** `docs/testing/SPRINT_EXECUTION_PLAYBOOK.md`

### Appendix C: Budget Estimate (Preliminary)

**Assumptions:**
- Average engineer cost: $150/hour (fully loaded)
- Average sprint duration: 80 hours (2 weeks × 40 hours)
- Peak staffing: 22 engineers (Weeks 7-10)

**Budget Estimate:**
- Foundation Phase (Weeks 1-6): 12 engineers × 240 hours × $150 = $432,000
- Module Tests Phase (Weeks 7-10): 22 engineers × 160 hours × $150 = $528,000
- Infrastructure Phase (Weeks 11-14): 8 engineers × 160 hours × $150 = $192,000
- **Total Estimated Cost:** $1,152,000

**Note:** Final budget requires approval from CTO/Finance. Contractor costs may reduce total if used strategically for peak staffing (Weeks 7-10).

---

**Prepared by:** Project Management
**Approval Required From:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead)
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)