# StellaOps Testing Strategy Master Plan (2026 H1) > **Executive Summary:** Comprehensive 14-week testing initiative to establish model-driven test coverage across 15 modules, implement 6 foundation epics, and achieve 30%+ code coverage increase platform-wide. --- ## Document Control | Attribute | Value | |-----------|-------| | **Program Name** | Testing Strategy Implementation 2026 H1 | | **Program ID** | SPRINT-5100 | | **Owner** | Project Management | | **Status** | PLANNING | | **Start Date** | 2026-01-06 (Week 1) | | **Target End Date** | 2026-04-14 (Week 14) | | **Budget** | TBD (resource model below) | | **Last Updated** | 2025-12-23 | --- ## Table of Contents 1. [Program Objectives](#program-objectives) 2. [Scope & Deliverables](#scope--deliverables) 3. [Timeline & Phases](#timeline--phases) 4. [Resource Model](#resource-model) 5. [Risk Register](#risk-register) 6. [Success Metrics](#success-metrics) 7. [Governance](#governance) 8. [Communication Plan](#communication-plan) --- ## Program Objectives ### Primary Objectives 1. **Establish Model-Driven Testing:** Implement 9 test models (L0, S1, T1, C1, W1, WK1, AN1, CLI1, PERF) across 15 modules 2. **Increase Code Coverage:** Achieve ≥30% code coverage increase from baseline (current ~40% → target 70%+) 3. **Enforce Quality Gates:** Implement determinism, architecture, and module-specific quality gates 4. **Build Test Infrastructure:** Deliver 6 foundation epics (TestKit, Determinism, Storage, Connectors, WebService, Architecture) 5. **Enable CI/CD Confidence:** Establish PR-gating and merge-gating test lanes ### Secondary Objectives 1. **Reduce Test Flakiness:** Achieve 100% deterministic test pass rate (eliminate timing-based failures) 2. **Improve Developer Experience:** Standardize test patterns, reduce test authoring friction 3. **Establish Parity Monitoring:** Continuous validation against competitor tools (Syft, Grype, Trivy, Anchore) 4. **Document Test Strategy:** Create comprehensive testing guides and playbooks --- ## Scope & Deliverables ### In-Scope #### Foundation Epics (Batch 5100.0007, 90 tasks) | Sprint ID | Epic | Deliverables | |-----------|------|--------------| | 5100.0007.0001 | Master Testing Strategy | Strategy docs, test runner scripts, trait standardization, Epic sprint creation | | 5100.0007.0002 | TestKit Foundations | DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, ValkeyFixture, OtelCapture, HttpFixtureServer | | 5100.0007.0003 | Determinism Gate | Determinism manifest format, expanded integration tests, CI artifact storage, drift detection | | 5100.0007.0004 | Storage Harness | PostgresFixture (Testcontainers), ValkeyFixture, automatic migrations, schema isolation | | 5100.0007.0005 | Connector Fixtures | Fixture discipline, FixtureUpdater tool, pilot adoption in Concelier.Connector.NVD | | 5100.0007.0006 | WebService Contract | WebServiceFixture, contract test pattern, pilot adoption in Scanner.WebService | | 5100.0007.0007 | Architecture Tests | NetArchTest.Rules, lattice placement enforcement, PR-gating architecture tests | #### Module Test Implementations (Batch 5100.0009, 185 tasks) | Sprint ID | Module | Test Models | Deliverables | |-----------|--------|-------------|--------------| | 5100.0009.0001 | Scanner | L0, AN1, S1, T1, W1, WK1, PERF | 25 tasks: property tests, SBOM/reachability/verdict snapshots, determinism, WebService contract, Worker e2e, perf smoke | | 5100.0009.0002 | Concelier | C1, L0, S1, W1, AN1 | 18 tasks: connector fixtures (NVD/OSV/GHSA/CSAF), merge property tests, WebService contract, architecture enforcement | | 5100.0009.0003 | Excititor | C1, L0, S1, W1, WK1 | 21 tasks: connector fixtures (CSAF/OpenVEX), format export snapshots, preserve-prune tests, Worker e2e, architecture enforcement | | 5100.0009.0004 | Policy | L0, S1, W1 | 15 tasks: policy engine property tests, DSL roundtrip tests, verdict snapshots, unknown budget enforcement | | 5100.0009.0005 | Authority | L0, W1, C1 | 17 tasks: auth logic tests, connector fixtures (OIDC/SAML/LDAP), WebService contract, sign/verify integration | | 5100.0009.0006 | Signer | L0, W1, C1 | 17 tasks: canonical payload tests, crypto plugin tests (BouncyCastle/CryptoPro/eIDAS/SimRemote), WebService contract | | 5100.0009.0007 | Attestor | L0, W1 | 14 tasks: DSSE envelope tests, Rekor integration tests, attestation statement snapshots, WebService contract | | 5100.0009.0008 | Scheduler | L0, S1, W1, WK1 | 14 tasks: scheduling invariant property tests, storage idempotency, WebService contract, Worker e2e | | 5100.0009.0009 | Notify | L0, C1, S1, W1, WK1 | 18 tasks: connector fixtures (email/Slack/Teams/webhook), WebService contract, Worker e2e | | 5100.0009.0010 | CLI | CLI1 | 13 tasks: exit code tests, golden output tests, determinism tests | | 5100.0009.0011 | UI | W1 | 13 tasks: API contract tests, E2E smoke tests, accessibility tests | #### Infrastructure Test Implementations (Batch 5100.0010, 62 tasks) | Sprint ID | Module Family | Deliverables | |-----------|---------------|--------------| | 5100.0010.0001 | EvidenceLocker + Findings + Replay | Immutability tests, ledger determinism, replay token security, WebService contract | | 5100.0010.0002 | Graph + TimelineIndexer | Graph construction/traversal tests, indexer e2e, query determinism, WebService contract | | 5100.0010.0003 | Router + Messaging | Transport compliance suite (in-memory/TCP/TLS/Valkey/RabbitMQ), routing determinism, fuzz tests | | 5100.0010.0004 | AirGap | Bundle export/import determinism, policy analyzer tests, WebService contract, CLI tool tests | #### Quality Gates (Batch 5100.0008, 11 tasks) | Sprint ID | Purpose | Deliverables | |-----------|---------|--------------| | 5100.0008.0001 | Competitor Parity Testing | Parity test harness, fixture set (10-15 container images), comparison logic (SBOM/vuln/latency/errors), time-series storage, drift detection (>5% threshold) | ### Out-of-Scope - ❌ **Performance optimization** (beyond PERF smoke tests for Scanner) - ❌ **UI/UX testing** (beyond W1 contract tests and E2E smoke tests) - ❌ **Load testing** (deferred to future sprint) - ❌ **Chaos engineering** (deferred to future sprint) - ❌ **Mobile/responsive testing** (not applicable - server-side platform) - ❌ **Penetration testing** (separate security initiative) --- ## Timeline & Phases ### Master Timeline (14 Weeks, 2026-01-06 to 2026-04-14) ``` PHASE 1: FOUNDATION (Weeks 1-4) ┌─────────────────────────────────────────────────────────────┐ │ Week 1-2: Master Strategy (5100.0007.0001) │ │ - Documentation sync │ │ - Test runner scripts │ │ - Trait standardization │ │ - Epic sprint creation │ │ │ │ Week 3-4: TestKit Foundations (5100.0007.0002) ← CRITICAL │ │ - DeterministicTime, DeterministicRandom │ │ - CanonicalJsonAssert, SnapshotAssert │ │ - PostgresFixture, ValkeyFixture, OtelCapture │ └─────────────────────────────────────────────────────────────┘ PHASE 2: EPIC IMPLEMENTATION (Weeks 5-6) ┌─────────────────────────────────────────────────────────────┐ │ Week 5-6: 5 Epic Sprints (PARALLEL) │ │ - 5100.0007.0003 (Determinism Gate) │ │ - 5100.0007.0004 (Storage Harness) │ │ - 5100.0007.0005 (Connector Fixtures) │ │ - 5100.0007.0006 (WebService Contract) │ │ - 5100.0007.0007 (Architecture Tests) │ └─────────────────────────────────────────────────────────────┘ PHASE 3: MODULE TESTS - TIER 1 (Weeks 7-8) ┌─────────────────────────────────────────────────────────────┐ │ Week 7-8: 6 Module Sprints (PARALLEL) │ │ - Scanner, Concelier, Excititor (core platform) │ │ - Policy, Authority, Signer (security/compliance) │ └─────────────────────────────────────────────────────────────┘ PHASE 4: MODULE TESTS - TIER 2 (Weeks 9-10) ┌─────────────────────────────────────────────────────────────┐ │ Week 9-10: 5 Module Sprints (PARALLEL) │ │ - Attestor, Scheduler, Notify (platform services) │ │ - CLI, UI (client interfaces) │ └─────────────────────────────────────────────────────────────┘ PHASE 5: INFRASTRUCTURE TESTS (Weeks 11-14) ┌─────────────────────────────────────────────────────────────┐ │ Week 11-14: 4 Infrastructure Sprints (PARALLEL) │ │ - EvidenceLocker, Graph, Router/Messaging, AirGap │ └─────────────────────────────────────────────────────────────┘ ONGOING: QUALITY GATES (Weeks 3-14+) ┌─────────────────────────────────────────────────────────────┐ │ Week 3: Competitor Parity harness setup │ │ Week 4+: Nightly/weekly parity tests │ └─────────────────────────────────────────────────────────────┘ ``` ### Critical Path (14 Weeks) **Week 1-2:** Master Strategy → **Week 3-4:** TestKit ← **BOTTLENECK** → **Week 5-6:** Epic Implementation → **Week 7-10:** Module Tests → **Week 11-14:** Infrastructure Tests **Critical Path Risks:** - TestKit delay → ALL downstream sprints blocked (+2-4 weeks) - Storage harness delay → 10 sprints blocked (+2-3 weeks) ### Milestones | Milestone | Week | Deliverables | Sign-Off Criteria | |-----------|------|--------------|-------------------| | **M1: Foundation Ready** | Week 4 | TestKit operational | DeterministicTime, SnapshotAssert, PostgresFixture, OtelCapture available; pilot adoption in 2+ modules | | **M2: Epic Complete** | Week 6 | All 6 foundation epics complete | Determinism gate in CI; Storage harness operational; WebService contract tests in Scanner; Architecture tests PR-gating | | **M3: Core Modules Tested** | Week 8 | Scanner, Concelier, Excititor, Policy, Authority, Signer complete | Code coverage increase ≥30%; quality gates passing | | **M4: All Modules Tested** | Week 10 | All 11 module test sprints complete | All module-specific quality gates passing | | **M5: Program Complete** | Week 14 | All infrastructure tests complete; program retrospective | All sprints signed off; final metrics review | --- ## Resource Model ### Guild Allocation | Guild | Assigned Sprints | Peak Staffing (Weeks 7-10) | Avg Sprint Ownership | |-------|------------------|----------------------------|----------------------| | **Platform Guild** | TestKit, Storage, Architecture, EvidenceLocker, Graph, Router | 10 engineers | 6 sprints | | **Scanner Guild** | Scanner | 3 engineers | 1 sprint | | **Concelier Guild** | Concelier | 2 engineers | 1 sprint | | **Excititor Guild** | Excititor | 2 engineers | 1 sprint | | **Policy Guild** | Policy, AirGap (analyzers) | 2-4 engineers | 2 sprints | | **Authority Guild** | Authority | 2 engineers | 1 sprint | | **Crypto Guild** | Signer, Attestor | 4 engineers | 2 sprints | | **Scheduler Guild** | Scheduler | 2 engineers | 1 sprint | | **Notify Guild** | Notify | 2 engineers | 1 sprint | | **CLI Guild** | CLI | 1 engineer | 1 sprint | | **UI Guild** | UI | 2 engineers | 1 sprint | | **AirGap Guild** | AirGap (core) | 2 engineers | 1 sprint | | **QA Guild** | Competitor Parity | 2 engineers | 1 sprint | ### Staffing Profile **Peak Staffing (Weeks 7-10):** 22-26 engineers **Average Staffing (Weeks 1-14):** 12-16 engineers **Critical Path Sprints (TestKit, Storage):** 3-4 senior engineers each ### Resource Constraints | Constraint | Impact | Mitigation | |------------|--------|------------| | Platform Guild oversubscribed (10 engineers, 6 sprints) | Burnout, delays | Stagger Epic sprints (Storage Week 5, Connectors Week 6); hire contractors for Weeks 5-10 | | Senior engineers limited (5-6 available) | TestKit/Storage quality risk | Assign 2 senior engineers to TestKit (critical path); 1 senior to Storage; rotate for reviews | | UI Guild availability (Angular expertise scarce) | UI sprint delayed | Start UI sprint Week 10 (after Tier 1/2 modules); hire Angular contractor if needed | --- ## Risk Register ### High-Impact Risks (Severity: CRITICAL) | ID | Risk | Probability | Impact | Mitigation | Owner | Status | |----|------|-------------|--------|------------|-------|--------| | R1 | TestKit delayed by 2+ weeks | MEDIUM | Blocks ALL 15 module/infra sprints; +4-6 weeks program delay | Staff with 2 senior engineers; daily standups; incremental releases (partial TestKit unblocks some modules) | Platform Guild | OPEN | | R2 | Storage harness (Testcontainers) incompatible with .NET 10 | LOW | Blocks 10 sprints; +3-4 weeks delay | Validate Testcontainers compatibility Week 1; fallback to manual Postgres setup | Platform Guild | OPEN | | R3 | Determinism tests fail due to non-deterministic crypto signatures | MEDIUM | Scanner, Signer, Attestor blocked; compliance issues | Focus determinism tests on payload hash (not signature bytes); document non-deterministic algorithms | Crypto Guild | OPEN | | R4 | Concurrent module tests overwhelm CI infrastructure | HIGH | Test suite timeout, flaky tests, developer friction | Stagger module test starts (Tier 1 Week 7-8, Tier 2 Week 9-10); use dedicated CI runners; implement CI parallelization | Platform Guild | OPEN | ### Medium-Impact Risks | ID | Risk | Probability | Impact | Mitigation | Owner | Status | |----|------|-------------|--------|------------|-------|--------| | R5 | Attestor-Signer circular dependency blocks integration tests | MEDIUM | Integration tests delayed 1-2 weeks | Signer uses mock attestation initially; coordinate integration in Week 9 | Crypto Guild | OPEN | | R6 | Upstream schema drift (NVD, OSV) breaks connector fixtures | MEDIUM | Connector tests fail; manual fixture regeneration required | FixtureUpdater tool automates regeneration; weekly live smoke tests detect drift early | Concelier Guild | OPEN | | R7 | WebService contract tests too brittle (fail on every API change) | MEDIUM | Developer friction, contract tests disabled | Version APIs explicitly; allow non-breaking changes; review contract test strategy Week 6 | Platform Guild | OPEN | ### Low-Impact Risks | ID | Risk | Probability | Impact | Mitigation | Owner | Status | |----|------|-------------|--------|------------|-------|--------| | R8 | Property test generation too slow (FsCheck iterations high) | LOW | Test suite timeout | Limit property test iterations (default 100 → 50); profile and optimize generators | Scanner Guild | OPEN | | R9 | Architecture tests false positive (allowlist too restrictive) | LOW | Valid code blocked | Review architecture rules Week 5; explicit allowlist for test projects, benchmarks | Platform Guild | OPEN | | R10 | Competitor parity tests require paid Trivy/Anchore licenses | LOW | Parity testing incomplete | Use Trivy free tier; defer Anchore to future sprint; focus on Syft/Grype (OSS) | QA Guild | OPEN | ### Risk Burn-Down Plan **Week 1:** Validate Testcontainers .NET 10 compatibility (R2) **Week 2:** TestKit API design review (R1) **Week 4:** Determinism test strategy review (R3) **Week 6:** CI infrastructure capacity review (R4) **Week 8:** Signer-Attestor integration coordination (R5) --- ## Success Metrics ### Quantitative Metrics | Metric | Baseline | Target | Measurement Method | Tracked By | |--------|----------|--------|-------------------|------------| | **Code Coverage** | ~40% | ≥70% | `dotnet test --collect:"XPlat Code Coverage"` | Weekly (Fridays) | | **Test Count** | ~200 tests | ≥500 tests | Test suite execution count | Weekly | | **Determinism Pass Rate** | N/A (not tracked) | 100% (no flaky tests) | Determinism gate CI job | Daily (CI) | | **Contract Test Coverage** | 0 WebServices | 13 WebServices (100%) | Contract lane CI job | Weekly | | **Architecture Violations** | Unknown | 0 violations | Architecture test failures | Daily (CI, PR gate) | | **Sprint On-Time Completion** | N/A | ≥80% | Tasks complete by wave deadline | Weekly | ### Qualitative Metrics | Metric | Success Criteria | Measurement Method | Tracked By | |--------|------------------|-------------------|------------| | **Developer Experience** | ≥80% of developers rate test authoring as "easy" or "very easy" | Post-sprint developer survey (Week 14) | Project Manager | | **Test Maintainability** | ≥75% of test failures are due to actual bugs (not test brittleness) | Monthly test failure classification | QA Guild | | **Integration Confidence** | ≥90% of PRs pass CI on first attempt (no test fixes required) | CI metrics (PR pass rate) | Platform Guild | ### Program Success Criteria ✅ **Program Successful If:** - All 22 sprints signed off (5100.0007.* + 5100.0008.0001 + 5100.0009.* + 5100.0010.*) - Code coverage ≥70% platform-wide - Determinism tests passing 100% in CI (no flaky tests) - Contract tests enforced for all 13 WebServices - Architecture tests PR-gating (lattice boundary violations blocked) ❌ **Program Failed If:** - <18 sprints signed off (<80% completion) - Code coverage increase <20% (baseline ~40% → <60%) - Critical quality gates missing (Determinism, Architecture, Contract) - TestKit not operational (blocking all module tests) --- ## Governance ### Steering Committee | Role | Name | Responsibility | |------|------|----------------| | **Program Sponsor** | CTO | Final escalation; budget approval | | **Program Manager** | Project Management | Overall program coordination; risk management | | **Technical Lead** | Platform Guild Lead | Architecture decisions; technical escalation | | **QA Lead** | QA Guild Lead | Quality gate oversight; test strategy validation | ### Decision-Making Authority | Decision Type | Authority | Escalation Path | |---------------|-----------|----------------| | **Sprint scope changes** | Sprint owner + Guild lead | Program Manager → Steering Committee | | **Architecture changes** | Platform Guild Lead | Steering Committee | | **Resource allocation** | Program Manager | CTO (if >10% budget impact) | | **Schedule changes (>1 week)** | Program Manager | Steering Committee | | **Risk acceptance** | Program Manager | Steering Committee (for HIGH/CRITICAL risks) | ### Status Reporting **Weekly Status Report (Fridays):** - Sprint completion status (% tasks complete) - Blockers and risks (RED/YELLOW/GREEN) - Resource allocation (current vs. planned) - Next week preview **Monthly Executive Summary:** - Program health (on-track / at-risk / off-track) - Milestone completion (M1-M5) - Budget vs. actuals - Key risks and mitigations ### Change Control **Change Request Process:** 1. **Requester submits change request** (scope, schedule, or resource change) 2. **Program Manager reviews** (impact analysis: cost, schedule, quality) 3. **Steering Committee approves/rejects** (for changes >1 week or >10% budget) 4. **Program Manager updates plan** (timeline, resource model, risk register) --- ## Communication Plan ### Stakeholders | Stakeholder Group | Interest | Communication Frequency | Method | |-------------------|----------|------------------------|--------| | **Engineering Teams (Guilds)** | Sprint execution, dependencies | Daily/Weekly | Slack #testing-strategy, guild standups | | **Guild Leads** | Sprint status, blockers | Weekly | Friday status sync (30 min) | | **Product Management** | Quality gates, feature readiness | Bi-weekly | Sprint demos, monthly exec summary | | **CTO / Executives** | Program health, budget | Monthly | Executive summary (email) | ### Meetings #### Weekly Sync (Every Friday, 30 min) **Attendees:** All active sprint owners + program manager **Agenda:** 1. Sprint status updates (green/yellow/red) (15 min) 2. Blocker escalation (10 min) 3. Next week preview (5 min) #### Monthly Steering Committee (First Monday, 60 min) **Attendees:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead) **Agenda:** 1. Program health review (on-track / at-risk / off-track) (20 min) 2. Milestone completion (M1-M5) (15 min) 3. Budget vs. actuals (10 min) 4. Risk review (top 3 risks) (10 min) 5. Decisions required (5 min) #### Retrospective (Week 14, 90 min) **Attendees:** All guild leads + program manager + steering committee **Agenda:** 1. Program retrospective (what went well, what didn't, lessons learned) (60 min) 2. Metrics review (code coverage, test count, determinism, etc.) (20 min) 3. Future improvements (next testing initiatives) (10 min) --- ## Appendices ### Appendix A: Sprint Inventory **Total Sprints:** 22 - Foundation Epics: 7 (5100.0007.0001-0007) - Quality Gates: 1 (5100.0008.0001) - Module Tests: 11 (5100.0009.0001-0011) - Infrastructure Tests: 4 (5100.0010.0001-0004) **Total Tasks:** ~370 tasks **Total Estimated Effort:** ~450 engineer-days (assuming avg 1.2 days/task) ### Appendix B: Reference Documents 1. **Advisory:** `docs/product-advisories/22-Dec-2026 - Better testing strategy.md` 2. **Test Catalog:** `docs/testing/TEST_CATALOG.yml` 3. **Test Models:** `docs/testing/testing-strategy-models.md` 4. **Dependency Graph:** `docs/testing/SPRINT_DEPENDENCY_GRAPH.md` 5. **Coverage Matrix:** `docs/testing/TEST_COVERAGE_MATRIX.md` 6. **Execution Playbook:** `docs/testing/SPRINT_EXECUTION_PLAYBOOK.md` ### Appendix C: Budget Estimate (Preliminary) **Assumptions:** - Average engineer cost: $150/hour (fully loaded) - Average sprint duration: 80 hours (2 weeks × 40 hours) - Peak staffing: 22 engineers (Weeks 7-10) **Budget Estimate:** - Foundation Phase (Weeks 1-6): 12 engineers × 240 hours × $150 = $432,000 - Module Tests Phase (Weeks 7-10): 22 engineers × 160 hours × $150 = $528,000 - Infrastructure Phase (Weeks 11-14): 8 engineers × 160 hours × $150 = $192,000 - **Total Estimated Cost:** $1,152,000 **Note:** Final budget requires approval from CTO/Finance. Contractor costs may reduce total if used strategically for peak staffing (Weeks 7-10). --- **Prepared by:** Project Management **Approval Required From:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead) **Date:** 2025-12-23 **Next Review:** 2026-01-06 (Week 1 kickoff)