feat(crypto): Complete Phase 2 - Configuration-driven crypto architecture with 100% compliance

## Summary

This commit completes Phase 2 of the configuration-driven crypto architecture, achieving
100% crypto compliance by eliminating all hardcoded cryptographic implementations.

## Key Changes

### Phase 1: Plugin Loader Infrastructure
- **Plugin Discovery System**: Created StellaOps.Cryptography.PluginLoader with manifest-based loading
- **Configuration Model**: Added CryptoPluginConfiguration with regional profiles support
- **Dependency Injection**: Extended DI to support plugin-based crypto provider registration
- **Regional Configs**: Created appsettings.crypto.{international,russia,eu,china}.yaml
- **CI Workflow**: Added .gitea/workflows/crypto-compliance.yml for audit enforcement

### Phase 2: Code Refactoring
- **API Extension**: Added ICryptoProvider.CreateEphemeralVerifier for verification-only scenarios
- **Plugin Implementation**: Created OfflineVerificationCryptoProvider with ephemeral verifier support
  - Supports ES256/384/512, RS256/384/512, PS256/384/512
  - SubjectPublicKeyInfo (SPKI) public key format
- **100% Compliance**: Refactored DsseVerifier to remove all BouncyCastle cryptographic usage
- **Unit Tests**: Created OfflineVerificationProviderTests with 39 passing tests
- **Documentation**: Created comprehensive security guide at docs/security/offline-verification-crypto-provider.md
- **Audit Infrastructure**: Created scripts/audit-crypto-usage.ps1 for static analysis

### Testing Infrastructure (TestKit)
- **Determinism Gate**: Created DeterminismGate for reproducibility validation
- **Test Fixtures**: Added PostgresFixture and ValkeyFixture using Testcontainers
- **Traits System**: Implemented test lane attributes for parallel CI execution
- **JSON Assertions**: Added CanonicalJsonAssert for deterministic JSON comparisons
- **Test Lanes**: Created test-lanes.yml workflow for parallel test execution

### Documentation
- **Architecture**: Created CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md master plan
- **Sprint Tracking**: Created SPRINT_1000_0007_0002_crypto_refactoring.md (COMPLETE)
- **API Documentation**: Updated docs2/cli/crypto-plugins.md and crypto.md
- **Testing Strategy**: Created testing strategy documents in docs/implplan/SPRINT_5100_0007_*

## Compliance & Testing

-  Zero direct System.Security.Cryptography usage in production code
-  All crypto operations go through ICryptoProvider abstraction
-  39/39 unit tests passing for OfflineVerificationCryptoProvider
-  Build successful (AirGap, Crypto plugin, DI infrastructure)
-  Audit script validates crypto boundaries

## Files Modified

**Core Crypto Infrastructure:**
- src/__Libraries/StellaOps.Cryptography/CryptoProvider.cs (API extension)
- src/__Libraries/StellaOps.Cryptography/CryptoSigningKey.cs (verification-only constructor)
- src/__Libraries/StellaOps.Cryptography/EcdsaSigner.cs (fixed ephemeral verifier)

**Plugin Implementation:**
- src/__Libraries/StellaOps.Cryptography.Plugin.OfflineVerification/ (new)
- src/__Libraries/StellaOps.Cryptography.PluginLoader/ (new)

**Production Code Refactoring:**
- src/AirGap/StellaOps.AirGap.Importer/Validation/DsseVerifier.cs (100% compliant)

**Tests:**
- src/__Libraries/__Tests/StellaOps.Cryptography.Plugin.OfflineVerification.Tests/ (new, 39 tests)
- src/__Libraries/__Tests/StellaOps.Cryptography.PluginLoader.Tests/ (new)

**Configuration:**
- etc/crypto-plugins-manifest.json (plugin registry)
- etc/appsettings.crypto.*.yaml (regional profiles)

**Documentation:**
- docs/security/offline-verification-crypto-provider.md (600+ lines)
- docs/implplan/CRYPTO_CONFIGURATION_DRIVEN_ARCHITECTURE.md (master plan)
- docs/implplan/SPRINT_1000_0007_0002_crypto_refactoring.md (Phase 2 complete)

## Next Steps

Phase 3: Docker & CI/CD Integration
- Create multi-stage Dockerfiles with all plugins
- Build regional Docker Compose files
- Implement runtime configuration selection
- Add deployment validation scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
master
2025-12-23 18:20:00 +02:00
parent b444284be5
commit dac8e10e36
241 changed files with 22567 additions and 307 deletions

View File

@@ -0,0 +1,393 @@
# Testing Strategy Sprint Dependency Graph & Critical Path Analysis
> **Purpose:** Visualize sprint dependencies, identify critical path, optimize parallel execution, and coordinate cross-guild work.
---
## Executive Summary
**Total Sprints:** 22 sprints across 4 batches
**Total Tasks:** ~370 tasks
**Estimated Duration:** 26 weeks (6 months) if executed sequentially
**Optimal Duration:** 14 weeks (3.5 months) with full parallelization
**Critical Path:** 14 weeks (Foundation Epics → Module Tests)
**Parallelization Opportunities:** Up to 15 sprints can run concurrently
---
## Sprint Inventory by Batch
### Batch 5100.0007: Foundation Epics (90 tasks, 6 sprints)
| Sprint ID | Name | Tasks | Waves | Dependencies |
|-----------|------|-------|-------|--------------|
| 5100.0007.0001 | Master Testing Strategy | 18 | 4 | None (entry point) |
| 5100.0007.0002 | Epic A: TestKit Foundations | 13 | 4 | 0001 (Wave 1 complete) |
| 5100.0007.0003 | Epic B: Determinism Gate | 12 | 4 | 0002 (TestKit available) |
| 5100.0007.0004 | Epic C: Storage Harness | 12 | 4 | 0002 (TestKit available) |
| 5100.0007.0005 | Epic D: Connector Fixtures | 12 | 4 | 0002 (TestKit available) |
| 5100.0007.0006 | Epic E: WebService Contract | 12 | 4 | 0002 (TestKit + OtelCapture) |
| 5100.0007.0007 | Epic F: Architecture Tests | 17 | 6 | None (can start immediately) |
### Batch 5100.0008: Quality Gates (11 tasks, 1 sprint)
| Sprint ID | Name | Tasks | Waves | Dependencies |
|-----------|------|-------|-------|--------------|
| 5100.0008.0001 | Competitor Parity Testing | 11 | 4 | 0007.0001 (Wave 1), 0007.0002 |
### Batch 5100.0009: Module Test Implementations (185 tasks, 11 sprints)
| Sprint ID | Module | Models | Tasks | Waves | Dependencies |
|-----------|--------|--------|-------|-------|--------------|
| 5100.0009.0001 | Scanner | L0,AN1,S1,T1,W1,WK1,PERF | 25 | 4 | 0002,0003,0004,0006 |
| 5100.0009.0002 | Concelier | C1,L0,S1,W1,AN1 | 18 | 4 | 0002,0003,0004,0005,0006,0007.0007 |
| 5100.0009.0003 | Excititor | C1,L0,S1,W1,WK1 | 21 | 4 | 0002,0003,0004,0005,0006,0007.0007 |
| 5100.0009.0004 | Policy | L0,S1,W1 | 15 | 3 | 0002,0003,0004,0006 |
| 5100.0009.0005 | Authority | L0,W1,C1 | 17 | 3 | 0002,0005,0006 |
| 5100.0009.0006 | Signer | L0,W1,C1 | 17 | 3 | 0002,0003,0005,0006 |
| 5100.0009.0007 | Attestor | L0,W1 | 14 | 3 | 0002,0003,0006,0009.0006 |
| 5100.0009.0008 | Scheduler | L0,S1,W1,WK1 | 14 | 3 | 0002,0004,0006 |
| 5100.0009.0009 | Notify | L0,C1,S1,W1,WK1 | 18 | 3 | 0002,0004,0005,0006 |
| 5100.0009.0010 | CLI | CLI1 | 13 | 3 | 0002,0003 |
| 5100.0009.0011 | UI | W1 | 13 | 3 | 0002,0006 |
### Batch 5100.0010: Infrastructure Tests (62 tasks, 4 sprints)
| Sprint ID | Module Family | Models | Tasks | Waves | Dependencies |
|-----------|---------------|--------|-------|-------|--------------|
| 5100.0010.0001 | EvidenceLocker + Findings + Replay | L0,S1,W1,WK1 | 16 | 3 | 0002,0004,0006 |
| 5100.0010.0002 | Graph + TimelineIndexer | L0,S1,W1,WK1 | 15 | 3 | 0002,0004,0006 |
| 5100.0010.0003 | Router + Messaging | L0,T1,W1,S1 | 14 | 3 | 0002,0004 |
| 5100.0010.0004 | AirGap | L0,AN1,S1,W1,CLI1 | 17 | 3 | 0002,0003,0004,0006 |
---
## Dependency Visualization (ASCII Graph)
```
CRITICAL PATH (14 weeks):
Week 1-2: [5100.0007.0001] Master Strategy (Wave 1-4)
Week 3-4: [5100.0007.0002] TestKit Foundations ← CRITICAL BOTTLENECK
├──────────┬──────────┬──────────┬──────────┐
Week 5-6: [0003] [0004] [0005] [0006] [0007.0007]
Determ. Storage Connect. WebSvc Arch.Tests
│ │ │ │
└─────────┴─────────┴─────────┘
Week 7-10: ┌──────────┼──────────────────────────────────┐
[5100.0009.xxxx] ALL MODULE TESTS (parallel) │
11 sprints run concurrently │
│ │
Week 11-14:└────────────────────────────────────────────┘
[5100.0010.xxxx] ALL INFRASTRUCTURE TESTS
4 sprints run concurrently
PARALLEL EXECUTION ZONES:
Zone 1 (Weeks 5-6): Epic Implementations
- 5100.0007.0003 (Determinism) ─┐
- 5100.0007.0004 (Storage) ├─ Can run in parallel
- 5100.0007.0005 (Connectors) │ (all depend only on TestKit)
- 5100.0007.0006 (WebService) │
- 5100.0007.0007 (Architecture) ─┘
Zone 2 (Weeks 7-10): Module Tests
- Scanner (5100.0009.0001) ─┐
- Concelier (5100.0009.0002) │
- Excititor (5100.0009.0003) │
- Policy (5100.0009.0004) ├─ Can run in parallel
- Authority (5100.0009.0005) │ (Epic dependencies met)
- Signer (5100.0009.0006) │
- Attestor (5100.0009.0007) │
- Scheduler (5100.0009.0008) │
- Notify (5100.0009.0009) │
- CLI (5100.0009.0010) │
- UI (5100.0009.0011) ─┘
Zone 3 (Weeks 11-14): Infrastructure Tests
- EvidenceLocker (5100.0010.0001) ─┐
- Graph (5100.0010.0002) ├─ Can run in parallel
- Router/Messaging (5100.0010.0003)│
- AirGap (5100.0010.0004) ─┘
Zone 4 (Weeks 3-14): Quality Gates (can overlap)
- Competitor Parity (5100.0008.0001) runs after Week 3
```
---
## Critical Path Analysis
### Critical Path Sequence (14 weeks)
1. **Week 1-2:** Master Strategy Sprint (5100.0007.0001)
- Wave 1: Documentation sync
- Wave 2: Quick wins (test runner scripts, trait standardization)
- Wave 3: CI infrastructure
- Wave 4: Epic sprint creation
2. **Week 3-4:** TestKit Foundations (5100.0007.0002) ← **CRITICAL BOTTLENECK**
- ALL downstream sprints depend on TestKit
- Must complete before any module tests can start
- Priority: DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, OtelCapture
3. **Week 5-6:** Epic Implementation (parallel zone)
- 5 sprints run concurrently
- Unblocks all module tests
4. **Week 7-10:** Module Test Implementation (parallel zone)
- 11 sprints run concurrently
- Longest pole: Scanner (25 tasks, 4 waves)
5. **Week 11-14:** Infrastructure Test Implementation (parallel zone)
- 4 sprints run concurrently
- Can overlap with late-stage module tests
### Critical Bottleneck: TestKit (Sprint 5100.0007.0002)
**Impact:** Blocks 20 downstream sprints (all module + infrastructure tests)
**Mitigation:**
- Staff with 2-3 senior engineers
- Prioritize DeterministicTime and SnapshotAssert (most commonly used)
- Release incrementally (partial TestKit unlocks some modules)
- Run daily check-ins to unblock consuming teams
---
## Dependency Matrix
### Epic → Module Dependencies
| Epic Sprint | Blocks Module Sprints | Reason |
|-------------|----------------------|--------|
| 5100.0007.0002 (TestKit) | ALL 15 module/infra sprints | Core test utilities required |
| 5100.0007.0003 (Determinism) | Scanner, Excititor, Signer, CLI, AirGap | Determinism gate required |
| 5100.0007.0004 (Storage) | Scanner, Concelier, Excititor, Policy, Scheduler, Notify, EvidenceLocker, Graph, Router, AirGap | PostgresFixture required |
| 5100.0007.0005 (Connectors) | Concelier, Excititor, Authority, Signer, Notify | Fixture discipline required |
| 5100.0007.0006 (WebService) | Scanner, Concelier, Excititor, Policy, Authority, Signer, Attestor, Scheduler, Notify, UI, EvidenceLocker, Graph, AirGap | WebServiceFixture required |
| 5100.0007.0007 (Architecture) | Concelier, Excititor | Lattice boundary enforcement |
### Module → Module Dependencies
| Sprint | Depends On Other Modules | Reason |
|--------|--------------------------|--------|
| Attestor (0009.0007) | Signer (0009.0006) | Sign/verify integration tests |
| (None other) | - | Modules are otherwise independent |
---
## Parallelization Strategy
### Maximum Parallel Execution (15 sprints)
**Week 5-6 (5 parallel):**
- Determinism (2 eng)
- Storage (3 eng)
- Connectors (2 eng)
- WebService (2 eng)
- Architecture (1 eng)
**Week 7-10 (11 parallel):**
- Scanner (3 eng) ← longest pole
- Concelier (2 eng)
- Excititor (2 eng)
- Policy (2 eng)
- Authority (2 eng)
- Signer (2 eng)
- Attestor (2 eng)
- Scheduler (2 eng)
- Notify (2 eng)
- CLI (1 eng)
- UI (2 eng)
**Week 11-14 (4 parallel):**
- EvidenceLocker (2 eng)
- Graph (2 eng)
- Router/Messaging (2 eng)
- AirGap (2 eng)
**Resource Requirement (Peak):**
- Week 7-10: 22 engineers (11 sprints × avg 2 eng/sprint)
- Realistic: 10-12 engineers with staggered starts
---
## Risk Hotspots
### High-Impact Delays
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| TestKit delayed (5100.0007.0002) | Blocks ALL downstream sprints; +2-4 weeks delay | MEDIUM | Staff with senior engineers; daily standups; incremental releases |
| Storage harness issues (5100.0007.0004) | Blocks 10 sprints | MEDIUM | Use Testcontainers early; validate Postgres 16 compatibility Week 1 |
| Determinism gate drift (5100.0007.0003) | Scanner/Excititor blocked; compliance issues | LOW | Explicit canonical JSON contract; freeze schema early |
| Attestor-Signer circular dependency (0009.0007 ↔ 0009.0006) | Integration tests blocked | MEDIUM | Mock signing for Attestor initial tests; coordinate guilds |
| Concurrent module tests overwhelm CI | Test suite timeout; flaky tests | HIGH | Stagger module test starts; use CI parallelization; dedicated test runners |
### Critical Path Risks
| Sprint | Risk | Impact if Delayed |
|--------|------|-------------------|
| 5100.0007.0002 (TestKit) | DeterministicTime implementation complex | +2 weeks to critical path |
| 5100.0009.0001 (Scanner) | 25 tasks, 4 waves; reachability tests complex | Delays integration tests; no impact on other modules |
| 5100.0007.0004 (Storage) | Testcontainers Postgres compatibility issues | Blocks 10 sprints; +2-3 weeks |
---
## Recommended Execution Sequence
### Phase 1: Foundation (Weeks 1-4)
**Goal:** Establish test infrastructure and strategy docs
**Sprints:**
1. 5100.0007.0001 (Master Strategy) — Week 1-2
2. 5100.0007.0002 (TestKit) — Week 3-4 ← CRITICAL
**Exit Criteria:**
- TestKit utilities available (DeterministicTime, SnapshotAssert, PostgresFixture, OtelCapture)
- Test runner scripts operational
- Trait standardization complete
### Phase 2: Epic Implementation (Weeks 5-6)
**Goal:** Implement all foundation epics in parallel
**Sprints (parallel):**
1. 5100.0007.0003 (Determinism)
2. 5100.0007.0004 (Storage)
3. 5100.0007.0005 (Connectors)
4. 5100.0007.0006 (WebService)
5. 5100.0007.0007 (Architecture)
**Exit Criteria:**
- PostgresFixture operational (Testcontainers)
- Determinism manifest format defined
- Connector fixture discipline documented
- WebServiceFixture operational
- Architecture tests in CI (PR gate)
### Phase 3: Module Tests — Priority Tier 1 (Weeks 7-8)
**Goal:** Implement tests for critical security/compliance modules
**Sprints (parallel):**
1. 5100.0009.0001 (Scanner) — critical path, longest pole
2. 5100.0009.0002 (Concelier)
3. 5100.0009.0003 (Excititor)
4. 5100.0009.0004 (Policy)
5. 5100.0009.0005 (Authority)
6. 5100.0009.0006 (Signer)
### Phase 4: Module Tests — Priority Tier 2 (Weeks 9-10)
**Goal:** Complete remaining module tests
**Sprints (parallel):**
1. 5100.0009.0007 (Attestor)
2. 5100.0009.0008 (Scheduler)
3. 5100.0009.0009 (Notify)
4. 5100.0009.0010 (CLI)
5. 5100.0009.0011 (UI)
### Phase 5: Infrastructure Tests (Weeks 11-14)
**Goal:** Complete platform infrastructure tests
**Sprints (parallel):**
1. 5100.0010.0001 (EvidenceLocker)
2. 5100.0010.0002 (Graph)
3. 5100.0010.0003 (Router/Messaging)
4. 5100.0010.0004 (AirGap)
### Phase 6: Quality Gates (Overlapping Weeks 3-14)
**Goal:** Establish ongoing parity testing
**Sprint:**
1. 5100.0008.0001 (Competitor Parity) — can start Week 3, run nightly thereafter
---
## Guild Coordination
### Cross-Guild Dependencies
| Guild | Owns Sprints | Depends On Guilds | Coordination Points |
|-------|--------------|-------------------|---------------------|
| Platform Guild | TestKit, Storage, Architecture, EvidenceLocker, Graph, Router | None | Week 3: TestKit readiness review |
| Scanner Guild | Scanner | Platform (TestKit, Storage, Determinism, WebService) | Week 5: Storage harness validation |
| Concelier Guild | Concelier | Platform (TestKit, Storage, Connectors, WebService), Architecture | Week 6: Connector fixture review |
| Excititor Guild | Excititor | Platform (TestKit, Storage, Connectors, WebService), Architecture | Week 6: Preserve-prune test design |
| Policy Guild | Policy, AirGap (analyzers) | Platform (TestKit, Storage, WebService) | Week 7: Unknown budget enforcement review |
| Authority Guild | Authority | Platform (TestKit, Connectors, WebService) | Week 7: OIDC connector fixture validation |
| Crypto Guild | Signer, Attestor | Platform (TestKit, Determinism, Connectors, WebService), Authority | Week 8: Canonical payload design; Week 9: Sign/verify integration |
| Scheduler Guild | Scheduler | Platform (TestKit, Storage, WebService) | Week 9: DeterministicTime validation |
| Notify Guild | Notify | Platform (TestKit, Storage, Connectors, WebService) | Week 9: Connector fixture templates |
| CLI Guild | CLI | Platform (TestKit, Determinism) | Week 10: Exit code conventions |
| UI Guild | UI | Platform (TestKit, WebService) | Week 10: API contract snapshot review |
| AirGap Guild | AirGap | Platform (TestKit, Determinism, Storage, WebService), Policy | Week 11: Bundle determinism validation |
| QA Guild | Competitor Parity | Platform (TestKit) | Week 3: Parity harness design |
### Weekly Sync Schedule
**Week 1-2:**
- All guilds: Master strategy review, sprint assignment
**Week 3-4:**
- Platform Guild: Daily standup (TestKit unblocking)
- All guilds: TestKit API review (design feedback)
**Week 5-6:**
- Epic guilds: Bi-weekly sync (Determinism, Storage, Connectors, WebService, Architecture)
- Scanner/Concelier/Excititor guilds: Prepare for module test sprint kickoff
**Week 7-10:**
- All module guilds: Weekly guild-specific standups
- Cross-guild: Bi-weekly integration sync (Signer ↔ Attestor, Policy ↔ AirGap)
**Week 11-14:**
- Infrastructure guilds: Weekly sync
- All guilds: Bi-weekly retrospective
---
## Metrics & Tracking
### Sprint Completion Metrics
| Metric | Target | Measurement |
|--------|--------|-------------|
| Sprint on-time completion | >80% | Tasks complete by wave deadline |
| Test coverage increase | +30% per module | Code coverage reports |
| Determinism tests passing | 100% | Determinism gate CI job |
| Contract tests in CI | 100% of WebServices | Contract lane CI job |
| Architecture tests enforcing | 100% violations blocked | Architecture test failures = build failures |
### Quality Gates
| Gate | Criteria | Enforced By |
|------|----------|-------------|
| Determinism | SHA-256 hash stable across runs | Sprint 5100.0007.0003 tests |
| Contract Stability | OpenAPI schema unchanged or explicitly versioned | Sprint 5100.0007.0006 tests |
| Architecture Boundaries | Concelier/Excititor do NOT reference Scanner lattice | Sprint 5100.0007.0007 tests |
| Preserve-Prune Source | Excititor preserves VEX source references and rationale | Sprint 5100.0009.0003 tests |
| Unknown Budget Enforcement | Policy engine fails when unknowns > N | Sprint 5100.0009.0004 tests |
---
## Next Steps
1. **Week 1 (2026-01-06):**
- Kick off Sprint 5100.0007.0001 (Master Strategy)
- Assign Platform Guild to TestKit (5100.0007.0002) for Week 3 start
- Review this dependency graph with all guild leads
2. **Week 2 (2026-01-13):**
- Complete Master Strategy Wave 1-2
- Finalize TestKit API design (DeterministicTime, SnapshotAssert, etc.)
3. **Week 3 (2026-01-20):**
- Start TestKit implementation (CRITICAL PATH)
- Daily standup for TestKit unblocking
- Prepare Epic sprint kickoffs for Week 5
4. **Week 4 (2026-01-27):**
- Complete TestKit Wave 1-2 (DeterministicTime, SnapshotAssert)
- Validate TestKit with pilot tests
- Final Epic sprint preparation
5. **Week 5 (2026-02-03):**
- Kick off 5 Epic sprints in parallel
- Weekly Epic sync meeting (Fridays)
---
**Prepared by:** Project Management
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)

View File

@@ -0,0 +1,517 @@
# Testing Strategy Sprint Execution Playbook
> **Purpose:** Practical guide for executing testing sprints - coordination, Definition of Done, sign-off criteria, ceremonies, and troubleshooting.
---
## Table of Contents
1. [Sprint Lifecycle](#sprint-lifecycle)
2. [Definition of Done (DoD)](#definition-of-done-dod)
3. [Wave-Based Execution](#wave-based-execution)
4. [Sign-Off Criteria](#sign-off-criteria)
5. [Cross-Guild Coordination](#cross-guild-coordination)
6. [Common Failure Patterns](#common-failure-patterns)
7. [Troubleshooting Guide](#troubleshooting-guide)
8. [Sprint Templates](#sprint-templates)
---
## Sprint Lifecycle
### Sprint States
```
TODO → DOING → BLOCKED/IN_REVIEW → DONE
│ │ │ │
│ │ │ └─ All waves complete + sign-off
│ │ └─ Waiting on dependency or approval
│ └─ Active development (1+ waves in progress)
└─ Not yet started
```
### Standard Sprint Duration
- **Foundation Epics (5100.0007.*):** 2 weeks per sprint
- **Module Tests (5100.0009.*):** 2 weeks per sprint
- **Infrastructure Tests (5100.0010.*):** 2 weeks per sprint
- **Competitor Parity (5100.0008.0001):** Initial setup 2 weeks; then ongoing (nightly/weekly)
### Ceremonies
#### Sprint Kickoff (Day 1)
**Who:** Sprint owner + guild members + dependencies
**Duration:** 60 min
**Agenda:**
1. Review sprint scope and deliverables (10 min)
2. Review wave structure and task breakdown (15 min)
3. Identify dependencies and blockers (15 min)
4. Assign tasks to engineers (10 min)
5. Schedule wave reviews (5 min)
6. Q&A (5 min)
#### Wave Review (End of each wave)
**Who:** Sprint owner + guild members
**Duration:** 30 min
**Agenda:**
1. Demo completed tasks (10 min)
2. Review DoD checklist for wave (10 min)
3. Identify blockers for next wave (5 min)
4. Update sprint status in `Delivery Tracker` (5 min)
#### Sprint Sign-Off (Final day)
**Who:** Sprint owner + guild lead + architect (for critical sprints)
**Duration:** 30 min
**Agenda:**
1. Review all wave completion (10 min)
2. Verify sign-off criteria (10 min)
3. Demo integration (if applicable) (5 min)
4. Sign execution log (5 min)
#### Weekly Sync (Every Friday)
**Who:** All active sprint owners + project manager
**Duration:** 30 min
**Agenda:**
1. Sprint status updates (15 min)
2. Blocker escalation (10 min)
3. Next week preview (5 min)
---
## Definition of Done (DoD)
### Universal DoD (Applies to ALL sprints)
**Code:**
- [ ] All tasks in `Delivery Tracker` marked as `DONE`
- [ ] Code reviewed by at least 1 other engineer
- [ ] No pending TODOs or FIXMEs in committed code
- [ ] Code follows StellaOps coding standards (SOLID, DRY, KISS)
**Tests:**
- [ ] All tests passing locally
- [ ] All tests passing in CI (appropriate lane)
- [ ] Code coverage increase ≥ target (see module-specific DoD)
- [ ] No flaky tests (deterministic pass rate 100%)
**Documentation:**
- [ ] Sprint `Execution Log` updated with completion date
- [ ] Module-specific `AGENTS.md` updated (if new patterns introduced)
- [ ] API documentation updated (if endpoints changed)
**Integration:**
- [ ] Changes merged to `main` branch
- [ ] CI lanes passing (Unit, Contract, Integration, Security as applicable)
- [ ] No regressions introduced (existing tests still passing)
---
### Model-Specific DoD
#### L0 (Library/Core)
- [ ] Unit tests covering all public methods
- [ ] Property tests for key invariants (where applicable)
- [ ] Snapshot tests for canonical outputs (SBOM, VEX, verdicts, etc.)
- [ ] Code coverage: ≥80% for core libraries
#### S1 (Storage/Postgres)
- [ ] Migration tests (apply from scratch, apply from N-1) passing
- [ ] Idempotency tests passing (same operation twice → no duplicates)
- [ ] Query determinism tests passing (explicit ORDER BY checks)
- [ ] Testcontainers Postgres fixture operational
#### T1 (Transport/Queue)
- [ ] Protocol roundtrip tests passing
- [ ] Fuzz tests for invalid input passing
- [ ] Delivery semantics tests passing (at-least-once, idempotency)
- [ ] Backpressure tests passing
#### C1 (Connector/External)
- [ ] Fixture folders created (`Fixtures/<source>/<case>.json`, `Expected/<case>.canonical.json`)
- [ ] Parser tests passing (fixture → parse → snapshot)
- [ ] Resilience tests passing (missing fields, invalid enums, etc.)
- [ ] Security tests passing (URL allowlist, redirect handling, payload limits)
#### W1 (WebService/API)
- [ ] Contract tests passing (OpenAPI snapshot validation)
- [ ] Auth/authz tests passing (deny-by-default, token expiry, scope enforcement)
- [ ] OTel trace assertions passing (spans emitted, tags present)
- [ ] Negative tests passing (malformed requests, size limits, method mismatch)
#### WK1 (Worker/Indexer)
- [ ] End-to-end tests passing (enqueue → worker → stored → events emitted)
- [ ] Retry tests passing (transient failure → backoff; permanent → poison)
- [ ] Idempotency tests passing (same job twice → single execution)
- [ ] OTel correlation tests passing (trace spans across lifecycle)
#### AN1 (Analyzer/SourceGen)
- [ ] Roslyn compilation tests passing (expected diagnostics, no false positives)
- [ ] Golden generated code tests passing (if applicable)
#### CLI1 (Tool/CLI)
- [ ] Exit code tests passing (0=success, 1=user error, 2=system error, etc.)
- [ ] Golden output tests passing (stdout/stderr snapshots)
- [ ] Determinism tests passing (same inputs → same outputs)
#### PERF (Benchmarks)
- [ ] Benchmark tests operational
- [ ] Perf smoke tests in CI (2× regression gate)
- [ ] Baseline results documented
---
### Sprint-Specific DoD
#### Foundation Epic Sprints (5100.0007.*)
**Epic A (TestKit):**
- [ ] `StellaOps.TestKit` NuGet package published internally
- [ ] DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, ValkeyFixture, OtelCapture, HttpFixtureServer all operational
- [ ] Pilot adoption in 2+ modules (e.g., Scanner, Concelier)
**Epic B (Determinism):**
- [ ] Determinism manifest JSON schema defined
- [ ] `tests/integration/StellaOps.Integration.Determinism` expanded for SBOM, VEX, policy verdicts, evidence bundles, AirGap exports
- [ ] Determinism tests in CI (merge gate)
- [ ] Determinism artifacts stored in CI artifact repo
**Epic C (Storage):**
- [ ] PostgresFixture operational (Testcontainers, automatic migrations, schema isolation)
- [ ] ValkeyFixture operational
- [ ] Pilot adoption in 2+ modules with S1 model (e.g., Scanner, Policy)
**Epic D (Connectors):**
- [ ] Connector fixture discipline documented in `docs/testing/connector-fixture-discipline.md`
- [ ] FixtureUpdater tool operational (with `UPDATE_CONNECTOR_FIXTURES=1` env var guard)
- [ ] Pilot adoption in Concelier.Connector.NVD
**Epic E (WebService):**
- [ ] WebServiceFixture<TProgram> operational (Microsoft.AspNetCore.Mvc.Testing)
- [ ] Contract test pattern documented
- [ ] Pilot adoption in Scanner.WebService
**Epic F (Architecture):**
- [ ] `tests/architecture/StellaOps.Architecture.Tests` project operational
- [ ] Lattice placement rules enforced (Concelier/Excititor must NOT reference Scanner lattice)
- [ ] Architecture tests in CI (PR gate, Unit lane)
#### Module Test Sprints (5100.0009.*)
**Per Module:**
- [ ] All model requirements from TEST_CATALOG.yml satisfied
- [ ] Module-specific quality gates passing (see TEST_COVERAGE_MATRIX.md)
- [ ] Code coverage increase: ≥30% from baseline
- [ ] All wave deliverables complete
#### Infrastructure Test Sprints (5100.0010.*)
**Per Infrastructure Module:**
- [ ] All integration tests passing
- [ ] Cross-module dependencies validated (e.g., EvidenceLocker ↔ Scanner)
---
## Wave-Based Execution
### Wave Structure
Most sprints use a 3-4 wave structure:
- **Wave 1:** Foundation / Core logic
- **Wave 2:** Integration / Storage / Connectors
- **Wave 3:** WebService / Workers / End-to-end
- **Wave 4:** (Optional) Polish / Documentation / Edge cases
### Wave Execution Pattern
```
Week 1:
Day 1-2: Wave 1 development
Day 3: Wave 1 review → APPROVED → proceed to Wave 2
Day 4-5: Wave 2 development
Week 2:
Day 1: Wave 2 review → APPROVED → proceed to Wave 3
Day 2-4: Wave 3 development
Day 5: Wave 3 review + Sprint Sign-Off
```
### Wave Review Checklist
**Before Wave Review:**
- [ ] All tasks in wave marked as `DOING``DONE` in `Delivery Tracker`
- [ ] All tests for wave passing in CI
- [ ] Code reviewed
**During Wave Review:**
- [ ] Demo completed functionality
- [ ] Review wave DoD checklist
- [ ] Identify blockers for next wave
- [ ] **Sign-off decision:** APPROVED / CHANGES_REQUIRED / BLOCKED
**After Wave Review:**
- [ ] Update sprint `Execution Log` with wave completion
- [ ] Update task status in `Delivery Tracker`
- [ ] If BLOCKED: escalate to project manager
---
## Sign-Off Criteria
### Sprint Sign-Off Levels
#### Level 1: Self-Sign-Off (Guild Lead)
**Applies to:** Routine module test sprints without architectural changes
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval
#### Level 2: Architect Sign-Off
**Applies to:** Foundation epics, architectural changes, cross-cutting concerns
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval
- **Architect review and approval**
#### Level 3: Project Manager + Architect Sign-Off
**Applies to:** Critical path sprints (TestKit, Determinism, Storage)
**Criteria:**
- All waves complete
- All DoD items checked
- Guild lead approval
- Architect approval
- **Project manager approval (validates dependencies unblocked)**
### Sign-Off Process
1. **Engineer completes final wave** → marks all tasks `DONE`
2. **Guild lead reviews** → verifies DoD checklist
3. **Sprint owner schedules sign-off meeting** (if Level 2/3)
4. **Sign-off meeting** (30 min):
- Demo final deliverables
- Review DoD checklist
- Verify integration (if applicable)
- **Decision:** APPROVED / CHANGES_REQUIRED
5. **Update Execution Log:**
```markdown
| 2026-XX-XX | Sprint signed off by [Guild Lead / Architect / PM]. | [Owner] |
```
---
## Cross-Guild Coordination
### Dependency Handoffs
When Sprint A depends on Sprint B:
**Sprint B (Provider):**
1. **Week before completion:** Notify Sprint A owner of expected completion date
2. **Wave 2-3 complete:** Provide preview build / early access to Sprint A
3. **Sprint complete:** Formally notify Sprint A owner; provide integration guide
**Sprint A (Consumer):**
1. **Sprint B Wave 2:** Begin integration planning; identify integration risks
2. **Sprint B Wave 3:** Start integration development (against preview build)
3. **Sprint B complete:** Complete integration; validate against final build
### Coordination Meetings
#### Epic → Module Handoff (Week 5)
**Who:** Epic sprint owners + all module sprint owners
**Duration:** 60 min
**Agenda:**
1. Epic deliverables review (TestKit, Storage, etc.) (20 min)
2. Integration guide walkthrough (15 min)
3. Module sprint kickoff previews (15 min)
4. Q&A (10 min)
#### Module Integration Sync (Bi-weekly, Weeks 7-10)
**Who:** Module sprint owners with cross-dependencies (e.g., Signer ↔ Attestor)
**Duration:** 30 min
**Agenda:**
1. Integration status updates (10 min)
2. Blocker resolution (15 min)
3. Next steps (5 min)
### Blocked Sprint Protocol
If a sprint is BLOCKED:
1. **Sprint owner:** Update sprint status to `BLOCKED` in `Delivery Tracker`
2. **Sprint owner:** Add blocker note to `Decisions & Risks` table
3. **Sprint owner:** Notify project manager immediately (Slack + email)
4. **Project manager:** Schedule blocker resolution meeting within 24 hours
5. **Resolution meeting:** Decide:
- **Workaround:** Continue with mock/stub dependency
- **Re-sequence:** Defer sprint; start alternative sprint
- **Escalate:** Assign additional resources to unblock dependency
---
## Common Failure Patterns
### Pattern 1: Testcontainers Failure (Storage Harness)
**Symptom:** Tests fail with "Docker not running" or "Container startup timeout"
**Root Cause:** Docker daemon not running, Docker Desktop not installed, or Testcontainers compatibility issue
**Fix:**
1. Verify Docker Desktop installed and running
2. Verify Testcontainers.Postgres version compatible with .NET 10
3. Add explicit timeout: `new PostgreSqlBuilder().WithStartupTimeout(TimeSpan.FromMinutes(5))`
4. For CI: ensure Docker available in CI runner environment
### Pattern 2: Determinism Test Drift
**Symptom:** Determinism tests fail with "Expected hash X, got hash Y"
**Root Cause:** Non-deterministic timestamps, GUIDs, or ordering
**Fix:**
1. Use `DeterministicTime` instead of `DateTime.UtcNow`
2. Use `DeterministicRandom` for random data
3. Explicit `ORDER BY` clauses in all queries
4. Strip timestamps from snapshots or use placeholders
### Pattern 3: Fixture Update Breaks Tests
**Symptom:** Connector tests fail after updating fixtures
**Root Cause:** Upstream schema drift (NVD, OSV, etc.)
**Fix:**
1. Review schema changes in upstream source
2. Update connector parser logic if needed
3. Regenerate expected snapshots with `UPDATE_CONNECTOR_FIXTURES=1`
4. Document schema version in fixture filename (e.g., `nvd_v1.1.json`)
### Pattern 4: WebService Contract Drift
**Symptom:** Contract tests fail with "OpenAPI schema mismatch"
**Root Cause:** Backend API schema changed (breaking change)
**Fix:**
1. Review API changes in backend PR
2. **If breaking:** Version API (e.g., `/api/v2/...`)
3. **If non-breaking:** Update contract snapshot
4. Coordinate with frontend/consumer teams
### Pattern 5: Circular Dependency (Attestor ↔ Signer)
**Symptom:** Integration tests blocked waiting for both Attestor and Signer
**Root Cause:** Attestor needs Signer for signing; Signer integration tests need Attestor
**Fix:**
1. **Signer Sprint (5100.0009.0006):** Use mock signing for initial tests; defer Attestor integration
2. **Attestor Sprint (5100.0009.0007):** Coordinate with Signer guild; run integration tests in Week 2
3. **Integration Sprint (post-module):** Full Attestor ↔ Signer integration validation
### Pattern 6: Flaky Tests (Timing Issues)
**Symptom:** Tests pass locally but fail intermittently in CI
**Root Cause:** Race conditions, sleeps, non-deterministic timing
**Fix:**
1. Use `DeterministicTime` instead of `Thread.Sleep` or `Task.Delay`
2. Use explicit waits (e.g., `await condition.UntilAsync(...)`) instead of fixed delays
3. Avoid hard-coded timeouts; use configurable timeouts
4. Run tests 10× locally to verify determinism
---
## Troubleshooting Guide
### Issue: "My sprint depends on Epic X, but Epic X is delayed"
**Solution:**
1. Check if partial Epic X deliverables available (e.g., TestKit Wave 1-2 complete → can start L0 tests)
2. If not, use mock/stub implementation
3. Coordinate with Epic X owner for preview build
4. If critically blocked: escalate to project manager for re-sequencing
### Issue: "Tests passing locally but failing in CI"
**Checklist:**
- [ ] Docker running in CI? (for Testcontainers)
- [ ] Environment variables set? (e.g., `STELLAOPS_TEST_POSTGRES_CONNECTION`)
- [ ] Correct .NET SDK version? (net10.0)
- [ ] Test isolation? (each test resets state)
- [ ] Deterministic? (run tests 10× locally)
### Issue: "Code coverage below target (80%)"
**Solution:**
1. Identify uncovered lines: `dotnet test --collect:"XPlat Code Coverage"`
2. Add unit tests for uncovered public methods
3. Add property tests for invariants
4. If coverage still low: review with guild lead (some boilerplate excluded from coverage)
### Issue: "Architecture tests failing (lattice boundary violation)"
**Solution:**
1. Review failing types: which assembly is referencing Scanner lattice?
2. **If legitimate:** Refactor to remove dependency (move logic to Scanner.WebService)
3. **If test project:** Add to allowlist in architecture test
### Issue: "Snapshot test failing after refactor"
**Solution:**
1. Review snapshot diff: is it intentional?
2. **If intentional:** Update snapshot (re-run test with snapshot update flag)
3. **If unintentional:** Revert refactor; investigate why output changed
---
## Sprint Templates
### Template: Task Status Update
```markdown
## Delivery Tracker
| # | Task ID | Status | Key dependency / next step | Owners | Task Definition |
| --- | --- | --- | --- | --- | --- |
| 1 | MODULE-5100-001 | DONE | None | John Doe | Add unit tests for... |
| 2 | MODULE-5100-002 | DOING | Task 1 | Jane Smith | Add property tests for... |
| 3 | MODULE-5100-003 | TODO | Task 2 | - | Add snapshot tests for... |
```
### Template: Execution Log Entry
```markdown
## Execution Log
| Date (UTC) | Update | Owner |
| --- | --- | --- |
| 2026-01-20 | Sprint created. | Project Mgmt |
| 2026-01-27 | Wave 1 complete (Tasks 1-5). | Guild Lead |
| 2026-02-03 | Wave 2 complete (Tasks 6-10). | Guild Lead |
| 2026-02-10 | Sprint signed off by Architect. | Project Mgmt |
```
### Template: Blocker Note
```markdown
## Decisions & Risks
| Risk | Impact | Mitigation | Owner |
| --- | --- | --- | --- |
| [BLOCKER] TestKit delayed by 1 week | Cannot start module tests | Using mock TestKit for initial development; switch to real TestKit Week 5 | Module Guild |
```
---
## Next Steps
1. **Week 1:** All guild leads review this playbook
2. **Week 1:** Project manager schedules kickoff meetings for Foundation Epics (Week 3)
3. **Week 2:** Epic sprint owners prepare kickoff materials (scope, wave breakdown, task assignments)
4. **Week 3:** Foundation Epic sprints begin (5100.0007.0002-0007)
---
**Prepared by:** Project Management
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)

View File

@@ -0,0 +1,419 @@
# StellaOps Testing Strategy Master Plan (2026 H1)
> **Executive Summary:** Comprehensive 14-week testing initiative to establish model-driven test coverage across 15 modules, implement 6 foundation epics, and achieve 30%+ code coverage increase platform-wide.
---
## Document Control
| Attribute | Value |
|-----------|-------|
| **Program Name** | Testing Strategy Implementation 2026 H1 |
| **Program ID** | SPRINT-5100 |
| **Owner** | Project Management |
| **Status** | PLANNING |
| **Start Date** | 2026-01-06 (Week 1) |
| **Target End Date** | 2026-04-14 (Week 14) |
| **Budget** | TBD (resource model below) |
| **Last Updated** | 2025-12-23 |
---
## Table of Contents
1. [Program Objectives](#program-objectives)
2. [Scope & Deliverables](#scope--deliverables)
3. [Timeline & Phases](#timeline--phases)
4. [Resource Model](#resource-model)
5. [Risk Register](#risk-register)
6. [Success Metrics](#success-metrics)
7. [Governance](#governance)
8. [Communication Plan](#communication-plan)
---
## Program Objectives
### Primary Objectives
1. **Establish Model-Driven Testing:** Implement 9 test models (L0, S1, T1, C1, W1, WK1, AN1, CLI1, PERF) across 15 modules
2. **Increase Code Coverage:** Achieve ≥30% code coverage increase from baseline (current ~40% → target 70%+)
3. **Enforce Quality Gates:** Implement determinism, architecture, and module-specific quality gates
4. **Build Test Infrastructure:** Deliver 6 foundation epics (TestKit, Determinism, Storage, Connectors, WebService, Architecture)
5. **Enable CI/CD Confidence:** Establish PR-gating and merge-gating test lanes
### Secondary Objectives
1. **Reduce Test Flakiness:** Achieve 100% deterministic test pass rate (eliminate timing-based failures)
2. **Improve Developer Experience:** Standardize test patterns, reduce test authoring friction
3. **Establish Parity Monitoring:** Continuous validation against competitor tools (Syft, Grype, Trivy, Anchore)
4. **Document Test Strategy:** Create comprehensive testing guides and playbooks
---
## Scope & Deliverables
### In-Scope
#### Foundation Epics (Batch 5100.0007, 90 tasks)
| Sprint ID | Epic | Deliverables |
|-----------|------|--------------|
| 5100.0007.0001 | Master Testing Strategy | Strategy docs, test runner scripts, trait standardization, Epic sprint creation |
| 5100.0007.0002 | TestKit Foundations | DeterministicTime, DeterministicRandom, CanonicalJsonAssert, SnapshotAssert, PostgresFixture, ValkeyFixture, OtelCapture, HttpFixtureServer |
| 5100.0007.0003 | Determinism Gate | Determinism manifest format, expanded integration tests, CI artifact storage, drift detection |
| 5100.0007.0004 | Storage Harness | PostgresFixture (Testcontainers), ValkeyFixture, automatic migrations, schema isolation |
| 5100.0007.0005 | Connector Fixtures | Fixture discipline, FixtureUpdater tool, pilot adoption in Concelier.Connector.NVD |
| 5100.0007.0006 | WebService Contract | WebServiceFixture<TProgram>, contract test pattern, pilot adoption in Scanner.WebService |
| 5100.0007.0007 | Architecture Tests | NetArchTest.Rules, lattice placement enforcement, PR-gating architecture tests |
#### Module Test Implementations (Batch 5100.0009, 185 tasks)
| Sprint ID | Module | Test Models | Deliverables |
|-----------|--------|-------------|--------------|
| 5100.0009.0001 | Scanner | L0, AN1, S1, T1, W1, WK1, PERF | 25 tasks: property tests, SBOM/reachability/verdict snapshots, determinism, WebService contract, Worker e2e, perf smoke |
| 5100.0009.0002 | Concelier | C1, L0, S1, W1, AN1 | 18 tasks: connector fixtures (NVD/OSV/GHSA/CSAF), merge property tests, WebService contract, architecture enforcement |
| 5100.0009.0003 | Excititor | C1, L0, S1, W1, WK1 | 21 tasks: connector fixtures (CSAF/OpenVEX), format export snapshots, preserve-prune tests, Worker e2e, architecture enforcement |
| 5100.0009.0004 | Policy | L0, S1, W1 | 15 tasks: policy engine property tests, DSL roundtrip tests, verdict snapshots, unknown budget enforcement |
| 5100.0009.0005 | Authority | L0, W1, C1 | 17 tasks: auth logic tests, connector fixtures (OIDC/SAML/LDAP), WebService contract, sign/verify integration |
| 5100.0009.0006 | Signer | L0, W1, C1 | 17 tasks: canonical payload tests, crypto plugin tests (BouncyCastle/CryptoPro/eIDAS/SimRemote), WebService contract |
| 5100.0009.0007 | Attestor | L0, W1 | 14 tasks: DSSE envelope tests, Rekor integration tests, attestation statement snapshots, WebService contract |
| 5100.0009.0008 | Scheduler | L0, S1, W1, WK1 | 14 tasks: scheduling invariant property tests, storage idempotency, WebService contract, Worker e2e |
| 5100.0009.0009 | Notify | L0, C1, S1, W1, WK1 | 18 tasks: connector fixtures (email/Slack/Teams/webhook), WebService contract, Worker e2e |
| 5100.0009.0010 | CLI | CLI1 | 13 tasks: exit code tests, golden output tests, determinism tests |
| 5100.0009.0011 | UI | W1 | 13 tasks: API contract tests, E2E smoke tests, accessibility tests |
#### Infrastructure Test Implementations (Batch 5100.0010, 62 tasks)
| Sprint ID | Module Family | Deliverables |
|-----------|---------------|--------------|
| 5100.0010.0001 | EvidenceLocker + Findings + Replay | Immutability tests, ledger determinism, replay token security, WebService contract |
| 5100.0010.0002 | Graph + TimelineIndexer | Graph construction/traversal tests, indexer e2e, query determinism, WebService contract |
| 5100.0010.0003 | Router + Messaging | Transport compliance suite (in-memory/TCP/TLS/Valkey/RabbitMQ), routing determinism, fuzz tests |
| 5100.0010.0004 | AirGap | Bundle export/import determinism, policy analyzer tests, WebService contract, CLI tool tests |
#### Quality Gates (Batch 5100.0008, 11 tasks)
| Sprint ID | Purpose | Deliverables |
|-----------|---------|--------------|
| 5100.0008.0001 | Competitor Parity Testing | Parity test harness, fixture set (10-15 container images), comparison logic (SBOM/vuln/latency/errors), time-series storage, drift detection (>5% threshold) |
### Out-of-Scope
-**Performance optimization** (beyond PERF smoke tests for Scanner)
-**UI/UX testing** (beyond W1 contract tests and E2E smoke tests)
-**Load testing** (deferred to future sprint)
-**Chaos engineering** (deferred to future sprint)
-**Mobile/responsive testing** (not applicable - server-side platform)
-**Penetration testing** (separate security initiative)
---
## Timeline & Phases
### Master Timeline (14 Weeks, 2026-01-06 to 2026-04-14)
```
PHASE 1: FOUNDATION (Weeks 1-4)
┌─────────────────────────────────────────────────────────────┐
│ Week 1-2: Master Strategy (5100.0007.0001) │
│ - Documentation sync │
│ - Test runner scripts │
│ - Trait standardization │
│ - Epic sprint creation │
│ │
│ Week 3-4: TestKit Foundations (5100.0007.0002) ← CRITICAL │
│ - DeterministicTime, DeterministicRandom │
│ - CanonicalJsonAssert, SnapshotAssert │
│ - PostgresFixture, ValkeyFixture, OtelCapture │
└─────────────────────────────────────────────────────────────┘
PHASE 2: EPIC IMPLEMENTATION (Weeks 5-6)
┌─────────────────────────────────────────────────────────────┐
│ Week 5-6: 5 Epic Sprints (PARALLEL) │
│ - 5100.0007.0003 (Determinism Gate) │
│ - 5100.0007.0004 (Storage Harness) │
│ - 5100.0007.0005 (Connector Fixtures) │
│ - 5100.0007.0006 (WebService Contract) │
│ - 5100.0007.0007 (Architecture Tests) │
└─────────────────────────────────────────────────────────────┘
PHASE 3: MODULE TESTS - TIER 1 (Weeks 7-8)
┌─────────────────────────────────────────────────────────────┐
│ Week 7-8: 6 Module Sprints (PARALLEL) │
│ - Scanner, Concelier, Excititor (core platform) │
│ - Policy, Authority, Signer (security/compliance) │
└─────────────────────────────────────────────────────────────┘
PHASE 4: MODULE TESTS - TIER 2 (Weeks 9-10)
┌─────────────────────────────────────────────────────────────┐
│ Week 9-10: 5 Module Sprints (PARALLEL) │
│ - Attestor, Scheduler, Notify (platform services) │
│ - CLI, UI (client interfaces) │
└─────────────────────────────────────────────────────────────┘
PHASE 5: INFRASTRUCTURE TESTS (Weeks 11-14)
┌─────────────────────────────────────────────────────────────┐
│ Week 11-14: 4 Infrastructure Sprints (PARALLEL) │
│ - EvidenceLocker, Graph, Router/Messaging, AirGap │
└─────────────────────────────────────────────────────────────┘
ONGOING: QUALITY GATES (Weeks 3-14+)
┌─────────────────────────────────────────────────────────────┐
│ Week 3: Competitor Parity harness setup │
│ Week 4+: Nightly/weekly parity tests │
└─────────────────────────────────────────────────────────────┘
```
### Critical Path (14 Weeks)
**Week 1-2:** Master Strategy → **Week 3-4:** TestKit ← **BOTTLENECK****Week 5-6:** Epic Implementation → **Week 7-10:** Module Tests → **Week 11-14:** Infrastructure Tests
**Critical Path Risks:**
- TestKit delay → ALL downstream sprints blocked (+2-4 weeks)
- Storage harness delay → 10 sprints blocked (+2-3 weeks)
### Milestones
| Milestone | Week | Deliverables | Sign-Off Criteria |
|-----------|------|--------------|-------------------|
| **M1: Foundation Ready** | Week 4 | TestKit operational | DeterministicTime, SnapshotAssert, PostgresFixture, OtelCapture available; pilot adoption in 2+ modules |
| **M2: Epic Complete** | Week 6 | All 6 foundation epics complete | Determinism gate in CI; Storage harness operational; WebService contract tests in Scanner; Architecture tests PR-gating |
| **M3: Core Modules Tested** | Week 8 | Scanner, Concelier, Excititor, Policy, Authority, Signer complete | Code coverage increase ≥30%; quality gates passing |
| **M4: All Modules Tested** | Week 10 | All 11 module test sprints complete | All module-specific quality gates passing |
| **M5: Program Complete** | Week 14 | All infrastructure tests complete; program retrospective | All sprints signed off; final metrics review |
---
## Resource Model
### Guild Allocation
| Guild | Assigned Sprints | Peak Staffing (Weeks 7-10) | Avg Sprint Ownership |
|-------|------------------|----------------------------|----------------------|
| **Platform Guild** | TestKit, Storage, Architecture, EvidenceLocker, Graph, Router | 10 engineers | 6 sprints |
| **Scanner Guild** | Scanner | 3 engineers | 1 sprint |
| **Concelier Guild** | Concelier | 2 engineers | 1 sprint |
| **Excititor Guild** | Excititor | 2 engineers | 1 sprint |
| **Policy Guild** | Policy, AirGap (analyzers) | 2-4 engineers | 2 sprints |
| **Authority Guild** | Authority | 2 engineers | 1 sprint |
| **Crypto Guild** | Signer, Attestor | 4 engineers | 2 sprints |
| **Scheduler Guild** | Scheduler | 2 engineers | 1 sprint |
| **Notify Guild** | Notify | 2 engineers | 1 sprint |
| **CLI Guild** | CLI | 1 engineer | 1 sprint |
| **UI Guild** | UI | 2 engineers | 1 sprint |
| **AirGap Guild** | AirGap (core) | 2 engineers | 1 sprint |
| **QA Guild** | Competitor Parity | 2 engineers | 1 sprint |
### Staffing Profile
**Peak Staffing (Weeks 7-10):** 22-26 engineers
**Average Staffing (Weeks 1-14):** 12-16 engineers
**Critical Path Sprints (TestKit, Storage):** 3-4 senior engineers each
### Resource Constraints
| Constraint | Impact | Mitigation |
|------------|--------|------------|
| Platform Guild oversubscribed (10 engineers, 6 sprints) | Burnout, delays | Stagger Epic sprints (Storage Week 5, Connectors Week 6); hire contractors for Weeks 5-10 |
| Senior engineers limited (5-6 available) | TestKit/Storage quality risk | Assign 2 senior engineers to TestKit (critical path); 1 senior to Storage; rotate for reviews |
| UI Guild availability (Angular expertise scarce) | UI sprint delayed | Start UI sprint Week 10 (after Tier 1/2 modules); hire Angular contractor if needed |
---
## Risk Register
### High-Impact Risks (Severity: CRITICAL)
| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R1 | TestKit delayed by 2+ weeks | MEDIUM | Blocks ALL 15 module/infra sprints; +4-6 weeks program delay | Staff with 2 senior engineers; daily standups; incremental releases (partial TestKit unblocks some modules) | Platform Guild | OPEN |
| R2 | Storage harness (Testcontainers) incompatible with .NET 10 | LOW | Blocks 10 sprints; +3-4 weeks delay | Validate Testcontainers compatibility Week 1; fallback to manual Postgres setup | Platform Guild | OPEN |
| R3 | Determinism tests fail due to non-deterministic crypto signatures | MEDIUM | Scanner, Signer, Attestor blocked; compliance issues | Focus determinism tests on payload hash (not signature bytes); document non-deterministic algorithms | Crypto Guild | OPEN |
| R4 | Concurrent module tests overwhelm CI infrastructure | HIGH | Test suite timeout, flaky tests, developer friction | Stagger module test starts (Tier 1 Week 7-8, Tier 2 Week 9-10); use dedicated CI runners; implement CI parallelization | Platform Guild | OPEN |
### Medium-Impact Risks
| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R5 | Attestor-Signer circular dependency blocks integration tests | MEDIUM | Integration tests delayed 1-2 weeks | Signer uses mock attestation initially; coordinate integration in Week 9 | Crypto Guild | OPEN |
| R6 | Upstream schema drift (NVD, OSV) breaks connector fixtures | MEDIUM | Connector tests fail; manual fixture regeneration required | FixtureUpdater tool automates regeneration; weekly live smoke tests detect drift early | Concelier Guild | OPEN |
| R7 | WebService contract tests too brittle (fail on every API change) | MEDIUM | Developer friction, contract tests disabled | Version APIs explicitly; allow non-breaking changes; review contract test strategy Week 6 | Platform Guild | OPEN |
### Low-Impact Risks
| ID | Risk | Probability | Impact | Mitigation | Owner | Status |
|----|------|-------------|--------|------------|-------|--------|
| R8 | Property test generation too slow (FsCheck iterations high) | LOW | Test suite timeout | Limit property test iterations (default 100 → 50); profile and optimize generators | Scanner Guild | OPEN |
| R9 | Architecture tests false positive (allowlist too restrictive) | LOW | Valid code blocked | Review architecture rules Week 5; explicit allowlist for test projects, benchmarks | Platform Guild | OPEN |
| R10 | Competitor parity tests require paid Trivy/Anchore licenses | LOW | Parity testing incomplete | Use Trivy free tier; defer Anchore to future sprint; focus on Syft/Grype (OSS) | QA Guild | OPEN |
### Risk Burn-Down Plan
**Week 1:** Validate Testcontainers .NET 10 compatibility (R2)
**Week 2:** TestKit API design review (R1)
**Week 4:** Determinism test strategy review (R3)
**Week 6:** CI infrastructure capacity review (R4)
**Week 8:** Signer-Attestor integration coordination (R5)
---
## Success Metrics
### Quantitative Metrics
| Metric | Baseline | Target | Measurement Method | Tracked By |
|--------|----------|--------|-------------------|------------|
| **Code Coverage** | ~40% | ≥70% | `dotnet test --collect:"XPlat Code Coverage"` | Weekly (Fridays) |
| **Test Count** | ~200 tests | ≥500 tests | Test suite execution count | Weekly |
| **Determinism Pass Rate** | N/A (not tracked) | 100% (no flaky tests) | Determinism gate CI job | Daily (CI) |
| **Contract Test Coverage** | 0 WebServices | 13 WebServices (100%) | Contract lane CI job | Weekly |
| **Architecture Violations** | Unknown | 0 violations | Architecture test failures | Daily (CI, PR gate) |
| **Sprint On-Time Completion** | N/A | ≥80% | Tasks complete by wave deadline | Weekly |
### Qualitative Metrics
| Metric | Success Criteria | Measurement Method | Tracked By |
|--------|------------------|-------------------|------------|
| **Developer Experience** | ≥80% of developers rate test authoring as "easy" or "very easy" | Post-sprint developer survey (Week 14) | Project Manager |
| **Test Maintainability** | ≥75% of test failures are due to actual bugs (not test brittleness) | Monthly test failure classification | QA Guild |
| **Integration Confidence** | ≥90% of PRs pass CI on first attempt (no test fixes required) | CI metrics (PR pass rate) | Platform Guild |
### Program Success Criteria
**Program Successful If:**
- All 22 sprints signed off (5100.0007.* + 5100.0008.0001 + 5100.0009.* + 5100.0010.*)
- Code coverage ≥70% platform-wide
- Determinism tests passing 100% in CI (no flaky tests)
- Contract tests enforced for all 13 WebServices
- Architecture tests PR-gating (lattice boundary violations blocked)
**Program Failed If:**
- <18 sprints signed off (<80% completion)
- Code coverage increase <20% (baseline ~40% <60%)
- Critical quality gates missing (Determinism, Architecture, Contract)
- TestKit not operational (blocking all module tests)
---
## Governance
### Steering Committee
| Role | Name | Responsibility |
|------|------|----------------|
| **Program Sponsor** | CTO | Final escalation; budget approval |
| **Program Manager** | Project Management | Overall program coordination; risk management |
| **Technical Lead** | Platform Guild Lead | Architecture decisions; technical escalation |
| **QA Lead** | QA Guild Lead | Quality gate oversight; test strategy validation |
### Decision-Making Authority
| Decision Type | Authority | Escalation Path |
|---------------|-----------|----------------|
| **Sprint scope changes** | Sprint owner + Guild lead | Program Manager Steering Committee |
| **Architecture changes** | Platform Guild Lead | Steering Committee |
| **Resource allocation** | Program Manager | CTO (if >10% budget impact) |
| **Schedule changes (>1 week)** | Program Manager | Steering Committee |
| **Risk acceptance** | Program Manager | Steering Committee (for HIGH/CRITICAL risks) |
### Status Reporting
**Weekly Status Report (Fridays):**
- Sprint completion status (% tasks complete)
- Blockers and risks (RED/YELLOW/GREEN)
- Resource allocation (current vs. planned)
- Next week preview
**Monthly Executive Summary:**
- Program health (on-track / at-risk / off-track)
- Milestone completion (M1-M5)
- Budget vs. actuals
- Key risks and mitigations
### Change Control
**Change Request Process:**
1. **Requester submits change request** (scope, schedule, or resource change)
2. **Program Manager reviews** (impact analysis: cost, schedule, quality)
3. **Steering Committee approves/rejects** (for changes >1 week or >10% budget)
4. **Program Manager updates plan** (timeline, resource model, risk register)
---
## Communication Plan
### Stakeholders
| Stakeholder Group | Interest | Communication Frequency | Method |
|-------------------|----------|------------------------|--------|
| **Engineering Teams (Guilds)** | Sprint execution, dependencies | Daily/Weekly | Slack #testing-strategy, guild standups |
| **Guild Leads** | Sprint status, blockers | Weekly | Friday status sync (30 min) |
| **Product Management** | Quality gates, feature readiness | Bi-weekly | Sprint demos, monthly exec summary |
| **CTO / Executives** | Program health, budget | Monthly | Executive summary (email) |
### Meetings
#### Weekly Sync (Every Friday, 30 min)
**Attendees:** All active sprint owners + program manager
**Agenda:**
1. Sprint status updates (green/yellow/red) (15 min)
2. Blocker escalation (10 min)
3. Next week preview (5 min)
#### Monthly Steering Committee (First Monday, 60 min)
**Attendees:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead)
**Agenda:**
1. Program health review (on-track / at-risk / off-track) (20 min)
2. Milestone completion (M1-M5) (15 min)
3. Budget vs. actuals (10 min)
4. Risk review (top 3 risks) (10 min)
5. Decisions required (5 min)
#### Retrospective (Week 14, 90 min)
**Attendees:** All guild leads + program manager + steering committee
**Agenda:**
1. Program retrospective (what went well, what didn't, lessons learned) (60 min)
2. Metrics review (code coverage, test count, determinism, etc.) (20 min)
3. Future improvements (next testing initiatives) (10 min)
---
## Appendices
### Appendix A: Sprint Inventory
**Total Sprints:** 22
- Foundation Epics: 7 (5100.0007.0001-0007)
- Quality Gates: 1 (5100.0008.0001)
- Module Tests: 11 (5100.0009.0001-0011)
- Infrastructure Tests: 4 (5100.0010.0001-0004)
**Total Tasks:** ~370 tasks
**Total Estimated Effort:** ~450 engineer-days (assuming avg 1.2 days/task)
### Appendix B: Reference Documents
1. **Advisory:** `docs/product-advisories/22-Dec-2026 - Better testing strategy.md`
2. **Test Catalog:** `docs/testing/TEST_CATALOG.yml`
3. **Test Models:** `docs/testing/testing-strategy-models.md`
4. **Dependency Graph:** `docs/testing/SPRINT_DEPENDENCY_GRAPH.md`
5. **Coverage Matrix:** `docs/testing/TEST_COVERAGE_MATRIX.md`
6. **Execution Playbook:** `docs/testing/SPRINT_EXECUTION_PLAYBOOK.md`
### Appendix C: Budget Estimate (Preliminary)
**Assumptions:**
- Average engineer cost: $150/hour (fully loaded)
- Average sprint duration: 80 hours (2 weeks × 40 hours)
- Peak staffing: 22 engineers (Weeks 7-10)
**Budget Estimate:**
- Foundation Phase (Weeks 1-6): 12 engineers × 240 hours × $150 = $432,000
- Module Tests Phase (Weeks 7-10): 22 engineers × 160 hours × $150 = $528,000
- Infrastructure Phase (Weeks 11-14): 8 engineers × 160 hours × $150 = $192,000
- **Total Estimated Cost:** $1,152,000
**Note:** Final budget requires approval from CTO/Finance. Contractor costs may reduce total if used strategically for peak staffing (Weeks 7-10).
---
**Prepared by:** Project Management
**Approval Required From:** Steering Committee (CTO, Program Manager, Platform Guild Lead, QA Lead)
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)

View File

@@ -0,0 +1,75 @@
version: 1
source_advisory: "docs/product-advisories/22-Dec-2026 - Better testing strategy.md"
models:
L0:
description: "Library/Core"
required: [unit, property, snapshot, determinism]
S1:
description: "Storage/Postgres"
required: [integration_postgres, migrations, idempotency, concurrency, query_ordering]
T1:
description: "Transport/Queue"
required: [protocol_roundtrip, fuzz_invalid, delivery_semantics, backpressure]
C1:
description: "Connector/External"
required: [fixtures, snapshot, resilience, security]
optional: [live_smoke]
W1:
description: "WebService/API"
required: [contract, authz, otel, negative]
WK1:
description: "Worker/Indexer"
required: [end_to_end, retries, idempotency, otel]
AN1:
description: "Analyzer/SourceGen"
required: [diagnostics, codefixes, golden_generated]
CLI1:
description: "Tool/CLI"
required: [exit_codes, golden_output, determinism]
PERF:
description: "Benchmarks"
required: [benchmark, perf_smoke, regression_thresholds]
lanes:
Unit: [unit, property, snapshot, determinism]
Contract: [contract, schema]
Integration: [integration_postgres, integration_services, end_to_end]
Security: [security, authz, negative]
Performance: [benchmark, perf_smoke]
Live: [live_smoke]
modules:
Scanner:
models: [L0, AN1, S1, T1, W1, WK1, PERF]
gates: [determinism, reachability_evidence, proof_spine]
Concelier:
models: [C1, L0, S1, W1, AN1]
gates: [fixture_coverage, normalization_determinism, no_lattice_dependency]
Excititor:
models: [C1, L0, S1, W1, WK1]
gates: [preserve_prune_source, format_snapshots, no_lattice_dependency]
Policy:
models: [L0, S1, W1]
gates: [unknown_budget, verdict_snapshot]
Authority:
models: [L0, W1, C1]
gates: [scope_enforcement, sign_verify]
Signer:
models: [L0, W1, C1]
gates: [canonical_payloads, sign_verify]
Attestor:
models: [L0, W1]
gates: [rekor_receipts, dsse_verify]
Scheduler:
models: [L0, S1, W1, WK1]
gates: [idempotent_jobs, retry_backoff]
Notify:
models: [L0, C1, S1, W1, WK1]
gates: [connector_snapshots, retry_semantics]
CLI:
models: [CLI1]
gates: [exit_codes, stdout_snapshots]
UI:
models: [W1]
gates: [contract_snapshots, e2e_smoke]

View File

@@ -0,0 +1,262 @@
# Testing Strategy Coverage Matrix
> **Purpose:** Visual map of test model requirements per module, quality gates, and sprint-to-model relationships.
---
## Module-to-Model Coverage Map
### Legend
-**Required** (from TEST_CATALOG.yml)
- 🟡 **Optional** (recommended but not mandatory)
-**Not Applicable**
### Model Definitions (Quick Reference)
| Model | Description | Key Tests |
|-------|-------------|-----------|
| **L0** | Library/Core | Unit, property, snapshot, determinism |
| **S1** | Storage/Postgres | Integration, migrations, idempotency, query ordering |
| **T1** | Transport/Queue | Protocol roundtrip, fuzz invalid, delivery semantics, backpressure |
| **C1** | Connector/External | Fixtures, snapshot, resilience, security |
| **W1** | WebService/API | Contract, authz, OTel, negative |
| **WK1** | Worker/Indexer | End-to-end, retries, idempotency, OTel |
| **AN1** | Analyzer/SourceGen | Diagnostics, codefixes, golden generated |
| **CLI1** | Tool/CLI | Exit codes, golden output, determinism |
| **PERF** | Benchmarks | Benchmark, perf smoke, regression thresholds |
---
## Coverage Matrix
### Core Modules
| Module | L0 | S1 | T1 | C1 | W1 | WK1 | AN1 | CLI1 | PERF | Sprint | Tasks |
|--------|----|----|----|----|----|----|-----|------|------|--------|-------|
| **Scanner** | ✅ | ✅ | ✅ | ⬜ | ✅ | ✅ | ✅ | ⬜ | ✅ | 5100.0009.0001 | 25 |
| **Concelier** | ✅ | ✅ | ⬜ | ✅ | ✅ | ⬜ | ✅ | ⬜ | ⬜ | 5100.0009.0002 | 18 |
| **Excititor** | ✅ | ✅ | ⬜ | ✅ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | 5100.0009.0003 | 21 |
| **Policy** | ✅ | ✅ | ⬜ | ⬜ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0009.0004 | 15 |
### Security & Compliance Modules
| Module | L0 | S1 | T1 | C1 | W1 | WK1 | AN1 | CLI1 | PERF | Sprint | Tasks |
|--------|----|----|----|----|----|----|-----|------|------|--------|-------|
| **Authority** | ✅ | ⬜ | ⬜ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0009.0005 | 17 |
| **Signer** | ✅ | ⬜ | ⬜ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0009.0006 | 17 |
| **Attestor** | ✅ | ⬜ | ⬜ | ⬜ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0009.0007 | 14 |
### Platform Services
| Module | L0 | S1 | T1 | C1 | W1 | WK1 | AN1 | CLI1 | PERF | Sprint | Tasks |
|--------|----|----|----|----|----|----|-----|------|------|--------|-------|
| **Scheduler** | ✅ | ✅ | ⬜ | ⬜ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | 5100.0009.0008 | 14 |
| **Notify** | ✅ | ✅ | ⬜ | ✅ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | 5100.0009.0009 | 18 |
### Client Interfaces
| Module | L0 | S1 | T1 | C1 | W1 | WK1 | AN1 | CLI1 | PERF | Sprint | Tasks |
|--------|----|----|----|----|----|----|-----|------|------|--------|-------|
| **CLI** | ⬜ | ⬜ | ⬜ | ⬜ | ⬜ | ⬜ | ⬜ | ✅ | ⬜ | 5100.0009.0010 | 13 |
| **UI** | ⬜ | ⬜ | ⬜ | ⬜ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0009.0011 | 13 |
### Infrastructure & Platform
| Module | L0 | S1 | T1 | C1 | W1 | WK1 | AN1 | CLI1 | PERF | Sprint | Tasks |
|--------|----|----|----|----|----|----|-----|------|------|--------|-------|
| **EvidenceLocker** | ✅ | ✅ | ⬜ | ⬜ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0010.0001 | 16 |
| **Graph/Timeline** | ✅ | ✅ | ⬜ | ⬜ | ✅ | ✅ | ⬜ | ⬜ | ⬜ | 5100.0010.0002 | 15 |
| **Router/Messaging** | ✅ | ✅ | ✅ | ⬜ | ✅ | ⬜ | ⬜ | ⬜ | ⬜ | 5100.0010.0003 | 14 |
| **AirGap** | ✅ | ✅ | ⬜ | ⬜ | ✅ | ⬜ | ✅ | ✅ | ⬜ | 5100.0010.0004 | 17 |
---
## Model Distribution Analysis
### Models by Usage Frequency
| Model | Modules Using | Percentage | Complexity |
|-------|---------------|------------|------------|
| **L0** (Library/Core) | 13/15 modules | 87% | HIGH (property tests, snapshots) |
| **W1** (WebService) | 13/15 modules | 87% | MEDIUM (contract tests, auth) |
| **S1** (Storage) | 10/15 modules | 67% | HIGH (migrations, idempotency) |
| **C1** (Connectors) | 5/15 modules | 33% | MEDIUM (fixtures, resilience) |
| **WK1** (Workers) | 5/15 modules | 33% | MEDIUM (end-to-end, retries) |
| **AN1** (Analyzers) | 3/15 modules | 20% | HIGH (Roslyn, diagnostics) |
| **T1** (Transport) | 2/15 modules | 13% | HIGH (protocol compliance) |
| **CLI1** (CLI Tools) | 2/15 modules | 13% | LOW (exit codes, snapshots) |
| **PERF** (Performance) | 1/15 modules | 7% | MEDIUM (benchmarks, regression) |
### Complexity Heatmap
**High Complexity (>15 tasks per sprint):**
- Scanner (25 tasks: L0+AN1+S1+T1+W1+WK1+PERF)
- Excititor (21 tasks: C1+L0+S1+W1+WK1)
- Concelier (18 tasks: C1+L0+S1+W1+AN1)
- Notify (18 tasks: L0+C1+S1+W1+WK1)
- Authority (17 tasks: L0+W1+C1)
- Signer (17 tasks: L0+W1+C1)
- AirGap (17 tasks: L0+AN1+S1+W1+CLI1)
**Medium Complexity (10-15 tasks):**
- Policy (15 tasks: L0+S1+W1)
- EvidenceLocker (16 tasks: L0+S1+W1)
- Graph/Timeline (15 tasks: L0+S1+W1+WK1)
- Scheduler (14 tasks: L0+S1+W1+WK1)
- Attestor (14 tasks: L0+W1)
- Router/Messaging (14 tasks: L0+T1+W1+S1)
- CLI (13 tasks: CLI1)
- UI (13 tasks: W1)
---
## Quality Gate Coverage
### Module-Specific Quality Gates (from TEST_CATALOG.yml)
| Module | Quality Gates | Enforced By |
|--------|---------------|-------------|
| **Scanner** | determinism, reachability_evidence, proof_spine | Sprint 5100.0009.0001 Tasks 7-10, 23-25 |
| **Concelier** | fixture_coverage, normalization_determinism, no_lattice_dependency | Sprint 5100.0009.0002 Tasks 1-7, 8-10, 18 |
| **Excititor** | preserve_prune_source, format_snapshots, no_lattice_dependency | Sprint 5100.0009.0003 Tasks 6-11, 21 |
| **Policy** | unknown_budget, verdict_snapshot | Sprint 5100.0009.0004 Tasks 2, 4, 14-15 |
| **Authority** | scope_enforcement, sign_verify | Sprint 5100.0009.0005 Tasks 3-5, 16-17 |
| **Signer** | canonical_payloads, sign_verify | Sprint 5100.0009.0006 Tasks 1-3, 15-17 |
| **Attestor** | rekor_receipts, dsse_verify | Sprint 5100.0009.0007 Tasks 6-8, 2 |
| **Scheduler** | idempotent_jobs, retry_backoff | Sprint 5100.0009.0008 Tasks 4, 3, 12 |
| **Notify** | connector_snapshots, retry_semantics | Sprint 5100.0009.0009 Tasks 1-6, 16 |
| **CLI** | exit_codes, stdout_snapshots | Sprint 5100.0009.0010 Tasks 1-4, 5-8 |
| **UI** | contract_snapshots, e2e_smoke | Sprint 5100.0009.0011 Tasks 1-2, 7-10 |
### Cross-Cutting Quality Gates
| Gate | Applies To | Enforced By |
|------|-----------|-------------|
| **Determinism Contract** | Scanner, Excititor, Signer, CLI, AirGap, Concelier | Sprint 5100.0007.0003 (Determinism Gate) |
| **Architecture Boundaries** | Concelier, Excititor (must NOT reference Scanner lattice) | Sprint 5100.0007.0007 (Architecture Tests) |
| **Contract Stability** | All WebServices (13 modules) | Sprint 5100.0007.0006 (WebService Contract) |
| **Storage Idempotency** | All S1 modules (10 modules) | Sprint 5100.0007.0004 (Storage Harness) |
| **Connector Resilience** | All C1 modules (5 modules) | Sprint 5100.0007.0005 (Connector Fixtures) |
---
## CI Lane Coverage
### Test Distribution Across CI Lanes
| CI Lane | Models | Modules | Sprint Tasks | Est. Runtime |
|---------|--------|---------|--------------|--------------|
| **Unit** | L0, AN1, CLI1 | All 15 modules | ~120 tasks | <5 min |
| **Contract** | W1 | 13 modules | ~50 tasks | <2 min |
| **Integration** | S1, WK1, T1 | 12 modules | ~100 tasks | 10-15 min |
| **Security** | C1 (security tests), W1 (auth tests) | 5 connectors + 13 WebServices | ~60 tasks | 5-10 min |
| **Performance** | PERF | Scanner only | ~3 tasks | 3-5 min |
| **Live** | C1 (live smoke tests) | Concelier, Excititor, Notify, Authority, Signer | ~5 tasks (opt-in) | 5-10 min (nightly) |
### CI Lane Dependencies
```
PR Gate (Must Pass):
├─ Unit Lane (L0, AN1, CLI1) ← Fast feedback
├─ Contract Lane (W1) ← API stability
├─ Architecture Lane (Sprint 5100.0007.0007) ← Boundary enforcement
└─ Integration Lane (S1, WK1, T1) ← Testcontainers
Merge Gate (Must Pass):
├─ All PR Gate lanes
├─ Security Lane (C1 security, W1 auth)
└─ Determinism Lane (Sprint 5100.0007.0003)
Nightly (Optional):
├─ Performance Lane (PERF)
└─ Live Lane (C1 live smoke)
Weekly (Optional):
└─ Competitor Parity (Sprint 5100.0008.0001)
```
---
## Epic-to-Model Coverage
### Epic Sprints Support Multiple Models
| Epic Sprint | Models Enabled | Consuming Modules | Tasks |
|-------------|----------------|-------------------|-------|
| **5100.0007.0002 (TestKit)** | ALL (L0, S1, T1, C1, W1, WK1, AN1, CLI1, PERF) | ALL 15 modules | 13 |
| **5100.0007.0003 (Determinism)** | L0 (determinism), CLI1 (determinism) | Scanner, Excititor, Signer, CLI, AirGap, Concelier | 12 |
| **5100.0007.0004 (Storage)** | S1 | 10 modules | 12 |
| **5100.0007.0005 (Connectors)** | C1 | Concelier, Excititor, Authority, Signer, Notify | 12 |
| **5100.0007.0006 (WebService)** | W1 | 13 modules | 12 |
| **5100.0007.0007 (Architecture)** | (Cross-cutting) | Concelier, Excititor | 17 |
---
## Test Type Distribution
### By Test Category (Trait)
| Test Category | Model Coverage | Estimated Test Count | CI Lane |
|---------------|----------------|----------------------|---------|
| **Unit** | L0, AN1 | ~150 tests across 13 modules | Unit |
| **Property** | L0 (subset) | ~40 tests (Scanner, Policy, Scheduler, Router) | Unit |
| **Snapshot** | L0, C1, CLI1 | ~80 tests (all modules with canonical outputs) | Unit/Contract |
| **Integration** | S1, WK1, T1 | ~120 tests across 12 modules | Integration |
| **Contract** | W1 | ~50 tests (13 WebServices × avg 4 endpoints) | Contract |
| **Security** | C1 (security), W1 (auth) | ~60 tests | Security |
| **Performance** | PERF | ~3 tests (Scanner only) | Performance |
| **Live** | C1 (live smoke) | ~5 tests (opt-in, nightly) | Live |
---
## Coverage Gaps & Recommendations
### Current Gaps
1. **Performance Testing:** Only Scanner has PERF model
- **Recommendation:** Add PERF to Policy (policy evaluation latency), Concelier (merge performance), Scheduler (scheduling overhead)
2. **Transport Testing:** Only Router/Messaging has T1 model
- **Recommendation:** Scanner has T1 in TEST_CATALOG.yml but should validate Valkey transport for job queues
3. **Live Connector Tests:** Only 5 modules have C1 live smoke tests (opt-in)
- **Recommendation:** Run weekly, not nightly; treat as early warning system for schema drift
### Recommended Additions (Future Sprints)
| Module | Missing Model | Justification | Priority |
|--------|---------------|---------------|----------|
| Policy | PERF | Policy evaluation latency critical for real-time decisioning | HIGH |
| Concelier | PERF | Merge performance affects ingestion throughput | MEDIUM |
| Scheduler | PERF | Scheduling overhead affects job execution latency | MEDIUM |
| Scanner | T1 (validate) | Job queue transport (Valkey) should have compliance tests | HIGH |
| Authority | S1 | Token storage/revocation should have migration tests | MEDIUM |
---
## Summary Statistics
**Total Test Models:** 9
**Total Modules Covered:** 15
**Total Module Test Sprints:** 15 (11 module + 4 infrastructure)
**Total Epic Sprints:** 6
**Total Quality Gate Sprints:** 1 (Competitor Parity)
**Model Usage:**
- L0: 13 modules (87%)
- W1: 13 modules (87%)
- S1: 10 modules (67%)
- C1: 5 modules (33%)
- WK1: 5 modules (33%)
- AN1: 3 modules (20%)
- T1: 2 modules (13%)
- CLI1: 2 modules (13%)
- PERF: 1 module (7%)
**Estimated Total Tests:** ~500 tests across all modules and models
---
**Prepared by:** Project Management
**Date:** 2025-12-23
**Next Review:** 2026-01-06 (Week 1 kickoff)
**Source:** `docs/testing/TEST_CATALOG.yml`, Sprint files 5100.0009.* and 5100.0010.*

View File

@@ -0,0 +1,245 @@
# CI Lane Filters and Test Traits
This document describes how to categorize tests by lane and test type for CI filtering.
## Test Lanes
StellaOps uses standardized test lanes based on `docs/testing/TEST_CATALOG.yml`:
| Lane | Purpose | Characteristics | PR Gating |
|------|---------|-----------------|-----------|
| **Unit** | Fast, isolated tests | No I/O, deterministic, offline | ✅ Yes |
| **Contract** | API contract stability | Schema/OpenAPI validation | ✅ Yes |
| **Integration** | Service and storage tests | Uses Testcontainers (Postgres/Valkey) | ✅ Yes |
| **Security** | Security regression tests | authz, negative cases, injection tests | ✅ Yes |
| **Performance** | Benchmark and perf smoke | Regression thresholds | ⚠️ Optional |
| **Live** | External connector smoke tests | Requires network, upstream deps | ❌ No (opt-in only) |
## Using Test Traits
Add `StellaOps.TestKit` to your test project:
```xml
<ItemGroup>
<ProjectReference Include="..\..\__Libraries\StellaOps.TestKit\StellaOps.TestKit.csproj" />
</ItemGroup>
```
### Lane Attributes
Mark tests with lane attributes:
```csharp
using StellaOps.TestKit.Traits;
using Xunit;
public class MyTests
{
[Fact]
[UnitTest] // Runs in Unit lane
public void FastIsolatedTest()
{
// No I/O, deterministic
}
[Fact]
[IntegrationTest] // Runs in Integration lane
public async Task DatabaseTest()
{
// Uses Testcontainers PostgreSQL
}
[Fact]
[SecurityTest] // Runs in Security lane
public void AuthorizationTest()
{
// Tests RBAC, scope enforcement
}
[Fact]
[LiveTest] // Runs in Live lane (opt-in only)
public async Task ExternalApiTest()
{
// Calls external service
}
}
```
### Test Type Attributes
Mark tests with specific test types:
```csharp
[Fact]
[UnitTest]
[DeterminismTest] // Verifies deterministic output
public void SbomGenerationIsDeterministic()
{
var sbom1 = GenerateSbom();
var sbom2 = GenerateSbom();
Assert.Equal(sbom1, sbom2); // Must be identical
}
[Fact]
[IntegrationTest]
[SnapshotTest] // Compares against golden file
public void ApiResponseMatchesSnapshot()
{
var response = CallApi();
SnapshotHelper.VerifySnapshot(response, "api_response.json");
}
[Fact]
[SecurityTest]
[AuthzTest] // Tests authorization
public void UnauthorizedRequestIsRejected()
{
// Test RBAC enforcement
}
```
## Running Tests by Lane
### Command Line
```bash
# Run Unit tests only
dotnet test --filter "Lane=Unit"
# Run Integration tests only
dotnet test --filter "Lane=Integration"
# Run Security tests only
dotnet test --filter "Lane=Security"
# Using the helper script
./scripts/test-lane.sh Unit
./scripts/test-lane.sh Integration --results-directory ./test-results
```
### CI Workflows
Example job for Unit lane:
```yaml
unit-tests:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.x'
- name: Run Unit tests
run: |
dotnet test \
--filter "Lane=Unit" \
--configuration Release \
--logger "trx;LogFileName=unit-tests.trx" \
--results-directory ./test-results
```
Example job for Integration lane:
```yaml
integration-tests:
runs-on: ubuntu-22.04
services:
postgres:
image: postgres:16-alpine
# ...
steps:
- name: Run Integration tests
run: |
dotnet test \
--filter "Lane=Integration" \
--configuration Release \
--logger "trx;LogFileName=integration-tests.trx" \
--results-directory ./test-results
```
## Lane Filtering Best Practices
### Unit Lane
- ✅ Pure functions, logic tests
- ✅ In-memory operations
- ✅ Deterministic time/random (use `StellaOps.TestKit.Time.DeterministicClock`)
- ❌ No file I/O (except snapshots in `__snapshots__/`)
- ❌ No network calls
- ❌ No databases
### Contract Lane
- ✅ OpenAPI schema validation
- ✅ API response envelope checks
- ✅ Contract stability tests
- ❌ No external dependencies
### Integration Lane
- ✅ Testcontainers for Postgres/Valkey
- ✅ End-to-end service flows
- ✅ Multi-component interactions
- ❌ No external/live services
### Security Lane
- ✅ Authorization/authentication tests
- ✅ Input validation (SQL injection, XSS prevention)
- ✅ RBAC scope enforcement
- ✅ Negative test cases
### Performance Lane
- ✅ Benchmarks with `[Benchmark]` attribute
- ✅ Performance smoke tests
- ✅ Regression threshold checks
- ⚠️ Not PR-gating by default (runs on schedule)
### Live Lane
- ✅ External API smoke tests
- ✅ Upstream connector validation
-**Never PR-gating**
- ⚠️ Opt-in only (schedule or manual trigger)
## Combining Traits
You can combine multiple traits:
```csharp
[Fact]
[IntegrationTest]
[TestType("idempotency")]
[TestType("postgres")]
public async Task JobExecutionIsIdempotent()
{
// Test uses Postgres fixture and verifies idempotency
}
```
## Migration Guide
If you have existing tests without lane attributes:
1. **Identify test characteristics**:
- Does it use I/O? → `IntegrationTest`
- Is it fast and isolated? → `UnitTest`
- Does it test auth/security? → `SecurityTest`
- Does it call external APIs? → `LiveTest`
2. **Add appropriate attributes**:
```csharp
[Fact]
[UnitTest] // Add this
public void ExistingTest() { ... }
```
3. **Verify in CI**:
```bash
# Should only run newly tagged tests
dotnet test --filter "Lane=Unit"
```
## Related Documentation
- Test Catalog: `docs/testing/TEST_CATALOG.yml`
- Testing Strategy: `docs/testing/testing-strategy-models.md`
- TestKit README: `src/__Libraries/StellaOps.TestKit/README.md`

View File

@@ -0,0 +1,310 @@
# CI Lane Integration Guide
This guide explains how to integrate the standardized test lane filtering into CI workflows.
## Overview
StellaOps uses a lane-based test categorization system with six standardized lanes:
- **Unit**: Fast, isolated, deterministic tests (PR-gating)
- **Contract**: API contract stability tests (PR-gating)
- **Integration**: Service and storage tests with Testcontainers (PR-gating)
- **Security**: AuthZ, input validation, negative tests (PR-gating)
- **Performance**: Benchmarks and regression thresholds (optional/scheduled)
- **Live**: External API smoke tests (opt-in only, never PR-gating)
## Using Lane Filters in CI
### Using the Test Runner Script
The recommended approach is to use `scripts/test-lane.sh`:
```yaml
- name: Run Unit lane tests
run: |
chmod +x scripts/test-lane.sh
./scripts/test-lane.sh Unit \
--logger "trx;LogFileName=unit-tests.trx" \
--results-directory ./test-results \
--verbosity normal
```
### Direct dotnet test Filtering
Alternatively, use `dotnet test` with lane filters directly:
```yaml
- name: Run Integration lane tests
run: |
dotnet test \
--filter "Lane=Integration" \
--configuration Release \
--logger "trx;LogFileName=integration-tests.trx" \
--results-directory ./test-results
```
## Lane-Based Workflow Pattern
### Full Workflow Example
See `.gitea/workflows/test-lanes.yml` for a complete reference implementation.
Key features:
- **Separate jobs per lane** for parallel execution
- **PR-gating lanes** run on all PRs (Unit, Contract, Integration, Security)
- **Optional lanes** run on schedule or manual trigger (Performance, Live)
- **Test results summary** aggregates all lane results
### Job Structure
```yaml
unit-tests:
name: Unit Tests
runs-on: ubuntu-22.04
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.100'
- name: Build
run: dotnet build src/StellaOps.sln --configuration Release
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit --results-directory ./test-results
- name: Upload results
uses: actions/upload-artifact@v4
with:
name: unit-test-results
path: ./test-results
```
## Lane Execution Guidelines
### Unit Lane
- **Timeout**: 10-15 minutes
- **Dependencies**: None (no I/O, no network, no databases)
- **PR gating**: ✅ Required
- **Characteristics**: Deterministic, fast, offline
### Contract Lane
- **Timeout**: 5-10 minutes
- **Dependencies**: None (schema validation only)
- **PR gating**: ✅ Required
- **Characteristics**: OpenAPI/schema validation, no external calls
### Integration Lane
- **Timeout**: 20-30 minutes
- **Dependencies**: Testcontainers (Postgres, Valkey)
- **PR gating**: ✅ Required
- **Characteristics**: End-to-end service flows, database tests
### Security Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: Testcontainers (if needed for auth tests)
- **PR gating**: ✅ Required
- **Characteristics**: RBAC, injection prevention, negative tests
### Performance Lane
- **Timeout**: 30-45 minutes
- **Dependencies**: Baseline data, historical metrics
- **PR gating**: ❌ Optional (scheduled/manual)
- **Characteristics**: Benchmarks, regression thresholds
### Live Lane
- **Timeout**: 15-20 minutes
- **Dependencies**: External APIs, upstream services
- **PR gating**: ❌ Never (opt-in only)
- **Characteristics**: Smoke tests, connector validation
## Migration from Per-Project to Lane-Based
### Before (Per-Project)
```yaml
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
- name: Run Authority tests
run: dotnet test src/Authority/StellaOps.Authority.sln
- name: Run Scanner tests
run: dotnet test src/Scanner/StellaOps.Scanner.sln
```
### After (Lane-Based)
```yaml
- name: Run Unit lane
run: ./scripts/test-lane.sh Unit
- name: Run Integration lane
run: ./scripts/test-lane.sh Integration
- name: Run Security lane
run: ./scripts/test-lane.sh Security
```
**Benefits**:
- Run all unit tests across all modules in parallel
- Clear separation of concerns by test type
- Faster feedback (fast tests run first)
- Better resource utilization (no Testcontainers for Unit tests)
## Best Practices
### 1. Parallel Execution
Run PR-gating lanes in parallel for faster feedback:
```yaml
jobs:
unit-tests:
# ...
integration-tests:
# ...
security-tests:
# ...
```
### 2. Conditional Execution
Use workflow inputs for optional lanes:
```yaml
on:
workflow_dispatch:
inputs:
run_performance:
type: boolean
default: false
jobs:
performance-tests:
if: github.event.inputs.run_performance == 'true'
# ...
```
### 3. Test Result Aggregation
Create a summary job that depends on all lane jobs:
```yaml
test-summary:
needs: [unit-tests, contract-tests, integration-tests, security-tests]
if: always()
steps:
- name: Download all results
uses: actions/download-artifact@v4
- name: Generate summary
run: ./scripts/ci/aggregate-test-results.sh
```
### 4. Timeout Configuration
Set appropriate timeouts per lane:
```yaml
unit-tests:
timeout-minutes: 15 # Fast
integration-tests:
timeout-minutes: 30 # Testcontainers startup
performance-tests:
timeout-minutes: 45 # Benchmark execution
```
### 5. Environment Isolation
Use Testcontainers for Integration lane, not GitHub Actions services:
```yaml
integration-tests:
steps:
- name: Run Integration tests
env:
POSTGRES_TEST_IMAGE: postgres:16-alpine
run: ./scripts/test-lane.sh Integration
```
Testcontainers provides:
- Per-test isolation
- Automatic cleanup
- Consistent behavior across environments
## Troubleshooting
### Tests Not Found
**Problem**: `dotnet test --filter "Lane=Unit"` finds no tests
**Solution**: Ensure tests have lane attributes:
```csharp
[Fact]
[UnitTest] // This attribute is required
public void MyTest() { }
```
### Wrong Lane Assignment
**Problem**: Integration test running in Unit lane
**Solution**: Check test attributes:
```csharp
// Bad: No database in Unit lane
[Fact]
[UnitTest]
public async Task DatabaseTest() { /* uses Postgres */ }
// Good: Use Integration lane for database tests
[Fact]
[IntegrationTest]
public async Task DatabaseTest() { /* uses Testcontainers */ }
```
### Testcontainers Timeout
**Problem**: Integration tests timeout waiting for containers
**Solution**: Increase job timeout and ensure Docker is available:
```yaml
integration-tests:
timeout-minutes: 30 # Increased from 15
steps:
- name: Verify Docker
run: docker info
```
### Live Tests in PR
**Problem**: Live lane tests failing in PRs
**Solution**: Never run Live tests in PRs:
```yaml
live-tests:
if: github.event_name == 'workflow_dispatch' && github.event.inputs.run_live == 'true'
# Never runs automatically on PR
```
## Integration with Existing Workflows
### Adding Lane-Based Testing to build-test-deploy.yml
Replace per-module test execution with lane-based execution:
```yaml
# Old approach
- name: Run Concelier tests
run: dotnet test src/Concelier/StellaOps.Concelier.sln
# New approach (recommended)
- name: Run all Unit tests
run: ./scripts/test-lane.sh Unit
- name: Run all Integration tests
run: ./scripts/test-lane.sh Integration
```
### Gradual Migration Strategy
1. **Phase 1**: Add lane attributes to existing tests
2. **Phase 2**: Add lane-based jobs alongside existing per-project jobs
3. **Phase 3**: Monitor lane-based jobs for stability
4. **Phase 4**: Remove per-project jobs once lane-based jobs proven stable
## Related Documentation
- Test Lane Filters: `docs/testing/ci-lane-filters.md`
- Testing Strategy: `docs/testing/testing-strategy-models.md`
- Test Catalog: `docs/testing/TEST_CATALOG.yml`
- TestKit README: `src/__Libraries/StellaOps.TestKit/README.md`
- Example Workflow: `.gitea/workflows/test-lanes.yml`

View File

@@ -150,6 +150,8 @@ If baselines become stale:
## Related Documentation
- [Test Suite Overview](../19_TEST_SUITE_OVERVIEW.md)
- [Testing Strategy Models](./testing-strategy-models.md)
- [Test Catalog](./TEST_CATALOG.yml)
- [Reachability Corpus Plan](../reachability/corpus-plan.md)
- [Performance Workbook](../12_PERFORMANCE_WORKBOOK.md)
- [Testing Quality Guardrails](./testing-quality-guardrails-implementation.md)

View File

@@ -0,0 +1,291 @@
# Determinism Gates
Determinism is a core principle of StellaOps - all artifact generation (SBOM, VEX, attestations) must be reproducible. This document describes how to test for determinism.
## Why Determinism Matters
- **Reproducible builds**: Same input → same output, always
- **Cryptographic verification**: Hash-based integrity depends on byte-for-byte reproducibility
- **Audit trails**: Deterministic timestamps and ordering for compliance
- **Offline operation**: No reliance on external randomness or timestamps
## Using Determinism Gates
Add `StellaOps.TestKit` to your test project and use the `DeterminismGate` class:
```csharp
using StellaOps.TestKit.Determinism;
using StellaOps.TestKit.Traits;
using Xunit;
public class SbomGeneratorTests
{
[Fact]
[UnitTest]
[DeterminismTest]
public void SbomGenerationIsDeterministic()
{
// Verify that calling the function 3 times produces identical output
DeterminismGate.AssertDeterministic(() =>
{
return GenerateSbom();
}, iterations: 3);
}
[Fact]
[UnitTest]
[DeterminismTest]
public void SbomBinaryIsDeterministic()
{
// Verify binary reproducibility
DeterminismGate.AssertDeterministic(() =>
{
return GenerateSbomBytes();
}, iterations: 3);
}
}
```
## JSON Determinism
JSON output must have:
- Stable property ordering (alphabetical or schema-defined)
- Consistent whitespace/formatting
- No random IDs or timestamps (unless explicitly from deterministic clock)
```csharp
[Fact]
[UnitTest]
[DeterminismTest]
public void VexDocumentJsonIsDeterministic()
{
// Verifies JSON canonicalization and property ordering
DeterminismGate.AssertJsonDeterministic(() =>
{
var vex = GenerateVexDocument();
return JsonSerializer.Serialize(vex);
});
}
[Fact]
[UnitTest]
[DeterminismTest]
public void VerdictObjectIsDeterministic()
{
// Verifies object serialization is deterministic
DeterminismGate.AssertJsonDeterministic(() =>
{
return GenerateVerdict();
});
}
```
## Canonical Equality
Compare two objects for canonical equivalence:
```csharp
[Fact]
[UnitTest]
public void VerdictFromDifferentPathsAreCanonicallyEqual()
{
var verdict1 = GenerateVerdictFromSbom();
var verdict2 = GenerateVerdictFromCache();
// Asserts that both produce identical canonical JSON
DeterminismGate.AssertCanonicallyEqual(verdict1, verdict2);
}
```
## Hash-Based Regression Testing
Compute stable hashes for regression detection:
```csharp
[Fact]
[UnitTest]
[DeterminismTest]
public void SbomHashMatchesBaseline()
{
var sbom = GenerateSbom();
var hash = DeterminismGate.ComputeHash(sbom);
// This hash should NEVER change unless SBOM format changes intentionally
const string expectedHash = "abc123...";
Assert.Equal(expectedHash, hash);
}
```
## Path Ordering
File paths in manifests must be sorted:
```csharp
[Fact]
[UnitTest]
[DeterminismTest]
public void SbomFilePathsAreSorted()
{
var sbom = GenerateSbom();
var filePaths = ExtractFilePaths(sbom);
// Asserts paths are in deterministic (lexicographic) order
DeterminismGate.AssertSortedPaths(filePaths);
}
```
## Timestamp Validation
All timestamps must be UTC ISO 8601:
```csharp
[Fact]
[UnitTest]
[DeterminismTest]
public void AttestationTimestampIsUtcIso8601()
{
var attestation = GenerateAttestation();
// Asserts timestamp is UTC with 'Z' suffix
DeterminismGate.AssertUtcIso8601(attestation.Timestamp);
}
```
## Determin
istic Time in Tests
Use `DeterministicClock` for reproducible timestamps:
```csharp
using StellaOps.TestKit.Time;
[Fact]
[UnitTest]
[DeterminismTest]
public void AttestationWithDeterministicTime()
{
var clock = new DeterministicClock();
// All operations using this clock will get the same time
var attestation1 = GenerateAttestation(clock);
var attestation2 = GenerateAttestation(clock);
Assert.Equal(attestation1.Timestamp, attestation2.Timestamp);
}
```
## Deterministic Random in Tests
Use `DeterministicRandom` for reproducible randomness:
```csharp
using StellaOps.TestKit.Random;
[Fact]
[UnitTest]
[DeterminismTest]
public void GeneratedIdsAreReproducible()
{
var rng1 = DeterministicRandomExtensions.WithTestSeed();
var id1 = GenerateId(rng1);
var rng2 = DeterministicRandomExtensions.WithTestSeed();
var id2 = GenerateId(rng2);
// Same seed → same output
Assert.Equal(id1, id2);
}
```
## Module-Specific Gates
### Scanner Determinism
- SBOM file path ordering
- Component hash stability
- Dependency graph ordering
### Concelier Determinism
- Advisory normalization (same advisory → same canonical form)
- Vulnerability merge determinism
- No lattice ordering dependencies
### Excititor Determinism
- VEX document format stability
- Preserve/prune decision ordering
- No lattice dependencies
### Policy Determinism
- Verdict reproducibility (same inputs → same verdict)
- Policy evaluation ordering
- Unknown budget calculations
### Attestor Determinism
- DSSE envelope canonical bytes
- Signature ordering (multiple signers)
- Rekor receipt stability
## Common Determinism Violations
**Timestamps from system clock**
```csharp
// Bad: Uses system time
var timestamp = DateTimeOffset.UtcNow.ToString("o");
// Good: Uses injected clock
var timestamp = clock.UtcNow.ToString("o");
```
**Random GUIDs**
```csharp
// Bad: Non-deterministic
var id = Guid.NewGuid().ToString();
// Good: Deterministic or content-addressed
var id = ComputeContentHash(data);
```
**Unordered collections**
```csharp
// Bad: Dictionary iteration order is undefined
foreach (var (key, value) in dict) { ... }
// Good: Explicit ordering
foreach (var (key, value) in dict.OrderBy(x => x.Key)) { ... }
```
**Floating-point comparisons**
```csharp
// Bad: Floating-point can differ across platforms
var score = 0.1 + 0.2; // Might not equal 0.3 exactly
// Good: Use fixed-point or integers
var scoreInt = (int)((0.1 + 0.2) * 1000);
```
**Non-UTC timestamps**
```csharp
// Bad: Timezone-dependent
var timestamp = DateTime.Now.ToString();
// Good: Always UTC with 'Z'
var timestamp = DateTimeOffset.UtcNow.ToString("o");
```
## Determinism Test Checklist
When writing determinism tests, verify:
- [ ] Multiple invocations produce identical output
- [ ] JSON has stable property ordering
- [ ] File paths are sorted lexicographically
- [ ] Timestamps are UTC ISO 8601 with 'Z' suffix
- [ ] No random GUIDs (use content-addressing)
- [ ] Collections are explicitly ordered
- [ ] No system time/random usage (use DeterministicClock/DeterministicRandom)
## Related Documentation
- TestKit README: `src/__Libraries/StellaOps.TestKit/README.md`
- Testing Strategy: `docs/testing/testing-strategy-models.md`
- Test Catalog: `docs/testing/TEST_CATALOG.yml`

View File

@@ -0,0 +1,52 @@
# Testing Strategy Models and Lanes (2026)
Source advisory: `docs/product-advisories/22-Dec-2026 - Better testing strategy.md`
Supersedes/extends: `docs/product-advisories/archived/2025-12-21-testing-strategy/20-Dec-2025 - Testing strategy.md`
## Purpose
- Define a single testing taxonomy for all StellaOps project types.
- Make determinism, offline readiness, and evidence integrity testable by default.
- Align CI lanes with a shared catalog so coverage is visible and enforceable.
## Strategy in brief
- Use test models (L0, S1, C1, W1, WK1, T1, AN1, CLI1, PERF) to encode required test types.
- Map every module to one or more models in `docs/testing/TEST_CATALOG.yml`.
- Run tests through standardized CI lanes (Unit, Contract, Integration, Security, Performance, Live).
## Test models (requirements)
- L0 (Library/Core): unit + property + snapshot + determinism.
- S1 (Storage/Postgres): migrations + idempotency + concurrency + query ordering.
- T1 (Transport/Queue): protocol roundtrip + fuzz invalid + delivery semantics.
- C1 (Connector/External): fixtures + snapshot + resilience + security; optional Live smoke.
- W1 (WebService/API): contract + authz + OTel trace assertions + negative cases.
- WK1 (Worker/Indexer): end-to-end job flow + retries + idempotency + telemetry.
- AN1 (Analyzer/SourceGen): Roslyn harness + diagnostics + golden generated output.
- CLI1 (Tool/CLI): exit codes + golden output + deterministic formatting.
- PERF (Benchmark): perf smoke subset + regression thresholds (relative).
## Repository foundations
- TestKit primitives: deterministic time/random, canonical JSON asserts, snapshot helpers, Postgres/Valkey fixtures, OTel capture.
- Determinism gate: canonical bytes and stable hashes for SBOM/VEX/verdict artifacts.
- Hybrid reachability posture: graph DSSE mandatory; edge-bundle DSSE optional/targeted with deterministic ordering.
- Architecture guards: enforce cross-module dependency boundaries (no lattice in Concelier/Excititor).
- Offline defaults: no network access unless explicitly tagged `Live`.
## CI lanes (standard filters)
- Unit: fast, offline; includes property and snapshot sub-traits.
- Contract: schema/OpenAPI stability and response envelopes.
- Integration: Testcontainers-backed service and storage tests.
- Security: authz/negative tests and security regressions.
- Performance: perf smoke and benchmark guards.
- Live: opt-in upstream connector checks (never PR gating by default).
## Documentation moments (when to update)
- New model or required test type: update `docs/testing/TEST_CATALOG.yml`.
- New lane or gate: update `docs/19_TEST_SUITE_OVERVIEW.md` and `docs/testing/ci-quality-gates.md`.
- Module-specific test policy change: update the module dossier under `docs/modules/<module>/`.
- New fixtures or runnable harnesses: place under `docs/benchmarks/**` or `tests/**` and link here.
## Related artifacts
- Test catalog (source of truth): `docs/testing/TEST_CATALOG.yml`
- Test suite overview: `docs/19_TEST_SUITE_OVERVIEW.md`
- Quality guardrails: `docs/testing/testing-quality-guardrails-implementation.md`
- Code samples from the advisory: `docs/benchmarks/testing/better-testing-strategy-samples.md`