# Testing Quality Guardrails Implementation ## Overview This document provides the master implementation plan for the Testing Quality Guardrails system derived from the `14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md` product advisory. **Source Advisory:** `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md` **Implementation Status:** Planning Complete, Execution Pending --- ## Executive Summary The Testing Quality Guardrails implementation addresses gaps between the product advisory and the current StellaOps codebase. After analysis, we identified 6 high-value items to implement across 4 focused sprints. ### What We're Implementing | Item | Sprint | Value | Effort | |------|--------|-------|--------| | Reachability quality gates in CI | 0350 | HIGH | LOW | | TTFS regression tracking | 0350 | HIGH | LOW | | Performance SLO enforcement | 0350 | HIGH | LOW | | SCA Failure Catalogue (FC6-FC10) | 0351 | HIGH | MEDIUM | | Security testing (OWASP Top 10) | 0352 | HIGH | MEDIUM | | Mutation testing (Stryker.NET) | 0353 | MEDIUM | MEDIUM | ### What We're NOT Implementing (and Why) | Item | Reason | |------|--------| | `toys/svc-XX/` directory restructure | Already have equivalent in `tests/reachability/corpus/` | | `labels.yaml` per-service format | Already have `reachgraph.truth.json` with same semantics | | Canonical TAG format | Can adopt incrementally, not blocking | | Fix validation 100% pass rate | Too rigid; changed to 90% for fixable cases | | Automated reviewer rejection | Over-engineering; human judgment needed | --- ## Sprint Roadmap ``` ┌─────────────────────────────────────────────────────────────────┐ │ TESTING QUALITY GUARDRAILS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Sprint 0350 Sprint 0351 │ │ CI Quality Gates SCA Failure Catalogue │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ Reachability│ │ FC6: Java │ │ │ │ TTFS │ │ FC7: .NET │ │ │ │ Performance │ │ FC8: Docker │ │ │ └─────────────┘ │ FC9: PURL │ │ │ │ │ FC10: CVE │ │ │ │ └─────────────┘ │ │ │ │ │ │ └──────────┬───────────────┘ │ │ │ │ │ ▼ │ │ Sprint 0352 Sprint 0353 │ │ Security Testing Mutation Testing │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ OWASP Top 10│ │ Stryker.NET │ │ │ │ A01-A10 │ │ Scanner │ │ │ │ 50+ tests │ │ Policy │ │ │ └─────────────┘ │ Authority │ │ │ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ### Execution Order 1. **Sprint 0350** and **Sprint 0351** can run in parallel (no dependencies) 2. **Sprint 0352** can run in parallel with 0350/0351 3. **Sprint 0353** should start after 0352 (security tests should be stable first) ### Estimated Duration | Sprint | Tasks | Estimated Effort | |--------|-------|------------------| | 0350 | 10 | 2-3 developer-days | | 0351 | 10 | 3-4 developer-days | | 0352 | 10 | 4-5 developer-days | | 0353 | 10 | 3-4 developer-days | | **Total** | **40** | **12-16 developer-days** | --- ## Sprint Details ### Sprint 0350: CI Quality Gates Foundation **File:** `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md` **Objective:** Connect existing test infrastructure to CI enforcement **Key Deliverables:** - `scripts/ci/compute-reachability-metrics.sh` - Compute recall/precision from corpus - `scripts/ci/reachability-thresholds.yaml` - Enforcement thresholds - `scripts/ci/compute-ttfs-metrics.sh` - TTFS extraction from test runs - `bench/baselines/ttfs-baseline.json` - TTFS targets - `scripts/ci/enforce-performance-slos.sh` - Performance SLO checks - CI workflow modifications for quality gates **Quality Thresholds:** ```yaml thresholds: runtime_dependency_recall: >= 0.95 unreachable_false_positives: <= 0.05 reachability_underreport: <= 0.10 ttfs_regression: <= +10% vs main ``` --- ### Sprint 0351: SCA Failure Catalogue Completion **File:** `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md` **Objective:** Complete FC6-FC10 test cases for scanner regression testing **New Failure Cases:** | ID | Name | Failure Mode | |----|------|--------------| | FC6 | Java Shadow JAR | Shaded dependencies not detected | | FC7 | .NET Transitive Pinning | CPM pins to vulnerable version | | FC8 | Docker Multi-Stage Leakage | Build-time deps in runtime analysis | | FC9 | PURL Namespace Collision | npm vs pypi same package name | | FC10 | CVE Split/Merge | Single vuln with multiple CVE IDs | **Key Deliverables:** - 5 new fixture directories under `tests/fixtures/sca/catalogue/` - DSSE manifests for integrity verification - xUnit test project for failure catalogue - Updated documentation --- ### Sprint 0352: Security Testing Framework **File:** `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md` **Objective:** Systematic OWASP Top 10 coverage for StellaOps **Coverage Matrix:** | OWASP | Category | Test Count | |-------|----------|------------| | A01 | Broken Access Control | 8+ | | A02 | Cryptographic Failures | 6+ | | A03 | Injection | 10+ | | A05 | Security Misconfiguration | 6+ | | A07 | Authentication Failures | 8+ | | A08 | Integrity Failures | 5+ | | A10 | SSRF | 8+ | | **Total** | **7 categories** | **50+ tests** | **Key Deliverables:** - `tests/security/StellaOps.Security.Tests/` - Security test project - `MaliciousPayloads.cs` - Common attack patterns - `SecurityTestBase.cs` - Test infrastructure - `.gitea/workflows/security-tests.yml` - Dedicated CI workflow - `docs/testing/security-testing-guide.md` - Documentation --- ### Sprint 0353: Mutation Testing Integration **File:** `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md` **Objective:** Measure test suite effectiveness with Stryker.NET **Target Modules:** | Module | Threshold (Break) | Threshold (High) | |--------|-------------------|------------------| | Scanner.Core | 60% | 85% | | Policy.Engine | 60% | 85% | | Authority.Core | 65% | 90% | **Key Deliverables:** - Stryker.NET configuration for each target module - `scripts/ci/mutation-thresholds.yaml` - Threshold configuration - `.gitea/workflows/mutation-testing.yml` - Weekly mutation runs - `bench/baselines/mutation-baselines.json` - Baseline scores - `docs/testing/mutation-testing-guide.md` - Developer guide --- ## Existing Infrastructure Mapping The advisory describes structures that already exist under different names: | Advisory Structure | Existing Equivalent | Notes | |-------------------|---------------------|-------| | `toys/svc-XX/` | `tests/reachability/corpus/` | Same purpose, different path | | `labels.yaml` | `reachgraph.truth.json` | Different schema, same semantics | | `evidence/trace.json` | Evidence in attestor module | Already implemented | | `PostgresFixture` | `PostgresIntegrationFixture` | Already implemented | | `FakeTimeProvider` | Authority tests, ConnectorTestHarness | Already used | | `inputs.lock` | Exists in acceptance/guardrails | Already implemented | --- ## Quality Gate Summary After implementation, CI will enforce: ### Reachability Gates - Runtime dependency recall ≥ 95% - Unreachable false positives ≤ 5% - Reachability underreport ≤ 10% ### Performance Gates - Medium service scan < 2 minutes - Reachability compute < 30 seconds - SBOM ingestion < 5 seconds ### TTFS Gates - p50 < 2 seconds - p95 < 5 seconds - Regression ≤ +10% vs main ### Coverage Gates - Line coverage ≥ 70% (existing) - Branch coverage ≥ 60% (existing) - Mutation score ≥ 60-65% (break threshold) ### Security Gates - All security tests pass - No OWASP Top 10 violations --- ## File Changes Summary ### New Files ``` scripts/ci/ ├── compute-reachability-metrics.sh ├── compute-ttfs-metrics.sh ├── enforce-performance-slos.sh ├── enforce-thresholds.sh ├── enforce-mutation-thresholds.sh ├── extract-mutation-score.sh ├── reachability-thresholds.yaml └── mutation-thresholds.yaml bench/baselines/ ├── ttfs-baseline.json └── mutation-baselines.json tests/ ├── fixtures/sca/catalogue/ │ ├── fc6-java-shadow-jar/ │ ├── fc7-dotnet-transitive-pinning/ │ ├── fc8-docker-multistage-leakage/ │ ├── fc9-purl-namespace-collision/ │ └── fc10-cve-split-merge/ └── security/ └── StellaOps.Security.Tests/ .gitea/workflows/ ├── security-tests.yml └── mutation-testing.yml .config/ └── dotnet-tools.json (stryker) stryker-config.json (root) src/Scanner/__Libraries/StellaOps.Scanner.Core/stryker-config.json src/Policy/StellaOps.Policy.Engine/stryker-config.json src/Authority/StellaOps.Authority.Core/stryker-config.json docs/testing/ ├── ci-quality-gates.md ├── security-testing-guide.md └── mutation-testing-guide.md ``` ### Modified Files ``` .gitea/workflows/build-test-deploy.yml tests/fixtures/sca/catalogue/inputs.lock tests/fixtures/sca/catalogue/README.md README.md (badges) ``` --- ## Rollback Strategy If quality gates cause CI instability: 1. **Immediate:** Set `failure_mode: warn` in threshold configs 2. **Short-term:** Remove `needs:` dependencies to unblock other jobs 3. **Investigation:** Create issue with specific threshold that failed 4. **Resolution:** Either fix underlying issue or adjust threshold 5. **Re-enable:** Set `failure_mode: block` after verification --- ## Success Metrics | Metric | Current | Target | Measurement | |--------|---------|--------|-------------| | FC Catalogue coverage | 5 cases | 10 cases | Count of fixtures | | Security test coverage | Partial | 50+ tests | OWASP categories | | Mutation score (Scanner) | Unknown | ≥ 70% | Stryker weekly | | Mutation score (Policy) | Unknown | ≥ 70% | Stryker weekly | | Mutation score (Authority) | Unknown | ≥ 80% | Stryker weekly | | Quality gate pass rate | N/A | ≥ 95% | CI runs | --- ## References ### Sprint Files - `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md` - `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md` - `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md` - `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md` ### Source Advisory - `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md` ### Existing Documentation - `docs/19_TEST_SUITE_OVERVIEW.md` - `docs/reachability/ground-truth-schema.md` - `docs/reachability/corpus-plan.md` - `tests/reachability/README.md` --- ## Document Version | Field | Value | |-------|-------| | Version | 1.0 | | Created | 2025-12-14 | | Author | Platform Team | | Status | Planning Complete |