# Testing Quality Guardrails Implementation

## Overview

This document provides the master implementation plan for the Testing Quality Guardrails system derived from the `14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md` product advisory.

**Source Advisory:** `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md`

**Implementation Status:** Planning Complete, Execution Pending

---

## Executive Summary

The Testing Quality Guardrails implementation addresses gaps between the product advisory and the current StellaOps codebase. After analysis, we identified 6 high-value items to implement across 4 focused sprints.

### What We're Implementing

| Item | Sprint | Value | Effort |
|------|--------|-------|--------|
| Reachability quality gates in CI | 0350 | HIGH | LOW |
| TTFS regression tracking | 0350 | HIGH | LOW |
| Performance SLO enforcement | 0350 | HIGH | LOW |
| SCA Failure Catalogue (FC6-FC10) | 0351 | HIGH | MEDIUM |
| Security testing (OWASP Top 10) | 0352 | HIGH | MEDIUM |
| Mutation testing (Stryker.NET) | 0353 | MEDIUM | MEDIUM |

### What We're NOT Implementing (and Why)

| Item | Reason |
|------|--------|
| `toys/svc-XX/` directory restructure | Already have equivalent in `tests/reachability/corpus/` |
| `labels.yaml` per-service format | Already have `reachgraph.truth.json` with same semantics |
| Canonical TAG format | Can adopt incrementally, not blocking |
| Fix validation 100% pass rate | Too rigid; changed to 90% for fixable cases |
| Automated reviewer rejection | Over-engineering; human judgment needed |

---

## Sprint Roadmap

```
┌─────────────────────────────────────────────────────────────────┐
│                    TESTING QUALITY GUARDRAILS                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Sprint 0350                Sprint 0351                          │
│  CI Quality Gates           SCA Failure Catalogue                │
│  ┌─────────────┐            ┌─────────────┐                     │
│  │ Reachability│            │ FC6: Java   │                     │
│  │ TTFS        │            │ FC7: .NET   │                     │
│  │ Performance │            │ FC8: Docker │                     │
│  └─────────────┘            │ FC9: PURL   │                     │
│       │                     │ FC10: CVE   │                     │
│       │                     └─────────────┘                     │
│       │                          │                              │
│       └──────────┬───────────────┘                              │
│                  │                                               │
│                  ▼                                               │
│  Sprint 0352                Sprint 0353                          │
│  Security Testing           Mutation Testing                     │
│  ┌─────────────┐            ┌─────────────┐                     │
│  │ OWASP Top 10│            │ Stryker.NET │                     │
│  │ A01-A10     │            │ Scanner     │                     │
│  │ 50+ tests   │            │ Policy      │                     │
│  └─────────────┘            │ Authority   │                     │
│                             └─────────────┘                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

### Execution Order

1. **Sprint 0350** and **Sprint 0351** can run in parallel (no dependencies)
2. **Sprint 0352** can run in parallel with 0350/0351
3. **Sprint 0353** should start after 0352 (security tests should be stable first)

### Estimated Duration

| Sprint | Tasks | Estimated Effort |
|--------|-------|------------------|
| 0350 | 10 | 2-3 developer-days |
| 0351 | 10 | 3-4 developer-days |
| 0352 | 10 | 4-5 developer-days |
| 0353 | 10 | 3-4 developer-days |
| **Total** | **40** | **12-16 developer-days** |

---

## Sprint Details

### Sprint 0350: CI Quality Gates Foundation

**File:** `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md`

**Objective:** Connect existing test infrastructure to CI enforcement

**Key Deliverables:**
- `scripts/ci/compute-reachability-metrics.sh` - Compute recall/precision from corpus
- `scripts/ci/reachability-thresholds.yaml` - Enforcement thresholds
- `scripts/ci/compute-ttfs-metrics.sh` - TTFS extraction from test runs
- `bench/baselines/ttfs-baseline.json` - TTFS targets
- `scripts/ci/enforce-performance-slos.sh` - Performance SLO checks
- CI workflow modifications for quality gates

**Quality Thresholds:**
```yaml
thresholds:
  runtime_dependency_recall: >= 0.95
  unreachable_false_positives: <= 0.05
  reachability_underreport: <= 0.10
  ttfs_regression: <= +10% vs main
```

---

### Sprint 0351: SCA Failure Catalogue Completion

**File:** `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md`

**Objective:** Complete FC6-FC10 test cases for scanner regression testing

**New Failure Cases:**
| ID | Name | Failure Mode |
|----|------|--------------|
| FC6 | Java Shadow JAR | Shaded dependencies not detected |
| FC7 | .NET Transitive Pinning | CPM pins to vulnerable version |
| FC8 | Docker Multi-Stage Leakage | Build-time deps in runtime analysis |
| FC9 | PURL Namespace Collision | npm vs pypi same package name |
| FC10 | CVE Split/Merge | Single vuln with multiple CVE IDs |

**Key Deliverables:**
- 5 new fixture directories under `tests/fixtures/sca/catalogue/`
- DSSE manifests for integrity verification
- xUnit test project for failure catalogue
- Updated documentation

---

### Sprint 0352: Security Testing Framework

**File:** `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md`

**Objective:** Systematic OWASP Top 10 coverage for StellaOps

**Coverage Matrix:**
| OWASP | Category | Test Count |
|-------|----------|------------|
| A01 | Broken Access Control | 8+ |
| A02 | Cryptographic Failures | 6+ |
| A03 | Injection | 10+ |
| A05 | Security Misconfiguration | 6+ |
| A07 | Authentication Failures | 8+ |
| A08 | Integrity Failures | 5+ |
| A10 | SSRF | 8+ |
| **Total** | **7 categories** | **50+ tests** |

**Key Deliverables:**
- `tests/security/StellaOps.Security.Tests/` - Security test project
- `MaliciousPayloads.cs` - Common attack patterns
- `SecurityTestBase.cs` - Test infrastructure
- `.gitea/workflows/security-tests.yml` - Dedicated CI workflow
- `docs/testing/security-testing-guide.md` - Documentation

---

### Sprint 0353: Mutation Testing Integration

**File:** `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md`

**Objective:** Measure test suite effectiveness with Stryker.NET

**Target Modules:**
| Module | Threshold (Break) | Threshold (High) |
|--------|-------------------|------------------|
| Scanner.Core | 60% | 85% |
| Policy.Engine | 60% | 85% |
| Authority.Core | 65% | 90% |

**Key Deliverables:**
- Stryker.NET configuration for each target module
- `scripts/ci/mutation-thresholds.yaml` - Threshold configuration
- `.gitea/workflows/mutation-testing.yml` - Weekly mutation runs
- `bench/baselines/mutation-baselines.json` - Baseline scores
- `docs/testing/mutation-testing-guide.md` - Developer guide

---

## Existing Infrastructure Mapping

The advisory describes structures that already exist under different names:

| Advisory Structure | Existing Equivalent | Notes |
|-------------------|---------------------|-------|
| `toys/svc-XX/` | `tests/reachability/corpus/` | Same purpose, different path |
| `labels.yaml` | `reachgraph.truth.json` | Different schema, same semantics |
| `evidence/trace.json` | Evidence in attestor module | Already implemented |
| `PostgresFixture` | `PostgresIntegrationFixture` | Already implemented |
| `FakeTimeProvider` | Authority tests, ConnectorTestHarness | Already used |
| `inputs.lock` | Exists in acceptance/guardrails | Already implemented |

---

## Quality Gate Summary

After implementation, CI will enforce:

### Reachability Gates
- Runtime dependency recall ≥ 95%
- Unreachable false positives ≤ 5%
- Reachability underreport ≤ 10%

### Performance Gates
- Medium service scan < 2 minutes
- Reachability compute < 30 seconds
- SBOM ingestion < 5 seconds

### TTFS Gates
- p50 < 2 seconds
- p95 < 5 seconds
- Regression ≤ +10% vs main

### Coverage Gates
- Line coverage ≥ 70% (existing)
- Branch coverage ≥ 60% (existing)
- Mutation score ≥ 60-65% (break threshold)

### Security Gates
- All security tests pass
- No OWASP Top 10 violations

---

## File Changes Summary

### New Files

```
scripts/ci/
├── compute-reachability-metrics.sh
├── compute-ttfs-metrics.sh
├── enforce-performance-slos.sh
├── enforce-thresholds.sh
├── enforce-mutation-thresholds.sh
├── extract-mutation-score.sh
├── reachability-thresholds.yaml
└── mutation-thresholds.yaml

bench/baselines/
├── ttfs-baseline.json
└── mutation-baselines.json

tests/
├── fixtures/sca/catalogue/
│   ├── fc6-java-shadow-jar/
│   ├── fc7-dotnet-transitive-pinning/
│   ├── fc8-docker-multistage-leakage/
│   ├── fc9-purl-namespace-collision/
│   └── fc10-cve-split-merge/
└── security/
    └── StellaOps.Security.Tests/

.gitea/workflows/
├── security-tests.yml
└── mutation-testing.yml

.config/
└── dotnet-tools.json (stryker)

stryker-config.json (root)

src/Scanner/__Libraries/StellaOps.Scanner.Core/stryker-config.json
src/Policy/StellaOps.Policy.Engine/stryker-config.json
src/Authority/StellaOps.Authority.Core/stryker-config.json

docs/testing/
├── ci-quality-gates.md
├── security-testing-guide.md
└── mutation-testing-guide.md
```

### Modified Files

```
.gitea/workflows/build-test-deploy.yml
tests/fixtures/sca/catalogue/inputs.lock
tests/fixtures/sca/catalogue/README.md
README.md (badges)
```

---

## Rollback Strategy

If quality gates cause CI instability:

1. **Immediate:** Set `failure_mode: warn` in threshold configs
2. **Short-term:** Remove `needs:` dependencies to unblock other jobs
3. **Investigation:** Create issue with specific threshold that failed
4. **Resolution:** Either fix underlying issue or adjust threshold
5. **Re-enable:** Set `failure_mode: block` after verification

---

## Success Metrics

| Metric | Current | Target | Measurement |
|--------|---------|--------|-------------|
| FC Catalogue coverage | 5 cases | 10 cases | Count of fixtures |
| Security test coverage | Partial | 50+ tests | OWASP categories |
| Mutation score (Scanner) | Unknown | ≥ 70% | Stryker weekly |
| Mutation score (Policy) | Unknown | ≥ 70% | Stryker weekly |
| Mutation score (Authority) | Unknown | ≥ 80% | Stryker weekly |
| Quality gate pass rate | N/A | ≥ 95% | CI runs |

---

## References

### Sprint Files
- `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md`
- `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md`
- `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md`
- `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md`

### Source Advisory
- `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md`

### Existing Documentation
- `docs/19_TEST_SUITE_OVERVIEW.md`
- `docs/reachability/ground-truth-schema.md`
- `docs/reachability/corpus-plan.md`
- `tests/reachability/README.md`

---

## Document Version

| Field | Value |
|-------|-------|
| Version | 1.0 |
| Created | 2025-12-14 |
| Author | Platform Team |
| Status | Planning Complete |