345 lines
12 KiB
Markdown
345 lines
12 KiB
Markdown
# Testing Quality Guardrails Implementation
|
|
|
|
## Overview
|
|
|
|
This document provides the master implementation plan for the Testing Quality Guardrails system derived from the `14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md` product advisory.
|
|
|
|
**Source Advisory:** `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md`
|
|
|
|
**Implementation Status:** Planning Complete, Execution Pending
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
The Testing Quality Guardrails implementation addresses gaps between the product advisory and the current StellaOps codebase. After analysis, we identified 6 high-value items to implement across 4 focused sprints.
|
|
|
|
### What We're Implementing
|
|
|
|
| Item | Sprint | Value | Effort |
|
|
|------|--------|-------|--------|
|
|
| Reachability quality gates in CI | 0350 | HIGH | LOW |
|
|
| TTFS regression tracking | 0350 | HIGH | LOW |
|
|
| Performance SLO enforcement | 0350 | HIGH | LOW |
|
|
| SCA Failure Catalogue (FC6-FC10) | 0351 | HIGH | MEDIUM |
|
|
| Security testing (OWASP Top 10) | 0352 | HIGH | MEDIUM |
|
|
| Mutation testing (Stryker.NET) | 0353 | MEDIUM | MEDIUM |
|
|
|
|
### What We're NOT Implementing (and Why)
|
|
|
|
| Item | Reason |
|
|
|------|--------|
|
|
| `toys/svc-XX/` directory restructure | Already have equivalent in `tests/reachability/corpus/` |
|
|
| `labels.yaml` per-service format | Already have `reachgraph.truth.json` with same semantics |
|
|
| Canonical TAG format | Can adopt incrementally, not blocking |
|
|
| Fix validation 100% pass rate | Too rigid; changed to 90% for fixable cases |
|
|
| Automated reviewer rejection | Over-engineering; human judgment needed |
|
|
|
|
---
|
|
|
|
## Sprint Roadmap
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ TESTING QUALITY GUARDRAILS │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ Sprint 0350 Sprint 0351 │
|
|
│ CI Quality Gates SCA Failure Catalogue │
|
|
│ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ Reachability│ │ FC6: Java │ │
|
|
│ │ TTFS │ │ FC7: .NET │ │
|
|
│ │ Performance │ │ FC8: Docker │ │
|
|
│ └─────────────┘ │ FC9: PURL │ │
|
|
│ │ │ FC10: CVE │ │
|
|
│ │ └─────────────┘ │
|
|
│ │ │ │
|
|
│ └──────────┬───────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ Sprint 0352 Sprint 0353 │
|
|
│ Security Testing Mutation Testing │
|
|
│ ┌─────────────┐ ┌─────────────┐ │
|
|
│ │ OWASP Top 10│ │ Stryker.NET │ │
|
|
│ │ A01-A10 │ │ Scanner │ │
|
|
│ │ 50+ tests │ │ Policy │ │
|
|
│ └─────────────┘ │ Authority │ │
|
|
│ └─────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Execution Order
|
|
|
|
1. **Sprint 0350** and **Sprint 0351** can run in parallel (no dependencies)
|
|
2. **Sprint 0352** can run in parallel with 0350/0351
|
|
3. **Sprint 0353** should start after 0352 (security tests should be stable first)
|
|
|
|
### Estimated Duration
|
|
|
|
| Sprint | Tasks | Estimated Effort |
|
|
|--------|-------|------------------|
|
|
| 0350 | 10 | 2-3 developer-days |
|
|
| 0351 | 10 | 3-4 developer-days |
|
|
| 0352 | 10 | 4-5 developer-days |
|
|
| 0353 | 10 | 3-4 developer-days |
|
|
| **Total** | **40** | **12-16 developer-days** |
|
|
|
|
---
|
|
|
|
## Sprint Details
|
|
|
|
### Sprint 0350: CI Quality Gates Foundation
|
|
|
|
**File:** `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md`
|
|
|
|
**Objective:** Connect existing test infrastructure to CI enforcement
|
|
|
|
**Key Deliverables:**
|
|
- `scripts/ci/compute-reachability-metrics.sh` - Compute recall/precision from corpus
|
|
- `scripts/ci/reachability-thresholds.yaml` - Enforcement thresholds
|
|
- `scripts/ci/compute-ttfs-metrics.sh` - TTFS extraction from test runs
|
|
- `bench/baselines/ttfs-baseline.json` - TTFS targets
|
|
- `scripts/ci/enforce-performance-slos.sh` - Performance SLO checks
|
|
- CI workflow modifications for quality gates
|
|
|
|
**Quality Thresholds:**
|
|
```yaml
|
|
thresholds:
|
|
runtime_dependency_recall: >= 0.95
|
|
unreachable_false_positives: <= 0.05
|
|
reachability_underreport: <= 0.10
|
|
ttfs_regression: <= +10% vs main
|
|
```
|
|
|
|
---
|
|
|
|
### Sprint 0351: SCA Failure Catalogue Completion
|
|
|
|
**File:** `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md`
|
|
|
|
**Objective:** Complete FC6-FC10 test cases for scanner regression testing
|
|
|
|
**New Failure Cases:**
|
|
| ID | Name | Failure Mode |
|
|
|----|------|--------------|
|
|
| FC6 | Java Shadow JAR | Shaded dependencies not detected |
|
|
| FC7 | .NET Transitive Pinning | CPM pins to vulnerable version |
|
|
| FC8 | Docker Multi-Stage Leakage | Build-time deps in runtime analysis |
|
|
| FC9 | PURL Namespace Collision | npm vs pypi same package name |
|
|
| FC10 | CVE Split/Merge | Single vuln with multiple CVE IDs |
|
|
|
|
**Key Deliverables:**
|
|
- 5 new fixture directories under `tests/fixtures/sca/catalogue/`
|
|
- DSSE manifests for integrity verification
|
|
- xUnit test project for failure catalogue
|
|
- Updated documentation
|
|
|
|
---
|
|
|
|
### Sprint 0352: Security Testing Framework
|
|
|
|
**File:** `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md`
|
|
|
|
**Objective:** Systematic OWASP Top 10 coverage for StellaOps
|
|
|
|
**Coverage Matrix:**
|
|
| OWASP | Category | Test Count |
|
|
|-------|----------|------------|
|
|
| A01 | Broken Access Control | 8+ |
|
|
| A02 | Cryptographic Failures | 6+ |
|
|
| A03 | Injection | 10+ |
|
|
| A05 | Security Misconfiguration | 6+ |
|
|
| A07 | Authentication Failures | 8+ |
|
|
| A08 | Integrity Failures | 5+ |
|
|
| A10 | SSRF | 8+ |
|
|
| **Total** | **7 categories** | **50+ tests** |
|
|
|
|
**Key Deliverables:**
|
|
- `tests/security/StellaOps.Security.Tests/` - Security test project
|
|
- `MaliciousPayloads.cs` - Common attack patterns
|
|
- `SecurityTestBase.cs` - Test infrastructure
|
|
- `.gitea/workflows/security-tests.yml` - Dedicated CI workflow
|
|
- `docs/testing/security-testing-guide.md` - Documentation
|
|
|
|
---
|
|
|
|
### Sprint 0353: Mutation Testing Integration
|
|
|
|
**File:** `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md`
|
|
|
|
**Objective:** Measure test suite effectiveness with Stryker.NET
|
|
|
|
**Target Modules:**
|
|
| Module | Threshold (Break) | Threshold (High) |
|
|
|--------|-------------------|------------------|
|
|
| Scanner.Core | 60% | 85% |
|
|
| Policy.Engine | 60% | 85% |
|
|
| Authority.Core | 65% | 90% |
|
|
|
|
**Key Deliverables:**
|
|
- Stryker.NET configuration for each target module
|
|
- `scripts/ci/mutation-thresholds.yaml` - Threshold configuration
|
|
- `.gitea/workflows/mutation-testing.yml` - Weekly mutation runs
|
|
- `bench/baselines/mutation-baselines.json` - Baseline scores
|
|
- `docs/testing/mutation-testing-guide.md` - Developer guide
|
|
|
|
---
|
|
|
|
## Existing Infrastructure Mapping
|
|
|
|
The advisory describes structures that already exist under different names:
|
|
|
|
| Advisory Structure | Existing Equivalent | Notes |
|
|
|-------------------|---------------------|-------|
|
|
| `toys/svc-XX/` | `tests/reachability/corpus/` | Same purpose, different path |
|
|
| `labels.yaml` | `reachgraph.truth.json` | Different schema, same semantics |
|
|
| `evidence/trace.json` | Evidence in attestor module | Already implemented |
|
|
| `PostgresFixture` | `PostgresIntegrationFixture` | Already implemented |
|
|
| `FakeTimeProvider` | Authority tests, ConnectorTestHarness | Already used |
|
|
| `inputs.lock` | Exists in acceptance/guardrails | Already implemented |
|
|
|
|
---
|
|
|
|
## Quality Gate Summary
|
|
|
|
After implementation, CI will enforce:
|
|
|
|
### Reachability Gates
|
|
- Runtime dependency recall ≥ 95%
|
|
- Unreachable false positives ≤ 5%
|
|
- Reachability underreport ≤ 10%
|
|
|
|
### Performance Gates
|
|
- Medium service scan < 2 minutes
|
|
- Reachability compute < 30 seconds
|
|
- SBOM ingestion < 5 seconds
|
|
|
|
### TTFS Gates
|
|
- p50 < 2 seconds
|
|
- p95 < 5 seconds
|
|
- Regression ≤ +10% vs main
|
|
|
|
### Coverage Gates
|
|
- Line coverage ≥ 70% (existing)
|
|
- Branch coverage ≥ 60% (existing)
|
|
- Mutation score ≥ 60-65% (break threshold)
|
|
|
|
### Security Gates
|
|
- All security tests pass
|
|
- No OWASP Top 10 violations
|
|
|
|
---
|
|
|
|
## File Changes Summary
|
|
|
|
### New Files
|
|
|
|
```
|
|
scripts/ci/
|
|
├── compute-reachability-metrics.sh
|
|
├── compute-ttfs-metrics.sh
|
|
├── enforce-performance-slos.sh
|
|
├── enforce-thresholds.sh
|
|
├── enforce-mutation-thresholds.sh
|
|
├── extract-mutation-score.sh
|
|
├── reachability-thresholds.yaml
|
|
└── mutation-thresholds.yaml
|
|
|
|
bench/baselines/
|
|
├── ttfs-baseline.json
|
|
└── mutation-baselines.json
|
|
|
|
tests/
|
|
├── fixtures/sca/catalogue/
|
|
│ ├── fc6-java-shadow-jar/
|
|
│ ├── fc7-dotnet-transitive-pinning/
|
|
│ ├── fc8-docker-multistage-leakage/
|
|
│ ├── fc9-purl-namespace-collision/
|
|
│ └── fc10-cve-split-merge/
|
|
└── security/
|
|
└── StellaOps.Security.Tests/
|
|
|
|
.gitea/workflows/
|
|
├── security-tests.yml
|
|
└── mutation-testing.yml
|
|
|
|
.config/
|
|
└── dotnet-tools.json (stryker)
|
|
|
|
stryker-config.json (root)
|
|
|
|
src/Scanner/__Libraries/StellaOps.Scanner.Core/stryker-config.json
|
|
src/Policy/StellaOps.Policy.Engine/stryker-config.json
|
|
src/Authority/StellaOps.Authority.Core/stryker-config.json
|
|
|
|
docs/testing/
|
|
├── ci-quality-gates.md
|
|
├── security-testing-guide.md
|
|
└── mutation-testing-guide.md
|
|
```
|
|
|
|
### Modified Files
|
|
|
|
```
|
|
.gitea/workflows/build-test-deploy.yml
|
|
tests/fixtures/sca/catalogue/inputs.lock
|
|
tests/fixtures/sca/catalogue/README.md
|
|
README.md (badges)
|
|
```
|
|
|
|
---
|
|
|
|
## Rollback Strategy
|
|
|
|
If quality gates cause CI instability:
|
|
|
|
1. **Immediate:** Set `failure_mode: warn` in threshold configs
|
|
2. **Short-term:** Remove `needs:` dependencies to unblock other jobs
|
|
3. **Investigation:** Create issue with specific threshold that failed
|
|
4. **Resolution:** Either fix underlying issue or adjust threshold
|
|
5. **Re-enable:** Set `failure_mode: block` after verification
|
|
|
|
---
|
|
|
|
## Success Metrics
|
|
|
|
| Metric | Current | Target | Measurement |
|
|
|--------|---------|--------|-------------|
|
|
| FC Catalogue coverage | 5 cases | 10 cases | Count of fixtures |
|
|
| Security test coverage | Partial | 50+ tests | OWASP categories |
|
|
| Mutation score (Scanner) | Unknown | ≥ 70% | Stryker weekly |
|
|
| Mutation score (Policy) | Unknown | ≥ 70% | Stryker weekly |
|
|
| Mutation score (Authority) | Unknown | ≥ 80% | Stryker weekly |
|
|
| Quality gate pass rate | N/A | ≥ 95% | CI runs |
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
### Sprint Files
|
|
- `docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md`
|
|
- `docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md`
|
|
- `docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md`
|
|
- `docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md`
|
|
|
|
### Source Advisory
|
|
- `docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md`
|
|
|
|
### Existing Documentation
|
|
- `docs/19_TEST_SUITE_OVERVIEW.md`
|
|
- `docs/reachability/ground-truth-schema.md`
|
|
- `docs/reachability/corpus-plan.md`
|
|
- `tests/reachability/README.md`
|
|
|
|
---
|
|
|
|
## Document Version
|
|
|
|
| Field | Value |
|
|
|-------|-------|
|
|
| Version | 1.0 |
|
|
| Created | 2025-12-14 |
|
|
| Author | Platform Team |
|
|
| Status | Planning Complete |
|