Files
git.stella-ops.org/docs/testing/testing-quality-guardrails-implementation.md
StellaOps Bot b058dbe031 up
2025-12-14 23:20:14 +02:00

12 KiB

Testing Quality Guardrails Implementation

Overview

This document provides the master implementation plan for the Testing Quality Guardrails system derived from the 14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md product advisory.

Source Advisory: docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md

Implementation Status: Planning Complete, Execution Pending


Executive Summary

The Testing Quality Guardrails implementation addresses gaps between the product advisory and the current StellaOps codebase. After analysis, we identified 6 high-value items to implement across 4 focused sprints.

What We're Implementing

Item Sprint Value Effort
Reachability quality gates in CI 0350 HIGH LOW
TTFS regression tracking 0350 HIGH LOW
Performance SLO enforcement 0350 HIGH LOW
SCA Failure Catalogue (FC6-FC10) 0351 HIGH MEDIUM
Security testing (OWASP Top 10) 0352 HIGH MEDIUM
Mutation testing (Stryker.NET) 0353 MEDIUM MEDIUM

What We're NOT Implementing (and Why)

Item Reason
toys/svc-XX/ directory restructure Already have equivalent in tests/reachability/corpus/
labels.yaml per-service format Already have reachgraph.truth.json with same semantics
Canonical TAG format Can adopt incrementally, not blocking
Fix validation 100% pass rate Too rigid; changed to 90% for fixable cases
Automated reviewer rejection Over-engineering; human judgment needed

Sprint Roadmap

┌─────────────────────────────────────────────────────────────────┐
│                    TESTING QUALITY GUARDRAILS                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Sprint 0350                Sprint 0351                          │
│  CI Quality Gates           SCA Failure Catalogue                │
│  ┌─────────────┐            ┌─────────────┐                     │
│  │ Reachability│            │ FC6: Java   │                     │
│  │ TTFS        │            │ FC7: .NET   │                     │
│  │ Performance │            │ FC8: Docker │                     │
│  └─────────────┘            │ FC9: PURL   │                     │
│       │                     │ FC10: CVE   │                     │
│       │                     └─────────────┘                     │
│       │                          │                              │
│       └──────────┬───────────────┘                              │
│                  │                                               │
│                  ▼                                               │
│  Sprint 0352                Sprint 0353                          │
│  Security Testing           Mutation Testing                     │
│  ┌─────────────┐            ┌─────────────┐                     │
│  │ OWASP Top 10│            │ Stryker.NET │                     │
│  │ A01-A10     │            │ Scanner     │                     │
│  │ 50+ tests   │            │ Policy      │                     │
│  └─────────────┘            │ Authority   │                     │
│                             └─────────────┘                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Execution Order

  1. Sprint 0350 and Sprint 0351 can run in parallel (no dependencies)
  2. Sprint 0352 can run in parallel with 0350/0351
  3. Sprint 0353 should start after 0352 (security tests should be stable first)

Estimated Duration

Sprint Tasks Estimated Effort
0350 10 2-3 developer-days
0351 10 3-4 developer-days
0352 10 4-5 developer-days
0353 10 3-4 developer-days
Total 40 12-16 developer-days

Sprint Details

Sprint 0350: CI Quality Gates Foundation

File: docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md

Objective: Connect existing test infrastructure to CI enforcement

Key Deliverables:

  • scripts/ci/compute-reachability-metrics.sh - Compute recall/precision from corpus
  • scripts/ci/reachability-thresholds.yaml - Enforcement thresholds
  • scripts/ci/compute-ttfs-metrics.sh - TTFS extraction from test runs
  • bench/baselines/ttfs-baseline.json - TTFS targets
  • scripts/ci/enforce-performance-slos.sh - Performance SLO checks
  • CI workflow modifications for quality gates

Quality Thresholds:

thresholds:
  runtime_dependency_recall: >= 0.95
  unreachable_false_positives: <= 0.05
  reachability_underreport: <= 0.10
  ttfs_regression: <= +10% vs main

Sprint 0351: SCA Failure Catalogue Completion

File: docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md

Objective: Complete FC6-FC10 test cases for scanner regression testing

New Failure Cases:

ID Name Failure Mode
FC6 Java Shadow JAR Shaded dependencies not detected
FC7 .NET Transitive Pinning CPM pins to vulnerable version
FC8 Docker Multi-Stage Leakage Build-time deps in runtime analysis
FC9 PURL Namespace Collision npm vs pypi same package name
FC10 CVE Split/Merge Single vuln with multiple CVE IDs

Key Deliverables:

  • 5 new fixture directories under tests/fixtures/sca/catalogue/
  • DSSE manifests for integrity verification
  • xUnit test project for failure catalogue
  • Updated documentation

Sprint 0352: Security Testing Framework

File: docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md

Objective: Systematic OWASP Top 10 coverage for StellaOps

Coverage Matrix:

OWASP Category Test Count
A01 Broken Access Control 8+
A02 Cryptographic Failures 6+
A03 Injection 10+
A05 Security Misconfiguration 6+
A07 Authentication Failures 8+
A08 Integrity Failures 5+
A10 SSRF 8+
Total 7 categories 50+ tests

Key Deliverables:

  • tests/security/StellaOps.Security.Tests/ - Security test project
  • MaliciousPayloads.cs - Common attack patterns
  • SecurityTestBase.cs - Test infrastructure
  • .gitea/workflows/security-tests.yml - Dedicated CI workflow
  • docs/testing/security-testing-guide.md - Documentation

Sprint 0353: Mutation Testing Integration

File: docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md

Objective: Measure test suite effectiveness with Stryker.NET

Target Modules:

Module Threshold (Break) Threshold (High)
Scanner.Core 60% 85%
Policy.Engine 60% 85%
Authority.Core 65% 90%

Key Deliverables:

  • Stryker.NET configuration for each target module
  • scripts/ci/mutation-thresholds.yaml - Threshold configuration
  • .gitea/workflows/mutation-testing.yml - Weekly mutation runs
  • bench/baselines/mutation-baselines.json - Baseline scores
  • docs/testing/mutation-testing-guide.md - Developer guide

Existing Infrastructure Mapping

The advisory describes structures that already exist under different names:

Advisory Structure Existing Equivalent Notes
toys/svc-XX/ tests/reachability/corpus/ Same purpose, different path
labels.yaml reachgraph.truth.json Different schema, same semantics
evidence/trace.json Evidence in attestor module Already implemented
PostgresFixture PostgresIntegrationFixture Already implemented
FakeTimeProvider Authority tests, ConnectorTestHarness Already used
inputs.lock Exists in acceptance/guardrails Already implemented

Quality Gate Summary

After implementation, CI will enforce:

Reachability Gates

  • Runtime dependency recall ≥ 95%
  • Unreachable false positives ≤ 5%
  • Reachability underreport ≤ 10%

Performance Gates

  • Medium service scan < 2 minutes
  • Reachability compute < 30 seconds
  • SBOM ingestion < 5 seconds

TTFS Gates

  • p50 < 2 seconds
  • p95 < 5 seconds
  • Regression ≤ +10% vs main

Coverage Gates

  • Line coverage ≥ 70% (existing)
  • Branch coverage ≥ 60% (existing)
  • Mutation score ≥ 60-65% (break threshold)

Security Gates

  • All security tests pass
  • No OWASP Top 10 violations

File Changes Summary

New Files

scripts/ci/
├── compute-reachability-metrics.sh
├── compute-ttfs-metrics.sh
├── enforce-performance-slos.sh
├── enforce-thresholds.sh
├── enforce-mutation-thresholds.sh
├── extract-mutation-score.sh
├── reachability-thresholds.yaml
└── mutation-thresholds.yaml

bench/baselines/
├── ttfs-baseline.json
└── mutation-baselines.json

tests/
├── fixtures/sca/catalogue/
│   ├── fc6-java-shadow-jar/
│   ├── fc7-dotnet-transitive-pinning/
│   ├── fc8-docker-multistage-leakage/
│   ├── fc9-purl-namespace-collision/
│   └── fc10-cve-split-merge/
└── security/
    └── StellaOps.Security.Tests/

.gitea/workflows/
├── security-tests.yml
└── mutation-testing.yml

.config/
└── dotnet-tools.json (stryker)

stryker-config.json (root)

src/Scanner/__Libraries/StellaOps.Scanner.Core/stryker-config.json
src/Policy/StellaOps.Policy.Engine/stryker-config.json
src/Authority/StellaOps.Authority.Core/stryker-config.json

docs/testing/
├── ci-quality-gates.md
├── security-testing-guide.md
└── mutation-testing-guide.md

Modified Files

.gitea/workflows/build-test-deploy.yml
tests/fixtures/sca/catalogue/inputs.lock
tests/fixtures/sca/catalogue/README.md
README.md (badges)

Rollback Strategy

If quality gates cause CI instability:

  1. Immediate: Set failure_mode: warn in threshold configs
  2. Short-term: Remove needs: dependencies to unblock other jobs
  3. Investigation: Create issue with specific threshold that failed
  4. Resolution: Either fix underlying issue or adjust threshold
  5. Re-enable: Set failure_mode: block after verification

Success Metrics

Metric Current Target Measurement
FC Catalogue coverage 5 cases 10 cases Count of fixtures
Security test coverage Partial 50+ tests OWASP categories
Mutation score (Scanner) Unknown ≥ 70% Stryker weekly
Mutation score (Policy) Unknown ≥ 70% Stryker weekly
Mutation score (Authority) Unknown ≥ 80% Stryker weekly
Quality gate pass rate N/A ≥ 95% CI runs

References

Sprint Files

  • docs/implplan/SPRINT_0350_0001_0001_ci_quality_gates_foundation.md
  • docs/implplan/SPRINT_0351_0001_0001_sca_failure_catalogue_completion.md
  • docs/implplan/SPRINT_0352_0001_0001_security_testing_framework.md
  • docs/implplan/SPRINT_0353_0001_0001_mutation_testing_integration.md

Source Advisory

  • docs/product-advisories/14-Dec-2025 - Testing and Quality Guardrails Technical Reference.md

Existing Documentation

  • docs/19_TEST_SUITE_OVERVIEW.md
  • docs/reachability/ground-truth-schema.md
  • docs/reachability/corpus-plan.md
  • tests/reachability/README.md

Document Version

Field Value
Version 1.0
Created 2025-12-14
Author Platform Team
Status Planning Complete