# Testing and Quality Guardrails Technical Reference **Source Advisories**: - 29-Nov-2025 - Acceptance Tests Pack and Guardrails - 29-Nov-2025 - SCA Failure Catalogue for StellaOps Tests - 30-Nov-2025 - Ecosystem Reality Test Cases for StellaOps - 14-Dec-2025 - Create a small ground‑truth corpus **Last Updated**: 2025-12-14 --- ## 1. ACCEPTANCE TEST PACK SCHEMA ### 1.1 Required Artifacts (MVP for DONE) - Advisory summary under `docs/process/` - Checklist stub referencing AT1–AT10 - Fixture pack path: `tests/acceptance/packs/guardrails/` (no network) - Links into sprint tracker (`SPRINT_0300_0001_0001_documentation_process.md`) ### 1.2 Determinism & Offline - Freeze scanner/db versions; record in `inputs.lock` - All fixtures reproducible from seeds - Include DSSE envelopes for pack manifests ## 2. SCA FAILURE CATALOGUE (FC1-FC10) ### 2.1 Required Artifacts - Catalogue plus fixture pack root: `tests/fixtures/sca/catalogue/` - Sprint Execution Log entry when published ### 2.2 Fixture Requirements - Pin scanner versions and feeds - Include `inputs.lock` and DSSE manifest per case - Normalize results (ordering, casing) for stable comparisons ## 3. ECOSYSTEM REALITY TEST CASES (ET1-ET10) **Fixture Path**: `tests/fixtures/sca/catalogue/` **Requirements**: - Map each incident to acceptance tests and fixture paths - Pin tool versions and feeds; no live network - Populate fixtures and acceptance specs ## 4. GROUND-TRUTH CORPUS SCHEMA ### 4.1 Service Structure Each service under `/toys/svc-XX-/`: ``` app/ infra/ # Dockerfile, compose, network policy tests/ # positive + negative reachability tests labels.yaml # ground truth evidence/ # generated by tests (trace, tags, manifests) fix/ # minimal patch proving remediation ``` ### 4.2 labels.yaml Schema ```yaml service: svc-01-password-reset vulns: - id: V1 cve: CVE-2022-XXXXX type: dep_runtime|dep_build|code|config|os_pkg|supply_chain package: string version: string reachable: true|false reachability_level: R0|R1|R2|R3|R4 entrypoint: string # route:/reset, topic:jobs, cli:command preconditions: [string] # flags/env/auth path_tags: [string] proof: artifacts: [string] tags: [string] fix: type: upgrade|config|code patch_path: string expected_delta: string negative_proof: string # if unreachable ``` ### 4.3 Reachability Tiers - **R0 Present**: component exists in SBOM, not imported/loaded - **R1 Loaded**: imported/linked/initialized, no executed path - **R2 Executed**: vulnerable function executed (deterministic trace) - **R3 Tainted execution**: execution with externally influenced input - **R4 Exploitable**: controlled, non-harmful PoC (optional) ### 4.4 Evidence Requirements per Tier - **R0**: SBOM + file hash/package metadata - **R1**: runtime startup logs or module load trace tag - **R2**: callsite tag + stack trace snippet - **R3**: R2 + taint marker showing external data reached call - **R4**: only if safe/necessary; non-weaponized, sandboxed ### 4.5 Canonical Tag Format ``` TAG:route: TAG:topic: TAG:call: TAG:taint: TAG:flag:= ``` ### 4.6 Evidence Artifact Schema **evidence/trace.json**: ```json { "ts": "UTC ISO-8601", "corr": "correlation-id", "tags": ["TAG:route:POST /reset", "TAG:taint:http.body.email", "TAG:call:Crypto.MD5"] } ``` ### 4.7 Evidence Manifest **evidence/manifest.json**: ```json { "git_sha": "string", "image_digest": "string", "tool_versions": {"scanner": "string", "db": "string"}, "timestamps": {"started_at": "UTC ISO-8601", "completed_at": "UTC ISO-8601"}, "evidence_hashes": {"trace.json": "sha256:...", "tags.log": "sha256:..."} } ``` ## 5. CORE TEST METRICS | Metric | Definition | |--------|------------| | Recall (by class) | % of labeled vulns detected (runtime deps, OS pkgs, code, config) | | Precision | 1 - false positive rate | | Reachability accuracy | % correct R0/R1/R2/R3 classifications | | Overreach | Predicted reachable but labeled R0/R1 | | Underreach | Labeled R2/R3 but predicted non-reachable | | TTFS | Time-to-first-signal (first evidence-backed blocking issue) | | Fix validation | % of applied fixes producing expected delta | ## 6. TEST QUALITY GATES (CI ENFORCEMENT THRESHOLDS) ```yaml thresholds: runtime_dependency_recall: >= 0.95 unreachable_false_positives: <= 0.05 reachability_underreport: <= 0.10 ttfs_regression: <= +10% vs main fix_validation_pass_rate: 100% ``` ## 7. SERVICE DEFINITION OF DONE A service PR is DONE only if it includes: - [ ] `labels.yaml` validated by `schemas/labels.schema.json` - [ ] Docker build reproducible (digest pinned, lockfiles committed) - [ ] Positive tests generating evidence proving reachability tiers - [ ] Negative tests proving "unreachable" claims - [ ] `fix/` patch removing/mitigating weakness with measurable delta - [ ] `evidence/manifest.json` capturing tool versions, git sha, image digest, timestamps, evidence hashes ## 8. REVIEWER REJECTION CRITERIA Reject PR if any fail: - [ ] Labels complete, schema-valid, stable IDs preserved - [ ] Proof artifacts deterministic and generated by tests - [ ] Reachability tier justified and matches evidence - [ ] Unreachable claims have negative proofs - [ ] Docker build uses pinned digests + committed lockfiles - [ ] `fix/` produces measurable delta without new unlabeled issues - [ ] No network egress required; tests hermetic ## 9. TEST HARNESS PATTERNS ### 9.1 xUnit Test Template ```csharp public class ReachabilityAcceptanceTests : IClassFixture { private readonly PostgresFixture _db; public ReachabilityAcceptanceTests(PostgresFixture db) { _db = db; } [Theory] [InlineData("svc-01-password-reset", "V1", ReachabilityLevel.R2)] [InlineData("svc-02-file-upload", "V1", ReachabilityLevel.R0)] public async Task VerifyReachabilityClassification( string serviceId, string vulnId, ReachabilityLevel expectedLevel) { // Arrange var labels = await LoadLabels($"toys/{serviceId}/labels.yaml"); var expectedVuln = labels.Vulns.First(v => v.Id == vulnId); // Act var result = await _scanner.ScanAsync(serviceId); var actualVuln = result.Findings.First(f => f.VulnId == vulnId); // Assert Assert.Equal(expectedLevel, actualVuln.ReachabilityLevel); Assert.NotEmpty(actualVuln.Evidence); } } ``` ### 9.2 Testcontainers Pattern ```csharp public class PostgresFixture : IAsyncLifetime { private PostgreSqlContainer? _container; public string ConnectionString { get; private set; } = null!; public async Task InitializeAsync() { _container = new PostgreSqlBuilder() .WithImage("postgres:16-alpine") .WithDatabase("stellaops_test") .WithUsername("test") .WithPassword("test") .Build(); await _container.StartAsync(); ConnectionString = _container.GetConnectionString(); // Run migrations await RunMigrations(ConnectionString); } public async Task DisposeAsync() { if (_container != null) await _container.DisposeAsync(); } } ``` ## 10. FIXTURE ORGANIZATION ``` tests/ fixtures/ sca/ catalogue/ FC001_openssl_version_range/ inputs.lock sbom.cdx.json expected_findings.json dsse_manifest.json acceptance/ packs/ guardrails/ AT001_reachability_present/ AT002_reachability_loaded/ AT003_reachability_executed/ micro/ motion/ error/ offline/ toys/ svc-01-password-reset/ app/ infra/ tests/ labels.yaml evidence/ fix/ ``` ## 11. DETERMINISTIC TEST REQUIREMENTS ### 11.1 Time Handling - Freeze timers to `2025-12-04T12:00:00Z` in stories/e2e - Use `FakeTimeProvider` in .NET tests - Playwright: `useFakeTimers` ### 11.2 Random Number Generation - Seed RNG with `0x5EED2025` unless scenario-specific - Never use `Random()` without explicit seed ### 11.3 Network Isolation - No network calls in test execution - Offline assets bundled - Testcontainers for external dependencies - Mock external APIs ### 11.4 Snapshot Testing - All fixtures stored under `tests/fixtures/` - Golden outputs checked into git - Stable ordering for arrays/objects - Strip volatile fields (timestamps, UUIDs) unless semantic ## 12. COVERAGE REQUIREMENTS ### 12.1 Unit Tests - **Target**: ≥85% line coverage for core modules - **Critical paths**: 100% coverage required - **Exceptions**: UI glue code, generated code ### 12.2 Integration Tests - **Database operations**: All repositories tested with Testcontainers - **API endpoints**: All endpoints tested with WebApplicationFactory - **External integrations**: Mocked or stubbed ### 12.3 End-to-End Tests - **Critical workflows**: User registration → scan → triage → decision - **Happy paths**: All major features - **Error paths**: Authentication failures, network errors, data validation ## 13. PERFORMANCE TESTING ### 13.1 Benchmark Tests ```csharp [MemoryDiagnoser] public class ScannerBenchmarks { [Benchmark] public async Task ScanMediumImage() { // 100k LOC .NET service await _scanner.ScanAsync("medium-service"); } [Benchmark] public async Task ComputeReachability() { await _reachability.ComputeAsync(_testGraph); } } ``` ### 13.2 Performance Targets | Operation | Target | |-----------|--------| | Medium service scan | < 2 minutes | | Reachability compute | < 30 seconds | | Query GET finding | < 200ms p95 | | SBOM ingestion | < 5 seconds | ## 14. MUTATION TESTING ### 14.1 Stryker Configuration ```json { "stryker-config": { "mutate": [ "src/**/*.cs", "!src/**/*.Designer.cs", "!src/**/Migrations/**" ], "test-runner": "dotnet", "threshold-high": 90, "threshold-low": 70, "threshold-break": 60 } } ``` ### 14.2 Mutation Score Targets - **Critical modules**: ≥90% - **Standard modules**: ≥70% - **Break build**: <60% ## 15. SECURITY TESTING ### 15.1 OWASP Top 10 Coverage - [ ] SQL Injection - [ ] XSS (Cross-Site Scripting) - [ ] CSRF (Cross-Site Request Forgery) - [ ] Authentication bypasses - [ ] Authorization bypasses - [ ] Sensitive data exposure - [ ] XML External Entities (XXE) - [ ] Broken Access Control - [ ] Security Misconfiguration - [ ] Insecure Deserialization ### 15.2 Dependency Scanning ```bash # SBOM generation dotnet sbom-tool generate -b ./bin -bc ./src -pn StellaOps -pv 1.0.0 # Vulnerability scanning dotnet list package --vulnerable --include-transitive ``` ## 16. CI/CD INTEGRATION ### 16.1 GitHub Actions Workflow ```yaml name: Test on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-dotnet@v4 with: dotnet-version: '10.0.x' - name: Restore dependencies run: dotnet restore - name: Build run: dotnet build --no-restore - name: Test run: dotnet test --no-build --verbosity normal --collect:"XPlat Code Coverage" - name: Upload coverage uses: codecov/codecov-action@v4 ``` ### 16.2 Quality Gates - All tests pass - Coverage ≥85% - No high/critical vulnerabilities - Mutation score ≥70% - Performance regressions <10% --- **Document Version**: 1.0 **Target Platform**: .NET 10, PostgreSQL ≥16, Angular v17