Scanner Deterministic Regression Test Framework

Module

Scanner

Status

PARTIALLY_IMPLEMENTED

Description

A structured regression test framework with standardized case layout, golden fixture comparison, and dedicated CI job. Each regression case is identified by SCN-XXXX-slug, contains frozen inputs and expected outputs, and uses byte-level comparison to detect scanner output drift.

What's Implemented

Existing Determinism Tests:
- src/Scanner/__Tests/StellaOps.Scanner.SmartDiff.Tests/ - Golden fixture tests for SmartDiff comparing actual vs. expected SBOM deltas with frozen inputs
- src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Node.Tests/ - Deterministic language analyzer tests with frozen package.json/lock files
- src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Ruby.Tests/ - Deterministic Ruby analyzer tests with frozen Gemfile.lock fixtures
- src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Java.Tests/ - Deterministic Java analyzer tests with frozen pom.xml/build.gradle fixtures
Reachability Tests:
- src/Scanner/__Tests/StellaOps.Scanner.Reachability.Tests/ - Reachability analysis tests with frozen call-graph fixtures and expected classification outputs
Test Infrastructure:
- Existing test projects demonstrate the golden fixture pattern (frozen input -> run analyzer -> compare against expected output) but each project uses its own ad-hoc fixture layout

What's Missing

Standardized Case Layout: No Regression/ directory with SCN-XXXX-slug/ subdirectories containing:
- case.metadata.json (case ID, description, scanner version that introduced the regression, severity)
- case.md (human-readable regression description with root cause analysis)
- input/ (frozen input fixtures: container layers, SBOMs, lock files)
- expected/ (expected output fixtures: SBOMs, reachability results, verdict payloads)
Regression Test Runner: No unified test runner that discovers all SCN-XXXX-slug/ cases, runs each through the scanner pipeline, and performs byte-level output comparison
Dedicated CI Job: No scanner-regression CI job that runs regression tests separately from unit tests with clear pass/fail reporting per case
Regression Case Generator: No tooling to capture a failing scanner scenario and automatically generate a new SCN-XXXX-slug/ case from it
Drift Detection: No tooling to detect when scanner output changes (intentionally or unintentionally) and prompt for expected-output updates with review

Implementation Plan

Create src/Scanner/__Tests/StellaOps.Scanner.Regression.Tests/ project with case discovery infrastructure
Define case.metadata.json schema with fields: caseId, slug, description, introducedInVersion, severity, tags
Create initial regression cases from existing golden fixture tests (migrate 5-10 representative cases)
Implement RegressionTestRunner that discovers cases, runs scanner pipeline on inputs, compares outputs byte-by-byte
Add case-capture CLI tool that takes a scanner invocation and generates a new case directory with frozen inputs and current outputs
Add scanner-regression CI job in .gitea/workflows/ that runs regression tests and reports per-case pass/fail
Add drift detection that generates a diff report when expected output changes

E2E Test Plan

Run the regression test runner and verify all SCN-XXXX-slug/ cases produce output that byte-matches their expected/ fixtures
Add a new regression case using the case-capture tool and verify it is automatically discovered by the test runner on the next run
Introduce an intentional scanner change that modifies output for one case and verify the regression test runner detects the drift and fails the case
Update the expected output for the changed case and verify the test runner passes again
Verify case.metadata.json is validated on test startup (missing required fields cause a clear error)
Verify the CI job produces a per-case pass/fail report with case ID, slug, and failure diff for any failing cases
Verify regression tests run in under 5 minutes for the initial 10-case corpus

Source: See feature catalog
Architecture: docs/modules/scanner/architecture.md

4.2 KiB Raw Blame History