git.stella-ops.org/docs/features/dropped/scanner-deterministic-regression-test-framework.md

# Scanner Deterministic Regression Test Framework

## Module
Scanner

## Status
PARTIALLY_IMPLEMENTED

## Description
A structured regression test framework with standardized case layout, golden fixture comparison, and dedicated CI job. Each regression case is identified by `SCN-XXXX-slug`, contains frozen inputs and expected outputs, and uses byte-level comparison to detect scanner output drift.

## What's Implemented
- **Existing Determinism Tests**:
  - `src/Scanner/__Tests/StellaOps.Scanner.SmartDiff.Tests/` - Golden fixture tests for SmartDiff comparing actual vs. expected SBOM deltas with frozen inputs
  - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Node.Tests/` - Deterministic language analyzer tests with frozen package.json/lock files
  - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Ruby.Tests/` - Deterministic Ruby analyzer tests with frozen Gemfile.lock fixtures
  - `src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Java.Tests/` - Deterministic Java analyzer tests with frozen pom.xml/build.gradle fixtures
- **Reachability Tests**:
  - `src/Scanner/__Tests/StellaOps.Scanner.Reachability.Tests/` - Reachability analysis tests with frozen call-graph fixtures and expected classification outputs
- **Test Infrastructure**:
  - Existing test projects demonstrate the golden fixture pattern (frozen input -> run analyzer -> compare against expected output) but each project uses its own ad-hoc fixture layout

## What's Missing
- **Standardized Case Layout**: No `Regression/` directory with `SCN-XXXX-slug/` subdirectories containing:
  - `case.metadata.json` (case ID, description, scanner version that introduced the regression, severity)
  - `case.md` (human-readable regression description with root cause analysis)
  - `input/` (frozen input fixtures: container layers, SBOMs, lock files)
  - `expected/` (expected output fixtures: SBOMs, reachability results, verdict payloads)
- **Regression Test Runner**: No unified test runner that discovers all `SCN-XXXX-slug/` cases, runs each through the scanner pipeline, and performs byte-level output comparison
- **Dedicated CI Job**: No `scanner-regression` CI job that runs regression tests separately from unit tests with clear pass/fail reporting per case
- **Regression Case Generator**: No tooling to capture a failing scanner scenario and automatically generate a new `SCN-XXXX-slug/` case from it
- **Drift Detection**: No tooling to detect when scanner output changes (intentionally or unintentionally) and prompt for expected-output updates with review

## Implementation Plan
1. Create `src/Scanner/__Tests/StellaOps.Scanner.Regression.Tests/` project with case discovery infrastructure
2. Define `case.metadata.json` schema with fields: caseId, slug, description, introducedInVersion, severity, tags
3. Create initial regression cases from existing golden fixture tests (migrate 5-10 representative cases)
4. Implement `RegressionTestRunner` that discovers cases, runs scanner pipeline on inputs, compares outputs byte-by-byte
5. Add `case-capture` CLI tool that takes a scanner invocation and generates a new case directory with frozen inputs and current outputs
6. Add `scanner-regression` CI job in `.gitea/workflows/` that runs regression tests and reports per-case pass/fail
7. Add drift detection that generates a diff report when expected output changes

## E2E Test Plan
- [ ] Run the regression test runner and verify all `SCN-XXXX-slug/` cases produce output that byte-matches their `expected/` fixtures
- [ ] Add a new regression case using the case-capture tool and verify it is automatically discovered by the test runner on the next run
- [ ] Introduce an intentional scanner change that modifies output for one case and verify the regression test runner detects the drift and fails the case
- [ ] Update the expected output for the changed case and verify the test runner passes again
- [ ] Verify `case.metadata.json` is validated on test startup (missing required fields cause a clear error)
- [ ] Verify the CI job produces a per-case pass/fail report with case ID, slug, and failure diff for any failing cases
- [ ] Verify regression tests run in under 5 minutes for the initial 10-case corpus

## Related Documentation
- Source: See feature catalog
- Architecture: `docs/modules/scanner/architecture.md`