Files
git.stella-ops.org/docs/features/dropped/scanner-deterministic-regression-test-framework.md

4.2 KiB

Scanner Deterministic Regression Test Framework

Module

Scanner

Status

PARTIALLY_IMPLEMENTED

Description

A structured regression test framework with standardized case layout, golden fixture comparison, and dedicated CI job. Each regression case is identified by SCN-XXXX-slug, contains frozen inputs and expected outputs, and uses byte-level comparison to detect scanner output drift.

What's Implemented

  • Existing Determinism Tests:
    • src/Scanner/__Tests/StellaOps.Scanner.SmartDiff.Tests/ - Golden fixture tests for SmartDiff comparing actual vs. expected SBOM deltas with frozen inputs
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Node.Tests/ - Deterministic language analyzer tests with frozen package.json/lock files
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Ruby.Tests/ - Deterministic Ruby analyzer tests with frozen Gemfile.lock fixtures
    • src/Scanner/__Tests/StellaOps.Scanner.Analyzers.Lang.Java.Tests/ - Deterministic Java analyzer tests with frozen pom.xml/build.gradle fixtures
  • Reachability Tests:
    • src/Scanner/__Tests/StellaOps.Scanner.Reachability.Tests/ - Reachability analysis tests with frozen call-graph fixtures and expected classification outputs
  • Test Infrastructure:
    • Existing test projects demonstrate the golden fixture pattern (frozen input -> run analyzer -> compare against expected output) but each project uses its own ad-hoc fixture layout

What's Missing

  • Standardized Case Layout: No Regression/ directory with SCN-XXXX-slug/ subdirectories containing:
    • case.metadata.json (case ID, description, scanner version that introduced the regression, severity)
    • case.md (human-readable regression description with root cause analysis)
    • input/ (frozen input fixtures: container layers, SBOMs, lock files)
    • expected/ (expected output fixtures: SBOMs, reachability results, verdict payloads)
  • Regression Test Runner: No unified test runner that discovers all SCN-XXXX-slug/ cases, runs each through the scanner pipeline, and performs byte-level output comparison
  • Dedicated CI Job: No scanner-regression CI job that runs regression tests separately from unit tests with clear pass/fail reporting per case
  • Regression Case Generator: No tooling to capture a failing scanner scenario and automatically generate a new SCN-XXXX-slug/ case from it
  • Drift Detection: No tooling to detect when scanner output changes (intentionally or unintentionally) and prompt for expected-output updates with review

Implementation Plan

  1. Create src/Scanner/__Tests/StellaOps.Scanner.Regression.Tests/ project with case discovery infrastructure
  2. Define case.metadata.json schema with fields: caseId, slug, description, introducedInVersion, severity, tags
  3. Create initial regression cases from existing golden fixture tests (migrate 5-10 representative cases)
  4. Implement RegressionTestRunner that discovers cases, runs scanner pipeline on inputs, compares outputs byte-by-byte
  5. Add case-capture CLI tool that takes a scanner invocation and generates a new case directory with frozen inputs and current outputs
  6. Add scanner-regression CI job in .gitea/workflows/ that runs regression tests and reports per-case pass/fail
  7. Add drift detection that generates a diff report when expected output changes

E2E Test Plan

  • Run the regression test runner and verify all SCN-XXXX-slug/ cases produce output that byte-matches their expected/ fixtures
  • Add a new regression case using the case-capture tool and verify it is automatically discovered by the test runner on the next run
  • Introduce an intentional scanner change that modifies output for one case and verify the regression test runner detects the drift and fails the case
  • Update the expected output for the changed case and verify the test runner passes again
  • Verify case.metadata.json is validated on test startup (missing required fields cause a clear error)
  • Verify the CI job produces a per-case pass/fail report with case ID, slug, and failure diff for any failing cases
  • Verify regression tests run in under 5 minutes for the initial 10-case corpus
  • Source: See feature catalog
  • Architecture: docs/modules/scanner/architecture.md