# Competitor Ingest Anomaly Regression Tests (CM4) Status: Draft · Date: 2025-12-04 Scope: Define anomaly regression test suite for ingest pipeline covering schema drift, nullables, encoding, and ordering anomalies. ## Objectives - Detect schema drift in upstream tool outputs. - Validate handling of nullable/missing fields. - Ensure proper encoding handling (UTF-8, escaping). - Verify deterministic ordering is maintained. - Provide golden fixtures with expected hashes. ## Test Categories ### 1. Schema Drift Tests Detect when upstream tools change their output schema. ``` tests/anomaly/schema-drift/ ├── syft/ │ ├── v1.0.0-baseline.json # Known good output │ ├── v1.5.0-new-fields.json # Added fields │ ├── v1.5.0-removed-fields.json # Removed fields │ ├── v1.5.0-type-change.json # Field type changed │ └── expected-results.json ├── trivy/ │ └── ... (same structure) └── clair/ └── ... (same structure) ``` #### Test Cases | Test | Input | Expected Behavior | |------|-------|-------------------| | `new_optional_field` | Output with new field | Accept, ignore new field | | `new_required_field` | Output with new required field | Warn, map if possible | | `removed_optional_field` | Output missing optional field | Accept | | `removed_required_field` | Output missing required field | Reject | | `field_type_change` | Field type differs from schema | Reject or coerce | | `field_rename` | Field renamed without mapping | Warn, check mapping | #### Schema Drift Fixture ```json { "test": "new_optional_field", "tool": "syft", "inputVersion": "1.5.0", "baselineVersion": "1.0.0", "input": { "artifacts": [ { "name": "lib-a", "version": "1.0.0", "purl": "pkg:npm/lib-a@1.0.0", "newField": "unexpected value" } ] }, "expected": { "status": "accepted", "warnings": ["unknown_field:newField"], "normalizedHash": "b3:..." } } ``` ### 2. Nullable/Missing Field Tests Validate handling of null, empty, and missing values. ``` tests/anomaly/nullables/ ├── null-values.json ├── empty-strings.json ├── empty-arrays.json ├── missing-optional.json ├── missing-required.json └── expected-results.json ``` #### Test Cases | Test | Input | Expected Behavior | |------|-------|-------------------| | `null_optional` | Optional field is null | Accept, omit from output | | `null_required` | Required field is null | Reject | | `empty_string` | String field is "" | Accept, preserve or omit | | `empty_array` | Array field is [] | Accept, preserve | | `missing_optional` | Optional field absent | Accept | | `missing_required` | Required field absent | Reject | #### Nullable Fixture ```json { "test": "null_optional", "tool": "syft", "input": { "artifacts": [ { "name": "lib-a", "version": "1.0.0", "purl": "pkg:npm/lib-a@1.0.0", "licenses": null } ] }, "expected": { "status": "accepted", "output": { "components": [ { "name": "lib-a", "version": "1.0.0", "purl": "pkg:npm/lib-a@1.0.0" } ] }, "normalizedHash": "b3:..." } } ``` ### 3. Encoding Tests Validate proper handling of character encoding and escaping. ``` tests/anomaly/encoding/ ├── utf8-valid.json ├── utf8-bom.json ├── latin1-fallback.json ├── unicode-escapes.json ├── special-chars.json ├── json-escaping.json └── expected-results.json ``` #### Test Cases | Test | Input | Expected Behavior | |------|-------|-------------------| | `utf8_valid` | Standard UTF-8 | Accept | | `utf8_bom` | UTF-8 with BOM | Accept, strip BOM | | `unicode_escapes` | `\u0041` style escapes | Accept, decode | | `special_chars` | Tabs, newlines in strings | Accept, preserve or escape | | `control_chars` | Control characters (0x00-0x1F) | Reject or sanitize | | `surrogate_pairs` | Emoji and supplementary chars | Accept | #### Encoding Fixture ```json { "test": "special_chars", "tool": "syft", "input": { "artifacts": [ { "name": "lib-with-tab\ttab", "version": "1.0.0", "description": "Line1\nLine2" } ] }, "expected": { "status": "accepted", "output": { "components": [ { "name": "lib-with-tab\ttab", "version": "1.0.0" } ] }, "normalizedHash": "b3:..." } } ``` ### 4. Ordering Tests Verify deterministic ordering is maintained across inputs. ``` tests/anomaly/ordering/ ├── unsorted-components.json ├── reversed-components.json ├── random-order.json ├── unicode-sort.json ├── case-sensitivity.json └── expected-results.json ``` #### Test Cases | Test | Input | Expected Behavior | |------|-------|-------------------| | `unsorted_input` | Components in random order | Sort deterministically | | `reversed_input` | Components in reverse order | Sort deterministically | | `same_after_sort` | Pre-sorted input | Same output as unsorted | | `unicode_sort` | Unicode component names | Locale-invariant sort | | `case_sensitivity` | Mixed case names | Case-insensitive sort | #### Ordering Fixture ```json { "test": "unsorted_input", "tool": "syft", "input": { "artifacts": [ {"name": "zebra", "version": "1.0.0", "purl": "pkg:npm/zebra@1.0.0"}, {"name": "apple", "version": "1.0.0", "purl": "pkg:npm/apple@1.0.0"}, {"name": "mango", "version": "1.0.0", "purl": "pkg:npm/mango@1.0.0"} ] }, "expected": { "status": "accepted", "output": { "components": [ {"name": "apple", "version": "1.0.0", "purl": "pkg:npm/apple@1.0.0"}, {"name": "mango", "version": "1.0.0", "purl": "pkg:npm/mango@1.0.0"}, {"name": "zebra", "version": "1.0.0", "purl": "pkg:npm/zebra@1.0.0"} ] }, "normalizedHash": "b3:..." } } ``` ## Golden Fixtures ### Hash File Format ``` # tests/anomaly/hashes.txt schema-drift/syft/v1.0.0-baseline.json: BLAKE3=... SHA256=... schema-drift/syft/expected-results.json: BLAKE3=... SHA256=... nullables/null-values.json: BLAKE3=... SHA256=... nullables/expected-results.json: BLAKE3=... SHA256=... encoding/utf8-valid.json: BLAKE3=... SHA256=... encoding/expected-results.json: BLAKE3=... SHA256=... ordering/unsorted-components.json: BLAKE3=... SHA256=... ordering/expected-results.json: BLAKE3=... SHA256=... ``` ## CI Integration ### Test Workflow ```yaml # .gitea/workflows/anomaly-tests.yml name: Anomaly Regression Tests on: push: paths: - 'src/Scanner/Adapters/**' - 'tests/anomaly/**' pull_request: jobs: anomaly-tests: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup .NET uses: actions/setup-dotnet@v4 with: dotnet-version: '10.0.x' - name: Verify fixture hashes run: scripts/scanner/verify-anomaly-fixtures.sh - name: Run schema drift tests run: | dotnet test src/Scanner/__Tests/StellaOps.Scanner.Anomaly.Tests \ --filter "Category=SchemaDrift" - name: Run nullable tests run: | dotnet test src/Scanner/__Tests/StellaOps.Scanner.Anomaly.Tests \ --filter "Category=Nullable" - name: Run encoding tests run: | dotnet test src/Scanner/__Tests/StellaOps.Scanner.Anomaly.Tests \ --filter "Category=Encoding" - name: Run ordering tests run: | dotnet test src/Scanner/__Tests/StellaOps.Scanner.Anomaly.Tests \ --filter "Category=Ordering" ``` ### Test Runner ```csharp // src/Scanner/__Tests/StellaOps.Scanner.Anomaly.Tests/AnomalyTestRunner.cs [Category("SchemaDrift")] [Theory] [MemberData(nameof(GetSchemaDriftTestCases))] public async Task SchemaDrift_HandledCorrectly(AnomalyTestCase testCase) { // Arrange var adapter = _adapterFactory.Create(testCase.Tool); // Act var result = await adapter.NormalizeAsync(testCase.Input); // Assert Assert.Equal(testCase.Expected.Status, result.Status); Assert.Equal(testCase.Expected.Warnings, result.Warnings); if (testCase.Expected.NormalizedHash != null) { var hash = Blake3.HashData(Encoding.UTF8.GetBytes( JsonSerializer.Serialize(result.Output))); Assert.Equal(testCase.Expected.NormalizedHash, $"b3:{Convert.ToHexString(hash).ToLowerInvariant()}"); } } ``` ## Failure Handling ### On Test Failure 1. **Schema Drift**: Create issue, update adapter mapping 2. **Nullable Handling**: Fix normalization logic 3. **Encoding Error**: Fix encoding detection/conversion 4. **Ordering Violation**: Fix sort comparator ### Failure Report ```json { "failure": { "category": "schema_drift", "test": "new_required_field", "tool": "syft", "input": {...}, "expected": {...}, "actual": {...}, "diff": [ {"path": "/status", "expected": "accepted", "actual": "rejected"} ], "timestamp": "2025-12-04T12:00:00Z" } } ``` ## Links - Sprint: `docs/implplan/SPRINT_0186_0001_0001_record_deterministic_execution.md` (CM4) - Normalization: `docs/modules/scanner/design/competitor-ingest-normalization.md` (CM1) - Fixtures: `docs/modules/scanner/fixtures/competitor-adapters/`