Files
git.stella-ops.org/docs/modules/scanner/design/determinism-ci-harness.md
StellaOps Bot 8768c27f30
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
Signals DSSE Sign & Evidence Locker / sign-signals-artifacts (push) Has been cancelled
Signals DSSE Sign & Evidence Locker / verify-signatures (push) Has been cancelled
Add signal contracts for reachability, exploitability, trust, and unknown symbols
- Introduced `ReachabilityState`, `RuntimeHit`, `ExploitabilitySignal`, `ReachabilitySignal`, `SignalEnvelope`, `SignalType`, `TrustSignal`, and `UnknownSymbolSignal` records to define various signal types and their properties.
- Implemented JSON serialization attributes for proper data interchange.
- Created project files for the new signal contracts library and corresponding test projects.
- Added deterministic test fixtures for micro-interaction testing.
- Included cryptographic keys for secure operations with cosign.
2025-12-05 00:27:00 +02:00

7.9 KiB

Determinism CI Harness for New Formats (SC5)

Status: Draft · Date: 2025-12-04 Scope: Define the determinism CI harness for validating stable ordering, hash checks, golden fixtures, and RNG seeds for CVSS v4, CycloneDX 1.7/CBOM, and SLSA 1.2 outputs.

Objectives

  • Ensure Scanner outputs are reproducible across builds, platforms, and time.
  • Validate that serialized SBOM/VEX/attestation outputs have deterministic ordering.
  • Anchor CI validation to golden fixtures with pre-computed hashes.
  • Enable offline verification without network dependencies.

CI Pipeline Integration

Environment Setup

# .gitea/workflows/scanner-determinism.yml additions
env:
  DOTNET_DISABLE_BUILTIN_GRAPH: "1"
  TZ: "UTC"
  LC_ALL: "C"
  STELLAOPS_DETERMINISM_SEED: "42"
  STELLAOPS_DETERMINISM_TIMESTAMP: "2025-01-01T00:00:00Z"

Required Environment Variables

Variable Purpose Default
TZ Force UTC timezone UTC
LC_ALL Force locale-invariant sorting C
STELLAOPS_DETERMINISM_SEED Fixed RNG seed for reproducibility 42
STELLAOPS_DETERMINISM_TIMESTAMP Fixed timestamp for output 2025-01-01T00:00:00Z
DOTNET_DISABLE_BUILTIN_GRAPH Disable non-deterministic graph features 1

Hash Validation Steps

1. Golden Fixture Verification

#!/bin/bash
# scripts/scanner/verify-determinism.sh

set -euo pipefail

FIXTURE_DIR="docs/modules/scanner/fixtures/cdx17-cbom"
HASH_FILE="${FIXTURE_DIR}/hashes.txt"

verify_fixture() {
    local file="$1"
    local expected_blake3="$2"
    local expected_sha256="$3"

    actual_blake3=$(b3sum "${file}" | cut -d' ' -f1)
    actual_sha256=$(sha256sum "${file}" | cut -d' ' -f1)

    if [[ "${actual_blake3}" != "${expected_blake3}" ]]; then
        echo "FAIL: ${file} BLAKE3 mismatch"
        echo "  expected: ${expected_blake3}"
        echo "  actual:   ${actual_blake3}"
        return 1
    fi

    if [[ "${actual_sha256}" != "${expected_sha256}" ]]; then
        echo "FAIL: ${file} SHA256 mismatch"
        echo "  expected: ${expected_sha256}"
        echo "  actual:   ${actual_sha256}"
        return 1
    fi

    echo "PASS: ${file}"
    return 0
}

# Parse hashes.txt and verify each fixture
while IFS=': ' read -r filename hashes; do
    blake3=$(echo "${hashes}" | grep -oP 'BLAKE3=\K[a-f0-9]+')
    sha256=$(echo "${hashes}" | grep -oP 'SHA256=\K[a-f0-9]+')
    verify_fixture "${FIXTURE_DIR}/${filename}" "${blake3}" "${sha256}"
done < <(grep -v '^#' "${HASH_FILE}")

2. Deterministic Serialization Test

// src/Scanner/__Tests/StellaOps.Scanner.Determinism.Tests/CdxDeterminismTests.cs
[Fact]
public async Task Cdx17_Serialization_Is_Deterministic()
{
    // Arrange
    var options = new DeterminismOptions
    {
        Seed = 42,
        Timestamp = new DateTimeOffset(2025, 1, 1, 0, 0, 0, TimeSpan.Zero),
        CultureInvariant = true
    };

    var sbom = CreateTestSbom();

    // Act - serialize twice
    var json1 = await _serializer.SerializeAsync(sbom, options);
    var json2 = await _serializer.SerializeAsync(sbom, options);

    // Assert - must be identical
    Assert.Equal(json1, json2);

    // Compute and verify hash
    var hash = Blake3.HashData(Encoding.UTF8.GetBytes(json1));
    Assert.Equal(ExpectedHash, Convert.ToHexString(hash).ToLowerInvariant());
}

3. Downgrade Adapter Verification

[Fact]
public async Task Cdx17_To_Cdx16_Downgrade_Is_Deterministic()
{
    // Arrange
    var cdx17 = await LoadFixture("sample-cdx17-cbom.json");

    // Act
    var cdx16_1 = await _adapter.Downgrade(cdx17);
    var cdx16_2 = await _adapter.Downgrade(cdx17);

    // Assert
    var json1 = await _serializer.SerializeAsync(cdx16_1);
    var json2 = await _serializer.SerializeAsync(cdx16_2);
    Assert.Equal(json1, json2);

    // Verify matches golden fixture hash
    var hash = Blake3.HashData(Encoding.UTF8.GetBytes(json1));
    var expectedHash = LoadExpectedHash("sample-cdx16.json");
    Assert.Equal(expectedHash, Convert.ToHexString(hash).ToLowerInvariant());
}

Ordering Rules

Components (CycloneDX)

  1. Sort by purl (case-insensitive, locale-invariant)
  2. Ties: sort by name (case-insensitive)
  3. Ties: sort by version (semantic version comparison)

Vulnerabilities

  1. Sort by id (lexicographic)
  2. Ties: sort by source.name (lexicographic)
  3. Ties: sort by highest severity rating score (descending)

Properties

  1. Sort by name (lexicographic, locale-invariant)

Hashes

  1. Sort by alg (BLAKE3-256, SHA-256, SHA-512 order)

Ratings (CVSS)

  1. CVSSv4 first
  2. CVSSv31 second
  3. CVSSv30 third
  4. Others alphabetically by method

Fixture Requirements (SC8 Cross-Reference)

Each golden fixture must include:

Format Fixture File Contents
CDX 1.7 + CBOM sample-cdx17-cbom.json Full SBOM with CVSS v4/v3.1, CBOM properties, SLSA Source Track, evidence
CDX 1.6 (downgraded) sample-cdx16.json Downgraded version with CVSS v4 removed, CBOM dropped, audit markers
SLSA Source Track source-track.sample.json Standalone source provenance block

CI Workflow Steps

# Add to .gitea/workflows/scanner-determinism.yml
jobs:
  determinism-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup .NET
        uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '10.0.x'

      - name: Set determinism environment
        run: |
          echo "TZ=UTC" >> $GITHUB_ENV
          echo "LC_ALL=C" >> $GITHUB_ENV
          echo "DOTNET_DISABLE_BUILTIN_GRAPH=1" >> $GITHUB_ENV
          echo "STELLAOPS_DETERMINISM_SEED=42" >> $GITHUB_ENV

      - name: Verify golden fixtures
        run: scripts/scanner/verify-determinism.sh

      - name: Run determinism tests
        run: |
          dotnet test src/Scanner/__Tests/StellaOps.Scanner.Determinism.Tests \
            --configuration Release \
            --verbosity normal

      - name: Run adapter determinism tests
        run: |
          dotnet test src/Scanner/__Tests/StellaOps.Scanner.Adapters.Tests \
            --filter "Category=Determinism" \
            --configuration Release

Failure Handling

Hash Mismatch Protocol

  1. Do not auto-update hashes - manual review required
  2. Log diff between expected and actual output
  3. Capture both BLAKE3 and SHA256 for audit trail
  4. Block merge until resolved

Acceptable Reasons for Hash Update

  • Schema version bump (documented in change log)
  • Intentional ordering rule change (documented in adapter CSV)
  • Bug fix that corrects previously non-deterministic output
  • Never: cosmetic changes, timestamp updates, random salts

Offline Verification

The harness must work completely offline:

  • No network calls during serialization
  • No external schema validation endpoints
  • Trust roots and schemas bundled in repository
  • All RNG seeded from environment variable

Integration with SC8 Fixtures

The fixtures defined in SC8 serve as golden sources for this harness:

docs/modules/scanner/fixtures/
├── cdx17-cbom/
│   ├── sample-cdx17-cbom.json    # CVSS v4 + v3.1, CBOM, evidence
│   ├── sample-cdx16.json         # Downgraded, CVSS v3.1 only
│   ├── source-track.sample.json  # SLSA Source Track
│   └── hashes.txt                # BLAKE3 + SHA256 for all fixtures
├── adapters/
│   ├── mapping-cvss4-to-cvss3.csv
│   ├── mapping-cdx17-to-cdx16.csv
│   ├── mapping-slsa12-to-slsa10.csv
│   └── hashes.txt
└── competitor-adapters/
    └── fixtures/
        ├── normalized-syft.json
        ├── normalized-trivy.json
        └── normalized-clair.json
  • Sprint: docs/implplan/SPRINT_0186_0001_0001_record_deterministic_execution.md (SC5)
  • Roadmap: docs/modules/scanner/design/standards-convergence-roadmap.md (SC1)
  • Contract: docs/modules/scanner/design/cdx17-cbom-contract.md (SC2)