Files
git.stella-ops.org/docs/testing/determinism-verification.md
master 491e883653 Add tests for SBOM generation determinism across multiple formats
- Created `StellaOps.TestKit.Tests` project for unit tests related to determinism.
- Implemented `DeterminismManifestTests` to validate deterministic output for canonical bytes and strings, file read/write operations, and error handling for invalid schema versions.
- Added `SbomDeterminismTests` to ensure identical inputs produce consistent SBOMs across SPDX 3.0.1 and CycloneDX 1.6/1.7 formats, including parallel execution tests.
- Updated project references in `StellaOps.Integration.Determinism` to include the new determinism testing library.
2025-12-24 00:36:14 +02:00

9.6 KiB

Determinism Verification Guide

Sprint: 5100.0007.0003 (Epic B)
Last Updated: 2025-12-23

Overview

StellaOps enforces deterministic artifact generation across all exported formats. This ensures:

  1. Reproducibility: Given the same inputs, outputs are byte-for-byte identical
  2. Auditability: Hash verification proves artifact integrity
  3. Compliance: Regulated environments can replay and verify builds
  4. CI Gating: Drift detection prevents unintended changes

Supported Artifact Types

Type Format(s) Test File
SBOM SPDX 3.0.1, CycloneDX 1.6, CycloneDX 1.7 SbomDeterminismTests.cs
VEX OpenVEX, CSAF 2.0 VexDeterminismTests.cs
Policy Verdicts JSON PolicyDeterminismTests.cs
Evidence Bundles JSON, DSSE, in-toto EvidenceBundleDeterminismTests.cs
AirGap Bundles NDJSON AirGapBundleDeterminismTests.cs
Advisory Normalization Canonical JSON IngestionDeterminismTests.cs

Determinism Manifest Format

Every deterministic artifact can produce a manifest describing its content hash and generation context.

Schema (v1.0)

{
  "schemaVersion": "1.0",
  "artifact": {
    "type": "sbom | vex | policy-verdict | evidence-bundle | airgap-bundle",
    "name": "artifact-identifier",
    "version": "1.0.0",
    "format": "SPDX 3.0.1 | CycloneDX 1.6 | OpenVEX | CSAF 2.0 | ..."
  },
  "canonicalHash": {
    "algorithm": "SHA-256",
    "value": "abc123..."
  },
  "toolchain": {
    "platform": ".NET 10.0",
    "components": [
      { "name": "StellaOps.Scanner", "version": "1.0.0" }
    ]
  },
  "inputs": {
    "feedSnapshotHash": "def456...",
    "policyManifestHash": "ghi789...",
    "configHash": "jkl012..."
  },
  "generatedAt": "2025-12-23T18:00:00Z"
}

Field Descriptions

Field Description
schemaVersion Manifest schema version (currently 1.0)
artifact.type Category of the artifact
artifact.name Identifier for the artifact
artifact.version Version of the artifact (if applicable)
artifact.format Specific format/spec version
canonicalHash.algorithm Hash algorithm (always SHA-256)
canonicalHash.value Lowercase hex hash of canonical bytes
toolchain.platform Runtime platform
toolchain.components List of generating components with versions
inputs Hashes of input artifacts (feed snapshots, policies, etc.)
generatedAt ISO-8601 UTC timestamp of generation

Creating a Determinism Manifest

Use DeterminismManifestWriter from StellaOps.Testing.Determinism:

using StellaOps.Testing.Determinism;

// Generate artifact bytes
var sbomBytes = GenerateSbom(input, frozenTime);

// Create artifact info
var artifactInfo = new ArtifactInfo
{
    Type = "sbom",
    Name = "my-container-sbom",
    Version = "1.0.0",
    Format = "CycloneDX 1.6"
};

// Create toolchain info
var toolchain = new ToolchainInfo
{
    Platform = ".NET 10.0",
    Components = new[]
    {
        new ComponentInfo { Name = "StellaOps.Scanner", Version = "1.0.0" }
    }
};

// Create manifest
var manifest = DeterminismManifestWriter.CreateManifest(
    sbomBytes,
    artifactInfo,
    toolchain);

// Save manifest
DeterminismManifestWriter.Save(manifest, "determinism.json");

Reading and Verifying Manifests

// Load manifest
var manifest = DeterminismManifestReader.Load("determinism.json");

// Verify artifact bytes match manifest hash
var currentBytes = File.ReadAllBytes("artifact.json");
var isValid = DeterminismManifestReader.Verify(manifest, currentBytes);

if (!isValid)
{
    throw new DeterminismDriftException(
        $"Artifact hash mismatch. Expected: {manifest.CanonicalHash.Value}");
}

Determinism Rules

1. Canonical JSON Serialization

All JSON output must use canonical serialization via StellaOps.Canonical.Json:

using StellaOps.Canonical.Json;

var json = CanonJson.Serialize(myObject);
var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(json));

Rules:

  • Keys sorted lexicographically
  • No trailing whitespace
  • Unix line endings (\n)
  • No BOM
  • UTF-8 encoding

2. Frozen Timestamps

All timestamps must be provided externally or use DeterministicTime:

// ❌ BAD - Non-deterministic
var timestamp = DateTimeOffset.UtcNow;

// ✅ GOOD - Deterministic
var timestamp = frozenTime; // Passed as parameter

3. Deterministic IDs

UUIDs and IDs must be derived from content, not random:

// ❌ BAD - Random UUID
var id = Guid.NewGuid();

// ✅ GOOD - Content-derived ID
var seed = $"{input.Name}:{input.Version}:{timestamp:O}";
var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(seed));
var id = new Guid(Convert.FromHexString(hash[..32]));

4. Stable Ordering

Collections must be sorted before serialization:

// ❌ BAD - Non-deterministic order
var items = dictionary.Values;

// ✅ GOOD - Sorted order
var items = dictionary.Values
    .OrderBy(v => v.Key, StringComparer.Ordinal);

5. Parallel Safety

Determinism must hold under parallel execution:

var tasks = Enumerable.Range(0, 20)
    .Select(_ => Task.Run(() => GenerateArtifact(input, frozenTime)))
    .ToArray();

var results = await Task.WhenAll(tasks);
results.Should().AllBe(results[0]); // All identical

CI Integration

PR Merge Gate

The determinism gate runs on PR merge:

# .gitea/workflows/determinism-gate.yaml
name: Determinism Gate
on:
  pull_request:
    types: [synchronize, ready_for_review]
jobs:
  determinism:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '10.0.x'
      - name: Run Determinism Tests
        run: |
          dotnet test tests/integration/StellaOps.Integration.Determinism \
            --logger "trx;LogFileName=determinism.trx"
      - name: Generate Determinism Manifest
        run: |
          dotnet run --project tools/DeterminismManifestGenerator \
            --output determinism.json
      - name: Upload Determinism Artifact
        uses: actions/upload-artifact@v4
        with:
          name: determinism-manifest
          path: determinism.json

Baseline Storage

Determinism baselines are stored as CI artifacts:

ci-artifacts/
  determinism/
    baseline/
      sbom-spdx-3.0.1.json
      sbom-cyclonedx-1.6.json
      sbom-cyclonedx-1.7.json
      vex-openvex.json
      vex-csaf.json
      policy-verdict.json
      evidence-bundle.json
      airgap-bundle.json

Drift Detection

When a PR changes artifact output:

  1. CI compares new manifest hash against baseline
  2. If different, CI fails with diff report
  3. Developer must either:
    • Fix the regression (restore determinism)
    • Update the baseline (if change is intentional)

Baseline Update Process

To intentionally update a baseline:

# 1. Run determinism tests to generate new manifests
dotnet test tests/integration/StellaOps.Integration.Determinism

# 2. Update baseline files
cp determinism/*.json ci-artifacts/determinism/baseline/

# 3. Commit with explicit message
git add ci-artifacts/determinism/baseline/
git commit -m "chore(determinism): update baselines for [reason]

Breaking: [explain what changed]
Justification: [explain why this is correct]"

Replay Verification

To verify an artifact was produced deterministically:

# 1. Get the manifest
curl -O https://releases.stellaops.io/v1.0.0/sbom.determinism.json

# 2. Get the artifact
curl -O https://releases.stellaops.io/v1.0.0/sbom.cdx.json

# 3. Verify
dotnet run --project tools/DeterminismVerifier \
  --manifest sbom.determinism.json \
  --artifact sbom.cdx.json

Output:

Determinism Verification
========================
Artifact: sbom.cdx.json
Manifest: sbom.determinism.json
Expected Hash: abc123...
Actual Hash:   abc123...
Status: ✅ VERIFIED

Test Files Reference

All determinism tests are in tests/integration/StellaOps.Integration.Determinism/:

File Tests Description
DeterminismValidationTests.cs 16 Manifest format and reader/writer
SbomDeterminismTests.cs 14 SPDX 3.0.1, CycloneDX 1.6/1.7
VexDeterminismTests.cs 17 OpenVEX, CSAF 2.0
PolicyDeterminismTests.cs 18 Policy verdict artifacts
EvidenceBundleDeterminismTests.cs 15 DSSE, in-toto attestations
AirGapBundleDeterminismTests.cs 14 NDJSON bundles, manifests
IngestionDeterminismTests.cs 17 NVD/OSV/GHSA/CSAF normalization

Troubleshooting

Hash Mismatch

If you see a hash mismatch:

  1. Check timestamps: Ensure frozen time is used
  2. Check ordering: Ensure all collections are sorted
  3. Check IDs: Ensure IDs are content-derived
  4. Check encoding: Ensure UTF-8 without BOM

Flaky Tests

If determinism tests are flaky:

  1. Check parallelism: Ensure no shared mutable state
  2. Check time zones: Use UTC explicitly
  3. Check random sources: Remove all random number generation
  4. Check hash inputs: Ensure all inputs are captured

CI Failures

If CI determinism gate fails:

  1. Compare the diff between expected and actual
  2. Identify which field changed
  3. Track back to the code change that caused it
  4. Either fix the regression or update baseline with justification