# Determinism Verification Guide **Sprint:** 5100.0007.0003 (Epic B) **Last Updated:** 2025-12-23 ## Overview StellaOps enforces deterministic artifact generation across all exported formats. This ensures: 1. **Reproducibility**: Given the same inputs, outputs are byte-for-byte identical 2. **Auditability**: Hash verification proves artifact integrity 3. **Compliance**: Regulated environments can replay and verify builds 4. **CI Gating**: Drift detection prevents unintended changes ## Supported Artifact Types | Type | Format(s) | Test File | |------|-----------|-----------| | SBOM | SPDX 3.0.1, CycloneDX 1.6, CycloneDX 1.7 | `SbomDeterminismTests.cs` | | VEX | OpenVEX, CSAF 2.0 | `VexDeterminismTests.cs` | | Policy Verdicts | JSON | `PolicyDeterminismTests.cs` | | Evidence Bundles | JSON, DSSE, in-toto | `EvidenceBundleDeterminismTests.cs` | | AirGap Bundles | NDJSON | `AirGapBundleDeterminismTests.cs` | | Advisory Normalization | Canonical JSON | `IngestionDeterminismTests.cs` | ## Determinism Manifest Format Every deterministic artifact can produce a manifest describing its content hash and generation context. ### Schema (v1.0) ```json { "schemaVersion": "1.0", "artifact": { "type": "sbom | vex | policy-verdict | evidence-bundle | airgap-bundle", "name": "artifact-identifier", "version": "1.0.0", "format": "SPDX 3.0.1 | CycloneDX 1.6 | OpenVEX | CSAF 2.0 | ..." }, "canonicalHash": { "algorithm": "SHA-256", "value": "abc123..." }, "toolchain": { "platform": ".NET 10.0", "components": [ { "name": "StellaOps.Scanner", "version": "1.0.0" } ] }, "inputs": { "feedSnapshotHash": "def456...", "policyManifestHash": "ghi789...", "configHash": "jkl012..." }, "generatedAt": "2025-12-23T18:00:00Z" } ``` ### Field Descriptions | Field | Description | |-------|-------------| | `schemaVersion` | Manifest schema version (currently `1.0`) | | `artifact.type` | Category of the artifact | | `artifact.name` | Identifier for the artifact | | `artifact.version` | Version of the artifact (if applicable) | | `artifact.format` | Specific format/spec version | | `canonicalHash.algorithm` | Hash algorithm (always `SHA-256`) | | `canonicalHash.value` | Lowercase hex hash of canonical bytes | | `toolchain.platform` | Runtime platform | | `toolchain.components` | List of generating components with versions | | `inputs` | Hashes of input artifacts (feed snapshots, policies, etc.) | | `generatedAt` | ISO-8601 UTC timestamp of generation | ## Creating a Determinism Manifest Use `DeterminismManifestWriter` from `StellaOps.Testing.Determinism`: ```csharp using StellaOps.Testing.Determinism; // Generate artifact bytes var sbomBytes = GenerateSbom(input, frozenTime); // Create artifact info var artifactInfo = new ArtifactInfo { Type = "sbom", Name = "my-container-sbom", Version = "1.0.0", Format = "CycloneDX 1.6" }; // Create toolchain info var toolchain = new ToolchainInfo { Platform = ".NET 10.0", Components = new[] { new ComponentInfo { Name = "StellaOps.Scanner", Version = "1.0.0" } } }; // Create manifest var manifest = DeterminismManifestWriter.CreateManifest( sbomBytes, artifactInfo, toolchain); // Save manifest DeterminismManifestWriter.Save(manifest, "determinism.json"); ``` ## Reading and Verifying Manifests ```csharp // Load manifest var manifest = DeterminismManifestReader.Load("determinism.json"); // Verify artifact bytes match manifest hash var currentBytes = File.ReadAllBytes("artifact.json"); var isValid = DeterminismManifestReader.Verify(manifest, currentBytes); if (!isValid) { throw new DeterminismDriftException( $"Artifact hash mismatch. Expected: {manifest.CanonicalHash.Value}"); } ``` ## Determinism Rules ### 1. Canonical JSON Serialization All JSON output must use canonical serialization via `StellaOps.Canonical.Json`: ```csharp using StellaOps.Canonical.Json; var json = CanonJson.Serialize(myObject); var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(json)); ``` Rules: - Keys sorted lexicographically - No trailing whitespace - Unix line endings (`\n`) - No BOM - UTF-8 encoding ### 2. Frozen Timestamps All timestamps must be provided externally or use `DeterministicTime`: ```csharp // ❌ BAD - Non-deterministic var timestamp = DateTimeOffset.UtcNow; // ✅ GOOD - Deterministic var timestamp = frozenTime; // Passed as parameter ``` ### 3. Deterministic IDs UUIDs and IDs must be derived from content, not random: ```csharp // ❌ BAD - Random UUID var id = Guid.NewGuid(); // ✅ GOOD - Content-derived ID var seed = $"{input.Name}:{input.Version}:{timestamp:O}"; var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(seed)); var id = new Guid(Convert.FromHexString(hash[..32])); ``` ### 4. Stable Ordering Collections must be sorted before serialization: ```csharp // ❌ BAD - Non-deterministic order var items = dictionary.Values; // ✅ GOOD - Sorted order var items = dictionary.Values .OrderBy(v => v.Key, StringComparer.Ordinal); ``` ### 5. Parallel Safety Determinism must hold under parallel execution: ```csharp var tasks = Enumerable.Range(0, 20) .Select(_ => Task.Run(() => GenerateArtifact(input, frozenTime))) .ToArray(); var results = await Task.WhenAll(tasks); results.Should().AllBe(results[0]); // All identical ``` ## CI Integration ### PR Merge Gate The determinism gate runs on PR merge: ```yaml # .gitea/workflows/determinism-gate.yaml name: Determinism Gate on: pull_request: types: [synchronize, ready_for_review] jobs: determinism: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-dotnet@v4 with: dotnet-version: '10.0.x' - name: Run Determinism Tests run: | dotnet test tests/integration/StellaOps.Integration.Determinism \ --logger "trx;LogFileName=determinism.trx" - name: Generate Determinism Manifest run: | dotnet run --project tools/DeterminismManifestGenerator \ --output determinism.json - name: Upload Determinism Artifact uses: actions/upload-artifact@v4 with: name: determinism-manifest path: determinism.json ``` ### Baseline Storage Determinism baselines are stored as CI artifacts: ``` ci-artifacts/ determinism/ baseline/ sbom-spdx-3.0.1.json sbom-cyclonedx-1.6.json sbom-cyclonedx-1.7.json vex-openvex.json vex-csaf.json policy-verdict.json evidence-bundle.json airgap-bundle.json ``` ### Drift Detection When a PR changes artifact output: 1. CI compares new manifest hash against baseline 2. If different, CI fails with diff report 3. Developer must either: - Fix the regression (restore determinism) - Update the baseline (if change is intentional) ### Baseline Update Process To intentionally update a baseline: ```bash # 1. Run determinism tests to generate new manifests dotnet test tests/integration/StellaOps.Integration.Determinism # 2. Update baseline files cp determinism/*.json ci-artifacts/determinism/baseline/ # 3. Commit with explicit message git add ci-artifacts/determinism/baseline/ git commit -m "chore(determinism): update baselines for [reason] Breaking: [explain what changed] Justification: [explain why this is correct]" ``` ## Replay Verification To verify an artifact was produced deterministically: ```bash # 1. Get the manifest curl -O https://releases.stellaops.io/v1.0.0/sbom.determinism.json # 2. Get the artifact curl -O https://releases.stellaops.io/v1.0.0/sbom.cdx.json # 3. Verify dotnet run --project tools/DeterminismVerifier \ --manifest sbom.determinism.json \ --artifact sbom.cdx.json ``` Output: ``` Determinism Verification ======================== Artifact: sbom.cdx.json Manifest: sbom.determinism.json Expected Hash: abc123... Actual Hash: abc123... Status: ✅ VERIFIED ``` ## Test Files Reference All determinism tests are in `tests/integration/StellaOps.Integration.Determinism/`: | File | Tests | Description | |------|-------|-------------| | `DeterminismValidationTests.cs` | 16 | Manifest format and reader/writer | | `SbomDeterminismTests.cs` | 14 | SPDX 3.0.1, CycloneDX 1.6/1.7 | | `VexDeterminismTests.cs` | 17 | OpenVEX, CSAF 2.0 | | `PolicyDeterminismTests.cs` | 18 | Policy verdict artifacts | | `EvidenceBundleDeterminismTests.cs` | 15 | DSSE, in-toto attestations | | `AirGapBundleDeterminismTests.cs` | 14 | NDJSON bundles, manifests | | `IngestionDeterminismTests.cs` | 17 | NVD/OSV/GHSA/CSAF normalization | ## Troubleshooting ### Hash Mismatch If you see a hash mismatch: 1. **Check timestamps**: Ensure frozen time is used 2. **Check ordering**: Ensure all collections are sorted 3. **Check IDs**: Ensure IDs are content-derived 4. **Check encoding**: Ensure UTF-8 without BOM ### Flaky Tests If determinism tests are flaky: 1. **Check parallelism**: Ensure no shared mutable state 2. **Check time zones**: Use UTC explicitly 3. **Check random sources**: Remove all random number generation 4. **Check hash inputs**: Ensure all inputs are captured ### CI Failures If CI determinism gate fails: 1. Compare the diff between expected and actual 2. Identify which field changed 3. Track back to the code change that caused it 4. Either fix the regression or update baseline with justification ## Related Documentation - [Testing Strategy Models](testing-strategy-models.md) - Overview of testing models - [Canonical JSON Specification](../11_DATA_SCHEMAS.md#canonical-json) - JSON serialization rules - [CI/CD Workflows](../modules/devops/architecture.md) - CI pipeline details - [Evidence Bundle Schema](../modules/evidence-locker/architecture.md) - Bundle format reference