17 KiB
Determinism Developer Guide
Overview
This guide helps developers add new determinism tests to StellaOps. Deterministic behavior is critical for:
- Reproducible verdicts
- Auditable evidence chains
- Cryptographic verification
- Cross-platform consistency
Table of Contents
- Core Principles
- Test Structure
- Common Patterns
- Anti-Patterns to Avoid
- Adding New Tests
- Cross-Platform Considerations
- Performance Guidelines
- Troubleshooting
Core Principles
1. Determinism Guarantee
Definition: Same inputs always produce identical outputs, regardless of:
- Platform (Windows, macOS, Linux, Alpine, Debian)
- Runtime (.NET version, JIT compiler)
- Execution order (parallel vs sequential)
- Time of day
- System locale
2. Golden File Philosophy
Golden files are baseline reference values that lock in correct behavior:
- Established after careful verification
- Never changed without ADR and migration plan
- Verified on all platforms before acceptance
3. Test Independence
Each test must:
- Not depend on other tests' execution or order
- Clean up resources after completion
- Use isolated data (no shared state)
Test Structure
Standard Test Template
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public async Task Feature_Behavior_ExpectedOutcome()
{
// Arrange - Create deterministic inputs
var input = CreateDeterministicInput();
// Act - Execute feature
var output1 = await ExecuteFeature(input);
var output2 = await ExecuteFeature(input);
// Assert - Verify determinism
output1.Should().Be(output2, "same input should produce identical output");
}
Test Organization
src/__Tests/Determinism/
├── CgsDeterminismTests.cs # CGS hash tests
├── LineageDeterminismTests.cs # SBOM lineage tests
├── VexDeterminismTests.cs # VEX consensus tests (future)
├── README.md # Test documentation
└── Fixtures/ # Test data
├── known-evidence-pack.json
├── known-policy-lock.json
└── golden-hashes/
└── cgs-v1.txt
Common Patterns
Pattern 1: 10-Iteration Stability Test
Purpose: Verify that executing the same operation 10 times produces identical results.
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_SameInput_ProducesIdenticalOutput_Across10Iterations()
{
// Arrange
var input = CreateDeterministicInput();
var service = CreateService();
var outputs = new List<string>();
// Act - Execute 10 times
for (int i = 0; i < 10; i++)
{
var result = await service.ProcessAsync(input, CancellationToken.None);
outputs.Add(result.Hash);
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
}
// Assert - All hashes should be identical
outputs.Distinct().Should().HaveCount(1,
"same input should produce identical output across all iterations");
}
Why 10 iterations?
- Catches non-deterministic behavior (e.g., GUID generation, random values)
- Reasonable execution time (<5 seconds for most tests)
- Industry standard for determinism verification
Pattern 2: Golden File Test
Purpose: Verify output matches a known-good baseline value.
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Golden)]
public async Task Feature_WithKnownInput_MatchesGoldenHash()
{
// Arrange
var input = CreateKnownInput(); // MUST be completely deterministic
var service = CreateService();
// Act
var result = await service.ProcessAsync(input, CancellationToken.None);
// Assert
var goldenHash = "sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3";
_output.WriteLine($"Computed Hash: {result.Hash}");
_output.WriteLine($"Golden Hash: {goldenHash}");
result.Hash.Should().Be(goldenHash, "hash must match golden file");
}
Golden file best practices:
- Document how golden value was established (date, platform, .NET version)
- Include golden value directly in test code (not external file) for visibility
- Add comment explaining what golden value represents
- Test golden value on all platforms before merging
Pattern 3: Order Independence Test
Purpose: Verify that input ordering doesn't affect output.
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_InputOrder_DoesNotAffectOutput()
{
// Arrange
var item1 = CreateItem("A");
var item2 = CreateItem("B");
var item3 = CreateItem("C");
var service = CreateService();
// Act - Process items in different orders
var result1 = await service.ProcessAsync(new[] { item1, item2, item3 }, CancellationToken.None);
var result2 = await service.ProcessAsync(new[] { item3, item1, item2 }, CancellationToken.None);
var result3 = await service.ProcessAsync(new[] { item2, item3, item1 }, CancellationToken.None);
// Assert - All should produce same hash
result1.Hash.Should().Be(result2.Hash, "input order should not affect output");
result1.Hash.Should().Be(result3.Hash, "input order should not affect output");
_output.WriteLine($"Order-independent hash: {result1.Hash}");
}
When to use:
- Collections that should be sorted internally (VEX documents, rules, dependencies)
- APIs that accept unordered inputs (dictionary keys, sets)
- Parallel processing where order is undefined
Pattern 4: Deterministic Timestamp Test
Purpose: Verify that fixed timestamps produce deterministic results.
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_WithFixedTimestamp_IsDeterministic()
{
// Arrange - Use FIXED timestamp (not DateTimeOffset.Now!)
var timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z");
var input = CreateInputWithTimestamp(timestamp);
var service = CreateService();
// Act
var result1 = await service.ProcessAsync(input, CancellationToken.None);
var result2 = await service.ProcessAsync(input, CancellationToken.None);
// Assert
result1.Hash.Should().Be(result2.Hash, "fixed timestamp should produce deterministic output");
}
Timestamp guidelines:
- ❌ Never use:
DateTimeOffset.Now,DateTime.UtcNow,Guid.NewGuid() - ✅ Always use:
DateTimeOffset.Parse("2025-01-01T00:00:00Z")for tests
Pattern 5: Empty/Minimal Input Test
Purpose: Verify that minimal or empty inputs don't cause non-determinism.
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_EmptyInput_ProducesDeterministicHash()
{
// Arrange - Minimal input
var input = CreateEmptyInput();
var service = CreateService();
// Act
var result = await service.ProcessAsync(input, CancellationToken.None);
// Assert - Verify format (hash may not be golden yet)
result.Hash.Should().StartWith("sha256:");
result.Hash.Length.Should().Be(71); // "sha256:" + 64 hex chars
_output.WriteLine($"Empty input hash: {result.Hash}");
}
Edge cases to test:
- Empty collections (
Array.Empty<string>()) - Null optional fields
- Zero-length strings
- Default values
Anti-Patterns to Avoid
❌ Anti-Pattern 1: Using Current Time
// BAD - Non-deterministic!
var input = new Input
{
Timestamp = DateTimeOffset.Now // ❌ Different every run!
};
Fix:
// GOOD - Deterministic
var input = new Input
{
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z") // ✅ Same every run
};
❌ Anti-Pattern 2: Using Random Values
// BAD - Non-deterministic!
var random = new Random();
var input = new Input
{
Id = random.Next() // ❌ Different every run!
};
Fix:
// GOOD - Deterministic
var input = new Input
{
Id = 12345 // ✅ Same every run
};
❌ Anti-Pattern 3: Using GUID Generation
// BAD - Non-deterministic!
var input = new Input
{
Id = Guid.NewGuid().ToString() // ❌ Different every run!
};
Fix:
// GOOD - Deterministic
var input = new Input
{
Id = "00000000-0000-0000-0000-000000000001" // ✅ Same every run
};
❌ Anti-Pattern 4: Using Unordered Collections
// BAD - Dictionary iteration order is NOT guaranteed!
var dict = new Dictionary<string, string>
{
["key1"] = "value1",
["key2"] = "value2"
};
foreach (var kvp in dict) // ❌ Order may vary!
{
hash.Update(kvp.Key);
}
Fix:
// GOOD - Explicit ordering
var dict = new Dictionary<string, string>
{
["key1"] = "value1",
["key2"] = "value2"
};
foreach (var kvp in dict.OrderBy(x => x.Key, StringComparer.Ordinal)) // ✅ Consistent order
{
hash.Update(kvp.Key);
}
❌ Anti-Pattern 5: Platform-Specific Paths
// BAD - Platform-specific!
var path = "dir\\file.txt"; // ❌ Windows-only!
Fix:
// GOOD - Cross-platform
var path = Path.Combine("dir", "file.txt"); // ✅ Works everywhere
❌ Anti-Pattern 6: Culture-Dependent Formatting
// BAD - Culture-dependent!
var formatted = value.ToString(); // ❌ Locale-specific!
Fix:
// GOOD - Culture-invariant
var formatted = value.ToString(CultureInfo.InvariantCulture); // ✅ Same everywhere
Adding New Tests
Step 1: Identify Determinism Requirement
Ask yourself:
- Does this feature produce a hash, signature, or cryptographic output?
- Will this feature's output be stored and verified later?
- Does this feature need to be reproducible across platforms?
- Is this feature part of an audit trail?
If YES to any → Add determinism test.
Step 2: Create Test File
cd src/__Tests/Determinism
touch MyFeatureDeterminismTests.cs
Step 3: Write Test Class
using FluentAssertions;
using StellaOps.TestKit;
using Xunit;
using Xunit.Abstractions;
namespace StellaOps.Tests.Determinism;
/// <summary>
/// Determinism tests for [Feature Name].
/// Verifies that [specific behavior] is deterministic across platforms and runs.
/// </summary>
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public sealed class MyFeatureDeterminismTests
{
private readonly ITestOutputHelper _output;
public MyFeatureDeterminismTests(ITestOutputHelper output)
{
_output = output;
}
[Fact]
public async Task MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations()
{
// Arrange
var input = CreateDeterministicInput();
var service = CreateMyFeatureService();
var outputs = new List<string>();
// Act - Execute 10 times
for (int i = 0; i < 10; i++)
{
var result = await service.ProcessAsync(input, CancellationToken.None);
outputs.Add(result.Hash);
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
}
// Assert - All hashes should be identical
outputs.Distinct().Should().HaveCount(1,
"same input should produce identical output across all iterations");
}
#region Helper Methods
private static MyInput CreateDeterministicInput()
{
return new MyInput
{
// ✅ Use fixed values
Id = "test-001",
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z"),
Data = new[] { "item1", "item2", "item3" }
};
}
private static MyFeatureService CreateMyFeatureService()
{
return new MyFeatureService(NullLogger<MyFeatureService>.Instance);
}
#endregion
}
Step 4: Run Test Locally 10 Times
for i in {1..10}; do
echo "=== Run $i ==="
dotnet test --filter "FullyQualifiedName~MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations"
done
Expected: All 10 runs pass with identical output.
Step 5: Add to CI/CD
Test is automatically included when pushed (no configuration needed).
CI/CD workflow .gitea/workflows/cross-platform-determinism.yml runs all Category=Determinism tests on 5 platforms.
Step 6: Document in README
Update src/__Tests/Determinism/README.md:
### MyFeature Determinism
Tests that verify [feature] hash computation is deterministic:
- **10-Iteration Stability**: Same input produces identical hash 10 times
- **Order Independence**: Input ordering doesn't affect hash
- **Empty Input**: Minimal input produces deterministic hash
Cross-Platform Considerations
Platform Matrix
Tests run on:
- Windows (windows-latest): glibc, CRLF line endings
- macOS (macos-latest): BSD libc, LF line endings
- Linux Ubuntu (ubuntu-latest): glibc, LF line endings
- Linux Alpine (Alpine Docker): musl libc, LF line endings
- Linux Debian (Debian Docker): glibc, LF line endings
Common Cross-Platform Issues
Issue 1: String Sorting (musl vs glibc)
Symptom: Alpine produces different hash than Ubuntu.
Cause: musl libc has different strcoll implementation than glibc.
Solution: Always use StringComparer.Ordinal for sorting:
// ❌ Wrong - Platform-dependent sorting
items.Sort();
// ✅ Correct - Culture-invariant sorting
items = items.OrderBy(x => x, StringComparer.Ordinal).ToList();
Issue 2: Path Separators
Symptom: Windows produces different hash than macOS/Linux.
Cause: Windows uses \, Unix uses /.
Solution: Use Path.Combine or normalize:
// ❌ Wrong - Hardcoded separator
var path = "dir\\file.txt";
// ✅ Correct - Cross-platform
var path = Path.Combine("dir", "file.txt");
// ✅ Alternative - Normalize to forward slash
var normalizedPath = path.Replace('\\', '/');
Issue 3: Line Endings
Symptom: Hash includes file content with different line endings.
Cause: Windows uses CRLF (\r\n), Unix uses LF (\n).
Solution: Normalize to LF:
// ❌ Wrong - Platform line endings
var content = File.ReadAllText(path);
// ✅ Correct - Normalized to LF
var content = File.ReadAllText(path).Replace("\r\n", "\n");
Issue 4: Floating-Point Precision
Symptom: Different platforms produce slightly different floating-point values.
Cause: JIT compiler optimizations, FPU rounding modes.
Solution: Use decimal for exact arithmetic, or round explicitly:
// ❌ Wrong - Floating-point non-determinism
var value = 0.1 + 0.2; // Might be 0.30000000000000004
// ✅ Correct - Decimal for exact values
var value = 0.1m + 0.2m; // Always 0.3
// ✅ Alternative - Round explicitly
var value = Math.Round(0.1 + 0.2, 2); // 0.30
Performance Guidelines
Execution Time Targets
| Test Type | Target | Max |
|---|---|---|
| Single iteration | <100ms | <500ms |
| 10-iteration stability | <1s | <3s |
| Golden file test | <100ms | <500ms |
| Full test suite | <5s | <15s |
Optimization Tips
- Avoid unnecessary I/O: Create test data in memory
- Use Task.CompletedTask: For synchronous operations
- Minimize allocations: Reuse test data across assertions
- Parallel test execution: xUnit runs tests in parallel by default
Performance Regression Detection
If test execution time increases by >2x:
- Profile with
dotnet-traceor BenchmarkDotNet - Identify bottleneck (I/O, CPU, memory)
- Optimize or split into separate test
- Document performance expectations in test comments
Troubleshooting
Problem: Test Passes 9/10 Times, Fails 1/10
Cause: Non-deterministic input or race condition.
Debug Steps:
- Add logging to each iteration:
_output.WriteLine($"Iteration {i}: Input={JsonSerializer.Serialize(input)}, Output={output}"); - Look for differences in input or output
- Check for
Guid.NewGuid(),Random,DateTimeOffset.Now - Check for unsynchronized parallel operations
Problem: Test Fails on Alpine but Passes Elsewhere
Cause: musl libc vs glibc difference.
Debug Steps:
- Run test locally with Alpine Docker:
docker run -it --rm -v $(pwd):/app mcr.microsoft.com/dotnet/sdk:10.0-alpine sh cd /app dotnet test --filter "FullyQualifiedName~MyTest" - Compare output with local (glibc) output
- Check for string sorting, culture-dependent formatting
- Use
StringComparer.OrdinalandCultureInfo.InvariantCulture
Problem: Golden Hash Changes After .NET Upgrade
Cause: .NET runtime change in JSON serialization or hash algorithm.
Debug Steps:
- Compare .NET versions:
dotnet --version # Should be same in CI/CD - Check JsonSerializer behavior:
var json1 = JsonSerializer.Serialize(input, options); var json2 = JsonSerializer.Serialize(input, options); json1.Should().Be(json2); - If intentional .NET change, follow Breaking Change Process
References
- Test README:
src/__Tests/Determinism/README.md - Golden File Guide:
docs/implplan/archived/2025-12-29-completed-sprints/GOLDEN_FILE_ESTABLISHMENT_GUIDE.md - ADR 0042: CGS Merkle Tree Implementation
- ADR 0043: Fulcio Keyless Signing
- CI/CD Workflow:
.gitea/workflows/cross-platform-determinism.yml
Getting Help
- Slack: #determinism-testing
- Issue Label:
determinism,testing - Priority: High (determinism bugs affect audit trails)