Introduces CGS determinism test runs to CI workflows for Windows, macOS, Linux, Alpine, and Debian, fulfilling CGS-008 cross-platform requirements. Updates local-ci scripts to support new smoke steps, test timeouts, progress intervals, and project slicing for improved test isolation and diagnostics.
647 lines
17 KiB
Markdown
647 lines
17 KiB
Markdown
# Determinism Developer Guide
|
|
|
|
## Overview
|
|
|
|
This guide helps developers add new determinism tests to StellaOps. Deterministic behavior is critical for:
|
|
- Reproducible verdicts
|
|
- Auditable evidence chains
|
|
- Cryptographic verification
|
|
- Cross-platform consistency
|
|
|
|
## Table of Contents
|
|
|
|
1. [Core Principles](#core-principles)
|
|
2. [Test Structure](#test-structure)
|
|
3. [Common Patterns](#common-patterns)
|
|
4. [Anti-Patterns to Avoid](#anti-patterns-to-avoid)
|
|
5. [Adding New Tests](#adding-new-tests)
|
|
6. [Cross-Platform Considerations](#cross-platform-considerations)
|
|
7. [Performance Guidelines](#performance-guidelines)
|
|
8. [Troubleshooting](#troubleshooting)
|
|
|
|
## Core Principles
|
|
|
|
### 1. Determinism Guarantee
|
|
|
|
**Definition**: Same inputs always produce identical outputs, regardless of:
|
|
- Platform (Windows, macOS, Linux, Alpine, Debian)
|
|
- Runtime (.NET version, JIT compiler)
|
|
- Execution order (parallel vs sequential)
|
|
- Time of day
|
|
- System locale
|
|
|
|
### 2. Golden File Philosophy
|
|
|
|
**Golden files** are baseline reference values that lock in correct behavior:
|
|
- Established after careful verification
|
|
- Never changed without ADR and migration plan
|
|
- Verified on all platforms before acceptance
|
|
|
|
### 3. Test Independence
|
|
|
|
Each test must:
|
|
- Not depend on other tests' execution or order
|
|
- Clean up resources after completion
|
|
- Use isolated data (no shared state)
|
|
|
|
## Test Structure
|
|
|
|
### Standard Test Template
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
[Trait("Category", TestCategories.Unit)]
|
|
public async Task Feature_Behavior_ExpectedOutcome()
|
|
{
|
|
// Arrange - Create deterministic inputs
|
|
var input = CreateDeterministicInput();
|
|
|
|
// Act - Execute feature
|
|
var output1 = await ExecuteFeature(input);
|
|
var output2 = await ExecuteFeature(input);
|
|
|
|
// Assert - Verify determinism
|
|
output1.Should().Be(output2, "same input should produce identical output");
|
|
}
|
|
```
|
|
|
|
### Test Organization
|
|
|
|
```
|
|
src/__Tests/Determinism/
|
|
├── CgsDeterminismTests.cs # CGS hash tests
|
|
├── LineageDeterminismTests.cs # SBOM lineage tests
|
|
├── VexDeterminismTests.cs # VEX consensus tests (future)
|
|
├── README.md # Test documentation
|
|
└── Fixtures/ # Test data
|
|
├── known-evidence-pack.json
|
|
├── known-policy-lock.json
|
|
└── golden-hashes/
|
|
└── cgs-v1.txt
|
|
```
|
|
|
|
## Common Patterns
|
|
|
|
### Pattern 1: 10-Iteration Stability Test
|
|
|
|
**Purpose**: Verify that executing the same operation 10 times produces identical results.
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
public async Task Feature_SameInput_ProducesIdenticalOutput_Across10Iterations()
|
|
{
|
|
// Arrange
|
|
var input = CreateDeterministicInput();
|
|
var service = CreateService();
|
|
var outputs = new List<string>();
|
|
|
|
// Act - Execute 10 times
|
|
for (int i = 0; i < 10; i++)
|
|
{
|
|
var result = await service.ProcessAsync(input, CancellationToken.None);
|
|
outputs.Add(result.Hash);
|
|
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
|
|
}
|
|
|
|
// Assert - All hashes should be identical
|
|
outputs.Distinct().Should().HaveCount(1,
|
|
"same input should produce identical output across all iterations");
|
|
}
|
|
```
|
|
|
|
**Why 10 iterations?**
|
|
- Catches non-deterministic behavior (e.g., GUID generation, random values)
|
|
- Reasonable execution time (<5 seconds for most tests)
|
|
- Industry standard for determinism verification
|
|
|
|
### Pattern 2: Golden File Test
|
|
|
|
**Purpose**: Verify output matches a known-good baseline value.
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
[Trait("Category", TestCategories.Golden)]
|
|
public async Task Feature_WithKnownInput_MatchesGoldenHash()
|
|
{
|
|
// Arrange
|
|
var input = CreateKnownInput(); // MUST be completely deterministic
|
|
var service = CreateService();
|
|
|
|
// Act
|
|
var result = await service.ProcessAsync(input, CancellationToken.None);
|
|
|
|
// Assert
|
|
var goldenHash = "sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3";
|
|
|
|
_output.WriteLine($"Computed Hash: {result.Hash}");
|
|
_output.WriteLine($"Golden Hash: {goldenHash}");
|
|
|
|
result.Hash.Should().Be(goldenHash, "hash must match golden file");
|
|
}
|
|
```
|
|
|
|
**Golden file best practices:**
|
|
- Document how golden value was established (date, platform, .NET version)
|
|
- Include golden value directly in test code (not external file) for visibility
|
|
- Add comment explaining what golden value represents
|
|
- Test golden value on all platforms before merging
|
|
|
|
### Pattern 3: Order Independence Test
|
|
|
|
**Purpose**: Verify that input ordering doesn't affect output.
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
public async Task Feature_InputOrder_DoesNotAffectOutput()
|
|
{
|
|
// Arrange
|
|
var item1 = CreateItem("A");
|
|
var item2 = CreateItem("B");
|
|
var item3 = CreateItem("C");
|
|
|
|
var service = CreateService();
|
|
|
|
// Act - Process items in different orders
|
|
var result1 = await service.ProcessAsync(new[] { item1, item2, item3 }, CancellationToken.None);
|
|
var result2 = await service.ProcessAsync(new[] { item3, item1, item2 }, CancellationToken.None);
|
|
var result3 = await service.ProcessAsync(new[] { item2, item3, item1 }, CancellationToken.None);
|
|
|
|
// Assert - All should produce same hash
|
|
result1.Hash.Should().Be(result2.Hash, "input order should not affect output");
|
|
result1.Hash.Should().Be(result3.Hash, "input order should not affect output");
|
|
|
|
_output.WriteLine($"Order-independent hash: {result1.Hash}");
|
|
}
|
|
```
|
|
|
|
**When to use:**
|
|
- Collections that should be sorted internally (VEX documents, rules, dependencies)
|
|
- APIs that accept unordered inputs (dictionary keys, sets)
|
|
- Parallel processing where order is undefined
|
|
|
|
### Pattern 4: Deterministic Timestamp Test
|
|
|
|
**Purpose**: Verify that fixed timestamps produce deterministic results.
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
public async Task Feature_WithFixedTimestamp_IsDeterministic()
|
|
{
|
|
// Arrange - Use FIXED timestamp (not DateTimeOffset.Now!)
|
|
var timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z");
|
|
var input = CreateInputWithTimestamp(timestamp);
|
|
var service = CreateService();
|
|
|
|
// Act
|
|
var result1 = await service.ProcessAsync(input, CancellationToken.None);
|
|
var result2 = await service.ProcessAsync(input, CancellationToken.None);
|
|
|
|
// Assert
|
|
result1.Hash.Should().Be(result2.Hash, "fixed timestamp should produce deterministic output");
|
|
}
|
|
```
|
|
|
|
**Timestamp guidelines:**
|
|
- ❌ **Never use**: `DateTimeOffset.Now`, `DateTime.UtcNow`, `Guid.NewGuid()`
|
|
- ✅ **Always use**: `DateTimeOffset.Parse("2025-01-01T00:00:00Z")` for tests
|
|
|
|
### Pattern 5: Empty/Minimal Input Test
|
|
|
|
**Purpose**: Verify that minimal or empty inputs don't cause non-determinism.
|
|
|
|
```csharp
|
|
[Fact]
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
public async Task Feature_EmptyInput_ProducesDeterministicHash()
|
|
{
|
|
// Arrange - Minimal input
|
|
var input = CreateEmptyInput();
|
|
var service = CreateService();
|
|
|
|
// Act
|
|
var result = await service.ProcessAsync(input, CancellationToken.None);
|
|
|
|
// Assert - Verify format (hash may not be golden yet)
|
|
result.Hash.Should().StartWith("sha256:");
|
|
result.Hash.Length.Should().Be(71); // "sha256:" + 64 hex chars
|
|
|
|
_output.WriteLine($"Empty input hash: {result.Hash}");
|
|
}
|
|
```
|
|
|
|
**Edge cases to test:**
|
|
- Empty collections (`Array.Empty<string>()`)
|
|
- Null optional fields
|
|
- Zero-length strings
|
|
- Default values
|
|
|
|
## Anti-Patterns to Avoid
|
|
|
|
### ❌ Anti-Pattern 1: Using Current Time
|
|
|
|
```csharp
|
|
// BAD - Non-deterministic!
|
|
var input = new Input
|
|
{
|
|
Timestamp = DateTimeOffset.Now // ❌ Different every run!
|
|
};
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Deterministic
|
|
var input = new Input
|
|
{
|
|
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z") // ✅ Same every run
|
|
};
|
|
```
|
|
|
|
### ❌ Anti-Pattern 2: Using Random Values
|
|
|
|
```csharp
|
|
// BAD - Non-deterministic!
|
|
var random = new Random();
|
|
var input = new Input
|
|
{
|
|
Id = random.Next() // ❌ Different every run!
|
|
};
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Deterministic
|
|
var input = new Input
|
|
{
|
|
Id = 12345 // ✅ Same every run
|
|
};
|
|
```
|
|
|
|
### ❌ Anti-Pattern 3: Using GUID Generation
|
|
|
|
```csharp
|
|
// BAD - Non-deterministic!
|
|
var input = new Input
|
|
{
|
|
Id = Guid.NewGuid().ToString() // ❌ Different every run!
|
|
};
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Deterministic
|
|
var input = new Input
|
|
{
|
|
Id = "00000000-0000-0000-0000-000000000001" // ✅ Same every run
|
|
};
|
|
```
|
|
|
|
### ❌ Anti-Pattern 4: Using Unordered Collections
|
|
|
|
```csharp
|
|
// BAD - Dictionary iteration order is NOT guaranteed!
|
|
var dict = new Dictionary<string, string>
|
|
{
|
|
["key1"] = "value1",
|
|
["key2"] = "value2"
|
|
};
|
|
|
|
foreach (var kvp in dict) // ❌ Order may vary!
|
|
{
|
|
hash.Update(kvp.Key);
|
|
}
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Explicit ordering
|
|
var dict = new Dictionary<string, string>
|
|
{
|
|
["key1"] = "value1",
|
|
["key2"] = "value2"
|
|
};
|
|
|
|
foreach (var kvp in dict.OrderBy(x => x.Key, StringComparer.Ordinal)) // ✅ Consistent order
|
|
{
|
|
hash.Update(kvp.Key);
|
|
}
|
|
```
|
|
|
|
### ❌ Anti-Pattern 5: Platform-Specific Paths
|
|
|
|
```csharp
|
|
// BAD - Platform-specific!
|
|
var path = "dir\\file.txt"; // ❌ Windows-only!
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Cross-platform
|
|
var path = Path.Combine("dir", "file.txt"); // ✅ Works everywhere
|
|
```
|
|
|
|
### ❌ Anti-Pattern 6: Culture-Dependent Formatting
|
|
|
|
```csharp
|
|
// BAD - Culture-dependent!
|
|
var formatted = value.ToString(); // ❌ Locale-specific!
|
|
```
|
|
|
|
**Fix:**
|
|
```csharp
|
|
// GOOD - Culture-invariant
|
|
var formatted = value.ToString(CultureInfo.InvariantCulture); // ✅ Same everywhere
|
|
```
|
|
|
|
## Adding New Tests
|
|
|
|
### Step 1: Identify Determinism Requirement
|
|
|
|
**Ask yourself:**
|
|
- Does this feature produce a hash, signature, or cryptographic output?
|
|
- Will this feature's output be stored and verified later?
|
|
- Does this feature need to be reproducible across platforms?
|
|
- Is this feature part of an audit trail?
|
|
|
|
If **YES** to any → Add determinism test.
|
|
|
|
### Step 2: Create Test File
|
|
|
|
```bash
|
|
cd src/__Tests/Determinism
|
|
touch MyFeatureDeterminismTests.cs
|
|
```
|
|
|
|
### Step 3: Write Test Class
|
|
|
|
```csharp
|
|
using FluentAssertions;
|
|
using StellaOps.TestKit;
|
|
using Xunit;
|
|
using Xunit.Abstractions;
|
|
|
|
namespace StellaOps.Tests.Determinism;
|
|
|
|
/// <summary>
|
|
/// Determinism tests for [Feature Name].
|
|
/// Verifies that [specific behavior] is deterministic across platforms and runs.
|
|
/// </summary>
|
|
[Trait("Category", TestCategories.Determinism)]
|
|
[Trait("Category", TestCategories.Unit)]
|
|
public sealed class MyFeatureDeterminismTests
|
|
{
|
|
private readonly ITestOutputHelper _output;
|
|
|
|
public MyFeatureDeterminismTests(ITestOutputHelper output)
|
|
{
|
|
_output = output;
|
|
}
|
|
|
|
[Fact]
|
|
public async Task MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations()
|
|
{
|
|
// Arrange
|
|
var input = CreateDeterministicInput();
|
|
var service = CreateMyFeatureService();
|
|
var outputs = new List<string>();
|
|
|
|
// Act - Execute 10 times
|
|
for (int i = 0; i < 10; i++)
|
|
{
|
|
var result = await service.ProcessAsync(input, CancellationToken.None);
|
|
outputs.Add(result.Hash);
|
|
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
|
|
}
|
|
|
|
// Assert - All hashes should be identical
|
|
outputs.Distinct().Should().HaveCount(1,
|
|
"same input should produce identical output across all iterations");
|
|
}
|
|
|
|
#region Helper Methods
|
|
|
|
private static MyInput CreateDeterministicInput()
|
|
{
|
|
return new MyInput
|
|
{
|
|
// ✅ Use fixed values
|
|
Id = "test-001",
|
|
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z"),
|
|
Data = new[] { "item1", "item2", "item3" }
|
|
};
|
|
}
|
|
|
|
private static MyFeatureService CreateMyFeatureService()
|
|
{
|
|
return new MyFeatureService(NullLogger<MyFeatureService>.Instance);
|
|
}
|
|
|
|
#endregion
|
|
}
|
|
```
|
|
|
|
### Step 4: Run Test Locally 10 Times
|
|
|
|
```bash
|
|
for i in {1..10}; do
|
|
echo "=== Run $i ==="
|
|
dotnet test --filter "FullyQualifiedName~MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations"
|
|
done
|
|
```
|
|
|
|
**Expected:** All 10 runs pass with identical output.
|
|
|
|
### Step 5: Add to CI/CD
|
|
|
|
Test is automatically included when pushed (no configuration needed).
|
|
|
|
CI/CD workflow `.gitea/workflows/cross-platform-determinism.yml` runs all `Category=Determinism` tests on 5 platforms.
|
|
|
|
### Step 6: Document in README
|
|
|
|
Update `src/__Tests/Determinism/README.md`:
|
|
|
|
```markdown
|
|
### MyFeature Determinism
|
|
|
|
Tests that verify [feature] hash computation is deterministic:
|
|
|
|
- **10-Iteration Stability**: Same input produces identical hash 10 times
|
|
- **Order Independence**: Input ordering doesn't affect hash
|
|
- **Empty Input**: Minimal input produces deterministic hash
|
|
```
|
|
|
|
## Cross-Platform Considerations
|
|
|
|
### Platform Matrix
|
|
|
|
Tests run on:
|
|
- **Windows** (windows-latest): glibc, CRLF line endings
|
|
- **macOS** (macos-latest): BSD libc, LF line endings
|
|
- **Linux Ubuntu** (ubuntu-latest): glibc, LF line endings
|
|
- **Linux Alpine** (Alpine Docker): musl libc, LF line endings
|
|
- **Linux Debian** (Debian Docker): glibc, LF line endings
|
|
|
|
### Common Cross-Platform Issues
|
|
|
|
#### Issue 1: String Sorting (musl vs glibc)
|
|
|
|
**Symptom**: Alpine produces different hash than Ubuntu.
|
|
|
|
**Cause**: `musl` libc has different `strcoll` implementation than `glibc`.
|
|
|
|
**Solution**: Always use `StringComparer.Ordinal` for sorting:
|
|
|
|
```csharp
|
|
// ❌ Wrong - Platform-dependent sorting
|
|
items.Sort();
|
|
|
|
// ✅ Correct - Culture-invariant sorting
|
|
items = items.OrderBy(x => x, StringComparer.Ordinal).ToList();
|
|
```
|
|
|
|
#### Issue 2: Path Separators
|
|
|
|
**Symptom**: Windows produces different hash than macOS/Linux.
|
|
|
|
**Cause**: Windows uses `\`, Unix uses `/`.
|
|
|
|
**Solution**: Use `Path.Combine` or normalize:
|
|
|
|
```csharp
|
|
// ❌ Wrong - Hardcoded separator
|
|
var path = "dir\\file.txt";
|
|
|
|
// ✅ Correct - Cross-platform
|
|
var path = Path.Combine("dir", "file.txt");
|
|
|
|
// ✅ Alternative - Normalize to forward slash
|
|
var normalizedPath = path.Replace('\\', '/');
|
|
```
|
|
|
|
#### Issue 3: Line Endings
|
|
|
|
**Symptom**: Hash includes file content with different line endings.
|
|
|
|
**Cause**: Windows uses CRLF (`\r\n`), Unix uses LF (`\n`).
|
|
|
|
**Solution**: Normalize to LF:
|
|
|
|
```csharp
|
|
// ❌ Wrong - Platform line endings
|
|
var content = File.ReadAllText(path);
|
|
|
|
// ✅ Correct - Normalized to LF
|
|
var content = File.ReadAllText(path).Replace("\r\n", "\n");
|
|
```
|
|
|
|
#### Issue 4: Floating-Point Precision
|
|
|
|
**Symptom**: Different platforms produce slightly different floating-point values.
|
|
|
|
**Cause**: JIT compiler optimizations, FPU rounding modes.
|
|
|
|
**Solution**: Use `decimal` for exact arithmetic, or round explicitly:
|
|
|
|
```csharp
|
|
// ❌ Wrong - Floating-point non-determinism
|
|
var value = 0.1 + 0.2; // Might be 0.30000000000000004
|
|
|
|
// ✅ Correct - Decimal for exact values
|
|
var value = 0.1m + 0.2m; // Always 0.3
|
|
|
|
// ✅ Alternative - Round explicitly
|
|
var value = Math.Round(0.1 + 0.2, 2); // 0.30
|
|
```
|
|
|
|
## Performance Guidelines
|
|
|
|
### Execution Time Targets
|
|
|
|
| Test Type | Target | Max |
|
|
|-----------|--------|-----|
|
|
| Single iteration | <100ms | <500ms |
|
|
| 10-iteration stability | <1s | <3s |
|
|
| Golden file test | <100ms | <500ms |
|
|
| **Full test suite** | **<5s** | **<15s** |
|
|
|
|
### Optimization Tips
|
|
|
|
1. **Avoid unnecessary I/O**: Create test data in memory
|
|
2. **Use Task.CompletedTask**: For synchronous operations
|
|
3. **Minimize allocations**: Reuse test data across assertions
|
|
4. **Parallel test execution**: xUnit runs tests in parallel by default
|
|
|
|
### Performance Regression Detection
|
|
|
|
If test execution time increases by >2x:
|
|
1. Profile with `dotnet-trace` or BenchmarkDotNet
|
|
2. Identify bottleneck (I/O, CPU, memory)
|
|
3. Optimize or split into separate test
|
|
4. Document performance expectations in test comments
|
|
|
|
## Troubleshooting
|
|
|
|
### Problem: Test Passes 9/10 Times, Fails 1/10
|
|
|
|
**Cause**: Non-deterministic input or race condition.
|
|
|
|
**Debug Steps:**
|
|
1. Add logging to each iteration:
|
|
```csharp
|
|
_output.WriteLine($"Iteration {i}: Input={JsonSerializer.Serialize(input)}, Output={output}");
|
|
```
|
|
2. Look for differences in input or output
|
|
3. Check for `Guid.NewGuid()`, `Random`, `DateTimeOffset.Now`
|
|
4. Check for unsynchronized parallel operations
|
|
|
|
### Problem: Test Fails on Alpine but Passes Elsewhere
|
|
|
|
**Cause**: musl libc vs glibc difference.
|
|
|
|
**Debug Steps:**
|
|
1. Run test locally with Alpine Docker:
|
|
```bash
|
|
docker run -it --rm -v $(pwd):/app mcr.microsoft.com/dotnet/sdk:10.0-alpine sh
|
|
cd /app
|
|
dotnet test --filter "FullyQualifiedName~MyTest"
|
|
```
|
|
2. Compare output with local (glibc) output
|
|
3. Check for string sorting, culture-dependent formatting
|
|
4. Use `StringComparer.Ordinal` and `CultureInfo.InvariantCulture`
|
|
|
|
### Problem: Golden Hash Changes After .NET Upgrade
|
|
|
|
**Cause**: .NET runtime change in JSON serialization or hash algorithm.
|
|
|
|
**Debug Steps:**
|
|
1. Compare .NET versions:
|
|
```bash
|
|
dotnet --version # Should be same in CI/CD
|
|
```
|
|
2. Check JsonSerializer behavior:
|
|
```csharp
|
|
var json1 = JsonSerializer.Serialize(input, options);
|
|
var json2 = JsonSerializer.Serialize(input, options);
|
|
json1.Should().Be(json2);
|
|
```
|
|
3. If intentional .NET change, follow [Breaking Change Process](./GOLDEN_FILE_ESTABLISHMENT_GUIDE.md#breaking-change-process)
|
|
|
|
## References
|
|
|
|
- **Test README**: `src/__Tests/Determinism/README.md`
|
|
- **Golden File Guide**: `docs/implplan/archived/2025-12-29-completed-sprints/GOLDEN_FILE_ESTABLISHMENT_GUIDE.md`
|
|
- **ADR 0042**: CGS Merkle Tree Implementation
|
|
- **ADR 0043**: Fulcio Keyless Signing
|
|
- **CI/CD Workflow**: `.gitea/workflows/cross-platform-determinism.yml`
|
|
|
|
## Getting Help
|
|
|
|
- **Slack**: #determinism-testing
|
|
- **Issue Label**: `determinism`, `testing`
|
|
- **Priority**: High (determinism bugs affect audit trails)
|