git.stella-ops.org/docs/testing/DETERMINISM_DEVELOPER_GUIDE.md

# Determinism Developer Guide

## Overview

This guide helps developers add new determinism tests to StellaOps. Deterministic behavior is critical for:
- Reproducible verdicts
- Auditable evidence chains
- Cryptographic verification
- Cross-platform consistency

## Table of Contents

1. [Core Principles](#core-principles)
2. [Test Structure](#test-structure)
3. [Common Patterns](#common-patterns)
4. [Anti-Patterns to Avoid](#anti-patterns-to-avoid)
5. [Adding New Tests](#adding-new-tests)
6. [Cross-Platform Considerations](#cross-platform-considerations)
7. [Performance Guidelines](#performance-guidelines)
8. [Troubleshooting](#troubleshooting)

## Core Principles

### 1. Determinism Guarantee

**Definition**: Same inputs always produce identical outputs, regardless of:
- Platform (Windows, macOS, Linux, Alpine, Debian)
- Runtime (.NET version, JIT compiler)
- Execution order (parallel vs sequential)
- Time of day
- System locale

### 2. Golden File Philosophy

**Golden files** are baseline reference values that lock in correct behavior:
- Established after careful verification
- Never changed without ADR and migration plan
- Verified on all platforms before acceptance

### 3. Test Independence

Each test must:
- Not depend on other tests' execution or order
- Clean up resources after completion
- Use isolated data (no shared state)

## Test Structure

### Standard Test Template

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public async Task Feature_Behavior_ExpectedOutcome()
{
    // Arrange - Create deterministic inputs
    var input = CreateDeterministicInput();

    // Act - Execute feature
    var output1 = await ExecuteFeature(input);
    var output2 = await ExecuteFeature(input);

    // Assert - Verify determinism
    output1.Should().Be(output2, "same input should produce identical output");
}
```

### Test Organization

```
src/__Tests/Determinism/
├── CgsDeterminismTests.cs          # CGS hash tests
├── LineageDeterminismTests.cs      # SBOM lineage tests
├── VexDeterminismTests.cs          # VEX consensus tests (future)
├── README.md                       # Test documentation
└── Fixtures/                       # Test data
    ├── known-evidence-pack.json
    ├── known-policy-lock.json
    └── golden-hashes/
        └── cgs-v1.txt
```

## Common Patterns

### Pattern 1: 10-Iteration Stability Test

**Purpose**: Verify that executing the same operation 10 times produces identical results.

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_SameInput_ProducesIdenticalOutput_Across10Iterations()
{
    // Arrange
    var input = CreateDeterministicInput();
    var service = CreateService();
    var outputs = new List<string>();

    // Act - Execute 10 times
    for (int i = 0; i < 10; i++)
    {
        var result = await service.ProcessAsync(input, CancellationToken.None);
        outputs.Add(result.Hash);
        _output.WriteLine($"Iteration {i + 1}: {result.Hash}");
    }

    // Assert - All hashes should be identical
    outputs.Distinct().Should().HaveCount(1,
        "same input should produce identical output across all iterations");
}
```

**Why 10 iterations?**
- Catches non-deterministic behavior (e.g., GUID generation, random values)
- Reasonable execution time (<5 seconds for most tests)
- Industry standard for determinism verification

### Pattern 2: Golden File Test

**Purpose**: Verify output matches a known-good baseline value.

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Golden)]
public async Task Feature_WithKnownInput_MatchesGoldenHash()
{
    // Arrange
    var input = CreateKnownInput();  // MUST be completely deterministic
    var service = CreateService();

    // Act
    var result = await service.ProcessAsync(input, CancellationToken.None);

    // Assert
    var goldenHash = "sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3";

    _output.WriteLine($"Computed Hash: {result.Hash}");
    _output.WriteLine($"Golden Hash:   {goldenHash}");

    result.Hash.Should().Be(goldenHash, "hash must match golden file");
}
```

**Golden file best practices:**
- Document how golden value was established (date, platform, .NET version)
- Include golden value directly in test code (not external file) for visibility
- Add comment explaining what golden value represents
- Test golden value on all platforms before merging

### Pattern 3: Order Independence Test

**Purpose**: Verify that input ordering doesn't affect output.

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_InputOrder_DoesNotAffectOutput()
{
    // Arrange
    var item1 = CreateItem("A");
    var item2 = CreateItem("B");
    var item3 = CreateItem("C");

    var service = CreateService();

    // Act - Process items in different orders
    var result1 = await service.ProcessAsync(new[] { item1, item2, item3 }, CancellationToken.None);
    var result2 = await service.ProcessAsync(new[] { item3, item1, item2 }, CancellationToken.None);
    var result3 = await service.ProcessAsync(new[] { item2, item3, item1 }, CancellationToken.None);

    // Assert - All should produce same hash
    result1.Hash.Should().Be(result2.Hash, "input order should not affect output");
    result1.Hash.Should().Be(result3.Hash, "input order should not affect output");

    _output.WriteLine($"Order-independent hash: {result1.Hash}");
}
```

**When to use:**
- Collections that should be sorted internally (VEX documents, rules, dependencies)
- APIs that accept unordered inputs (dictionary keys, sets)
- Parallel processing where order is undefined

### Pattern 4: Deterministic Timestamp Test

**Purpose**: Verify that fixed timestamps produce deterministic results.

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_WithFixedTimestamp_IsDeterministic()
{
    // Arrange - Use FIXED timestamp (not DateTimeOffset.Now!)
    var timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z");
    var input = CreateInputWithTimestamp(timestamp);
    var service = CreateService();

    // Act
    var result1 = await service.ProcessAsync(input, CancellationToken.None);
    var result2 = await service.ProcessAsync(input, CancellationToken.None);

    // Assert
    result1.Hash.Should().Be(result2.Hash, "fixed timestamp should produce deterministic output");
}
```

**Timestamp guidelines:**
- ❌ **Never use**: `DateTimeOffset.Now`, `DateTime.UtcNow`, `Guid.NewGuid()`
- ✅ **Always use**: `DateTimeOffset.Parse("2025-01-01T00:00:00Z")` for tests

### Pattern 5: Empty/Minimal Input Test

**Purpose**: Verify that minimal or empty inputs don't cause non-determinism.

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_EmptyInput_ProducesDeterministicHash()
{
    // Arrange - Minimal input
    var input = CreateEmptyInput();
    var service = CreateService();

    // Act
    var result = await service.ProcessAsync(input, CancellationToken.None);

    // Assert - Verify format (hash may not be golden yet)
    result.Hash.Should().StartWith("sha256:");
    result.Hash.Length.Should().Be(71); // "sha256:" + 64 hex chars

    _output.WriteLine($"Empty input hash: {result.Hash}");
}
```

**Edge cases to test:**
- Empty collections (`Array.Empty<string>()`)
- Null optional fields
- Zero-length strings
- Default values

## Anti-Patterns to Avoid

### ❌ Anti-Pattern 1: Using Current Time

```csharp
// BAD - Non-deterministic!
var input = new Input
{
    Timestamp = DateTimeOffset.Now  // ❌ Different every run!
};
```

**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
    Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z")  // ✅ Same every run
};
```

### ❌ Anti-Pattern 2: Using Random Values

```csharp
// BAD - Non-deterministic!
var random = new Random();
var input = new Input
{
    Id = random.Next()  // ❌ Different every run!
};
```

**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
    Id = 12345  // ✅ Same every run
};
```

### ❌ Anti-Pattern 3: Using GUID Generation

```csharp
// BAD - Non-deterministic!
var input = new Input
{
    Id = Guid.NewGuid().ToString()  // ❌ Different every run!
};
```

**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
    Id = "00000000-0000-0000-0000-000000000001"  // ✅ Same every run
};
```

### ❌ Anti-Pattern 4: Using Unordered Collections

```csharp
// BAD - Dictionary iteration order is NOT guaranteed!
var dict = new Dictionary<string, string>
{
    ["key1"] = "value1",
    ["key2"] = "value2"
};

foreach (var kvp in dict)  // ❌ Order may vary!
{
    hash.Update(kvp.Key);
}
```

**Fix:**
```csharp
// GOOD - Explicit ordering
var dict = new Dictionary<string, string>
{
    ["key1"] = "value1",
    ["key2"] = "value2"
};

foreach (var kvp in dict.OrderBy(x => x.Key, StringComparer.Ordinal))  // ✅ Consistent order
{
    hash.Update(kvp.Key);
}
```

### ❌ Anti-Pattern 5: Platform-Specific Paths

```csharp
// BAD - Platform-specific!
var path = "dir\\file.txt";  // ❌ Windows-only!
```

**Fix:**
```csharp
// GOOD - Cross-platform
var path = Path.Combine("dir", "file.txt");  // ✅ Works everywhere
```

### ❌ Anti-Pattern 6: Culture-Dependent Formatting

```csharp
// BAD - Culture-dependent!
var formatted = value.ToString();  // ❌ Locale-specific!
```

**Fix:**
```csharp
// GOOD - Culture-invariant
var formatted = value.ToString(CultureInfo.InvariantCulture);  // ✅ Same everywhere
```

## Adding New Tests

### Step 1: Identify Determinism Requirement

**Ask yourself:**
- Does this feature produce a hash, signature, or cryptographic output?
- Will this feature's output be stored and verified later?
- Does this feature need to be reproducible across platforms?
- Is this feature part of an audit trail?

If **YES** to any → Add determinism test.

### Step 2: Create Test File

```bash
cd src/__Tests/Determinism
touch MyFeatureDeterminismTests.cs
```

### Step 3: Write Test Class

```csharp
using FluentAssertions;
using StellaOps.TestKit;
using Xunit;
using Xunit.Abstractions;

namespace StellaOps.Tests.Determinism;

/// <summary>
/// Determinism tests for [Feature Name].
/// Verifies that [specific behavior] is deterministic across platforms and runs.
/// </summary>
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public sealed class MyFeatureDeterminismTests
{
    private readonly ITestOutputHelper _output;

    public MyFeatureDeterminismTests(ITestOutputHelper output)
    {
        _output = output;
    }

    [Fact]
    public async Task MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations()
    {
        // Arrange
        var input = CreateDeterministicInput();
        var service = CreateMyFeatureService();
        var outputs = new List<string>();

        // Act - Execute 10 times
        for (int i = 0; i < 10; i++)
        {
            var result = await service.ProcessAsync(input, CancellationToken.None);
            outputs.Add(result.Hash);
            _output.WriteLine($"Iteration {i + 1}: {result.Hash}");
        }

        // Assert - All hashes should be identical
        outputs.Distinct().Should().HaveCount(1,
            "same input should produce identical output across all iterations");
    }

    #region Helper Methods

    private static MyInput CreateDeterministicInput()
    {
        return new MyInput
        {
            // ✅ Use fixed values
            Id = "test-001",
            Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z"),
            Data = new[] { "item1", "item2", "item3" }
        };
    }

    private static MyFeatureService CreateMyFeatureService()
    {
        return new MyFeatureService(NullLogger<MyFeatureService>.Instance);
    }

    #endregion
}
```

### Step 4: Run Test Locally 10 Times

```bash
for i in {1..10}; do
  echo "=== Run $i ==="
  dotnet test --filter "FullyQualifiedName~MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations"
done
```

**Expected:** All 10 runs pass with identical output.

### Step 5: Add to CI/CD

Test is automatically included when pushed (no configuration needed).

CI/CD workflow `.gitea/workflows/cross-platform-determinism.yml` runs all `Category=Determinism` tests on 5 platforms.

### Step 6: Document in README

Update `src/__Tests/Determinism/README.md`:

```markdown
### MyFeature Determinism

Tests that verify [feature] hash computation is deterministic:

- **10-Iteration Stability**: Same input produces identical hash 10 times
- **Order Independence**: Input ordering doesn't affect hash
- **Empty Input**: Minimal input produces deterministic hash
```

## Cross-Platform Considerations

### Platform Matrix

Tests run on:
- **Windows** (windows-latest): glibc, CRLF line endings
- **macOS** (macos-latest): BSD libc, LF line endings
- **Linux Ubuntu** (ubuntu-latest): glibc, LF line endings
- **Linux Alpine** (Alpine Docker): musl libc, LF line endings
- **Linux Debian** (Debian Docker): glibc, LF line endings

### Common Cross-Platform Issues

#### Issue 1: String Sorting (musl vs glibc)

**Symptom**: Alpine produces different hash than Ubuntu.

**Cause**: `musl` libc has different `strcoll` implementation than `glibc`.

**Solution**: Always use `StringComparer.Ordinal` for sorting:

```csharp
// ❌ Wrong - Platform-dependent sorting
items.Sort();

// ✅ Correct - Culture-invariant sorting
items = items.OrderBy(x => x, StringComparer.Ordinal).ToList();
```

#### Issue 2: Path Separators

**Symptom**: Windows produces different hash than macOS/Linux.

**Cause**: Windows uses `\`, Unix uses `/`.

**Solution**: Use `Path.Combine` or normalize:

```csharp
// ❌ Wrong - Hardcoded separator
var path = "dir\\file.txt";

// ✅ Correct - Cross-platform
var path = Path.Combine("dir", "file.txt");

// ✅ Alternative - Normalize to forward slash
var normalizedPath = path.Replace('\\', '/');
```

#### Issue 3: Line Endings

**Symptom**: Hash includes file content with different line endings.

**Cause**: Windows uses CRLF (`\r\n`), Unix uses LF (`\n`).

**Solution**: Normalize to LF:

```csharp
// ❌ Wrong - Platform line endings
var content = File.ReadAllText(path);

// ✅ Correct - Normalized to LF
var content = File.ReadAllText(path).Replace("\r\n", "\n");
```

#### Issue 4: Floating-Point Precision

**Symptom**: Different platforms produce slightly different floating-point values.

**Cause**: JIT compiler optimizations, FPU rounding modes.

**Solution**: Use `decimal` for exact arithmetic, or round explicitly:

```csharp
// ❌ Wrong - Floating-point non-determinism
var value = 0.1 + 0.2;  // Might be 0.30000000000000004

// ✅ Correct - Decimal for exact values
var value = 0.1m + 0.2m;  // Always 0.3

// ✅ Alternative - Round explicitly
var value = Math.Round(0.1 + 0.2, 2);  // 0.30
```

## Performance Guidelines

### Execution Time Targets

| Test Type | Target | Max |
|-----------|--------|-----|
| Single iteration | <100ms | <500ms |
| 10-iteration stability | <1s | <3s |
| Golden file test | <100ms | <500ms |
| **Full test suite** | **<5s** | **<15s** |

### Optimization Tips

1. **Avoid unnecessary I/O**: Create test data in memory
2. **Use Task.CompletedTask**: For synchronous operations
3. **Minimize allocations**: Reuse test data across assertions
4. **Parallel test execution**: xUnit runs tests in parallel by default

### Performance Regression Detection

If test execution time increases by >2x:
1. Profile with `dotnet-trace` or BenchmarkDotNet
2. Identify bottleneck (I/O, CPU, memory)
3. Optimize or split into separate test
4. Document performance expectations in test comments

## Troubleshooting

### Problem: Test Passes 9/10 Times, Fails 1/10

**Cause**: Non-deterministic input or race condition.

**Debug Steps:**
1. Add logging to each iteration:
   ```csharp
   _output.WriteLine($"Iteration {i}: Input={JsonSerializer.Serialize(input)}, Output={output}");
   ```
2. Look for differences in input or output
3. Check for `Guid.NewGuid()`, `Random`, `DateTimeOffset.Now`
4. Check for unsynchronized parallel operations

### Problem: Test Fails on Alpine but Passes Elsewhere

**Cause**: musl libc vs glibc difference.

**Debug Steps:**
1. Run test locally with Alpine Docker:
   ```bash
   docker run -it --rm -v $(pwd):/app mcr.microsoft.com/dotnet/sdk:10.0-alpine sh
   cd /app
   dotnet test --filter "FullyQualifiedName~MyTest"
   ```
2. Compare output with local (glibc) output
3. Check for string sorting, culture-dependent formatting
4. Use `StringComparer.Ordinal` and `CultureInfo.InvariantCulture`

### Problem: Golden Hash Changes After .NET Upgrade

**Cause**: .NET runtime change in JSON serialization or hash algorithm.

**Debug Steps:**
1. Compare .NET versions:
   ```bash
   dotnet --version  # Should be same in CI/CD
   ```
2. Check JsonSerializer behavior:
   ```csharp
   var json1 = JsonSerializer.Serialize(input, options);
   var json2 = JsonSerializer.Serialize(input, options);
   json1.Should().Be(json2);
   ```
3. If intentional .NET change, follow [Breaking Change Process](./GOLDEN_FILE_ESTABLISHMENT_GUIDE.md#breaking-change-process)

## References

- **Test README**: `src/__Tests/Determinism/README.md`
- **Golden File Guide**: `docs/implplan/archived/2025-12-29-completed-sprints/GOLDEN_FILE_ESTABLISHMENT_GUIDE.md`
- **ADR 0042**: CGS Merkle Tree Implementation
- **ADR 0043**: Fulcio Keyless Signing
- **CI/CD Workflow**: `.gitea/workflows/cross-platform-determinism.yml`

## Getting Help

- **Slack**: #determinism-testing
- **Issue Label**: `determinism`, `testing`
- **Priority**: High (determinism bugs affect audit trails)