# Determinism Tests

This test project verifies that StellaOps produces deterministic outputs across platforms, runs, and configurations. Deterministic behavior is critical for reproducible verdicts, auditable evidence chains, and cryptographic verification.

## Test Categories

### CGS (Canonical Graph Signature) Determinism

Tests that verify verdict hash computation is deterministic:

- **Golden File Tests**: Known evidence produces expected hash
- **10-Iteration Stability**: Same input produces identical hash 10 times
- **VEX Order Independence**: VEX document ordering doesn't affect hash
- **Reachability Graph Tests**: Reachability inclusion changes hash predictably
- **Policy Lock Tests**: Different policy versions produce different hashes

### Cross-Platform Verification

Tests run on multiple platforms via CI/CD:
- Windows (glibc)
- macOS (BSD libc)
- Linux Ubuntu (glibc)
- Linux Alpine (musl libc)
- Linux Debian (glibc)

## Running Tests Locally

### Prerequisites

- .NET 10 SDK
- Docker (for Testcontainers, if needed)

### Run All Determinism Tests

```bash
cd src/__Tests/Determinism
dotnet test
```

### Run Specific Test Category

```bash
# Run only determinism tests
dotnet test --filter "Category=Determinism"

# Run only unit tests
dotnet test --filter "Category=Unit"
```

### Run with Detailed Output

```bash
dotnet test --logger "console;verbosity=detailed"
```

### Run and Generate TRX Report

```bash
dotnet test --logger "trx;LogFileName=determinism-results.trx" --results-directory ./test-results
```

## Test Structure

```
CgsDeterminismTests.cs
├── Golden File Tests
│   ├── CgsHash_WithKnownEvidence_MatchesGoldenHash
│   └── CgsHash_EmptyEvidence_ProducesDeterministicHash
├── 10-Iteration Stability Tests
│   ├── CgsHash_SameInput_ProducesIdenticalHash_Across10Iterations
│   ├── CgsHash_VexOrderIndependent_ProducesIdenticalHash
│   └── CgsHash_WithReachability_IsDifferentFromWithout
└── Policy Lock Determinism Tests
    └── CgsHash_DifferentPolicyVersion_ProducesDifferentHash
```

## Golden File Workflow

### Initial Baseline (First Time)

1. Run tests locally to compute initial hash:
   ```bash
   dotnet test --filter "FullyQualifiedName~CgsHash_WithKnownEvidence_MatchesGoldenHash"
   ```

2. Observe the computed CGS hash in test output:
   ```
   Computed CGS: cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
   Golden CGS:   cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
   ```

3. Verify hash matches expected value (line 59 in CgsDeterminismTests.cs)

4. Uncomment golden hash assertion (line 69):
   ```csharp
   result.CgsHash.Should().Be(goldenHash, "CGS hash must match golden file");
   ```

5. Commit the change to lock in the golden hash

### Verifying Golden Hash Stability

After establishing the baseline:

```bash
# Run 10 times to verify stability
for i in {1..10}; do
  echo "Iteration $i"
  dotnet test --filter "FullyQualifiedName~CgsHash_WithKnownEvidence_MatchesGoldenHash" --logger "console;verbosity=minimal"
done
```

All iterations should pass with identical hash.

### Golden Hash Changes

⚠️ **BREAKING CHANGE**: If golden hash tests fail, the CGS algorithm has changed!

**Impact**:
- All historical verdicts become unverifiable
- Stored CGS hashes no longer match recomputed values
- Audit trails are broken

**Process for Intentional Changes**:
1. Document the reason for algorithm change in ADR
2. Create migration guide for existing verdicts
3. Update golden hash in test
4. Coordinate with all deployments
5. Plan for dual-algorithm support during transition

## CI/CD Integration

### Cross-Platform Workflow

File: `.gitea/workflows/cross-platform-determinism.yml`

**Triggers**:
- Push to `main` branch
- Pull requests targeting `main`
- Manual dispatch

**Platform Matrix**:
- Windows: `windows-latest`
- macOS: `macos-latest`
- Linux: `ubuntu-latest`
- Alpine: `mcr.microsoft.com/dotnet/sdk:10.0-alpine` (musl libc)
- Debian: `mcr.microsoft.com/dotnet/sdk:10.0-bookworm-slim`

**Outputs**:
- TRX test results per platform
- Cross-platform hash comparison report
- Divergence detection (fails if hashes differ)

### Running CI/CD Locally

Using [act](https://github.com/nektos/act) to run Gitea Actions locally:

```bash
# Install act (if not already installed)
# macOS: brew install act
# Linux: curl https://raw.githubusercontent.com/nektos/act/master/install.sh | sudo bash

# Run cross-platform determinism workflow
act -W .gitea/workflows/cross-platform-determinism.yml
```

## Troubleshooting

### Tests Fail with "Hashes Don't Match"

**Symptoms**:
```
Expected hashes.Distinct() to have count 1, but found 2.
```

**Cause**: Non-deterministic input or platform-specific behavior

**Solutions**:
1. Check for timestamp usage (use fixed `DateTimeOffset.Parse("2025-01-01T00:00:00Z")`)
2. Check for dictionary ordering (use `OrderBy`)
3. Check for GUID generation (use fixed GUIDs in tests)
4. Check for floating-point arithmetic (use decimal for determinism)

### Tests Fail on Alpine (musl libc)

**Symptoms**:
```
Hash divergence detected: Alpine produces different hash than Ubuntu
```

**Cause**: musl libc vs glibc differences in string handling, sorting, or crypto

**Solutions**:
1. Use `StringComparer.Ordinal` for all sorting
2. Use `Encoding.UTF8.GetBytes()` explicitly (don't rely on platform default)
3. Use `CultureInfo.InvariantCulture` for number/date formatting

### Golden Hash Test Fails After Upgrade

**Symptoms**:
```
Expected "cgs:sha256:abc123..." but found "cgs:sha256:def456..."
```

**Cause**: .NET upgrade changed hash computation or JSON serialization

**Solutions**:
1. Verify .NET version in CI/CD matches local (should be 10.0.100)
2. Check `CanonicalJsonOptions` configuration (line 33 in CgsDeterminismTests.cs)
3. Review recent changes to VerdictBuilderService.cs

### Flaky Tests (Intermittent Failures)

**Symptoms**:
```
Test passes 9/10 times, fails 1/10
```

**Cause**: Race condition, timing dependency, or non-deterministic input

**Solutions**:
1. Add `Interlocked` for thread-safe counters
2. Use `TaskCompletionSource` instead of `Task.Delay` for synchronization
3. Remove randomness (no `Random`, `Guid.NewGuid()` in test inputs)
4. Fix ordering of parallel operations

## Adding New Determinism Tests

### Step 1: Create Test Method

```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task MyNewFeature_IsDeterministic()
{
    // Arrange
    var evidence = CreateKnownEvidencePack();
    var policyLock = CreateKnownPolicyLock();
    var service = CreateVerdictBuilder();

    // Act
    var result1 = await service.BuildAsync(evidence, policyLock, CancellationToken.None);
    var result2 = await service.BuildAsync(evidence, policyLock, CancellationToken.None);

    // Assert
    result1.CgsHash.Should().Be(result2.CgsHash, "same input should produce same hash");
}
```

### Step 2: Run Locally 10 Times

```bash
for i in {1..10}; do
  dotnet test --filter "FullyQualifiedName~MyNewFeature_IsDeterministic"
done
```

### Step 3: Verify Cross-Platform

Push to branch and check CI/CD results:
- Windows ✅
- macOS ✅
- Linux ✅
- Alpine ✅
- Debian ✅

### Step 4: Document Edge Cases

Add comments explaining:
- What makes this test deterministic
- Any platform-specific considerations
- Expected hash format/structure

## Performance Baselines

Typical test execution times (on CI/CD runners):

| Test | Windows | macOS | Linux | Alpine | Debian |
|------|---------|-------|-------|--------|--------|
| Golden File Test | <100ms | <100ms | <100ms | <150ms | <100ms |
| 10-Iteration Stability | <1s | <1s | <1s | <1.5s | <1s |
| VEX Order Independence | <200ms | <200ms | <200ms | <300ms | <200ms |
| **Total Suite** | **<3s** | **<3s** | **<3s** | **<4s** | **<3s** |

If tests exceed these baselines by 2x, investigate performance regression.

## References

- **Architecture**: `docs/modules/verdict/architecture.md` (CGS section)
- **Sprint Documentation**: `docs/implplan/archived/SPRINT_20251229_001_001_BE_cgs_infrastructure.md`
- **Batch Summary**: `docs/implplan/archived/2025-12-29-completed-sprints/BATCH_20251229_BE_COMPLETION_SUMMARY.md`
- **CI/CD Workflow**: `.gitea/workflows/cross-platform-determinism.yml`

## Contact

For questions or issues:
- Create issue in repository
- Tag: `determinism`, `testing`, `cgs`
- Priority: High (determinism bugs affect audit trails)