UI work to fill SBOM sourcing management gap. UI planning remaining functionality exposure. Work on CI/Tests stabilization

Introduces CGS determinism test runs to CI workflows for Windows, macOS, Linux, Alpine, and Debian, fulfilling CGS-008 cross-platform requirements. Updates local-ci scripts to support new smoke steps, test timeouts, progress intervals, and project slicing for improved test isolation and diagnostics.
This commit is contained in:
master
2025-12-29 19:12:38 +02:00
parent 41552d26ec
commit a4badc275e
286 changed files with 50918 additions and 992 deletions

View File

@@ -0,0 +1,646 @@
# Determinism Developer Guide
## Overview
This guide helps developers add new determinism tests to StellaOps. Deterministic behavior is critical for:
- Reproducible verdicts
- Auditable evidence chains
- Cryptographic verification
- Cross-platform consistency
## Table of Contents
1. [Core Principles](#core-principles)
2. [Test Structure](#test-structure)
3. [Common Patterns](#common-patterns)
4. [Anti-Patterns to Avoid](#anti-patterns-to-avoid)
5. [Adding New Tests](#adding-new-tests)
6. [Cross-Platform Considerations](#cross-platform-considerations)
7. [Performance Guidelines](#performance-guidelines)
8. [Troubleshooting](#troubleshooting)
## Core Principles
### 1. Determinism Guarantee
**Definition**: Same inputs always produce identical outputs, regardless of:
- Platform (Windows, macOS, Linux, Alpine, Debian)
- Runtime (.NET version, JIT compiler)
- Execution order (parallel vs sequential)
- Time of day
- System locale
### 2. Golden File Philosophy
**Golden files** are baseline reference values that lock in correct behavior:
- Established after careful verification
- Never changed without ADR and migration plan
- Verified on all platforms before acceptance
### 3. Test Independence
Each test must:
- Not depend on other tests' execution or order
- Clean up resources after completion
- Use isolated data (no shared state)
## Test Structure
### Standard Test Template
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public async Task Feature_Behavior_ExpectedOutcome()
{
// Arrange - Create deterministic inputs
var input = CreateDeterministicInput();
// Act - Execute feature
var output1 = await ExecuteFeature(input);
var output2 = await ExecuteFeature(input);
// Assert - Verify determinism
output1.Should().Be(output2, "same input should produce identical output");
}
```
### Test Organization
```
src/__Tests/Determinism/
├── CgsDeterminismTests.cs # CGS hash tests
├── LineageDeterminismTests.cs # SBOM lineage tests
├── VexDeterminismTests.cs # VEX consensus tests (future)
├── README.md # Test documentation
└── Fixtures/ # Test data
├── known-evidence-pack.json
├── known-policy-lock.json
└── golden-hashes/
└── cgs-v1.txt
```
## Common Patterns
### Pattern 1: 10-Iteration Stability Test
**Purpose**: Verify that executing the same operation 10 times produces identical results.
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_SameInput_ProducesIdenticalOutput_Across10Iterations()
{
// Arrange
var input = CreateDeterministicInput();
var service = CreateService();
var outputs = new List<string>();
// Act - Execute 10 times
for (int i = 0; i < 10; i++)
{
var result = await service.ProcessAsync(input, CancellationToken.None);
outputs.Add(result.Hash);
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
}
// Assert - All hashes should be identical
outputs.Distinct().Should().HaveCount(1,
"same input should produce identical output across all iterations");
}
```
**Why 10 iterations?**
- Catches non-deterministic behavior (e.g., GUID generation, random values)
- Reasonable execution time (<5 seconds for most tests)
- Industry standard for determinism verification
### Pattern 2: Golden File Test
**Purpose**: Verify output matches a known-good baseline value.
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Golden)]
public async Task Feature_WithKnownInput_MatchesGoldenHash()
{
// Arrange
var input = CreateKnownInput(); // MUST be completely deterministic
var service = CreateService();
// Act
var result = await service.ProcessAsync(input, CancellationToken.None);
// Assert
var goldenHash = "sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3";
_output.WriteLine($"Computed Hash: {result.Hash}");
_output.WriteLine($"Golden Hash: {goldenHash}");
result.Hash.Should().Be(goldenHash, "hash must match golden file");
}
```
**Golden file best practices:**
- Document how golden value was established (date, platform, .NET version)
- Include golden value directly in test code (not external file) for visibility
- Add comment explaining what golden value represents
- Test golden value on all platforms before merging
### Pattern 3: Order Independence Test
**Purpose**: Verify that input ordering doesn't affect output.
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_InputOrder_DoesNotAffectOutput()
{
// Arrange
var item1 = CreateItem("A");
var item2 = CreateItem("B");
var item3 = CreateItem("C");
var service = CreateService();
// Act - Process items in different orders
var result1 = await service.ProcessAsync(new[] { item1, item2, item3 }, CancellationToken.None);
var result2 = await service.ProcessAsync(new[] { item3, item1, item2 }, CancellationToken.None);
var result3 = await service.ProcessAsync(new[] { item2, item3, item1 }, CancellationToken.None);
// Assert - All should produce same hash
result1.Hash.Should().Be(result2.Hash, "input order should not affect output");
result1.Hash.Should().Be(result3.Hash, "input order should not affect output");
_output.WriteLine($"Order-independent hash: {result1.Hash}");
}
```
**When to use:**
- Collections that should be sorted internally (VEX documents, rules, dependencies)
- APIs that accept unordered inputs (dictionary keys, sets)
- Parallel processing where order is undefined
### Pattern 4: Deterministic Timestamp Test
**Purpose**: Verify that fixed timestamps produce deterministic results.
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_WithFixedTimestamp_IsDeterministic()
{
// Arrange - Use FIXED timestamp (not DateTimeOffset.Now!)
var timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z");
var input = CreateInputWithTimestamp(timestamp);
var service = CreateService();
// Act
var result1 = await service.ProcessAsync(input, CancellationToken.None);
var result2 = await service.ProcessAsync(input, CancellationToken.None);
// Assert
result1.Hash.Should().Be(result2.Hash, "fixed timestamp should produce deterministic output");
}
```
**Timestamp guidelines:**
- **Never use**: `DateTimeOffset.Now`, `DateTime.UtcNow`, `Guid.NewGuid()`
- **Always use**: `DateTimeOffset.Parse("2025-01-01T00:00:00Z")` for tests
### Pattern 5: Empty/Minimal Input Test
**Purpose**: Verify that minimal or empty inputs don't cause non-determinism.
```csharp
[Fact]
[Trait("Category", TestCategories.Determinism)]
public async Task Feature_EmptyInput_ProducesDeterministicHash()
{
// Arrange - Minimal input
var input = CreateEmptyInput();
var service = CreateService();
// Act
var result = await service.ProcessAsync(input, CancellationToken.None);
// Assert - Verify format (hash may not be golden yet)
result.Hash.Should().StartWith("sha256:");
result.Hash.Length.Should().Be(71); // "sha256:" + 64 hex chars
_output.WriteLine($"Empty input hash: {result.Hash}");
}
```
**Edge cases to test:**
- Empty collections (`Array.Empty<string>()`)
- Null optional fields
- Zero-length strings
- Default values
## Anti-Patterns to Avoid
### ❌ Anti-Pattern 1: Using Current Time
```csharp
// BAD - Non-deterministic!
var input = new Input
{
Timestamp = DateTimeOffset.Now // ❌ Different every run!
};
```
**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z") // ✅ Same every run
};
```
### ❌ Anti-Pattern 2: Using Random Values
```csharp
// BAD - Non-deterministic!
var random = new Random();
var input = new Input
{
Id = random.Next() // ❌ Different every run!
};
```
**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
Id = 12345 // ✅ Same every run
};
```
### ❌ Anti-Pattern 3: Using GUID Generation
```csharp
// BAD - Non-deterministic!
var input = new Input
{
Id = Guid.NewGuid().ToString() // ❌ Different every run!
};
```
**Fix:**
```csharp
// GOOD - Deterministic
var input = new Input
{
Id = "00000000-0000-0000-0000-000000000001" // ✅ Same every run
};
```
### ❌ Anti-Pattern 4: Using Unordered Collections
```csharp
// BAD - Dictionary iteration order is NOT guaranteed!
var dict = new Dictionary<string, string>
{
["key1"] = "value1",
["key2"] = "value2"
};
foreach (var kvp in dict) // ❌ Order may vary!
{
hash.Update(kvp.Key);
}
```
**Fix:**
```csharp
// GOOD - Explicit ordering
var dict = new Dictionary<string, string>
{
["key1"] = "value1",
["key2"] = "value2"
};
foreach (var kvp in dict.OrderBy(x => x.Key, StringComparer.Ordinal)) // ✅ Consistent order
{
hash.Update(kvp.Key);
}
```
### ❌ Anti-Pattern 5: Platform-Specific Paths
```csharp
// BAD - Platform-specific!
var path = "dir\\file.txt"; // ❌ Windows-only!
```
**Fix:**
```csharp
// GOOD - Cross-platform
var path = Path.Combine("dir", "file.txt"); // ✅ Works everywhere
```
### ❌ Anti-Pattern 6: Culture-Dependent Formatting
```csharp
// BAD - Culture-dependent!
var formatted = value.ToString(); // ❌ Locale-specific!
```
**Fix:**
```csharp
// GOOD - Culture-invariant
var formatted = value.ToString(CultureInfo.InvariantCulture); // ✅ Same everywhere
```
## Adding New Tests
### Step 1: Identify Determinism Requirement
**Ask yourself:**
- Does this feature produce a hash, signature, or cryptographic output?
- Will this feature's output be stored and verified later?
- Does this feature need to be reproducible across platforms?
- Is this feature part of an audit trail?
If **YES** to any Add determinism test.
### Step 2: Create Test File
```bash
cd src/__Tests/Determinism
touch MyFeatureDeterminismTests.cs
```
### Step 3: Write Test Class
```csharp
using FluentAssertions;
using StellaOps.TestKit;
using Xunit;
using Xunit.Abstractions;
namespace StellaOps.Tests.Determinism;
/// <summary>
/// Determinism tests for [Feature Name].
/// Verifies that [specific behavior] is deterministic across platforms and runs.
/// </summary>
[Trait("Category", TestCategories.Determinism)]
[Trait("Category", TestCategories.Unit)]
public sealed class MyFeatureDeterminismTests
{
private readonly ITestOutputHelper _output;
public MyFeatureDeterminismTests(ITestOutputHelper output)
{
_output = output;
}
[Fact]
public async Task MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations()
{
// Arrange
var input = CreateDeterministicInput();
var service = CreateMyFeatureService();
var outputs = new List<string>();
// Act - Execute 10 times
for (int i = 0; i < 10; i++)
{
var result = await service.ProcessAsync(input, CancellationToken.None);
outputs.Add(result.Hash);
_output.WriteLine($"Iteration {i + 1}: {result.Hash}");
}
// Assert - All hashes should be identical
outputs.Distinct().Should().HaveCount(1,
"same input should produce identical output across all iterations");
}
#region Helper Methods
private static MyInput CreateDeterministicInput()
{
return new MyInput
{
// ✅ Use fixed values
Id = "test-001",
Timestamp = DateTimeOffset.Parse("2025-01-01T00:00:00Z"),
Data = new[] { "item1", "item2", "item3" }
};
}
private static MyFeatureService CreateMyFeatureService()
{
return new MyFeatureService(NullLogger<MyFeatureService>.Instance);
}
#endregion
}
```
### Step 4: Run Test Locally 10 Times
```bash
for i in {1..10}; do
echo "=== Run $i ==="
dotnet test --filter "FullyQualifiedName~MyFeature_SameInput_ProducesIdenticalOutput_Across10Iterations"
done
```
**Expected:** All 10 runs pass with identical output.
### Step 5: Add to CI/CD
Test is automatically included when pushed (no configuration needed).
CI/CD workflow `.gitea/workflows/cross-platform-determinism.yml` runs all `Category=Determinism` tests on 5 platforms.
### Step 6: Document in README
Update `src/__Tests/Determinism/README.md`:
```markdown
### MyFeature Determinism
Tests that verify [feature] hash computation is deterministic:
- **10-Iteration Stability**: Same input produces identical hash 10 times
- **Order Independence**: Input ordering doesn't affect hash
- **Empty Input**: Minimal input produces deterministic hash
```
## Cross-Platform Considerations
### Platform Matrix
Tests run on:
- **Windows** (windows-latest): glibc, CRLF line endings
- **macOS** (macos-latest): BSD libc, LF line endings
- **Linux Ubuntu** (ubuntu-latest): glibc, LF line endings
- **Linux Alpine** (Alpine Docker): musl libc, LF line endings
- **Linux Debian** (Debian Docker): glibc, LF line endings
### Common Cross-Platform Issues
#### Issue 1: String Sorting (musl vs glibc)
**Symptom**: Alpine produces different hash than Ubuntu.
**Cause**: `musl` libc has different `strcoll` implementation than `glibc`.
**Solution**: Always use `StringComparer.Ordinal` for sorting:
```csharp
// ❌ Wrong - Platform-dependent sorting
items.Sort();
// ✅ Correct - Culture-invariant sorting
items = items.OrderBy(x => x, StringComparer.Ordinal).ToList();
```
#### Issue 2: Path Separators
**Symptom**: Windows produces different hash than macOS/Linux.
**Cause**: Windows uses `\`, Unix uses `/`.
**Solution**: Use `Path.Combine` or normalize:
```csharp
// ❌ Wrong - Hardcoded separator
var path = "dir\\file.txt";
// ✅ Correct - Cross-platform
var path = Path.Combine("dir", "file.txt");
// ✅ Alternative - Normalize to forward slash
var normalizedPath = path.Replace('\\', '/');
```
#### Issue 3: Line Endings
**Symptom**: Hash includes file content with different line endings.
**Cause**: Windows uses CRLF (`\r\n`), Unix uses LF (`\n`).
**Solution**: Normalize to LF:
```csharp
// ❌ Wrong - Platform line endings
var content = File.ReadAllText(path);
// ✅ Correct - Normalized to LF
var content = File.ReadAllText(path).Replace("\r\n", "\n");
```
#### Issue 4: Floating-Point Precision
**Symptom**: Different platforms produce slightly different floating-point values.
**Cause**: JIT compiler optimizations, FPU rounding modes.
**Solution**: Use `decimal` for exact arithmetic, or round explicitly:
```csharp
// ❌ Wrong - Floating-point non-determinism
var value = 0.1 + 0.2; // Might be 0.30000000000000004
// ✅ Correct - Decimal for exact values
var value = 0.1m + 0.2m; // Always 0.3
// ✅ Alternative - Round explicitly
var value = Math.Round(0.1 + 0.2, 2); // 0.30
```
## Performance Guidelines
### Execution Time Targets
| Test Type | Target | Max |
|-----------|--------|-----|
| Single iteration | <100ms | <500ms |
| 10-iteration stability | <1s | <3s |
| Golden file test | <100ms | <500ms |
| **Full test suite** | **<5s** | **<15s** |
### Optimization Tips
1. **Avoid unnecessary I/O**: Create test data in memory
2. **Use Task.CompletedTask**: For synchronous operations
3. **Minimize allocations**: Reuse test data across assertions
4. **Parallel test execution**: xUnit runs tests in parallel by default
### Performance Regression Detection
If test execution time increases by >2x:
1. Profile with `dotnet-trace` or BenchmarkDotNet
2. Identify bottleneck (I/O, CPU, memory)
3. Optimize or split into separate test
4. Document performance expectations in test comments
## Troubleshooting
### Problem: Test Passes 9/10 Times, Fails 1/10
**Cause**: Non-deterministic input or race condition.
**Debug Steps:**
1. Add logging to each iteration:
```csharp
_output.WriteLine($"Iteration {i}: Input={JsonSerializer.Serialize(input)}, Output={output}");
```
2. Look for differences in input or output
3. Check for `Guid.NewGuid()`, `Random`, `DateTimeOffset.Now`
4. Check for unsynchronized parallel operations
### Problem: Test Fails on Alpine but Passes Elsewhere
**Cause**: musl libc vs glibc difference.
**Debug Steps:**
1. Run test locally with Alpine Docker:
```bash
docker run -it --rm -v $(pwd):/app mcr.microsoft.com/dotnet/sdk:10.0-alpine sh
cd /app
dotnet test --filter "FullyQualifiedName~MyTest"
```
2. Compare output with local (glibc) output
3. Check for string sorting, culture-dependent formatting
4. Use `StringComparer.Ordinal` and `CultureInfo.InvariantCulture`
### Problem: Golden Hash Changes After .NET Upgrade
**Cause**: .NET runtime change in JSON serialization or hash algorithm.
**Debug Steps:**
1. Compare .NET versions:
```bash
dotnet --version # Should be same in CI/CD
```
2. Check JsonSerializer behavior:
```csharp
var json1 = JsonSerializer.Serialize(input, options);
var json2 = JsonSerializer.Serialize(input, options);
json1.Should().Be(json2);
```
3. If intentional .NET change, follow [Breaking Change Process](./GOLDEN_FILE_ESTABLISHMENT_GUIDE.md#breaking-change-process)
## References
- **Test README**: `src/__Tests/Determinism/README.md`
- **Golden File Guide**: `docs/implplan/archived/2025-12-29-completed-sprints/GOLDEN_FILE_ESTABLISHMENT_GUIDE.md`
- **ADR 0042**: CGS Merkle Tree Implementation
- **ADR 0043**: Fulcio Keyless Signing
- **CI/CD Workflow**: `.gitea/workflows/cross-platform-determinism.yml`
## Getting Help
- **Slack**: #determinism-testing
- **Issue Label**: `determinism`, `testing`
- **Priority**: High (determinism bugs affect audit trails)