18 KiB
Golden File Establishment Guide
Overview
Golden files are baseline reference values that verify deterministic behavior remains stable over time. This guide explains how to establish, verify, and maintain golden hashes for CGS (Canonical Graph Signature) and other deterministic subsystems.
Table of Contents
- Prerequisites
- Initial Baseline Setup
- Cross-Platform Verification
- Golden Hash Maintenance
- Troubleshooting
- Breaking Change Process
Prerequisites
Local Environment
- .NET 10 SDK (10.0.100 or later)
- Git access to repository
- Write access to CI/CD workflows
CI/CD Environment
- Gitea Actions enabled
- Cross-platform runners configured:
- Windows (windows-latest)
- macOS (macos-latest)
- Linux (ubuntu-latest)
- Alpine (mcr.microsoft.com/dotnet/sdk:10.0-alpine)
- Debian (mcr.microsoft.com/dotnet/sdk:10.0-bookworm-slim)
Initial Baseline Setup
Step 1: Run Tests Locally
cd src/__Tests/Determinism
# Run CGS determinism tests
dotnet test --filter "Category=Determinism" --logger "console;verbosity=detailed"
Expected Output:
Test Name: CgsHash_WithKnownEvidence_MatchesGoldenHash
Outcome: Passed
Duration: 87ms
Standard Output Messages:
Computed CGS: cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
Golden CGS: cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
Step 2: Verify Hash Format
Computed CGS hash should match this format:
- Prefix:
cgs:sha256: - Hash: 64 hexadecimal characters (lowercase)
- Total length: 75 characters
Example:
cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
|-------| |---------------------------------------------------------------|
prefix 64 hex chars
Step 3: Run 10-Iteration Stability Test
# Run 10 times to verify determinism
for i in {1..10}; do
echo "=== Iteration $i ==="
dotnet test \
--filter "FullyQualifiedName~CgsHash_SameInput_ProducesIdenticalHash_Across10Iterations" \
--logger "console;verbosity=minimal"
done
Expected Result: All 10 iterations should pass.
If any iteration fails with:
Expected hashes.Distinct() to have count 1, but found 2 or more.
This indicates non-deterministic behavior. DO NOT proceed until determinism is fixed.
Step 4: Verify VEX Order Independence
dotnet test --filter "FullyQualifiedName~CgsHash_VexOrderIndependent_ProducesIdenticalHash"
This test creates evidence packs with VEX documents in different orders (1-2-3, 3-1-2, 2-3-1) and verifies all produce identical hash.
Expected Output:
Test Passed
VEX order-independent CGS: cgs:sha256:...
Step 5: Document Baseline
Create a baseline record:
cat > docs/testing/baselines/cgs-golden-hash-$(date +%Y%m%d).md <<EOF
# CGS Golden Hash Baseline - $(date +%Y-%m-%d)
## Environment
- .NET Version: $(dotnet --version)
- Platform: $(uname -s)
- Machine: $(uname -m)
## Golden Hash
\`\`\`
cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
\`\`\`
## Verification
- 10-iteration stability: ✅ PASS
- VEX order independence: ✅ PASS
- Empty evidence test: ✅ PASS
## Evidence Pack
\`\`\`json
{
"sbomCanonJson": "{\"spdxVersion\":\"SPDX-3.0.1\",\"name\":\"test-sbom\"}",
"vexCanonJson": ["{\"id\":\"vex-1\",\"cve\":\"CVE-2024-0001\",\"status\":\"not_affected\"}"],
"reachabilityGraphJson": null,
"feedSnapshotDigest": "sha256:0000000000000000000000000000000000000000000000000000000000000001"
}
\`\`\`
## Policy Lock
\`\`\`json
{
"schemaVersion": "1.0",
"policyVersion": "1.0.0",
"ruleHashes": {
"rule-001": "sha256:aaaa",
"rule-002": "sha256:bbbb"
},
"engineVersion": "1.0.0",
"generatedAt": "2025-01-01T00:00:00Z"
}
\`\`\`
## Established By
- Name: [Your Name]
- Date: $(date +%Y-%m-%d)
- Commit: $(git rev-parse --short HEAD)
EOF
Cross-Platform Verification
Step 1: Push to Feature Branch
git checkout -b feature/establish-golden-hash
git add src/__Tests/Determinism/CgsDeterminismTests.cs
git commit -m "chore: establish CGS golden hash baseline
- Verified 10-iteration stability locally
- Verified VEX order independence
- Ready for cross-platform verification"
git push origin feature/establish-golden-hash
Step 2: Create Pull Request
Create PR with description:
## Golden Hash Baseline Establishment
This PR establishes the golden hash baseline for CGS determinism testing.
### Local Verification ✅
- [x] 10-iteration stability test (all identical)
- [x] VEX order independence test
- [x] Empty evidence test
- [x] Policy lock version test
### Expected CI/CD Verification
- [ ] Windows: golden hash matches
- [ ] macOS: golden hash matches
- [ ] Linux (Ubuntu): golden hash matches
- [ ] Linux (Alpine, musl libc): golden hash matches
- [ ] Linux (Debian): golden hash matches
### Golden Hash
cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
### References
- Baseline documentation: `docs/testing/baselines/cgs-golden-hash-20251229.md`
- Sprint: `docs/implplan/archived/SPRINT_20251229_001_001_BE_cgs_infrastructure.md`
Step 3: Monitor CI/CD Pipeline
Watch for cross-platform determinism workflow: .gitea/workflows/cross-platform-determinism.yml
Expected Workflow Jobs:
- ✅ determinism-windows
- ✅ determinism-macos
- ✅ determinism-linux
- ✅ determinism-alpine
- ✅ determinism-debian
- ✅ compare-hashes
Step 4: Review Hash Comparison Report
After all platform tests complete, the compare-hashes job generates a report:
Successful Output:
{
"divergences": [],
"platforms": {
"windows": "cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3",
"macos": "cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3",
"linux": "cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3",
"alpine": "cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3",
"debian": "cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3"
},
"status": "SUCCESS",
"message": "All hashes match across platforms."
}
Divergence Detected (❌ FAILURE):
{
"divergences": [
{
"key": "cgs_hash",
"linux": "cgs:sha256:abc123...",
"alpine": "cgs:sha256:def456...",
"windows": "cgs:sha256:abc123...",
"macos": "cgs:sha256:abc123...",
"debian": "cgs:sha256:abc123..."
}
],
"status": "FAILURE",
"message": "Hash divergence detected on Alpine platform (musl libc)"
}
If divergences are detected, DO NOT merge. See Troubleshooting.
Step 5: Uncomment Golden Hash Assertion
After successful cross-platform verification:
# Edit CgsDeterminismTests.cs
vi src/__Tests/Determinism/CgsDeterminismTests.cs
Line 68-69: Uncomment the assertion:
// Before:
// Uncomment when golden hash is established:
// result.CgsHash.Should().Be(goldenHash, "CGS hash must match golden file");
// After:
// Golden hash established 2025-12-29 (all platforms verified)
result.CgsHash.Should().Be(goldenHash, "CGS hash must match golden file");
Commit:
git add src/__Tests/Determinism/CgsDeterminismTests.cs
git commit -m "test: enable golden hash assertion
All platforms verified:
- Windows: ✅
- macOS: ✅
- Linux (Ubuntu): ✅
- Linux (Alpine): ✅
- Linux (Debian): ✅
Golden hash locked: cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3"
git push origin feature/establish-golden-hash
Step 6: Merge to Main
After PR approval and final CI/CD run:
git checkout main
git merge feature/establish-golden-hash
git push origin main
Golden Hash Maintenance
Regular Verification
Run cross-platform tests weekly to detect drift:
# Trigger manual workflow dispatch
gh workflow run cross-platform-determinism.yml
Monitoring
Set up alerts for:
- Hash divergence detected
- Golden hash test failures
- Cross-platform workflow failures
Slack/Email Alert Example:
⚠️ CGS Golden Hash Failure
Platform: Alpine (musl libc)
Expected: cgs:sha256:d4e56740f876aef8c010b86a40d5f56745a118d0906a34e69aec8c0db1cb8fa3
Actual: cgs:sha256:e5f67851g987bh09d121c97b51e6g67856b229e1017b45f70bfd9d1ec2cb9gb4
Investigate immediately - audit trail integrity at risk!
Version Tracking
Maintain golden hash changelog:
# CGS Golden Hash Changelog
## v1.0.0 (2025-01-01)
- Initial baseline: `cgs:sha256:d4e56740...`
- Established by: Team
- All platforms verified
## v1.1.0 (2025-02-15) - BREAKING CHANGE
- Updated to: `cgs:sha256:e5f67851...`
- Reason: Fixed VEX ordering bug in VerdictBuilderService
- Migration: Recompute all verdicts after 2025-02-01
- ADR: docs/adr/0042-cgs-vex-ordering-fix.md
Troubleshooting
Divergence on Alpine (musl libc)
Symptom:
Alpine: cgs:sha256:abc123...
Others: cgs:sha256:def456...
Likely Causes:
- String sorting differences (musl vs glibc
strcoll) - JSON serialization differences
- Floating-point formatting differences
Solutions:
- Use Ordinal String Comparison:
// ❌ Wrong (culture-dependent)
leaves.Sort();
// ✅ Correct (culture-independent)
leaves = leaves.OrderBy(l => l, StringComparer.Ordinal).ToList();
- Explicit UTF-8 Encoding:
// ❌ Wrong (platform default encoding)
var bytes = Encoding.Default.GetBytes(input);
// ✅ Correct (explicit UTF-8)
var bytes = Encoding.UTF8.GetBytes(input);
- Invariant Culture for Numbers:
// ❌ Wrong (culture-dependent)
var json = JsonSerializer.Serialize(data);
// ✅ Correct (invariant culture)
var json = JsonSerializer.Serialize(data, new JsonSerializerOptions
{
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
WriteIndented = false,
// Ensure invariant culture
Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping
});
Divergence on Windows
Symptom:
Windows: cgs:sha256:abc123...
macOS/Linux: cgs:sha256:def456...
Likely Causes:
- Path separator differences (
\vs/) - Line ending differences (CRLF vs LF)
- Case sensitivity in file paths
Solutions:
- Use Path.Combine:
// ❌ Wrong (hardcoded separator)
var path = "dir\\file.txt";
// ✅ Correct (cross-platform)
var path = Path.Combine("dir", "file.txt");
- Normalize Line Endings:
// ❌ Wrong (platform line endings)
var text = File.ReadAllText(path);
// ✅ Correct (normalized to \n)
var text = File.ReadAllText(path).Replace("\r\n", "\n");
Golden Hash Changes After .NET Upgrade
Symptom: After upgrading from .NET 10.0.100 to 10.0.101:
Expected: cgs:sha256:abc123...
Actual: cgs:sha256:def456...
Investigation:
- Check .NET Version:
dotnet --version # Should be consistent across platforms
- Check JsonSerializer Behavior:
// Test JSON serialization consistency
var test = new { name = "test", value = 123 };
var json1 = JsonSerializer.Serialize(test, CanonicalJsonOptions);
var json2 = JsonSerializer.Serialize(test, CanonicalJsonOptions);
Assert.Equal(json1, json2);
- Check Hash Algorithm:
// Verify SHA256 produces expected output
var input = "test";
var hash = Convert.ToHexString(SHA256.HashData(Encoding.UTF8.GetBytes(input))).ToLowerInvariant();
// Should be: 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08
Resolution:
- If .NET change is intentional and breaking, follow Breaking Change Process
- If .NET change is unintentional, file bug with .NET team
Breaking Change Process
When Golden Hash MUST Change
Golden hash changes are breaking changes that affect audit trail integrity. Only change for:
- Security Fixes: Vulnerability in hash computation
- Correctness Bugs: Hash not deterministic or incorrect
- Platform Incompatibility: Hash diverges across platforms
Change Process
Step 1: Document in ADR
Create docs/adr/NNNN-cgs-hash-algorithm-change.md:
# ADR NNNN: CGS Hash Algorithm Change
## Status
ACCEPTED (2025-03-15)
## Context
The current CGS hash computation has a bug in VEX document ordering that causes non-deterministic results when VEX documents have identical timestamps.
## Decision
Update VerdictBuilderService to sort VEX documents by (timestamp, cve_id, issuer_id) instead of just (timestamp).
## Consequences
### Breaking Changes
- Golden hash will change from `cgs:sha256:abc123...` to `cgs:sha256:def456...`
- All historical verdicts computed before 2025-03-15 will have old hash format
- Audit trail verification requires dual-algorithm support during transition
### Migration Strategy
1. Deploy dual-algorithm support (v1 and v2 hash computation)
2. Recompute all verdicts created after 2025-02-01 with v2 algorithm
3. Store both v1 and v2 hashes for 90-day transition period
4. Deprecate v1 algorithm on 2025-06-15
### Testing
- Verify v2 hash is deterministic across all platforms
- Verify v1 verdicts can still be verified during transition
- Load test recomputation of 1M+ verdicts
Step 2: Implement Dual-Algorithm Support
public enum CgsHashVersion
{
V1 = 1, // Original algorithm (deprecated 2025-03-15)
V2 = 2 // Fixed VEX ordering (current)
}
public string ComputeCgsHash(EvidencePack evidence, PolicyLock policyLock, CgsHashVersion version = CgsHashVersion.V2)
{
return version switch
{
CgsHashVersion.V1 => ComputeCgsHashV1(evidence, policyLock),
CgsHashVersion.V2 => ComputeCgsHashV2(evidence, policyLock),
_ => throw new ArgumentException($"Unsupported CGS hash version: {version}")
};
}
Step 3: Update Tests with Both Versions
[Theory]
[InlineData(CgsHashVersion.V1, "cgs:sha256:abc123...")] // Old golden hash
[InlineData(CgsHashVersion.V2, "cgs:sha256:def456...")] // New golden hash
public async Task CgsHash_WithKnownEvidence_MatchesGoldenHash_BothVersions(
CgsHashVersion version,
string expectedHash)
{
// Test both algorithms during transition period
var evidence = CreateKnownEvidencePack();
var policyLock = CreateKnownPolicyLock();
var service = CreateVerdictBuilder();
var result = await service.BuildAsync(evidence, policyLock, version, CancellationToken.None);
result.CgsHash.Should().Be(expectedHash);
}
Step 4: Create Migration Script
// tools/migrate-cgs-hashes.cs
public class CgsHashMigrator
{
public async Task MigrateVerdicts(DateTimeOffset since)
{
var verdicts = await _repository.GetVerdictsSince(since);
foreach (var verdict in verdicts)
{
// Recompute with V2 algorithm
var newHash = ComputeCgsHashV2(verdict.Evidence, verdict.PolicyLock);
// Store both hashes during transition
await _repository.UpdateVerdict(verdict.Id, new
{
CgsHashV1 = verdict.CgsHash,
CgsHashV2 = newHash,
MigratedAt = DateTimeOffset.UtcNow
});
}
}
}
Step 5: Coordinate Deployment
Timeline:
- Week 1: Deploy dual-algorithm support to staging
- Week 2: Run migration script on staging data
- Week 3: Verify all verdicts have both v1 and v2 hashes
- Week 4: Deploy to production
- Week 5-16: 90-day transition period (both algorithms supported)
- Week 17: Deprecate v1, remove from codebase
Step 6: Update Golden Hash
After successful migration:
// src/__Tests/Determinism/CgsDeterminismTests.cs
[Fact]
public async Task CgsHash_WithKnownEvidence_MatchesGoldenHash()
{
// Arrange
var evidence = CreateKnownEvidencePack();
var policyLock = CreateKnownPolicyLock();
var service = CreateVerdictBuilder();
// Act
var result = await service.BuildAsync(evidence, policyLock, CancellationToken.None);
// Assert - Updated golden hash (2025-03-15)
var goldenHash = "cgs:sha256:def456..."; // V2 algorithm
result.CgsHash.Should().Be(goldenHash, "CGS hash must match golden file (V2 algorithm)");
}
Step 7: Document in Changelog
## CHANGELOG
### [2.0.0] - 2025-03-15 - BREAKING CHANGE
#### Changed
- **CGS Hash Algorithm**: Fixed VEX ordering bug (#1234)
- Old: `cgs:sha256:abc123...`
- New: `cgs:sha256:def456...`
- Migration: All verdicts after 2025-02-01 recomputed
- Dual-algorithm support: 90 days (until 2025-06-15)
#### Migration Guide
See: `docs/migrations/cgs-hash-v2-migration.md`
#### ADR
See: `docs/adr/0042-cgs-hash-algorithm-change.md`
Best Practices
1. Never Change Golden Hash Without ADR
Every golden hash change MUST have an ADR documenting:
- Why the change is necessary
- Impact on historical data
- Migration strategy
- Testing plan
2. Always Support Dual Algorithms During Transition
For 90 days after change, support both old and new algorithms to avoid breaking existing integrations.
3. Run Cross-Platform Tests Before Merge
Never merge golden hash changes without verifying all 5 platforms produce identical results.
4. Version Golden Hashes in Baseline Files
Maintain historical record:
docs/testing/baselines/
├── cgs-golden-hash-20250101-v1.md # Original
└── cgs-golden-hash-20250315-v2.md # Updated
5. Automate Monitoring
Set up daily cross-platform runs to detect drift early:
# .gitea/workflows/golden-hash-monitor.yml
on:
schedule:
- cron: '0 0 * * *' # Daily at midnight UTC
References
- Sprint Documentation:
docs/implplan/archived/SPRINT_20251229_001_001_BE_cgs_infrastructure.md - Test README:
src/__Tests/Determinism/README.md - CI/CD Workflow:
.gitea/workflows/cross-platform-determinism.yml - Batch Summary:
docs/implplan/archived/2025-12-29-completed-sprints/BATCH_20251229_BE_COMPLETION_SUMMARY.md
Support
For questions or issues:
- Create issue with label:
determinism,golden-file - Priority: Critical (affects audit trail integrity)
- Slack: #determinism-testing