# ADR 0042: CGS Merkle Tree Implementation ## Status ACCEPTED (2025-12-29) ## Context The CGS (Canonical Graph Signature) system requires deterministic hash computation for verdicts. We need to decide whether to: 1. Reuse existing `StellaOps.Attestor.ProofChain` Merkle tree builder 2. Build a custom Merkle tree implementation in `VerdictBuilderService` ### Requirements - **Determinism**: Same evidence must always produce identical CGS hash - **Order Independence**: VEX document ordering should not affect hash (sorted internally) - **Cross-Platform**: Identical hash on Windows, macOS, Linux (glibc), Linux (musl), BSD - **Leaf Composition**: Specific ordering of evidence components (SBOM, VEX sorted, reachability, policy lock) ### Existing ProofChain Merkle Builder Located at: `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/` **Pros:** - Already implements Merkle tree construction - Tested and proven in production - Handles parent/child attestation chains **Cons:** - Designed for attestation chains, not evidence hashing - Includes attestation-specific metadata in hash - Doesn't support custom leaf ordering required for CGS - Would require modifications that might break existing attestation behavior ## Decision **Build custom Merkle tree implementation in `VerdictBuilderService`.** ### Rationale 1. **Separation of Concerns**: CGS hash computation has different requirements than attestation chain verification 2. **Full Control Over Determinism**: Custom implementation allows: - Explicit leaf ordering: SBOM → VEX (sorted) → Reachability → PolicyLock - VEX document sorting by content hash (not insertion order) - Culture-invariant string comparison (`StringComparer.Ordinal`) 3. **Simplicity**: ~50 lines of code vs modifying 500+ lines in ProofChain 4. **No Breaking Changes**: Doesn't affect existing attestation infrastructure ### Implementation ```csharp // VerdictBuilderService.cs private static string ComputeCgsHash(EvidencePack evidence, PolicyLock policyLock) { // Build Merkle tree from evidence components (sorted for determinism) var leaves = new List { ComputeHash(evidence.SbomCanonJson), ComputeHash(evidence.FeedSnapshotDigest) }; // Add VEX digests in sorted order (ORDER-CRITICAL for determinism!) foreach (var vex in evidence.VexCanonJson.OrderBy(v => v, StringComparer.Ordinal)) { leaves.Add(ComputeHash(vex)); } // Add reachability if present if (!string.IsNullOrEmpty(evidence.ReachabilityGraphJson)) { leaves.Add(ComputeHash(evidence.ReachabilityGraphJson)); } // Add policy lock hash var policyLockJson = JsonSerializer.Serialize(policyLock, CanonicalJsonOptions); leaves.Add(ComputeHash(policyLockJson)); // Build Merkle root var merkleRoot = BuildMerkleRoot(leaves); return $"cgs:sha256:{merkleRoot}"; } private static string BuildMerkleRoot(List leaves) { if (leaves.Count == 0) return ComputeHash(""); if (leaves.Count == 1) return leaves[0]; var level = leaves.ToList(); while (level.Count > 1) { var nextLevel = new List(); for (int i = 0; i < level.Count; i += 2) { if (i + 1 < level.Count) { // Combine two hashes var combined = level[i] + level[i + 1]; nextLevel.Add(ComputeHash(combined)); } else { // Odd number of nodes, promote last one nextLevel.Add(level[i]); } } level = nextLevel; } return level[0]; } private static string ComputeHash(string input) { var bytes = Encoding.UTF8.GetBytes(input); var hashBytes = SHA256.HashData(bytes); return Convert.ToHexString(hashBytes).ToLowerInvariant(); } ``` ## Consequences ### Positive - ✅ Full control over CGS hash computation logic - ✅ No risk of breaking existing attestation chains - ✅ Simple, testable implementation (~50 lines) - ✅ Explicit ordering guarantees determinism - ✅ Cross-platform verified (Windows, macOS, Linux, Alpine, Debian) ### Negative - ⚠️ Code duplication with ProofChain (minimal - different use case) - ⚠️ Need to maintain separate Merkle tree implementation (low maintenance burden) ### Neutral - 📝 Custom implementation documented in tests (CgsDeterminismTests.cs) - 📝 Future: Could extract shared Merkle tree primitives if needed ## Alternatives Considered ### Alternative 1: Modify ProofChain Builder **Rejected because:** - Would require adding configuration options to ProofChain - Risk of breaking existing attestation behavior - Increased complexity for both use cases - Tight coupling between verdict and attestation systems ### Alternative 2: Use Third-Party Merkle Tree Library **Rejected because:** - External dependency for ~50 lines of code - Less control over ordering and hash format - Potential platform-specific issues - Security review overhead ### Alternative 3: Single-Level Hash (No Merkle Tree) **Rejected because:** - Loses incremental verification capability - Can't prove individual evidence components without full evidence pack - Less efficient for large evidence packs (can't skip unchanged components) ## Verification ### Test Coverage File: `src/__Tests/Determinism/CgsDeterminismTests.cs` 1. **Golden File Test**: Known evidence produces expected hash 2. **10-Iteration Stability**: Same input produces identical hash 10 times 3. **VEX Order Independence**: VEX document ordering doesn't affect hash 4. **Reachability Inclusion**: Reachability graph changes hash predictably 5. **Policy Lock Versioning**: Different policy versions produce different hashes ### Cross-Platform Verification CI/CD Workflow: `.gitea/workflows/cross-platform-determinism.yml` - ✅ Windows (glibc) - ✅ macOS (BSD libc) - ✅ Linux Ubuntu (glibc) - ✅ Linux Alpine (musl libc) - ✅ Linux Debian (glibc) All platforms produce identical CGS hash for same input. ## Migration No migration required - this is a new feature. ## References - **Sprint**: `docs/implplan/archived/SPRINT_20251229_001_001_BE_cgs_infrastructure.md` - **Implementation**: `src/__Libraries/StellaOps.Verdict/VerdictBuilderService.cs` - **Tests**: `src/__Tests/Determinism/CgsDeterminismTests.cs` - **ProofChain**: `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/` ## Decision Date 2025-12-29 ## Decision Makers - Backend Team - Security Team - Attestation Team (consulted) ## Review Date 2026-06-29 (6 months) - Evaluate if code duplication warrants shared library