# Canonicalization and Content Addressing ## Module Attestor ## Status VERIFIED ## Description RFC 8785 JSON canonicalization, deterministic Merkle tree building, and content-addressed ID generation for all proof chain artifacts ensuring stable hashing. ## Implementation Details - **RFC 8785 Canonicalizer**: `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/Rfc8785JsonCanonicalizer.cs` -- implements `IJsonCanonicalizer`. Partials: - `.DecimalPoint` -- decimal point handling per RFC 8785 - `.NumberSerialization` -- IEEE 754 number serialization - `.StringNormalization` -- Unicode escape and string normalization - `.WriteMethods` -- low-level write operations - **SBOM Canonicalizer**: `src/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Canonicalization/SbomCanonicalizer.cs` (with `.Elements`) -- implements `ISbomCanonicalizer` for deterministic SBOM element ordering. - **Content-Addressed ID Generator**: `__Libraries/StellaOps.Attestor.ProofChain/Identifiers/ContentAddressedIdGenerator.cs` (with `.Graph`) -- `IContentAddressedIdGenerator` implementation. Generates SHA-256 content-addressed IDs. - **ID Types**: `ContentAddressedId.cs` (base), `GenericContentAddressedId.cs`, `ArtifactId.cs`, `EvidenceId.cs`, `ProofBundleId.cs`, `VexVerdictId.cs`, `ReasoningId.cs`, `SbomEntryId.cs`, `TrustAnchorId.cs`, `GraphRevisionId.cs`. - **SHA-256 Parser**: `Sha256IdParser.cs` -- parses `sha256:` formatted IDs. - **Proof Hashing**: `ProofHashing.cs` -- utility methods for proof chain hashing. - **Merkle Tree**: `Merkle/DeterministicMerkleTreeBuilder.cs` (with `.Helpers`, `.Proof`) -- deterministic tree construction. - **Tests**: `__Tests/StellaOps.Attestor.ProofChain.Tests/JsonCanonicalizerTests.cs`, `ContentAddressedIdTests.cs`, `ContentAddressedIdGeneratorTests.cs`, `MerkleTreeBuilderTests.cs` ## E2E Test Plan - [ ] Canonicalize JSON with out-of-order keys and verify output has keys in lexicographic order per RFC 8785 - [ ] Canonicalize JSON with Unicode escapes (e.g., `\u00e9`) and verify normalization to UTF-8 - [ ] Canonicalize JSON with floating-point numbers and verify IEEE 754 serialization - [ ] Generate a content-addressed ID for a proof blob and verify it matches `sha256:<64-hex-chars>` format - [ ] Verify `Sha256IdParser` correctly parses valid IDs and rejects malformed ones - [ ] Canonicalize an SBOM document via `SbomCanonicalizer` and verify element ordering is deterministic - [ ] Build a Merkle tree from canonicalized artifacts and verify the root hash is stable across invocations - [ ] Generate `SbomEntryId` for identical SBOM component content and verify ID equality ## Verification | Check | Result | |-------|--------| | Tier 0 - Source Verification | PASS | | Tier 1 - Build + Code Review | PASS | | Tier 2 - Behavioral Verification | PASS | | Verified Date | 2026-02-13 | | Run ID | run-001 |