Files
git.stella-ops.org/docs/features/unchecked/attestor/canonicalization-and-content-addressing.md

2.6 KiB

Canonicalization and Content Addressing

Module

Attestor

Status

IMPLEMENTED

Description

RFC 8785 JSON canonicalization, deterministic Merkle tree building, and content-addressed ID generation for all proof chain artifacts ensuring stable hashing.

Implementation Details

  • RFC 8785 Canonicalizer: src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/Rfc8785JsonCanonicalizer.cs -- implements IJsonCanonicalizer. Partials:
    • .DecimalPoint -- decimal point handling per RFC 8785
    • .NumberSerialization -- IEEE 754 number serialization
    • .StringNormalization -- Unicode escape and string normalization
    • .WriteMethods -- low-level write operations
  • SBOM Canonicalizer: src/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Canonicalization/SbomCanonicalizer.cs (with .Elements) -- implements ISbomCanonicalizer for deterministic SBOM element ordering.
  • Content-Addressed ID Generator: __Libraries/StellaOps.Attestor.ProofChain/Identifiers/ContentAddressedIdGenerator.cs (with .Graph) -- IContentAddressedIdGenerator implementation. Generates SHA-256 content-addressed IDs.
  • ID Types: ContentAddressedId.cs (base), GenericContentAddressedId.cs, ArtifactId.cs, EvidenceId.cs, ProofBundleId.cs, VexVerdictId.cs, ReasoningId.cs, SbomEntryId.cs, TrustAnchorId.cs, GraphRevisionId.cs.
  • SHA-256 Parser: Sha256IdParser.cs -- parses sha256:<hex> formatted IDs.
  • Proof Hashing: ProofHashing.cs -- utility methods for proof chain hashing.
  • Merkle Tree: Merkle/DeterministicMerkleTreeBuilder.cs (with .Helpers, .Proof) -- deterministic tree construction.
  • Tests: __Tests/StellaOps.Attestor.ProofChain.Tests/JsonCanonicalizerTests.cs, ContentAddressedIdTests.cs, ContentAddressedIdGeneratorTests.cs, MerkleTreeBuilderTests.cs

E2E Test Plan

  • Canonicalize JSON with out-of-order keys and verify output has keys in lexicographic order per RFC 8785
  • Canonicalize JSON with Unicode escapes (e.g., \u00e9) and verify normalization to UTF-8
  • Canonicalize JSON with floating-point numbers and verify IEEE 754 serialization
  • Generate a content-addressed ID for a proof blob and verify it matches sha256:<64-hex-chars> format
  • Verify Sha256IdParser correctly parses valid IDs and rejects malformed ones
  • Canonicalize an SBOM document via SbomCanonicalizer and verify element ordering is deterministic
  • Build a Merkle tree from canonicalized artifacts and verify the root hash is stable across invocations
  • Generate SbomEntryId for identical SBOM component content and verify ID equality