sprints work
This commit is contained in:
@@ -0,0 +1,363 @@
|
||||
# Sprint 8100.0012.0001 · Canonicalizer Versioning for Content-Addressed Identifiers
|
||||
|
||||
## Topic & Scope
|
||||
|
||||
Embed canonicalization version markers in content-addressed hashes to prevent future hash collisions when canonicalization logic evolves. This sprint delivers:
|
||||
|
||||
1. **Canonicalizer Version Constant**: Define `CanonVersion.V1 = "stella:canon:v1"` as a stable version identifier.
|
||||
2. **Version-Prefixed Hashing**: Update `ContentAddressedIdGenerator` to include version marker in canonicalized payloads before hashing.
|
||||
3. **Backward Compatibility**: Existing hashes remain valid; new hashes include version marker; verification can detect and handle both formats.
|
||||
4. **Documentation**: Update architecture docs with canonicalization versioning rationale and upgrade path.
|
||||
|
||||
**Working directory:** `src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/`, `src/__Libraries/StellaOps.Canonical.Json/`, `src/__Libraries/__Tests/`.
|
||||
|
||||
**Evidence:** All content-addressed IDs include version marker; determinism tests pass; backward compatibility verified; no hash collisions between v0 (legacy) and v1 (versioned).
|
||||
|
||||
---
|
||||
|
||||
## Dependencies & Concurrency
|
||||
|
||||
- **Depends on:** None (foundational change).
|
||||
- **Blocks:** Sprint 8100.0012.0002 (Unified Evidence Model), Sprint 8100.0012.0003 (Graph Root Attestation) — both depend on stable versioned hashing.
|
||||
- **Safe to run in parallel with:** Unrelated module work.
|
||||
|
||||
---
|
||||
|
||||
## Documentation Prerequisites
|
||||
|
||||
- `docs/modules/attestor/README.md` (Attestor architecture)
|
||||
- `docs/modules/attestor/proof-chain.md` (Proof chain design)
|
||||
- Product Advisory: Merkle-Hash REG (this sprint's origin)
|
||||
|
||||
---
|
||||
|
||||
## Problem Statement
|
||||
|
||||
### Current State
|
||||
|
||||
The `ContentAddressedIdGenerator` computes hashes by:
|
||||
1. Serializing predicates to JSON with `JsonSerializer`
|
||||
2. Canonicalizing via `IJsonCanonicalizer` (RFC 8785)
|
||||
3. Computing SHA-256 of canonical bytes
|
||||
|
||||
**Problem:** If the canonicalization algorithm ever changes (bug fix, spec update, optimization), existing hashes become invalid with no way to distinguish which version produced them.
|
||||
|
||||
### Target State
|
||||
|
||||
Include a version marker in the canonical representation:
|
||||
```json
|
||||
{
|
||||
"_canonVersion": "stella:canon:v1",
|
||||
"evidenceSource": "...",
|
||||
"sbomEntryId": "...",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
The version marker:
|
||||
- Is sorted first (underscore prefix ensures lexicographic ordering)
|
||||
- Identifies the exact canonicalization algorithm used
|
||||
- Enables verifiers to select the correct algorithm
|
||||
- Allows graceful migration to future versions
|
||||
|
||||
---
|
||||
|
||||
## Design Specification
|
||||
|
||||
### CanonVersion Constants
|
||||
|
||||
```csharp
|
||||
// src/__Libraries/StellaOps.Canonical.Json/CanonVersion.cs
|
||||
namespace StellaOps.Canonical.Json;
|
||||
|
||||
/// <summary>
|
||||
/// Canonicalization version identifiers for content-addressed hashing.
|
||||
/// </summary>
|
||||
public static class CanonVersion
|
||||
{
|
||||
/// <summary>
|
||||
/// Version 1: RFC 8785 JSON canonicalization with:
|
||||
/// - Ordinal key sorting
|
||||
/// - No whitespace
|
||||
/// - UTF-8 encoding without BOM
|
||||
/// - IEEE 754 number formatting
|
||||
/// </summary>
|
||||
public const string V1 = "stella:canon:v1";
|
||||
|
||||
/// <summary>
|
||||
/// Field name for version marker in canonical JSON.
|
||||
/// Underscore prefix ensures it sorts first.
|
||||
/// </summary>
|
||||
public const string VersionFieldName = "_canonVersion";
|
||||
|
||||
/// <summary>
|
||||
/// Current default version for new hashes.
|
||||
/// </summary>
|
||||
public const string Current = V1;
|
||||
}
|
||||
```
|
||||
|
||||
### Updated CanonJson API
|
||||
|
||||
```csharp
|
||||
// src/__Libraries/StellaOps.Canonical.Json/CanonJson.cs (additions)
|
||||
|
||||
/// <summary>
|
||||
/// Canonicalizes an object with version marker for content-addressed hashing.
|
||||
/// </summary>
|
||||
/// <typeparam name="T">The type to serialize.</typeparam>
|
||||
/// <param name="obj">The object to canonicalize.</param>
|
||||
/// <param name="version">Canonicalization version (default: Current).</param>
|
||||
/// <returns>UTF-8 encoded canonical JSON bytes with version marker.</returns>
|
||||
public static byte[] CanonicalizeVersioned<T>(T obj, string version = CanonVersion.Current)
|
||||
{
|
||||
var json = JsonSerializer.SerializeToUtf8Bytes(obj, DefaultOptions);
|
||||
using var doc = JsonDocument.Parse(json);
|
||||
|
||||
using var ms = new MemoryStream();
|
||||
using var writer = new Utf8JsonWriter(ms, new JsonWriterOptions { Indented = false });
|
||||
|
||||
writer.WriteStartObject();
|
||||
writer.WriteString(CanonVersion.VersionFieldName, version);
|
||||
|
||||
// Write sorted properties from original object
|
||||
foreach (var prop in doc.RootElement.EnumerateObject()
|
||||
.OrderBy(p => p.Name, StringComparer.Ordinal))
|
||||
{
|
||||
writer.WritePropertyName(prop.Name);
|
||||
WriteElementSorted(prop.Value, writer);
|
||||
}
|
||||
|
||||
writer.WriteEndObject();
|
||||
writer.Flush();
|
||||
return ms.ToArray();
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Computes SHA-256 hash with version marker.
|
||||
/// </summary>
|
||||
public static string HashVersioned<T>(T obj, string version = CanonVersion.Current)
|
||||
{
|
||||
var canonical = CanonicalizeVersioned(obj, version);
|
||||
return Sha256Hex(canonical);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Computes prefixed SHA-256 hash with version marker.
|
||||
/// </summary>
|
||||
public static string HashVersionedPrefixed<T>(T obj, string version = CanonVersion.Current)
|
||||
{
|
||||
var canonical = CanonicalizeVersioned(obj, version);
|
||||
return Sha256Prefixed(canonical);
|
||||
}
|
||||
```
|
||||
|
||||
### Updated ContentAddressedIdGenerator
|
||||
|
||||
```csharp
|
||||
// src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Identifiers/ContentAddressedIdGenerator.cs
|
||||
|
||||
public EvidenceId ComputeEvidenceId(EvidencePredicate predicate)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(predicate);
|
||||
// Clear self-referential field, add version marker
|
||||
var toHash = predicate with { EvidenceId = null };
|
||||
var canonical = CanonicalizeVersioned(toHash, CanonVersion.Current);
|
||||
return new EvidenceId(HashSha256Hex(canonical));
|
||||
}
|
||||
|
||||
// Similar updates for ComputeReasoningId, ComputeVexVerdictId, etc.
|
||||
|
||||
private byte[] CanonicalizeVersioned<T>(T value, string version)
|
||||
{
|
||||
var json = JsonSerializer.SerializeToUtf8Bytes(value, SerializerOptions);
|
||||
return _canonicalizer.CanonicalizeWithVersion(json, version);
|
||||
}
|
||||
```
|
||||
|
||||
### IJsonCanonicalizer Extension
|
||||
|
||||
```csharp
|
||||
// src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Json/IJsonCanonicalizer.cs
|
||||
|
||||
public interface IJsonCanonicalizer
|
||||
{
|
||||
/// <summary>
|
||||
/// Canonicalizes JSON bytes per RFC 8785.
|
||||
/// </summary>
|
||||
byte[] Canonicalize(ReadOnlySpan<byte> json);
|
||||
|
||||
/// <summary>
|
||||
/// Canonicalizes JSON bytes with version marker prepended.
|
||||
/// </summary>
|
||||
byte[] CanonicalizeWithVersion(ReadOnlySpan<byte> json, string version);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Backward Compatibility Strategy
|
||||
|
||||
### Phase 1: Dual-Mode (This Sprint)
|
||||
|
||||
- **Generation:** Always emit versioned hashes (v1)
|
||||
- **Verification:** Accept both legacy (unversioned) and v1 hashes
|
||||
- **Detection:** Check if canonical JSON starts with `{"_canonVersion":` to determine format
|
||||
|
||||
```csharp
|
||||
public static bool IsVersionedHash(ReadOnlySpan<byte> canonicalJson)
|
||||
{
|
||||
// Check for version field at start (after lexicographic sorting, _ comes first)
|
||||
return canonicalJson.Length > 20 &&
|
||||
canonicalJson.StartsWith("{\"_canonVersion\":"u8);
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 2: Migration (Future Sprint)
|
||||
|
||||
- Emit migration warnings for legacy hashes in logs
|
||||
- Provide tooling to rehash attestations with version marker
|
||||
- Document upgrade path in `docs/operations/canon-version-migration.md`
|
||||
|
||||
### Phase 3: Deprecation (Future Sprint)
|
||||
|
||||
- Remove legacy hash acceptance
|
||||
- Fail verification for unversioned hashes
|
||||
|
||||
---
|
||||
|
||||
## Delivery Tracker
|
||||
|
||||
| # | Task ID | Status | Key dependency | Owners | Task Definition |
|
||||
|---|---------|--------|----------------|--------|-----------------|
|
||||
| **Wave 0 (Constants & Types)** | | | | | |
|
||||
| 1 | CANON-8100-001 | DONE | None | Platform Guild | Create `CanonVersion.cs` with V1 constant and field name. |
|
||||
| 2 | CANON-8100-002 | DONE | Task 1 | Platform Guild | Add `CanonicalizeVersioned<T>()` to `CanonJson.cs`. |
|
||||
| 3 | CANON-8100-003 | DONE | Task 1 | Platform Guild | Add `HashVersioned<T>()` and `HashVersionedPrefixed<T>()` to `CanonJson.cs`. |
|
||||
| **Wave 1 (Canonicalizer Updates)** | | | | | |
|
||||
| 4 | CANON-8100-004 | DONE | Task 2 | Attestor Guild | Extend `IJsonCanonicalizer` with `CanonicalizeWithVersion()` method. |
|
||||
| 5 | CANON-8100-005 | DONE | Task 4 | Attestor Guild | Implement `CanonicalizeWithVersion()` in `Rfc8785JsonCanonicalizer`. |
|
||||
| 6 | CANON-8100-006 | DONE | Task 5 | Attestor Guild | Add `IsVersionedHash()` detection utility. |
|
||||
| **Wave 2 (Generator Updates)** | | | | | |
|
||||
| 7 | CANON-8100-007 | DONE | Tasks 4-6 | Attestor Guild | Update `ComputeEvidenceId()` to use versioned canonicalization. |
|
||||
| 8 | CANON-8100-008 | DONE | Task 7 | Attestor Guild | Update `ComputeReasoningId()` to use versioned canonicalization. |
|
||||
| 9 | CANON-8100-009 | DONE | Task 7 | Attestor Guild | Update `ComputeVexVerdictId()` to use versioned canonicalization. |
|
||||
| 10 | CANON-8100-010 | DONE | Task 7 | Attestor Guild | Update `ComputeProofBundleId()` to use versioned canonicalization. |
|
||||
| 11 | CANON-8100-011 | DONE | Task 7 | Attestor Guild | Update `ComputeGraphRevisionId()` to use versioned canonicalization. |
|
||||
| **Wave 3 (Tests)** | | | | | |
|
||||
| 12 | CANON-8100-012 | DONE | Tasks 7-11 | QA Guild | Add unit tests: versioned hash differs from legacy hash for same input. |
|
||||
| 13 | CANON-8100-013 | DONE | Task 12 | QA Guild | Add determinism tests: same input + same version = same hash. |
|
||||
| 14 | CANON-8100-014 | DONE | Task 12 | QA Guild | Add backward compatibility tests: verify both legacy and v1 hashes accepted. |
|
||||
| 15 | CANON-8100-015 | DONE | Task 12 | QA Guild | Add golden file tests: snapshot of v1 canonical output for known inputs. |
|
||||
| **Wave 4 (Documentation)** | | | | | |
|
||||
| 16 | CANON-8100-016 | DONE | Tasks 7-11 | Docs Guild | Update `docs/modules/attestor/proof-chain.md` with versioning rationale. |
|
||||
| 17 | CANON-8100-017 | DONE | Task 16 | Docs Guild | Create `docs/operations/canon-version-migration.md` with upgrade path. |
|
||||
| 18 | CANON-8100-018 | DONE | Task 16 | Docs Guild | Update API reference with new `CanonJson` methods. |
|
||||
|
||||
---
|
||||
|
||||
## Wave Coordination
|
||||
|
||||
| Wave | Tasks | Focus | Evidence |
|
||||
|------|-------|-------|----------|
|
||||
| **Wave 0** | 1-3 | Constants and CanonJson API | `CanonVersion.cs` exists; `CanonJson` has versioned methods |
|
||||
| **Wave 1** | 4-6 | Canonicalizer implementation | `IJsonCanonicalizer.CanonicalizeWithVersion()` works; detection utility works |
|
||||
| **Wave 2** | 7-11 | Generator updates | All `Compute*Id()` methods use versioned hashing |
|
||||
| **Wave 3** | 12-15 | Tests | All tests pass; golden files stable |
|
||||
| **Wave 4** | 16-18 | Documentation | Docs updated; migration guide complete |
|
||||
|
||||
---
|
||||
|
||||
## Test Cases
|
||||
|
||||
### TC-001: Versioned Hash Differs from Legacy
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void VersionedHash_DiffersFromLegacy_ForSameInput()
|
||||
{
|
||||
var predicate = new EvidencePredicate { /* ... */ };
|
||||
|
||||
var legacyHash = CanonJson.Hash(predicate);
|
||||
var versionedHash = CanonJson.HashVersioned(predicate, CanonVersion.V1);
|
||||
|
||||
Assert.NotEqual(legacyHash, versionedHash);
|
||||
}
|
||||
```
|
||||
|
||||
### TC-002: Determinism Across Environments
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void VersionedHash_IsDeterministic()
|
||||
{
|
||||
var predicate = new EvidencePredicate { /* ... */ };
|
||||
|
||||
var hash1 = CanonJson.HashVersioned(predicate, CanonVersion.V1);
|
||||
var hash2 = CanonJson.HashVersioned(predicate, CanonVersion.V1);
|
||||
|
||||
Assert.Equal(hash1, hash2);
|
||||
}
|
||||
```
|
||||
|
||||
### TC-003: Version Field Sorts First
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void VersionedCanonical_HasVersionFieldFirst()
|
||||
{
|
||||
var predicate = new EvidencePredicate { Source = "test" };
|
||||
var canonical = CanonJson.CanonicalizeVersioned(predicate, CanonVersion.V1);
|
||||
var json = Encoding.UTF8.GetString(canonical);
|
||||
|
||||
Assert.StartsWith("{\"_canonVersion\":\"stella:canon:v1\"", json);
|
||||
}
|
||||
```
|
||||
|
||||
### TC-004: Golden File Stability
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public async Task VersionedCanonical_MatchesGoldenFile()
|
||||
{
|
||||
var predicate = CreateKnownPredicate();
|
||||
var canonical = CanonJson.CanonicalizeVersioned(predicate, CanonVersion.V1);
|
||||
|
||||
await Verify(Encoding.UTF8.GetString(canonical))
|
||||
.UseDirectory("Golden")
|
||||
.UseFileName("EvidencePredicate_v1");
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Decisions & Risks
|
||||
|
||||
### Decisions
|
||||
|
||||
| Decision | Rationale |
|
||||
|----------|-----------|
|
||||
| Use underscore prefix for version field | Ensures lexicographic first position |
|
||||
| Version string format `stella:canon:v1` | Namespaced, unambiguous, extensible |
|
||||
| Dual-mode verification initially | Backward compatibility for existing attestations |
|
||||
| Version field in payload, not hash prefix | Keeps hash format consistent (sha256:...) |
|
||||
|
||||
### Risks
|
||||
|
||||
| Risk | Impact | Mitigation | Owner |
|
||||
|------|--------|------------|-------|
|
||||
| Existing attestations invalidated | Verification failures | Dual-mode verification; migration tooling | Attestor Guild |
|
||||
| Performance overhead of version injection | Latency | Minimal (~100 bytes); benchmark | Platform Guild |
|
||||
| Version field conflicts with user data | Hash collision | Reserved `_` prefix; schema validation | Attestor Guild |
|
||||
| Future canonicalization changes | V2 needed | Design allows unlimited versions | Platform Guild |
|
||||
|
||||
---
|
||||
|
||||
## Execution Log
|
||||
|
||||
| Date (UTC) | Update | Owner |
|
||||
|------------|--------|-------|
|
||||
| 2025-12-24 | Sprint created from Merkle-Hash REG product advisory gap analysis. | Project Mgmt |
|
||||
| 2025-12-24 | Wave 0-2 completed: CanonVersion.cs, CanonJson versioned methods, IJsonCanonicalizer.CanonicalizeWithVersion(), ContentAddressedIdGenerator updated. | Platform Guild |
|
||||
| 2025-12-24 | Wave 3 completed: 33 unit tests added covering versioned vs legacy, determinism, backward compatibility, golden files, edge cases. All tests pass. | QA Guild |
|
||||
| 2025-12-24 | Wave 4 completed: Updated proof-chain-specification.md with versioning section, created canon-version-migration.md guide, created canon-json.md API reference. Sprint complete. | Docs Guild |
|
||||
Reference in New Issue
Block a user