226 lines
5.9 KiB
Markdown
226 lines
5.9 KiB
Markdown
# Canonicalization Version Migration Guide
|
|
|
|
**Version**: 1.0
|
|
**Status**: Active
|
|
**Last Updated**: 2025-12-24
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This guide describes the migration path for content-addressed identifiers from legacy (unversioned) canonicalization to versioned canonicalization (`stella:canon:v1`). Versioned canonicalization embeds algorithm version markers in the canonical JSON before hashing, ensuring forward compatibility and verifier clarity.
|
|
|
|
## Why Versioning?
|
|
|
|
### Problem
|
|
|
|
Legacy content-addressed IDs (EvidenceID, ReasoningID, VEXVerdictID, ProofBundleID) were computed as:
|
|
|
|
```
|
|
hash = SHA256(canonicalize(payload))
|
|
```
|
|
|
|
If the canonicalization algorithm ever changed (bug fix, specification update, optimization), existing hashes would become invalid with no way to detect which algorithm produced them.
|
|
|
|
### Solution
|
|
|
|
Versioned content-addressed IDs embed a version marker:
|
|
|
|
```
|
|
hash = SHA256(canonicalize_with_version(payload, "stella:canon:v1"))
|
|
```
|
|
|
|
The canonical JSON now includes `_canonVersion` as the first field:
|
|
|
|
```json
|
|
{
|
|
"_canonVersion": "stella:canon:v1",
|
|
"sbomEntryId": "sha256:91f2ab3c:pkg:npm/lodash@4.17.21",
|
|
"vulnerabilityId": "CVE-2021-23337"
|
|
}
|
|
```
|
|
|
|
## Migration Phases
|
|
|
|
### Phase 1: Dual-Mode (Current)
|
|
|
|
**Timeline**: Now
|
|
**Behavior**:
|
|
- **Generation**: All new content-addressed IDs use versioned canonicalization (v1)
|
|
- **Verification**: Accept both legacy and v1 hashes
|
|
- **Detection**: `CanonVersion.IsVersioned()` distinguishes format
|
|
|
|
**Impact**:
|
|
- Zero downtime migration
|
|
- Existing attestations remain valid
|
|
- New attestations get version markers
|
|
|
|
### Phase 2: Deprecation Warning
|
|
|
|
**Timeline**: +6 months from Phase 1
|
|
**Behavior**:
|
|
- Log warnings when verifying legacy hashes
|
|
- Emit metrics for legacy hash encounters
|
|
- Continue accepting legacy hashes
|
|
|
|
**Operator Action**:
|
|
- Monitor `canon_legacy_hash_verified_total` metric
|
|
- Plan re-attestation of critical assets
|
|
|
|
### Phase 3: Legacy Rejection
|
|
|
|
**Timeline**: +12 months from Phase 1
|
|
**Behavior**:
|
|
- Reject legacy hashes during verification
|
|
- Only v1 (or newer) hashes accepted
|
|
|
|
**Operator Action**:
|
|
- Re-attest any remaining legacy attestations before cutoff
|
|
- Use `stella rehash` CLI command for bulk re-attestation
|
|
|
|
---
|
|
|
|
## Detection and Verification
|
|
|
|
### Detecting Versioned Hashes
|
|
|
|
Versioned canonical JSON always starts with `{"_canonVersion":` due to lexicographic sorting (underscore sorts before all ASCII letters).
|
|
|
|
```csharp
|
|
using StellaOps.Canonical.Json;
|
|
|
|
// Check if canonical JSON is versioned
|
|
byte[] canonicalJson = GetCanonicalPayload();
|
|
bool isVersioned = CanonVersion.IsVersioned(canonicalJson);
|
|
|
|
// Extract version if present
|
|
string? version = CanonVersion.ExtractVersion(canonicalJson);
|
|
if (version == CanonVersion.V1)
|
|
{
|
|
// Use V1 verification algorithm
|
|
}
|
|
```
|
|
|
|
### Verifying Both Formats
|
|
|
|
During Phase 1, verifiers should accept both formats:
|
|
|
|
```csharp
|
|
public bool VerifyContentAddressedId(byte[] payload, string expectedHash)
|
|
{
|
|
// Try versioned first (new format)
|
|
if (CanonVersion.IsVersioned(payload))
|
|
{
|
|
var hash = CanonJson.HashVersioned(payload, CanonVersion.Current);
|
|
return hash == expectedHash;
|
|
}
|
|
|
|
// Fall back to legacy (unversioned)
|
|
var legacyHash = CanonJson.Hash(payload);
|
|
return legacyHash == expectedHash;
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Re-Attestation Procedure
|
|
|
|
### When to Re-Attest
|
|
|
|
Re-attestation is required when:
|
|
- Moving from Phase 1 to Phase 3
|
|
- Migrating attestations between systems
|
|
- Archiving for long-term storage
|
|
|
|
### CLI Re-Attestation
|
|
|
|
```bash
|
|
# Re-attest a single attestation bundle
|
|
stella rehash --input attestation.json --output attestation-v1.json
|
|
|
|
# Bulk re-attest all attestations in a directory
|
|
stella rehash --input-dir /var/stella/attestations \
|
|
--output-dir /var/stella/attestations-v1 \
|
|
--version stella:canon:v1
|
|
|
|
# Dry-run to preview changes
|
|
stella rehash --input attestation.json --dry-run
|
|
```
|
|
|
|
### Database Migration
|
|
|
|
For PostgreSQL-stored attestations:
|
|
|
|
```sql
|
|
-- Find legacy attestations (those without version marker)
|
|
SELECT id, content_hash, created_at
|
|
FROM attestations
|
|
WHERE NOT content LIKE '{"_canonVersion":%'
|
|
ORDER BY created_at;
|
|
|
|
-- Export for re-processing
|
|
COPY (
|
|
SELECT id, content
|
|
FROM attestations
|
|
WHERE NOT content LIKE '{"_canonVersion":%'
|
|
) TO '/tmp/legacy_attestations.csv' WITH CSV HEADER;
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Hash Mismatch Errors
|
|
|
|
**Symptom**: Verification fails with "hash mismatch" error.
|
|
|
|
**Diagnosis**:
|
|
1. Check if the stored hash was computed with legacy or versioned canonicalization
|
|
2. Check the current verification mode (Phase 1/2/3)
|
|
|
|
**Resolution**:
|
|
```bash
|
|
# Inspect the attestation format
|
|
stella inspect attestation.json --show-version
|
|
|
|
# Output:
|
|
# Canonicalization Version: stella:canon:v1
|
|
# Hash Algorithm: SHA-256
|
|
# Computed Hash: sha256:7b8c9d0e...
|
|
```
|
|
|
|
### Legacy Hash in Phase 3
|
|
|
|
**Symptom**: Verification rejected with "legacy hash not accepted" error.
|
|
|
|
**Resolution**:
|
|
1. Re-attest the content with versioned canonicalization
|
|
2. Update any references to the old hash
|
|
|
|
```bash
|
|
stella rehash --input old.json --output new.json
|
|
stella verify new.json # Should pass
|
|
```
|
|
|
|
### Performance Considerations
|
|
|
|
Versioned canonicalization adds ~25-30 bytes to each canonical payload (`{"_canonVersion":"stella:canon:v1",`). This has negligible impact on:
|
|
- Hash computation time (<1μs overhead)
|
|
- Storage size (<0.1% increase for typical payloads)
|
|
- Network transfer (compression eliminates overhead)
|
|
|
|
---
|
|
|
|
## Version History
|
|
|
|
| Version | Identifier | Status | Notes |
|
|
|---------|------------|--------|-------|
|
|
| V1 | `stella:canon:v1` | **Current** | RFC 8785 JSON canonicalization |
|
|
| Legacy | (none) | Deprecated | Pre-versioning; no version marker |
|
|
|
|
## Related Documentation
|
|
|
|
- [Proof Chain Specification](../modules/attestor/proof-chain-specification.md)
|
|
- [Content-Addressed Identifier System](../modules/attestor/proof-chain-specification.md#content-addressed-identifier-system)
|
|
- [CanonJson API Reference](../api/canon-json.md)
|