- Implement `SbomVexOrderingDeterminismProperties` for testing component list and vulnerability metadata hash consistency. - Create `UnicodeNormalizationDeterminismProperties` to validate NFC normalization and Unicode string handling. - Add project file for `StellaOps.Testing.Determinism.Properties` with necessary dependencies. - Introduce CI/CD template validation tests including YAML syntax checks and documentation content verification. - Create validation script for CI/CD templates ensuring all required files and structures are present.
11 KiB
Determinism Specification
Status: Living document Version: 1.0 Created: 2025-12-26 Owners: Policy Guild, Platform Guild Related:
CONSOLIDATED - Deterministic Evidence and Verdict Architecture.md
Overview
This specification defines the determinism guarantees for StellaOps verdict computation, including digest algorithms, canonicalization rules, and migration strategies. All services that produce or verify verdicts MUST comply with this specification.
1. Digest Algorithms
1.1 VerdictId
Purpose: Uniquely identifies a verdict computation result.
Algorithm:
VerdictId = SHA256(CanonicalJson(verdict_payload))
Input Structure:
{
"_canonVersion": "stella:canon:v1",
"evidence_refs": ["sha256:..."],
"explanations": [...],
"risk_score": 42,
"status": "pass",
"unknowns_count": 0
}
Implementation: StellaOps.Attestor.ProofChain.Identifiers.VerdictIdGenerator
1.2 EvidenceId
Purpose: Uniquely identifies an evidence artifact (SBOM, VEX, graph, etc.).
Algorithm:
EvidenceId = SHA256(raw_bytes)
Notes:
- For JSON artifacts, use JCS-canonical bytes
- For binary artifacts, use raw bytes
- For multi-file bundles, use Merkle root
Implementation: StellaOps.Attestor.ProofChain.Identifiers.EvidenceIdGenerator
1.3 GraphRevisionId
Purpose: Uniquely identifies a call graph or reachability graph snapshot.
Algorithm:
GraphRevisionId = SHA256(CanonicalJson({
nodes: SortedBy(nodes, n => n.id),
edges: SortedBy(edges, e => (e.source, e.target, e.kind))
}))
Sorting Rules:
- Nodes: lexicographic by
id(Ordinal) - Edges: tuple sort by
(source, target, kind)
Implementation: StellaOps.Scanner.CallGraph.Identifiers.GraphRevisionIdGenerator
1.4 ManifestId
Purpose: Uniquely identifies a scan manifest (all inputs for an evaluation).
Algorithm:
ManifestId = SHA256(CanonicalJson(manifest_payload))
Input Structure:
{
"_canonVersion": "stella:canon:v1",
"engine_version": "1.0.0",
"feeds_snapshot_sha256": "sha256:...",
"options_hash": "sha256:...",
"policy_bundle_sha256": "sha256:...",
"policy_semver": "2025.12",
"reach_subgraph_sha256": "sha256:...",
"sbom_sha256": "sha256:...",
"vex_set_sha256": ["sha256:..."]
}
Implementation: StellaOps.Replay.Core.ManifestIdGenerator
1.5 PolicyBundleId
Purpose: Uniquely identifies a compiled policy bundle.
Algorithm:
PolicyBundleId = SHA256(CanonicalJson({
rules: SortedBy(rules, r => r.id),
version: semver,
lattice_config: {...}
}))
Implementation: StellaOps.Policy.Engine.PolicyBundleIdGenerator
2. Canonicalization Rules
2.1 JSON Canonicalization (JCS - RFC 8785)
All JSON artifacts MUST be canonicalized before hashing or signing.
Rules:
- Object keys sorted lexicographically (Ordinal comparison)
- No whitespace between tokens
- No trailing commas
- UTF-8 encoding without BOM
- Numbers: IEEE 754 double-precision, no unnecessary trailing zeros, no exponent for integers ≤ 10^21
Example:
// Before
{ "b": 1, "a": 2, "c": { "z": true, "y": false } }
// After (canonical)
{"a":2,"b":1,"c":{"y":false,"z":true}}
Implementation: StellaOps.Canonical.Json.Rfc8785JsonCanonicalizer
2.2 String Normalization (Unicode NFC)
All string values MUST be normalized to Unicode NFC before canonicalization.
Why: Different Unicode representations of the same visual character produce different hashes.
Example:
// Before: é as e + combining acute (U+0065 U+0301)
// After NFC: é as single codepoint (U+00E9)
Implementation: StellaOps.Resolver.NfcStringNormalizer
2.3 Version Markers
All canonical JSON MUST include a version marker for migration safety:
{
"_canonVersion": "stella:canon:v1",
...
}
Current Version: stella:canon:v1
Migration Path: When canonicalization rules change:
- Introduce new version marker (e.g.,
stella:canon:v2) - Support both versions during transition period
- Re-hash legacy artifacts once, store
old_hash → new_hashmapping - Deprecate old version after migration window
3. Determinism Guards
3.1 Forbidden Operations
The following operations are FORBIDDEN during verdict evaluation:
| Operation | Reason | Alternative |
|---|---|---|
DateTime.Now / DateTimeOffset.Now |
Non-deterministic | Use TimeProvider from manifest |
Random / Guid.NewGuid() |
Non-deterministic | Use content-based IDs |
Dictionary<K,V> iteration |
Unstable order | Use SortedDictionary or explicit ordering |
HashSet<T> iteration |
Unstable order | Use SortedSet or explicit ordering |
Parallel.ForEach (unordered) |
Race conditions | Use ordered parallel with merge |
| HTTP calls | External dependency | Use pre-fetched snapshots |
| File system reads | External dependency | Use CAS-cached blobs |
3.2 Runtime Enforcement
The DeterminismGuard class provides runtime enforcement:
using StellaOps.Policy.Engine.DeterminismGuard;
// Wraps evaluation in a determinism context
var result = await DeterminismGuard.ExecuteAsync(async () =>
{
// Any forbidden operation throws DeterminismViolationException
return await evaluator.EvaluateAsync(manifest);
});
Implementation: StellaOps.Policy.Engine.DeterminismGuard.DeterminismGuard
3.3 Compile-Time Enforcement (Planned)
A Roslyn analyzer will flag determinism violations at compile time:
// This will produce a compiler warning/error
public Verdict Evaluate(Manifest m)
{
var now = DateTime.Now; // STELLA001: Forbidden in deterministic context
...
}
Status: Planned for Q1 2026 (SPRINT_20251226_007 DET-GAP-18)
4. Replay Contract
4.1 Requirements
For deterministic replay, the following MUST be pinned and recorded:
| Input | Storage | Notes |
|---|---|---|
| Feed snapshots | CAS by hash | CVE, VEX advisories |
| Scanner version | Manifest | Exact semver |
| Rule packs | CAS by hash | Policy rules |
| Lattice/policy version | Manifest | Semver |
| SBOM generator version | Manifest | For generator-specific quirks |
| Reachability engine settings | Manifest | Language analyzers, depth limits |
| Merge semantics ID | Manifest | Lattice configuration |
4.2 Replay Verification
// Load original manifest
var manifest = await manifestStore.GetAsync(manifestId);
// Replay evaluation
var replayVerdict = await engine.ReplayAsync(manifest);
// Verify determinism
var originalHash = CanonJson.Hash(originalVerdict);
var replayHash = CanonJson.Hash(replayVerdict);
if (originalHash != replayHash)
{
throw new DeterminismViolationException(
$"Replay produced different verdict: {originalHash} vs {replayHash}");
}
4.3 Replay API
GET /replay?manifest_sha=sha256:...
Response:
{
"verdict": {...},
"replay_manifest_sha": "sha256:...",
"verdict_sha": "sha256:...",
"determinism_verified": true
}
5. Testing Requirements
5.1 Golden Tests
Every service that produces verdicts MUST maintain golden test fixtures:
tests/fixtures/golden/
├── manifest-001.json
├── verdict-001.json (expected)
├── manifest-002.json
├── verdict-002.json (expected)
└── ...
Test Pattern:
[Theory]
[MemberData(nameof(GoldenTestCases))]
public async Task Verdict_MatchesGolden(string manifestPath, string expectedPath)
{
var manifest = await LoadManifest(manifestPath);
var actual = await engine.EvaluateAsync(manifest);
var expected = await File.ReadAllBytesAsync(expectedPath);
Assert.Equal(expected, CanonJson.Canonicalize(actual));
}
5.2 Chaos Tests
Chaos tests verify determinism under varying conditions:
[Fact]
public async Task Verdict_IsDeterministic_UnderChaos()
{
var manifest = CreateTestManifest();
var baseline = await engine.EvaluateAsync(manifest);
// Vary conditions
for (int i = 0; i < 100; i++)
{
Environment.SetEnvironmentVariable("RANDOM_SEED", i.ToString());
ThreadPool.SetMinThreads(i % 16 + 1, i % 16 + 1);
var verdict = await engine.EvaluateAsync(manifest);
Assert.Equal(
CanonJson.Hash(baseline),
CanonJson.Hash(verdict));
}
}
5.3 Cross-Platform Tests
Verdicts MUST be identical across:
- Windows / Linux / macOS
- x64 / ARM64
- .NET versions (within major version)
6. Troubleshooting Guide
6.1 "Why are my verdicts different?"
Symptom: Same inputs produce different verdict hashes.
Checklist:
- ✅ Are all inputs content-addressed? Check manifest hashes.
- ✅ Is canonicalization version the same? Check
_canonVersion. - ✅ Is engine version the same? Check
engine_versionin manifest. - ✅ Are feeds from the same snapshot? Check
feeds_snapshot_sha256. - ✅ Is policy bundle the same? Check
policy_bundle_sha256.
Debug Logging: Enable pre-canonical hash logging to compare inputs:
{
"Logging": {
"DeterminismDebug": {
"LogPreCanonicalHashes": true
}
}
}
6.2 Common Causes
| Symptom | Likely Cause | Fix |
|---|---|---|
| Different verdict hash, same risk score | Explanation order | Sort explanations by template + params |
| Different verdict hash, same findings | Evidence ref order | Sort evidence_refs lexicographically |
| Different graph hash | Node iteration order | Use SortedDictionary for nodes |
| Different VEX merge | Feed freshness | Pin feeds to exact snapshot |
6.3 Reporting Issues
When reporting determinism issues, include:
- Both manifest JSONs (canonical form)
- Both verdict JSONs (canonical form)
- Engine versions
- Platform details (OS, architecture, .NET version)
- Pre-canonical hash logs (if available)
7. Migration History
v1 (2025-12-26)
- Initial specification
- RFC 8785 JCS + Unicode NFC
- Version marker:
stella:canon:v1
Appendix A: Reference Implementations
| Component | Location |
|---|---|
| JCS Canonicalizer | src/__Libraries/StellaOps.Canonical.Json/ |
| NFC Normalizer | src/__Libraries/StellaOps.Resolver/NfcStringNormalizer.cs |
| Determinism Guard | src/Policy/__Libraries/StellaOps.Policy.Engine/DeterminismGuard/ |
| Content-Addressed IDs | src/Attestor/__Libraries/StellaOps.Attestor.ProofChain/Identifiers/ |
| Replay Core | src/__Libraries/StellaOps.Replay.Core/ |
| Golden Test Base | src/__Libraries/StellaOps.TestKit/Determinism/ |
Appendix B: Compliance Checklist
Services producing verdicts MUST complete this checklist:
- All JSON outputs use JCS canonicalization
- All strings are NFC-normalized before hashing
- Version marker included in all canonical JSON
- Determinism guard enabled for evaluation code
- Golden tests cover all verdict paths
- Chaos tests verify multi-threaded determinism
- Cross-platform tests pass on CI
- Replay API returns identical verdicts
- Documentation references this specification