9.9 KiB
SBOM Determinism Guide
Sprint: SPRINT_20260118_025_ReleaseOrchestrator_sbom_release_association Task: TASK-025-005 Status: Living Document
This document consolidates all determinism requirements for Stella Ops SBOMs. Deterministic SBOMs are critical for reproducible builds, verifiable release gates, and trust chain integrity.
1. Why Determinism Matters
1.1 Reproducibility
Deterministic SBOMs ensure that scanning the same artifact multiple times produces identical output. This is essential for:
- CI/CD Reliability: Re-running a pipeline should produce the same SBOM hash
- Audit Trails: Evidence submitted to compliance frameworks must be reproducible
- Caching: Content-addressed storage can deduplicate identical SBOMs
- Debugging: Engineers can reproduce exact SBOM state from artifact digest
1.2 Verifiable Gates
Policy gates rely on SBOM hashes for trust verification:
Artifact Digest → SBOM Generation → Canonical Hash → DSSE Signature → Policy Evaluation
If SBOM generation is non-deterministic, the same artifact could produce different hashes, breaking:
- Signature verification (hash mismatch)
- Gate decisions (different vulnerability sets)
- Attestation chains (broken proof lineage)
1.3 Trust Chaining
Evidence chains require stable identifiers. A release component's SbomDigest must match the SBOM retrieved later for verification. Non-determinism breaks this chain:
Release Finalization: SbomDigest = sha256:abc123...
Later Verification: sha256(regenerated-sbom) = sha256:xyz789... ← BROKEN
2. Canonicalization Rules
Stella Ops uses RFC 8785 JSON Canonicalization Scheme (JCS) for deterministic JSON serialization.
2.1 Core JCS Rules
- No Whitespace: Output has no formatting, newlines, or indentation
- Sorted Keys: Object keys are sorted lexicographically (Unicode code point order)
- Normalized Numbers: No leading zeros, no trailing decimal zeros, no positive exponent sign
- UTF-8 Encoding: All strings encoded as UTF-8 without BOM
- No Duplicate Keys: Object keys must be unique
2.2 Implementation
// Using StellaOps.Canonical.Json
using StellaOps.Canonical.Json;
// Canonicalize raw JSON bytes
byte[] canonical = CanonJson.CanonicalizeParsedJson(jsonBytes);
// Compute SHA-256 of canonical form
string digest = CanonJson.Sha256Hex(canonical);
2.3 SBOM-Specific Ordering
Beyond JCS, Stella Ops applies additional ordering for SBOM elements:
| Element | Ordering Strategy |
|---|---|
components |
Sorted by bom-ref (Ordinal) |
dependencies |
Sorted by ref (Ordinal) |
hashes |
Sorted by alg (Ordinal) |
licenses |
Sorted by license ID (Ordinal) |
dependsOn |
Sorted lexicographically |
This ensures component order doesn't affect the canonical hash.
3. Identity Field Derivation
3.1 serialNumber (CycloneDX)
Rule: Use urn:sha256:<artifact-digest> format for deterministic identification.
{
"bomFormat": "CycloneDX",
"specVersion": "1.7",
"serialNumber": "urn:sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
}
Benefits:
- Directly ties SBOM identity to the artifact it describes
- Enables verification:
serialNumber == urn:sha256:$(sha256sum artifact) - Content-addressed: identical artifacts produce identical serialNumbers
Fallback: If artifact digest is unavailable, UUIDv5 derived from sorted components is used for backwards compatibility. This produces a warning during validation.
3.2 bom-ref
Rule: Use deterministic derivation based on purl or component identity.
bom-ref = sha256(purl || name || version)[:12] // truncated hash
Or use the package URL directly if available:
{
"bom-ref": "pkg:npm/lodash@4.17.21",
"name": "lodash",
"version": "4.17.21",
"purl": "pkg:npm/lodash@4.17.21"
}
Anti-pattern: Random UUIDs or incrementing counters as bom-ref.
3.3 SPDX Document Namespace
Rule: Use artifact-derived namespace for SPDX documents.
DocumentNamespace: https://stella-ops.org/spdx/sha256/<artifact-digest>
4. Ephemeral Data Policy
Certain SBOM fields are inherently non-deterministic and should be handled carefully.
4.1 Prunable Fields
These fields should be omitted or normalized before hashing:
| Field | Treatment |
|---|---|
metadata.timestamp |
Use fixed epoch or artifact build time |
metadata.tools[].version |
Optional: pin tool versions |
| File paths (absolute) | Convert to relative paths |
| Environment variables | Exclude from SBOM |
4.2 Timestamp Strategy
Option 1: Fixed Epoch (Recommended)
"timestamp": "1970-01-01T00:00:00Z"
Option 2: Artifact Build Time
"timestamp": "<artifact-created-at>"
Option 3: Omit Field
// No timestamp field - allowed by CycloneDX
4.3 Tool Metadata
Tool information aids debugging but affects hashes:
"tools": [
{
"vendor": "Stella Ops",
"name": "stella-scanner",
"version": "1.0.0" // Pin this version
}
]
Recommendation: Pin tool versions in CI configuration to ensure reproducibility.
5. Verification Workflow
5.1 CLI Commands
Verify Canonical Form:
stella sbom verify input.json --canonical
# Exit 0: Input is canonical
# Exit 1: Input is not canonical (outputs SHA-256 of canonical form)
Canonicalize and Output:
stella sbom verify input.json --canonical --output bom.canonical.json
# Writes: bom.canonical.json (canonical SBOM)
# Writes: bom.canonical.json.sha256 (digest sidecar)
Verbose Output:
stella sbom verify input.json --canonical --verbose
# SHA-256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
# Canonical: yes
# Input size: 15234 bytes
# Canonical size: 12456 bytes
5.2 CI Gate Integration
# .gitea/workflows/sbom-gate.yaml
steps:
- name: Generate SBOM
run: stella sbom generate --artifact ${{ artifact }} --output bom.json
- name: Verify Canonical
run: |
stella sbom verify bom.json --canonical --output bom.canonical.json
if [ $? -ne 0 ]; then
echo "SBOM is not in canonical form"
exit 1
fi
- name: Sign SBOM
run: stella sbom sign bom.canonical.json --key ${{ signing_key }}
- name: Store Digest
run: |
DIGEST=$(cat bom.canonical.json.sha256)
echo "SBOM_DIGEST=$DIGEST" >> $GITHUB_ENV
5.3 Release Finalization
At release finalization, the SBOM digest is captured:
1. Lookup SBOM for artifact: ISbomService.GetByDigestAsync(artifact.Digest)
2. Extract canonical digest: sbom.SbomSha256
3. Store on ReleaseComponent: component.SbomDigest = sbom.SbomSha256
4. Include in release manifest hash computation
6. KPIs and Monitoring
6.1 Byte-Identical Rate
Metric: Percentage of SBOM regenerations that produce identical bytes.
Target: 100% for same artifact + same scanner version
Alert: < 99.9% indicates non-determinism bug
6.2 Stable-Field Coverage
Metric: Percentage of SBOM fields that are deterministic.
| Field Type | Target |
|---|---|
| Component identifiers | 100% |
| Hashes | 100% |
| Dependencies | 100% |
| Metadata timestamps | 95%+ (fixed epoch) |
| Tool versions | 90%+ (pinned) |
6.3 Gate False Positives
Metric: Signature verification failures due to hash mismatch.
Target: 0% for valid artifacts
Investigation: Any mismatch indicates canonicalization or regeneration issue.
7. Troubleshooting
7.1 Hash Mismatch on Regeneration
Symptom: Same artifact produces different SBOM hashes.
Causes:
- Timestamp drift: Check if
metadata.timestampvaries - Tool version change: Check scanner/tool versions
- Ordering instability: Check component/dependency ordering
- Unicode normalization: Check for composed vs decomposed characters
Debug:
# Compare two SBOMs
stella sbom diff bom1.json bom2.json
# Check canonical form
stella sbom verify bom1.json --canonical --verbose
stella sbom verify bom2.json --canonical --verbose
7.2 serialNumber Warning
Symptom: Warning CDX_SERIAL_NON_DETERMINISTIC during validation.
Cause: SBOM uses urn:uuid: format instead of urn:sha256:.
Fix: Ensure ArtifactDigest is provided when generating SBOM:
var document = new SbomDocument
{
Name = "my-app",
ArtifactDigest = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
// ...
};
7.3 Canonical vs Pretty-Printed
Symptom: SBOM appears valid but fails canonical verification.
Cause: SBOM was saved with indentation/formatting.
Fix:
# Convert to canonical form
stella sbom verify input.json --canonical --output output.json
# Use output.json for signing and storage
7.4 Platform-Specific Differences
Symptom: Same code produces different SBOMs on Windows vs Linux.
Causes:
- Line endings: CR+LF vs LF in embedded content
- Path separators:
\vs/in file paths - Locale differences: Number formatting, date formatting
Prevention:
- Normalize line endings in CI
- Use forward slashes for paths
- Use invariant culture for formatting
References
- RFC 8785: JSON Canonicalization Scheme
- CycloneDX 1.7 Specification
- SPDX 2.3 Specification
docs/modules/scanner/signed-sbom-archive-spec.md- Archive formatdocs/modules/scanner/deterministic-sbom-compose.md- Composition rulessrc/Attestor/__Libraries/StellaOps.Attestor.StandardPredicates/Writers/CycloneDxWriter.cs- Implementationsrc/__Libraries/StellaOps.Canonical.Json/CanonJson.cs- Canonicalization library
Changelog
| Date | Change |
|---|---|
| 2026-01-19 | Initial creation (TASK-025-005) |