Add tests for SBOM generation determinism across multiple formats
- Created `StellaOps.TestKit.Tests` project for unit tests related to determinism. - Implemented `DeterminismManifestTests` to validate deterministic output for canonical bytes and strings, file read/write operations, and error handling for invalid schema versions. - Added `SbomDeterminismTests` to ensure identical inputs produce consistent SBOMs across SPDX 3.0.1 and CycloneDX 1.6/1.7 formats, including parallel execution tests. - Updated project references in `StellaOps.Integration.Determinism` to include the new determinism testing library.
This commit is contained in:
362
docs/testing/determinism-verification.md
Normal file
362
docs/testing/determinism-verification.md
Normal file
@@ -0,0 +1,362 @@
|
||||
# Determinism Verification Guide
|
||||
|
||||
**Sprint:** 5100.0007.0003 (Epic B)
|
||||
**Last Updated:** 2025-12-23
|
||||
|
||||
## Overview
|
||||
|
||||
StellaOps enforces deterministic artifact generation across all exported formats. This ensures:
|
||||
|
||||
1. **Reproducibility**: Given the same inputs, outputs are byte-for-byte identical
|
||||
2. **Auditability**: Hash verification proves artifact integrity
|
||||
3. **Compliance**: Regulated environments can replay and verify builds
|
||||
4. **CI Gating**: Drift detection prevents unintended changes
|
||||
|
||||
## Supported Artifact Types
|
||||
|
||||
| Type | Format(s) | Test File |
|
||||
|------|-----------|-----------|
|
||||
| SBOM | SPDX 3.0.1, CycloneDX 1.6, CycloneDX 1.7 | `SbomDeterminismTests.cs` |
|
||||
| VEX | OpenVEX, CSAF 2.0 | `VexDeterminismTests.cs` |
|
||||
| Policy Verdicts | JSON | `PolicyDeterminismTests.cs` |
|
||||
| Evidence Bundles | JSON, DSSE, in-toto | `EvidenceBundleDeterminismTests.cs` |
|
||||
| AirGap Bundles | NDJSON | `AirGapBundleDeterminismTests.cs` |
|
||||
| Advisory Normalization | Canonical JSON | `IngestionDeterminismTests.cs` |
|
||||
|
||||
## Determinism Manifest Format
|
||||
|
||||
Every deterministic artifact can produce a manifest describing its content hash and generation context.
|
||||
|
||||
### Schema (v1.0)
|
||||
|
||||
```json
|
||||
{
|
||||
"schemaVersion": "1.0",
|
||||
"artifact": {
|
||||
"type": "sbom | vex | policy-verdict | evidence-bundle | airgap-bundle",
|
||||
"name": "artifact-identifier",
|
||||
"version": "1.0.0",
|
||||
"format": "SPDX 3.0.1 | CycloneDX 1.6 | OpenVEX | CSAF 2.0 | ..."
|
||||
},
|
||||
"canonicalHash": {
|
||||
"algorithm": "SHA-256",
|
||||
"value": "abc123..."
|
||||
},
|
||||
"toolchain": {
|
||||
"platform": ".NET 10.0",
|
||||
"components": [
|
||||
{ "name": "StellaOps.Scanner", "version": "1.0.0" }
|
||||
]
|
||||
},
|
||||
"inputs": {
|
||||
"feedSnapshotHash": "def456...",
|
||||
"policyManifestHash": "ghi789...",
|
||||
"configHash": "jkl012..."
|
||||
},
|
||||
"generatedAt": "2025-12-23T18:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Field Descriptions
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `schemaVersion` | Manifest schema version (currently `1.0`) |
|
||||
| `artifact.type` | Category of the artifact |
|
||||
| `artifact.name` | Identifier for the artifact |
|
||||
| `artifact.version` | Version of the artifact (if applicable) |
|
||||
| `artifact.format` | Specific format/spec version |
|
||||
| `canonicalHash.algorithm` | Hash algorithm (always `SHA-256`) |
|
||||
| `canonicalHash.value` | Lowercase hex hash of canonical bytes |
|
||||
| `toolchain.platform` | Runtime platform |
|
||||
| `toolchain.components` | List of generating components with versions |
|
||||
| `inputs` | Hashes of input artifacts (feed snapshots, policies, etc.) |
|
||||
| `generatedAt` | ISO-8601 UTC timestamp of generation |
|
||||
|
||||
## Creating a Determinism Manifest
|
||||
|
||||
Use `DeterminismManifestWriter` from `StellaOps.Testing.Determinism`:
|
||||
|
||||
```csharp
|
||||
using StellaOps.Testing.Determinism;
|
||||
|
||||
// Generate artifact bytes
|
||||
var sbomBytes = GenerateSbom(input, frozenTime);
|
||||
|
||||
// Create artifact info
|
||||
var artifactInfo = new ArtifactInfo
|
||||
{
|
||||
Type = "sbom",
|
||||
Name = "my-container-sbom",
|
||||
Version = "1.0.0",
|
||||
Format = "CycloneDX 1.6"
|
||||
};
|
||||
|
||||
// Create toolchain info
|
||||
var toolchain = new ToolchainInfo
|
||||
{
|
||||
Platform = ".NET 10.0",
|
||||
Components = new[]
|
||||
{
|
||||
new ComponentInfo { Name = "StellaOps.Scanner", Version = "1.0.0" }
|
||||
}
|
||||
};
|
||||
|
||||
// Create manifest
|
||||
var manifest = DeterminismManifestWriter.CreateManifest(
|
||||
sbomBytes,
|
||||
artifactInfo,
|
||||
toolchain);
|
||||
|
||||
// Save manifest
|
||||
DeterminismManifestWriter.Save(manifest, "determinism.json");
|
||||
```
|
||||
|
||||
## Reading and Verifying Manifests
|
||||
|
||||
```csharp
|
||||
// Load manifest
|
||||
var manifest = DeterminismManifestReader.Load("determinism.json");
|
||||
|
||||
// Verify artifact bytes match manifest hash
|
||||
var currentBytes = File.ReadAllBytes("artifact.json");
|
||||
var isValid = DeterminismManifestReader.Verify(manifest, currentBytes);
|
||||
|
||||
if (!isValid)
|
||||
{
|
||||
throw new DeterminismDriftException(
|
||||
$"Artifact hash mismatch. Expected: {manifest.CanonicalHash.Value}");
|
||||
}
|
||||
```
|
||||
|
||||
## Determinism Rules
|
||||
|
||||
### 1. Canonical JSON Serialization
|
||||
|
||||
All JSON output must use canonical serialization via `StellaOps.Canonical.Json`:
|
||||
|
||||
```csharp
|
||||
using StellaOps.Canonical.Json;
|
||||
|
||||
var json = CanonJson.Serialize(myObject);
|
||||
var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(json));
|
||||
```
|
||||
|
||||
Rules:
|
||||
- Keys sorted lexicographically
|
||||
- No trailing whitespace
|
||||
- Unix line endings (`\n`)
|
||||
- No BOM
|
||||
- UTF-8 encoding
|
||||
|
||||
### 2. Frozen Timestamps
|
||||
|
||||
All timestamps must be provided externally or use `DeterministicTime`:
|
||||
|
||||
```csharp
|
||||
// ❌ BAD - Non-deterministic
|
||||
var timestamp = DateTimeOffset.UtcNow;
|
||||
|
||||
// ✅ GOOD - Deterministic
|
||||
var timestamp = frozenTime; // Passed as parameter
|
||||
```
|
||||
|
||||
### 3. Deterministic IDs
|
||||
|
||||
UUIDs and IDs must be derived from content, not random:
|
||||
|
||||
```csharp
|
||||
// ❌ BAD - Random UUID
|
||||
var id = Guid.NewGuid();
|
||||
|
||||
// ✅ GOOD - Content-derived ID
|
||||
var seed = $"{input.Name}:{input.Version}:{timestamp:O}";
|
||||
var hash = CanonJson.Sha256Hex(Encoding.UTF8.GetBytes(seed));
|
||||
var id = new Guid(Convert.FromHexString(hash[..32]));
|
||||
```
|
||||
|
||||
### 4. Stable Ordering
|
||||
|
||||
Collections must be sorted before serialization:
|
||||
|
||||
```csharp
|
||||
// ❌ BAD - Non-deterministic order
|
||||
var items = dictionary.Values;
|
||||
|
||||
// ✅ GOOD - Sorted order
|
||||
var items = dictionary.Values
|
||||
.OrderBy(v => v.Key, StringComparer.Ordinal);
|
||||
```
|
||||
|
||||
### 5. Parallel Safety
|
||||
|
||||
Determinism must hold under parallel execution:
|
||||
|
||||
```csharp
|
||||
var tasks = Enumerable.Range(0, 20)
|
||||
.Select(_ => Task.Run(() => GenerateArtifact(input, frozenTime)))
|
||||
.ToArray();
|
||||
|
||||
var results = await Task.WhenAll(tasks);
|
||||
results.Should().AllBe(results[0]); // All identical
|
||||
```
|
||||
|
||||
## CI Integration
|
||||
|
||||
### PR Merge Gate
|
||||
|
||||
The determinism gate runs on PR merge:
|
||||
|
||||
```yaml
|
||||
# .gitea/workflows/determinism-gate.yaml
|
||||
name: Determinism Gate
|
||||
on:
|
||||
pull_request:
|
||||
types: [synchronize, ready_for_review]
|
||||
jobs:
|
||||
determinism:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-dotnet@v4
|
||||
with:
|
||||
dotnet-version: '10.0.x'
|
||||
- name: Run Determinism Tests
|
||||
run: |
|
||||
dotnet test tests/integration/StellaOps.Integration.Determinism \
|
||||
--logger "trx;LogFileName=determinism.trx"
|
||||
- name: Generate Determinism Manifest
|
||||
run: |
|
||||
dotnet run --project tools/DeterminismManifestGenerator \
|
||||
--output determinism.json
|
||||
- name: Upload Determinism Artifact
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: determinism-manifest
|
||||
path: determinism.json
|
||||
```
|
||||
|
||||
### Baseline Storage
|
||||
|
||||
Determinism baselines are stored as CI artifacts:
|
||||
|
||||
```
|
||||
ci-artifacts/
|
||||
determinism/
|
||||
baseline/
|
||||
sbom-spdx-3.0.1.json
|
||||
sbom-cyclonedx-1.6.json
|
||||
sbom-cyclonedx-1.7.json
|
||||
vex-openvex.json
|
||||
vex-csaf.json
|
||||
policy-verdict.json
|
||||
evidence-bundle.json
|
||||
airgap-bundle.json
|
||||
```
|
||||
|
||||
### Drift Detection
|
||||
|
||||
When a PR changes artifact output:
|
||||
|
||||
1. CI compares new manifest hash against baseline
|
||||
2. If different, CI fails with diff report
|
||||
3. Developer must either:
|
||||
- Fix the regression (restore determinism)
|
||||
- Update the baseline (if change is intentional)
|
||||
|
||||
### Baseline Update Process
|
||||
|
||||
To intentionally update a baseline:
|
||||
|
||||
```bash
|
||||
# 1. Run determinism tests to generate new manifests
|
||||
dotnet test tests/integration/StellaOps.Integration.Determinism
|
||||
|
||||
# 2. Update baseline files
|
||||
cp determinism/*.json ci-artifacts/determinism/baseline/
|
||||
|
||||
# 3. Commit with explicit message
|
||||
git add ci-artifacts/determinism/baseline/
|
||||
git commit -m "chore(determinism): update baselines for [reason]
|
||||
|
||||
Breaking: [explain what changed]
|
||||
Justification: [explain why this is correct]"
|
||||
```
|
||||
|
||||
## Replay Verification
|
||||
|
||||
To verify an artifact was produced deterministically:
|
||||
|
||||
```bash
|
||||
# 1. Get the manifest
|
||||
curl -O https://releases.stellaops.io/v1.0.0/sbom.determinism.json
|
||||
|
||||
# 2. Get the artifact
|
||||
curl -O https://releases.stellaops.io/v1.0.0/sbom.cdx.json
|
||||
|
||||
# 3. Verify
|
||||
dotnet run --project tools/DeterminismVerifier \
|
||||
--manifest sbom.determinism.json \
|
||||
--artifact sbom.cdx.json
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Determinism Verification
|
||||
========================
|
||||
Artifact: sbom.cdx.json
|
||||
Manifest: sbom.determinism.json
|
||||
Expected Hash: abc123...
|
||||
Actual Hash: abc123...
|
||||
Status: ✅ VERIFIED
|
||||
```
|
||||
|
||||
## Test Files Reference
|
||||
|
||||
All determinism tests are in `tests/integration/StellaOps.Integration.Determinism/`:
|
||||
|
||||
| File | Tests | Description |
|
||||
|------|-------|-------------|
|
||||
| `DeterminismValidationTests.cs` | 16 | Manifest format and reader/writer |
|
||||
| `SbomDeterminismTests.cs` | 14 | SPDX 3.0.1, CycloneDX 1.6/1.7 |
|
||||
| `VexDeterminismTests.cs` | 17 | OpenVEX, CSAF 2.0 |
|
||||
| `PolicyDeterminismTests.cs` | 18 | Policy verdict artifacts |
|
||||
| `EvidenceBundleDeterminismTests.cs` | 15 | DSSE, in-toto attestations |
|
||||
| `AirGapBundleDeterminismTests.cs` | 14 | NDJSON bundles, manifests |
|
||||
| `IngestionDeterminismTests.cs` | 17 | NVD/OSV/GHSA/CSAF normalization |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Hash Mismatch
|
||||
|
||||
If you see a hash mismatch:
|
||||
|
||||
1. **Check timestamps**: Ensure frozen time is used
|
||||
2. **Check ordering**: Ensure all collections are sorted
|
||||
3. **Check IDs**: Ensure IDs are content-derived
|
||||
4. **Check encoding**: Ensure UTF-8 without BOM
|
||||
|
||||
### Flaky Tests
|
||||
|
||||
If determinism tests are flaky:
|
||||
|
||||
1. **Check parallelism**: Ensure no shared mutable state
|
||||
2. **Check time zones**: Use UTC explicitly
|
||||
3. **Check random sources**: Remove all random number generation
|
||||
4. **Check hash inputs**: Ensure all inputs are captured
|
||||
|
||||
### CI Failures
|
||||
|
||||
If CI determinism gate fails:
|
||||
|
||||
1. Compare the diff between expected and actual
|
||||
2. Identify which field changed
|
||||
3. Track back to the code change that caused it
|
||||
4. Either fix the regression or update baseline with justification
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Testing Strategy Models](testing-strategy-models.md) - Overview of testing models
|
||||
- [Canonical JSON Specification](../11_DATA_SCHEMAS.md#canonical-json) - JSON serialization rules
|
||||
- [CI/CD Workflows](../modules/devops/architecture.md) - CI pipeline details
|
||||
- [Evidence Bundle Schema](../modules/evidence-locker/architecture.md) - Bundle format reference
|
||||
Reference in New Issue
Block a user