Files
git.stella-ops.org/docs/reproducibility.md
2025-12-24 21:45:46 +02:00

332 lines
8.2 KiB
Markdown

# StellaOps Reproducibility Specification
This document defines the reproducibility guarantees, verdict identity computation, and replay procedures for StellaOps artifacts.
## Overview
StellaOps provides **deterministic, reproducible outputs** for all security artifacts:
- SBOM generation (CycloneDX 1.6, SPDX 3.0.1)
- VEX statements (OpenVEX)
- Policy verdicts
- Delta computations
- DSSE attestations and Sigstore bundles
**Core Guarantee:** Given identical inputs (image digest, advisory feeds, policies, tool versions), StellaOps produces byte-for-byte identical outputs with matching content-addressed identifiers.
## Verdict Identity Formula
### Content-Addressed Verdict ID
All policy verdicts use content-addressed identifiers computed as:
```
VerdictId = "verdict:sha256:" + HexLower(SHA256(CanonicalJson(VerdictPayload)))
```
Where `VerdictPayload` is a JSON object with the following structure:
```json
{
"_canonVersion": "stella:canon:v1",
"deltaId": "<content-addressed delta ID>",
"blockingDrivers": [
{
"cveId": "CVE-...",
"description": "...",
"purl": "pkg:...",
"severity": "Critical|High|Medium|Low",
"type": "new-reachable-cve|..."
}
],
"warningDrivers": [...],
"appliedExceptions": ["EXCEPTION-001", ...],
"gateLevel": "G0|G1|G2|G3|G4"
}
```
**Determinism guarantees:**
- `blockingDrivers` and `warningDrivers` are sorted by `type`, then `cveId`, then `purl`, then `severity`
- `appliedExceptions` are sorted lexicographically
- All string comparisons use Ordinal (case-sensitive, lexicographic)
- Canonical JSON follows RFC 8785 (JCS) with keys sorted alphabetically
- The `_canonVersion` field ensures hash stability across algorithm evolution
### VerdictIdGenerator Implementation
The `VerdictIdGenerator` class in `StellaOps.Policy.Deltas` computes deterministic verdict IDs:
```csharp
// Create a verdict with content-addressed ID
var verdict = new DeltaVerdictBuilder()
.AddBlockingDriver(new DeltaDriver
{
Type = "new-reachable-cve",
CveId = "CVE-2024-001",
Severity = DeltaDriverSeverity.Critical,
Description = "Critical CVE is now reachable"
})
.Build("delta:sha256:abc123...");
// VerdictId is deterministic:
// verdict.VerdictId == "verdict:sha256:..."
// Recompute for verification:
var generator = new VerdictIdGenerator();
var recomputed = generator.ComputeVerdictId(verdict);
Debug.Assert(recomputed == verdict.VerdictId);
```
### Input Stamps
Every artifact includes `InputStamps` capturing the provenance of all inputs:
```json
{
"feedSnapshotHash": "sha256:abc123...",
"policyManifestHash": "sha256:def456...",
"sourceCodeHash": "sha256:789ghi...",
"baseImageDigest": "sha256:jkl012...",
"vexDocumentHashes": ["sha256:mno345..."],
"toolchainVersion": "1.0.0",
"custom": {}
}
```
### Determinism Manifest
The `DeterminismManifest` (schema v1.0) tracks artifact reproducibility:
```json
{
"schemaVersion": "1.0",
"artifact": {
"type": "verdict",
"name": "scan-verdict",
"version": "2025-12-24T12:00:00Z",
"format": "StellaOps.DeltaVerdict@1"
},
"canonicalHash": {
"algorithm": "SHA-256",
"value": "abc123def456...",
"encoding": "hex"
},
"inputs": {
"feedSnapshotHash": "sha256:...",
"policyManifestHash": "sha256:...",
"baseImageDigest": "sha256:..."
},
"toolchain": {
"platform": ".NET 10.0",
"components": [
{"name": "StellaOps.Scanner", "version": "1.0.0"},
{"name": "StellaOps.Policy", "version": "1.0.0"}
]
},
"reproducibility": {
"clockFixed": true,
"orderingGuarantee": "stable-sort",
"normalizationRules": ["UTF-8", "LF", "canonical-json"]
},
"generatedAt": "2025-12-24T12:00:00Z"
}
```
## Canonical JSON Serialization
All JSON outputs follow RFC 8785 (JSON Canonicalization Scheme):
1. Keys sorted lexicographically
2. No whitespace between tokens
3. Unicode escaping for non-ASCII
4. Numbers without leading zeros
5. UTF-8 encoding
## DSSE Attestation Format
### Envelope Structure
```json
{
"payloadType": "application/vnd.in-toto+json",
"payload": "<base64url-encoded statement>",
"signatures": [
{
"keyid": "sha256:...",
"sig": "<base64url-encoded signature>"
}
]
}
```
### In-toto Statement
```json
{
"_type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "registry.example.com/image:tag",
"digest": {"sha256": "abc123..."}
}
],
"predicateType": "https://stellaops.io/attestation/verdict/v1",
"predicate": {
"verdictId": "sha256:...",
"status": "pass",
"gate": "G2",
"inputs": {...},
"evidence": [...]
}
}
```
## Sigstore Bundle Format
StellaOps produces Sigstore bundles (v0.3) for offline verification:
```json
{
"$mediaType": "application/vnd.dev.sigstore.bundle.v0.3+json",
"verificationMaterial": {
"certificate": {...},
"tlogEntries": [{
"logIndex": "12345",
"logId": {...},
"inclusionProof": {...}
}]
},
"dsseEnvelope": {...}
}
```
## Replay Procedure
### Prerequisites
1. Offline bundle containing:
- Advisory feed snapshot
- Policy pack
- VEX documents
- Tool binaries (pinned versions)
### Steps
```bash
# 1. Extract offline bundle
stella offline extract --bundle offline-kit.tar.gz --output ./replay
# 2. Set deterministic environment
export STELLAOPS_DETERMINISTIC_SEED=42
export STELLAOPS_CLOCK_FIXED=2025-12-24T12:00:00Z
# 3. Run scan with pinned inputs
stella scan \
--image registry.example.com/image@sha256:abc123 \
--feeds ./replay/feeds \
--policies ./replay/policies \
--output ./replay/output
# 4. Verify hash matches original
stella verify \
--manifest ./replay/output/manifest.json \
--expected-hash sha256:def456...
# 5. Verify DSSE attestation
stella attest verify \
--bundle ./replay/output/bundle.sigstore \
--policy verification-policy.yaml
```
### Verification Policy
```yaml
apiVersion: stellaops.io/v1
kind: VerificationPolicy
metadata:
name: audit-verification
spec:
requiredPredicateTypes:
- https://stellaops.io/attestation/verdict/v1
trustedIssuers:
- https://accounts.stellaops.io
maxAge: 90d
requireRekorEntry: true
unknownBudget:
maxTotal: 5
action: fail
```
## Unknown Budget Attestation
Policy thresholds are attested in verdict bundles:
```json
{
"predicateType": "https://stellaops.io/attestation/budget-check/v1",
"predicate": {
"environment": "production",
"budgetConfig": {
"maxUnknownCount": 5,
"maxCumulativeUncertainty": 2.0,
"reasonLimits": {
"Reachability": 0,
"Identity": 2,
"Provenance": 2
}
},
"actualCounts": {
"total": 3,
"byReason": {"Identity": 2, "Provenance": 1}
},
"result": "pass",
"configHash": "sha256:..."
}
}
```
## Schema Versions
| Format | Version | Schema Location |
|--------|---------|-----------------|
| CycloneDX | 1.6 | `docs/schemas/cyclonedx-bom-1.6.schema.json` |
| SPDX | 3.0.1 | `docs/schemas/spdx-3.0.1.schema.json` |
| OpenVEX | 0.2.0 | `docs/schemas/openvex-0.2.0.schema.json` |
| Sigstore Bundle | 0.3 | `docs/schemas/sigstore-bundle-0.3.schema.json` |
| DeterminismManifest | 1.0 | `docs/schemas/determinism-manifest-1.0.schema.json` |
## CI Integration
### Schema Validation
```yaml
# .gitea/workflows/schema-validation.yml
- name: Validate CycloneDX
run: |
sbom-utility validate \
--input-file ${{ matrix.fixture }} \
--schema docs/schemas/cyclonedx-bom-1.6.schema.json
```
### Determinism Gate
```yaml
# .gitea/workflows/determinism-gate.yml
- name: Verify Verdict Hash
run: |
HASH1=$(stella scan --image test:latest --output /tmp/run1 | jq -r '.verdictId')
HASH2=$(stella scan --image test:latest --output /tmp/run2 | jq -r '.verdictId')
[ "$HASH1" = "$HASH2" ] || exit 1
```
## Related Documentation
- [Testing Strategy](testing/testing-strategy-models.md)
- [Determinism Verification](testing/determinism-verification.md)
- [DSSE Attestation Guide](modules/attestor/README.md)
- [Offline Operation](24_OFFLINE_KIT.md)
- [Proof Bundle Spec](modules/triage/proof-bundle-spec.md)
## Changelog
| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2025-12-24 | Initial specification based on product advisory gap analysis |