Files
git.stella-ops.org/docs/replay/DETERMINISTIC_REPLAY.md
master 9253620833
Some checks failed
Docs CI / lint-and-preview (push) Has been cancelled
feat: Implement Policy Engine Evaluation Service and Cache with unit tests
Temp commit to debug
2025-11-05 09:44:37 +02:00

11 KiB

Stella Ops — Deterministic Replay Specification

Version: 1.0
Status: Draft / Internal Technical Reference
Audience: Core developers, module maintainers, audit engineers.


1. Purpose

Deterministic Replay allows any completed Stella Ops scan to be reproduced byte-for-byte with full cryptographic validation.
It guarantees that SBOMs, Findings, and VEX evaluations can be re-executed later to:

  • prove historical compliance decisions,
  • attribute changes precisely to feeds, rules, or tools,
  • support dual-signing (FIPS + regional crypto),
  • and anchor cryptographic evidence in offline or public ledgers.

Replay requires that all inputs and environmental conditions are captured, hashed, and sealed at scan time.


2. Architecture Overview

graph TD
A[Scanner.WebService] --> B[Replay Manifest]
A --> C[InputBundle]
A --> D[OutputBundle]
B --> E[DSSE Envelope]
C --> F[Feedser Snapshot Export]
C --> G[Policy/Lattice Bundle]
D --> H[DSSE Outputs (SBOM, Findings, VEX)]
E --> I[MongoDB: replay_runs]
C --> J[Blob Store: Input/Output Bundles]

Core Artifacts

Artifact Description Format
Replay Manifest Immutable JSON describing all scan inputs and outputs. JSON (canonicalized)
InputBundle Feeds, rules, policies, tool binaries (hashed). .tar.zst
OutputBundle SBOM, Findings, VEX, logs. .tar.zst
DSSE Envelope Signed metadata for each artifact. JSON / JWS
Merkle Map Layer and feed chunk trees. JSON (embedded or sidecar)

3. Replay Manifest Schema (v1)

3.1 Top-level Layout

{
  "schemaVersion": "1.0",
  "scan": {
    "id": "uuid",
    "time": "2025-10-29T13:05:33Z",
    "mode": "record",
    "scannerVersion": "10.1.3",
    "cryptoProfile": "FIPS-140-3+GOST-R-34.10-2012"
  },
  "subject": {
    "ociDigest": "sha256:abcd...",
    "layers": [
      { "layerDigest": "...", "merkleRoot": "...", "leafCount": 144 }
    ]
  },
  "inputs": {
    "feeds": [
      {
        "name": "nvd",
        "snapshotHash": "sha256:...",
        "snapshotTime": "2025-10-29T12:00:00Z",
        "merkleRoot": "..."
      }
    ],
    "rulesBundleHash": "sha256:...",
    "tools": [
      { "name": "sbomer", "version": "10.1.3", "sha256": "..." },
      { "name": "scanner", "version": "10.1.3", "sha256": "..." },
      { "name": "vexer", "version": "10.1.3", "sha256": "..." }
    ],
    "env": {
      "os": "linux",
      "arch": "x64",
      "locale": "en_US.UTF-8",
      "tz": "UTC",
      "seed": "H(scan.id||merkleRootAllLayers)",
      "flags": ["offline"]
    }
  },
  "policy": {
    "latticeHash": "sha256:...",
    "mutes": [
      { "id": "MUTE-1234", "reason": "vendor ack", "approvedBy": "authority@example.com", "approvedAt": "2025-10-29T12:55Z" }
    ],
    "trustProfile": "sha256:..."
  },
  "outputs": {
    "sbomHash": "sha256:...",
    "findingsHash": "sha256:...",
    "vexHash": "sha256:...",
    "logHash": "sha256:..."
  },
  "provenance": {
    "signer": "scanner.authority",
    "dsseEnvelopeHash": "sha256:...",
    "rekorEntry": "optional"
  }
}

4. Deterministic Execution Rules

4.1 Environment Normalization

  • Clock: frozen to scan.time unless a rule explicitly requires “now”.

  • Random seed: derived as H(scan.id || MerkleRootAllLayers).

  • Locale/TZ: enforced per manifest; deviations cause validation error.

  • Filesystem normalization:

    • Normalize perms to 0644/0755.
    • Path separators = /.
    • Newlines = LF.
    • JSON key order = lexical.

4.2 Concurrency & I/O

  • File traversal: stable lexicographic order.
  • Parallel jobs: ordered reduction by subject path.
  • Temporary directories: ephemeral but deterministic hash seeds.

4.3 Feeds & Policies

  • All network I/O disabled; feeds must be read from snapshot bundles.
  • Policies and suppressions must resolve by hash, not name.

5. DSSE and Signing

5.1 Envelope Structure

{
  "payloadType": "application/vnd.stella.replay.manifest+json",
  "payload": "<base64-encoded canonical JSON>",
  "signatures": [
    { "keyid": "authority-root-fips", "sig": "..." },
    { "keyid": "authority-root-gost", "sig": "..." }
  ]
}

5.2 Verification Steps

  1. Decode payload → verify canonical form.
  2. Verify each signature chain against RootPack (offline trust anchors).
  3. Recompute hash and compare to dsseEnvelopeHash in manifest.
  4. Optionally verify Rekor inclusion proof.

6. CLI Interface

6.1 Recording a Scan

stella scan image:tag --record ./out/

Produces:

out/
 ├─ manifest.json
 ├─ manifest.dsse.json
 ├─ inputbundle.tar.zst
 ├─ outputbundle.tar.zst
 └─ signatures/

6.2 Verifying

stella verify manifest.json
  • Checks all hashes and DSSE envelopes.

  • Prints summary:

    ✅ Verified: SBOM, Findings, VEX, Tools, Feeds, Policy
    

6.3 Replaying

stella replay manifest.json --strict
stella replay manifest.json --what-if --vary=feeds
  • --strict: all inputs locked; identical result expected.
  • --what-if: varies only specified dimension(s).

6.4 Diffing

stella diff manifestA.json manifestB.json

Shows field-level differences (feed snapshot, tool, or policy hash).


7. MongoDB Schema

7.1 replay_runs

{
  "_id": "uuid",
  "manifestHash": "sha256:...",
  "status": "verified|failed|replayed",
  "createdAt": "...",
  "updatedAt": "...",
  "signatures": [{ "profile": "FIPS", "verified": true }],
  "outputs": {
    "sbom": "sha256:...",
    "findings": "sha256:..."
  }
}

7.2 bundles

{
  "_id": "sha256:...",
  "type": "input|output|rootpack",
  "size": 4123123,
  "location": "/var/lib/stella/bundles/<sha>.tar.zst"
}

7.3 subjects

{
  "ociDigest": "sha256:abcd...",
  "layers": [
    { "layerDigest": "...", "merkleRoot": "...", "leafCount": 120 }
  ]
}

8. Layer Merkle Implementation

8.1 Algorithm

static string ComputeMerkleRoot(string layerTarPath)
{
    const int ChunkSize = 4 * 1024 * 1024;
    var hashes = new List<byte[]>();
    using var fs = File.OpenRead(layerTarPath);
    var buffer = new byte[ChunkSize];
    int read;
    using var sha = SHA256.Create();
    while ((read = fs.Read(buffer, 0, buffer.Length)) > 0)
        hashes.Add(sha.ComputeHash(buffer, 0, read));
    while (hashes.Count > 1)
        hashes = hashes
            .Select((h, i) => (h, i))
            .GroupBy(x => x.i / 2)
            .Select(g => sha.ComputeHash(g.SelectMany(x => x.h).ToArray()))
            .ToList();
    return Convert.ToHexString(hashes.Single());
}

8.2 Stored Values

{
  "layerDigest": "sha256:...",
  "merkleRoot": "b81f...",
  "leafCount": 240,
  "leavesHash": "sha256:..."
}

9. Replay Engine Implementation Notes (.NET 10)

9.1 Manifest Parsing

Use System.Text.Json with deterministic ordering:

var options = new JsonSerializerOptions {
    WriteIndented = false,
    PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
    TypeInfoResolverChain = { new OrderedResolver() }
};

9.2 Stable Output

Normalize SBOM/Findings/VEX JSON:

string Canonicalize(string json) =>
    JsonSerializer.Serialize(
        JsonSerializer.Deserialize<JsonDocument>(json),
        options);

9.3 Verification Flow

var manifest = Manifest.Load("manifest.json");
VerifySignatures(manifest);
VerifyHashes(manifest);
if (mode == Strict) RunPipeline(manifest);
else RunPipelineWithVariation(manifest, vary);

9.4 Failure Modes

Condition Action
Missing snapshot or bundle Error: InputBundleMissing
Feed hash mismatch Error: FeedSnapshotDrift
Tool binary hash mismatch Reject replay
Output hash drift in strict mode Mark as failed, emit diff log
Invalid signature Reject manifest

10. Crypto Profiles and RootPack

10.1 Example Profiles

Profile Algorithms Notes
FIPS-140-3 ECDSA-P256 / SHA-256 / AES-GCM Default for US/EU
GOST GOST R 34.10-2012 / GOST R 34.11-2012 Russia
SM SM2 / SM3 / SM4 China
eIDAS RSA-PSS / SHA-256 EU qualified signatures

10.2 Dual-Signing Example

stella sign manifest.json --profiles=FIPS,GOST

Produces:

signatures/
 ├─ manifest.dsse.fips.json
 └─ manifest.dsse.gost.json

11. Test Strategy

Test Description Expected Result
Golden Replay Repeat identical scan → same outputs identical hashes
Feed Drift Test Replay with updated feeds Only inputs.feeds changes
Tool Upgrade Test Replay with new scanner version Reject or diff by tools
Policy Change Test Different lattice/mutes Diff by policy section
Cross-Arch Test x64 vs arm64 Identical outputs
Corrupted Bundle Tamper bundle Verification fails

12. Example Verification Output

$ stella verify manifest.json

[✓] Manifest integrity: OK
[✓] DSSE signatures (FIPS,GOST): OK
[✓] Feeds snapshot hash: OK
[✓] Policy + mutes hash: OK
[✓] Toolchain hash: OK
[✓] SBOM/VEX outputs: OK

Result: VERIFIED

13. Future Extensions

  • Support SPDX 3.0.1 alongside CycloneDX 1.6.
  • Add per-file Merkle proofs for local scans.
  • Ledger anchoring (Rekor, distributed Proof-Market).
  • Post-quantum signatures (Dilithium/Falcon).
  • Replay orchestration API (/api/replay/:id).

14. Summary

Deterministic Replay freezes every element of a scan:

image → feeds → policy → toolchain → environment → outputs → signatures.

By enforcing canonical input/output states and verifiable cryptographic bindings, Stella Ops achieves regulatory-grade replayability, regional crypto compliance, and immutable provenance across all scans.