docs consolidation and others
This commit is contained in:
473
docs/modules/replay/guides/DETERMINISTIC_REPLAY.md
Normal file
473
docs/modules/replay/guides/DETERMINISTIC_REPLAY.md
Normal file
@@ -0,0 +1,473 @@
|
||||
# Stella Ops — Deterministic Replay Specification
|
||||
|
||||
Version: 1.0
|
||||
Status: Draft / Internal Technical Reference
|
||||
Audience: Core developers, module maintainers, audit engineers.
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
Deterministic Replay allows any completed Stella Ops scan to be **reproduced byte-for-byte** with full cryptographic validation.
|
||||
It guarantees that SBOMs, Findings, and VEX evaluations can be re-executed later to:
|
||||
|
||||
- prove historical compliance decisions,
|
||||
- attribute changes precisely to feeds, rules, or tools,
|
||||
- support dual-signing (FIPS + regional crypto),
|
||||
- and anchor cryptographic evidence in offline or public ledgers.
|
||||
|
||||
Replay requires that all inputs and environmental conditions are **captured, hashed, and sealed** at scan time.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architecture Overview
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Scanner.WebService] --> B[Replay Manifest]
|
||||
A --> C[InputBundle]
|
||||
A --> D[OutputBundle]
|
||||
B --> E[DSSE Envelope]
|
||||
C --> F[Feedser Snapshot Export]
|
||||
C --> G[Policy/Lattice Bundle]
|
||||
D --> H[DSSE Outputs (SBOM, Findings, VEX)]
|
||||
E --> I[PostgreSQL: replay_runs]
|
||||
C --> J[Blob Store: Input/Output Bundles]
|
||||
````
|
||||
|
||||
### Core Artifacts
|
||||
|
||||
| Artifact | Description | Format |
|
||||
| ------------------- | ------------------------------------------------------ | -------------------------- |
|
||||
| **Replay Manifest** | Immutable JSON describing all scan inputs and outputs. | JSON (canonicalized) |
|
||||
| **InputBundle** | Feeds, rules, policies, tool binaries (hashed). | `.tar.zst` |
|
||||
| **OutputBundle** | SBOM, Findings, VEX, logs. | `.tar.zst` |
|
||||
| **DSSE Envelope** | Signed metadata for each artifact. | JSON / JWS |
|
||||
| **Merkle Map** | Layer and feed chunk trees. | JSON (embedded or sidecar) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Replay Manifest Schema (v1)
|
||||
|
||||
### 3.1 Top-level Layout
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"schemaVersion": "1.0",
|
||||
"scan": {
|
||||
"id": "uuid",
|
||||
"time": "2025-10-29T13:05:33Z",
|
||||
"mode": "record",
|
||||
"scannerVersion": "10.1.3",
|
||||
"cryptoProfile": "FIPS-140-3+GOST-R-34.10-2012"
|
||||
},
|
||||
"subject": {
|
||||
"ociDigest": "sha256:abcd...",
|
||||
"layers": [
|
||||
{ "layerDigest": "...", "merkleRoot": "...", "leafCount": 144 }
|
||||
]
|
||||
},
|
||||
"inputs": {
|
||||
"feeds": [
|
||||
{
|
||||
"name": "nvd",
|
||||
"snapshotHash": "sha256:...",
|
||||
"snapshotTime": "2025-10-29T12:00:00Z",
|
||||
"merkleRoot": "..."
|
||||
}
|
||||
],
|
||||
"rulesBundleHash": "sha256:...",
|
||||
"tools": [
|
||||
{ "name": "sbomer", "version": "10.1.3", "sha256": "..." },
|
||||
{ "name": "scanner", "version": "10.1.3", "sha256": "..." },
|
||||
{ "name": "vexer", "version": "10.1.3", "sha256": "..." }
|
||||
],
|
||||
"env": {
|
||||
"os": "linux",
|
||||
"arch": "x64",
|
||||
"locale": "en_US.UTF-8",
|
||||
"tz": "UTC",
|
||||
"seed": "H(scan.id||merkleRootAllLayers)",
|
||||
"flags": ["offline"]
|
||||
}
|
||||
},
|
||||
"policy": {
|
||||
"latticeHash": "sha256:...",
|
||||
"mutes": [
|
||||
{ "id": "MUTE-1234", "reason": "vendor ack", "approvedBy": "authority@example.com", "approvedAt": "2025-10-29T12:55Z" }
|
||||
],
|
||||
"trustProfile": "sha256:..."
|
||||
},
|
||||
"outputs": {
|
||||
"sbomHash": "sha256:...",
|
||||
"findingsHash": "sha256:...",
|
||||
"vexHash": "sha256:...",
|
||||
"logHash": "sha256:..."
|
||||
},
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{
|
||||
"kind": "static",
|
||||
"analyzer": "scanner/java@sha256:...",
|
||||
"casUri": "cas://replay/scan-123/reachability/static-graph.tar.zst",
|
||||
"sha256": "abc123"
|
||||
},
|
||||
{
|
||||
"kind": "framework",
|
||||
"analyzer": "scanner/framework@sha256:...",
|
||||
"casUri": "cas://replay/scan-123/reachability/framework-graph.tar.zst",
|
||||
"sha256": "def456"
|
||||
}
|
||||
],
|
||||
"runtimeTraces": [
|
||||
{
|
||||
"source": "zastava",
|
||||
"casUri": "cas://replay/scan-123/reachability/runtime-trace.ndjson.zst",
|
||||
"sha256": "feedface",
|
||||
"recordedAt": "2025-11-07T11:10:00Z"
|
||||
}
|
||||
]
|
||||
},
|
||||
"provenance": {
|
||||
"signer": "scanner.authority",
|
||||
"dsseEnvelopeHash": "sha256:...",
|
||||
"rekorEntry": "optional"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Reachability Section
|
||||
|
||||
The optional `reachability` block captures the inputs needed to replay explainability decisions:
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `reachability.graphs[]` | References to static/framework callgraph bundles. Each entry records the producing analyzer (`analyzer`/`version`), the CAS URI under `cas://replay/<scan-id>/reachability/graphs/`, and the SHA-256 digest of the tarball. |
|
||||
| `reachability.runtimeTraces[]` | References to runtime observation bundles (e.g., Zastava ND-JSON traces). Each item stores the emitting source, CAS URI (typically `cas://replay/<scan-id>/reachability/traces/`), SHA-256, and capture timestamp. |
|
||||
|
||||
Replay engines MUST verify every referenced artifact hash before re-evaluating reachability. Missing graphs downgrade affected signals to `reachability:unknown` and should raise policy warnings.
|
||||
|
||||
Producer note: default clock values in `StellaOps.Replay.Core` are `UnixEpoch` to avoid hidden time drift; producers MUST set `scan.time` and `reachability.runtimeTraces[].recordedAt` explicitly.
|
||||
|
||||
---
|
||||
|
||||
## 4. Deterministic Execution Rules
|
||||
|
||||
### 4.1 Environment Normalization
|
||||
|
||||
* **Clock:** frozen to `scan.time` unless a rule explicitly requires “now”.
|
||||
* **Random seed:** derived as `H(scan.id || MerkleRootAllLayers)`.
|
||||
* **Locale/TZ:** enforced per manifest; deviations cause validation error.
|
||||
* **Filesystem normalization:**
|
||||
|
||||
* Normalize perms to 0644/0755.
|
||||
* Path separators = `/`.
|
||||
* Newlines = LF.
|
||||
* JSON key order = lexical.
|
||||
|
||||
### 4.2 Concurrency & I/O
|
||||
|
||||
* File traversal: stable lexicographic order.
|
||||
* Parallel jobs: ordered reduction by subject path.
|
||||
* Temporary directories: ephemeral but deterministic hash seeds.
|
||||
|
||||
### 4.3 Feeds & Policies
|
||||
|
||||
* All network I/O disabled; feeds must be read from snapshot bundles.
|
||||
* Policies and suppressions must resolve by hash, not name.
|
||||
|
||||
### 4.4 Library hooks (StellaOps.Replay.Core)
|
||||
|
||||
Use the shared helpers in `src/__Libraries/StellaOps.Replay.Core` to keep outputs deterministic:
|
||||
|
||||
- `CanonicalJson.Serialize(...)` → lexicographic key ordering with relaxed escaping, arrays preserved as-is.
|
||||
- `DeterministicHash.Sha256Hex(...)` and `DeterministicHash.MerkleRootHex(...)` → lowercase digests and stable Merkle roots for bundle manifests.
|
||||
- `DssePayloadBuilder.BuildUnsigned(...)` → DSSE payloads for replay manifests using payload type `application/vnd.stellaops.replay+json`.
|
||||
- `ReplayManifestExtensions.ComputeCanonicalSha256()` → convenience for CAS naming of manifest blobs.
|
||||
|
||||
---
|
||||
|
||||
## 5. DSSE and Signing
|
||||
|
||||
### 5.1 Envelope Structure
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"payloadType": "application/vnd.stellaops.replay+json",
|
||||
"payload": "<base64-encoded canonical JSON>",
|
||||
"signatures": [
|
||||
{ "keyid": "authority-root-fips", "sig": "..." },
|
||||
{ "keyid": "authority-root-gost", "sig": "..." }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Verification Steps
|
||||
|
||||
1. Decode payload → verify canonical form.
|
||||
2. Verify each signature chain against RootPack (offline trust anchors).
|
||||
3. Recompute hash and compare to `dsseEnvelopeHash` in manifest.
|
||||
4. Optionally verify Rekor inclusion proof.
|
||||
|
||||
### 5.3 Default payload type
|
||||
|
||||
Replay DSSE envelopes emitted by `DssePayloadBuilder` use payload type `application/vnd.stellaops.replay+json`. Consumers should treat this as canonical unless a future manifest revision increments the schema and payload type together.
|
||||
|
||||
---
|
||||
|
||||
## 6. CLI Interface
|
||||
|
||||
### 6.1 Recording a Scan
|
||||
|
||||
```bash
|
||||
stella scan image:tag --record ./out/
|
||||
```
|
||||
|
||||
Produces:
|
||||
|
||||
```
|
||||
out/
|
||||
├─ manifest.json
|
||||
├─ manifest.dsse.json
|
||||
├─ inputbundle.tar.zst
|
||||
├─ outputbundle.tar.zst
|
||||
└─ signatures/
|
||||
```
|
||||
|
||||
### 6.2 Verifying
|
||||
|
||||
```bash
|
||||
stella verify manifest.json
|
||||
```
|
||||
|
||||
* Checks all hashes and DSSE envelopes.
|
||||
* Prints summary:
|
||||
|
||||
```
|
||||
✅ Verified: SBOM, Findings, VEX, Tools, Feeds, Policy
|
||||
```
|
||||
|
||||
### 6.3 Replaying
|
||||
|
||||
```bash
|
||||
stella replay manifest.json --strict
|
||||
stella replay manifest.json --what-if --vary=feeds
|
||||
```
|
||||
|
||||
* `--strict`: all inputs locked; identical result expected.
|
||||
* `--what-if`: varies only specified dimension(s).
|
||||
|
||||
### 6.4 Diffing
|
||||
|
||||
```bash
|
||||
stella diff manifestA.json manifestB.json
|
||||
```
|
||||
|
||||
Shows field-level differences (feed snapshot, tool, or policy hash).
|
||||
|
||||
---
|
||||
|
||||
## 7. PostgreSQL Schema
|
||||
|
||||
### 7.1 `replay_runs`
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"_id": "uuid",
|
||||
"manifestHash": "sha256:...",
|
||||
"status": "verified|failed|replayed",
|
||||
"createdAt": "...",
|
||||
"updatedAt": "...",
|
||||
"signatures": [{ "profile": "FIPS", "verified": true }],
|
||||
"outputs": {
|
||||
"sbom": "sha256:...",
|
||||
"findings": "sha256:..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 `bundles`
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"_id": "sha256:...",
|
||||
"type": "input|output|rootpack",
|
||||
"size": 4123123,
|
||||
"location": "/var/lib/stella/bundles/<sha>.tar.zst"
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 `subjects`
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"ociDigest": "sha256:abcd...",
|
||||
"layers": [
|
||||
{ "layerDigest": "...", "merkleRoot": "...", "leafCount": 120 }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Layer Merkle Implementation
|
||||
|
||||
### 8.1 Algorithm
|
||||
|
||||
```csharp
|
||||
static string ComputeMerkleRoot(string layerTarPath)
|
||||
{
|
||||
const int ChunkSize = 4 * 1024 * 1024;
|
||||
var hashes = new List<byte[]>();
|
||||
using var fs = File.OpenRead(layerTarPath);
|
||||
var buffer = new byte[ChunkSize];
|
||||
int read;
|
||||
using var sha = SHA256.Create();
|
||||
while ((read = fs.Read(buffer, 0, buffer.Length)) > 0)
|
||||
hashes.Add(sha.ComputeHash(buffer, 0, read));
|
||||
while (hashes.Count > 1)
|
||||
hashes = hashes
|
||||
.Select((h, i) => (h, i))
|
||||
.GroupBy(x => x.i / 2)
|
||||
.Select(g => sha.ComputeHash(g.SelectMany(x => x.h).ToArray()))
|
||||
.ToList();
|
||||
return Convert.ToHexString(hashes.Single());
|
||||
}
|
||||
```
|
||||
|
||||
### 8.2 Stored Values
|
||||
|
||||
```json
|
||||
{
|
||||
"layerDigest": "sha256:...",
|
||||
"merkleRoot": "b81f...",
|
||||
"leafCount": 240,
|
||||
"leavesHash": "sha256:..."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Replay Engine Implementation Notes (.NET 10)
|
||||
|
||||
### 9.1 Manifest Parsing
|
||||
|
||||
Use `System.Text.Json` with deterministic ordering:
|
||||
|
||||
```csharp
|
||||
var options = new JsonSerializerOptions {
|
||||
WriteIndented = false,
|
||||
PropertyNamingPolicy = JsonNamingPolicy.CamelCase,
|
||||
TypeInfoResolverChain = { new OrderedResolver() }
|
||||
};
|
||||
```
|
||||
|
||||
### 9.2 Stable Output
|
||||
|
||||
Normalize SBOM/Findings/VEX JSON:
|
||||
|
||||
```csharp
|
||||
string Canonicalize(string json) =>
|
||||
JsonSerializer.Serialize(
|
||||
JsonSerializer.Deserialize<JsonDocument>(json),
|
||||
options);
|
||||
```
|
||||
|
||||
### 9.3 Verification Flow
|
||||
|
||||
```csharp
|
||||
var manifest = Manifest.Load("manifest.json");
|
||||
VerifySignatures(manifest);
|
||||
VerifyHashes(manifest);
|
||||
if (mode == Strict) RunPipeline(manifest);
|
||||
else RunPipelineWithVariation(manifest, vary);
|
||||
```
|
||||
|
||||
### 9.4 Failure Modes
|
||||
|
||||
| Condition | Action |
|
||||
| -------------------------------- | ----------------------------- |
|
||||
| Missing snapshot or bundle | Error: `InputBundleMissing` |
|
||||
| Feed hash mismatch | Error: `FeedSnapshotDrift` |
|
||||
| Tool binary hash mismatch | Reject replay |
|
||||
| Output hash drift in strict mode | Mark as failed, emit diff log |
|
||||
| Invalid signature | Reject manifest |
|
||||
|
||||
---
|
||||
|
||||
## 10. Crypto Profiles and RootPack
|
||||
|
||||
### 10.1 Example Profiles
|
||||
|
||||
| Profile | Algorithms | Notes |
|
||||
| -------------- | ------------------------------------- | ----------------------- |
|
||||
| **FIPS-140-3** | ECDSA-P256 / SHA-256 / AES-GCM | Default for US/EU |
|
||||
| **GOST** | GOST R 34.10-2012 / GOST R 34.11-2012 | Russia |
|
||||
| **SM** | SM2 / SM3 / SM4 | China |
|
||||
| **eIDAS** | RSA-PSS / SHA-256 | EU qualified signatures |
|
||||
|
||||
### 10.2 Dual-Signing Example
|
||||
|
||||
```bash
|
||||
stella sign manifest.json --profiles=FIPS,GOST
|
||||
```
|
||||
|
||||
Produces:
|
||||
|
||||
```
|
||||
signatures/
|
||||
├─ manifest.dsse.fips.json
|
||||
└─ manifest.dsse.gost.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Test Strategy
|
||||
|
||||
| Test | Description | Expected Result |
|
||||
| ---------------------- | ------------------------------------ | --------------------------- |
|
||||
| **Golden Replay** | Repeat identical scan → same outputs | ✅ identical hashes |
|
||||
| **Feed Drift Test** | Replay with updated feeds | Only `inputs.feeds` changes |
|
||||
| **Tool Upgrade Test** | Replay with new scanner version | Reject or diff by `tools` |
|
||||
| **Policy Change Test** | Different lattice/mutes | Diff by `policy` section |
|
||||
| **Cross-Arch Test** | x64 vs arm64 | Identical outputs |
|
||||
| **Corrupted Bundle** | Tamper bundle | Verification fails |
|
||||
|
||||
---
|
||||
|
||||
## 12. Example Verification Output
|
||||
|
||||
```
|
||||
$ stella verify manifest.json
|
||||
|
||||
[✓] Manifest integrity: OK
|
||||
[✓] DSSE signatures (FIPS,GOST): OK
|
||||
[✓] Feeds snapshot hash: OK
|
||||
[✓] Policy + mutes hash: OK
|
||||
[✓] Toolchain hash: OK
|
||||
[✓] SBOM/VEX outputs: OK
|
||||
|
||||
Result: VERIFIED
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 13. Future Extensions
|
||||
|
||||
* Support **SPDX 3.0.1** alongside CycloneDX 1.6.
|
||||
* Add **per-file Merkle proofs** for local scans.
|
||||
* Ledger anchoring (Rekor, distributed Proof-Market).
|
||||
* Post-quantum signatures (Dilithium/Falcon).
|
||||
* Replay orchestration API (`/api/replay/:id`).
|
||||
|
||||
---
|
||||
|
||||
## 14. Summary
|
||||
|
||||
Deterministic Replay freezes every element of a scan:
|
||||
|
||||
> *image → feeds → policy → toolchain → environment → outputs → signatures.*
|
||||
|
||||
By enforcing canonical input/output states and verifiable cryptographic bindings, Stella Ops achieves **regulatory-grade replayability**, **regional crypto compliance**, and **immutable provenance** across all scans.
|
||||
|
||||
---
|
||||
116
docs/modules/replay/guides/DEVS_GUIDE_REPLAY.md
Normal file
116
docs/modules/replay/guides/DEVS_GUIDE_REPLAY.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# Stella Ops — Developer Guide: Deterministic Replay
|
||||
|
||||
## Purpose
|
||||
Deterministic Replay ensures any past scan can be re-executed byte-for-byte, producing identical SBOM, Findings, and VEX results, cryptographically verifiable for audits or compliance.
|
||||
|
||||
Replay is the foundation for:
|
||||
- **Audit proofs** (exact past state reproduction)
|
||||
- **Diff analysis** (feeds, policies, tool versions)
|
||||
- **Cross-region verification** (same outputs on different hosts)
|
||||
- **Long-term cryptographic trust** (re-sign with new crypto profiles)
|
||||
|
||||
---
|
||||
|
||||
## Core Concepts
|
||||
|
||||
| Term | Description |
|
||||
|------|--------------|
|
||||
| **Replay Manifest** | Immutable JSON describing all inputs, tools, env, and outputs of a scan. |
|
||||
| **InputBundle** | Snapshot of feeds, rules, policies, and toolchain binaries used. |
|
||||
| **OutputBundle** | SBOM, Findings, VEX, and logs from a completed scan. |
|
||||
| **Layer Merkle** | Per-layer hash tree for precise deduplication and drift detection. |
|
||||
| **DSSE Envelope** | Digital signature wrapper for each attestation (SBOM, Findings, Manifest, etc.). |
|
||||
|
||||
---
|
||||
|
||||
## What to Freeze
|
||||
|
||||
| Category | Example Contents | Required in Manifest |
|
||||
|-----------|------------------|----------------------|
|
||||
| **Subject** | OCI image digest, per-layer Merkle roots | ✅ |
|
||||
| **Outputs** | SBOM, Findings, VEX, logs (content hashes) | ✅ |
|
||||
| **Toolchain** | Sbomer, Scanner, Vexer binaries + versions + SHA256 | ✅ |
|
||||
| **Feeds/VEX sources** | Full or pruned snapshot with Merkle proofs | ✅ |
|
||||
| **Policy Bundle** | Lattice rules, mutes, trust profiles, thresholds | ✅ |
|
||||
| **Environment** | OS, arch, locale, TZ, deterministic seed, runtime flags | ✅ |
|
||||
| **Reachability Evidence** | Callgraphs (`graphs[]`), runtime traces (`runtimeTraces[]`), analyzer/version hashes | ✅ |
|
||||
| **Crypto Profile** | Algorithm suites (FIPS, GOST, SM, eIDAS) | ✅ |
|
||||
|
||||
---
|
||||
|
||||
## Replay Modes
|
||||
|
||||
| Mode | Purpose | Input Variation | Expected Output |
|
||||
|------|----------|-----------------|-----------------|
|
||||
| **Strict Replay** | Audit proof | None | Bit-for-bit identical |
|
||||
| **What-If Replay** | Change impact analysis | One dimension (feeds/tools/policy) | Deterministic diff |
|
||||
|
||||
Example:
|
||||
```
|
||||
|
||||
stella replay manifest.json --strict
|
||||
stella replay manifest.json --what-if --vary=feeds
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Developer Responsibilities
|
||||
|
||||
| Module | Role |
|
||||
|---------|------|
|
||||
| **Scanner.WebService** | Capture full input set and produce Replay Manifest + DSSE sigs. |
|
||||
| **Sbomer** | Generate deterministic SBOM; normalize ordering and JSON formatting. |
|
||||
| **Vexer/Excititor** | Apply lattice and mutes from policy bundle; record gating logic. |
|
||||
| **Feedser/Concelier** | Freeze and export feed snapshots or Merkle proofs. |
|
||||
| **Authority** | Manage signer keys and crypto profiles; issue DSSE envelopes. |
|
||||
| **CLI** | Provide `scan --record`, `replay`, `verify`, `diff` commands. |
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
1. `stella scan image:tag --record out/`
|
||||
- Generates Replay Manifest, InputBundle, OutputBundle, DSSE sigs.
|
||||
- Captures reachability graphs/traces (if enabled) and references them via `reachability.graphs[]` + `runtimeTraces[]`.
|
||||
2. `stella verify manifest.json`
|
||||
- Validates hashes, signatures, and completeness.
|
||||
3. `stella replay manifest.json --strict`
|
||||
- Re-executes in sealed mode; expect byte-identical results.
|
||||
4. `stella replay manifest.json --what-if --vary=feeds`
|
||||
- Runs with new feeds; diff is attributed to feeds only.
|
||||
5. `stella diff manifestA manifestB`
|
||||
- Attribute differences by hash comparison.
|
||||
|
||||
---
|
||||
|
||||
## Storage
|
||||
|
||||
- **PostgreSQL tables** (see `docs/db/SPECIFICATION.md` for schema details)
|
||||
- `replay.runs`: manifest hash, status, signatures, outputs
|
||||
- `replay.bundles`: digest, type, CAS location, size
|
||||
- `replay.subjects`: OCI digests + per-layer Merkle roots
|
||||
- **Indexes** (canonical names): `runs_manifest_hash_unique`, `runs_status_created_at`, `bundles_type`, `bundles_location`, `subjects_layer_digest`
|
||||
- **File store**
|
||||
- Bundles stored as `<sha256>.tar.zst` in CAS (`cas://replay/<shard>/<digest>.tar.zst`); shard = first two hex chars
|
||||
|
||||
---
|
||||
|
||||
## Developer Checklist
|
||||
|
||||
- [ ] All inputs (feeds, policies, tools, env) hashed and recorded.
|
||||
- [ ] JSON normalization: key order, number format, newline mode.
|
||||
- [ ] Random seed = `H(scan.id || MerkleRootAllLayers)`.
|
||||
- [ ] Clock fixed to `scan.time` unless policy requires “now”.
|
||||
- [ ] DSSE multi-sig supported (FIPS + regional).
|
||||
- [ ] Manifest signed + optionally anchored to Rekor ledger.
|
||||
- [ ] Replay comparison mode tested across x64/arm64.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
See also:
|
||||
- `DETERMINISTIC_REPLAY.md` — detailed manifest schema & CLI examples.
|
||||
- `../docs/CRYPTO_SOVEREIGN_READY.md` — RootPack and dual-signature handling.
|
||||
|
||||
---
|
||||
30
docs/modules/replay/guides/README.md
Normal file
30
docs/modules/replay/guides/README.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# Policy Simulation Gaps (PS1–PS10) — Lockfile, Quotas, and Shadow Safety
|
||||
|
||||
This note closes POLICY-GAPS-185-006 by defining a signed inputs lock, offline verifier, and shadow isolation guardrails for policy simulations.
|
||||
|
||||
## Lockfile
|
||||
- Schema: `docs/modules/replay/schemas/policy-sim/lock.schema.json`
|
||||
- Sample: `docs/modules/replay/samples/policy-sim/inputs.lock.sample.json`
|
||||
- Fields cover policy bundle, graph, SBOM, time anchor, dataset digests; shadowIsolation flag; requiredScopes.
|
||||
- Recommended signing: DSSE over the lockfile with Ed25519; record envelope digest alongside artefacts.
|
||||
|
||||
## Validation
|
||||
- Library helper: `PolicySimulationInputLockValidator` in `StellaOps.Replay.Core` compares materialized digests and enforces shadow mode + scope `policy:simulate:shadow`.
|
||||
- Staleness: pass `maxAge` (suggested 24h) to reject outdated locks.
|
||||
|
||||
## CLI / CI contract
|
||||
- Script: `scripts/replay/verify-policy-sim-lock.sh` (offline). Exit codes: 0 OK, 2 missing tools/args, 3 schema/hash mismatch, 4 stale, 5 shadow/scope failure.
|
||||
- CI should run verifier before simulations and fail fast on non-zero exit.
|
||||
|
||||
## Quotas & backpressure
|
||||
- Default limits: max 10 concurrent shadow runs per tenant; queue depth 100; reject when `policy:simulate:shadow` scope missing.
|
||||
- Simulators must be read-only: no writes to policy stores; only emit shadow metrics.
|
||||
|
||||
## Offline policy-sim kit
|
||||
- Lockfile + DSSE, digests of policy/graph/sbom/time-anchor/dataset.
|
||||
- Bundle alongside replay packs; verifier uses local SHA256 only (no network).
|
||||
|
||||
## Shadow isolation & redaction
|
||||
- Always run in `shadow` mode; block if requested runMode != `shadow`.
|
||||
- Redact PII fields (`user`, `ip`, `headers`, `secrets`) before storing fixtures; keep only hashes.
|
||||
- Require DSSE evidence when storing fixtures or responding to API clients.
|
||||
57
docs/modules/replay/guides/TEST_STRATEGY.md
Normal file
57
docs/modules/replay/guides/TEST_STRATEGY.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Replay Test Strategy
|
||||
|
||||
> **Imposed rule:** Replay tests must use frozen inputs (SBOM, advisories, VEX, feeds, policy, tools) and fixed seeds/clocks; any non-determinism is a test failure.
|
||||
|
||||
This strategy defines how we validate replayability of Scanner outputs and attestations across tool/definition updates and environments.
|
||||
|
||||
## 1. Goals
|
||||
- Prove that a recorded scan bundle (inputs + manifests) replays bit-for-bit across environments.
|
||||
- Detect drift from feeds, policy, or tooling changes before shipping releases.
|
||||
- Provide auditors with evidence (hashes, DSSE bundles) that replays are deterministic.
|
||||
|
||||
## 2. Test layers
|
||||
1) **Golden replay**: take a recorded bundle (SBOM/VEX/feeds/policy/tool hashes) and rerun; assert hash equality for SBOM, findings, VEX, logs. Fail on any difference.
|
||||
2) **Feed drift guard**: rerun bundle after feed update; expect differences; ensure drift is surfaced (hash mismatch, diff report) not silently masked.
|
||||
3) **Tool upgrade**: rerun with new scanner version; expect stable outputs if no functional change, otherwise require documented diffs.
|
||||
4) **Policy change**: rerun with updated policy; expect explain trace to show changed rules and hash delta; diff must be recorded.
|
||||
5) **Offline**: replay in sealed mode using only bundle contents; no network access permitted.
|
||||
|
||||
## 3. Inputs
|
||||
- Replay bundle contents: `sbom`, `feeds.tar.gz`, `policy.tar.gz`, `scanner-image`, `reachability.graph`, `runtime-trace` (optional), `replay.yaml`.
|
||||
- Hash manifest: SHA-256 for every file; top-level Merkle root.
|
||||
- DSSE attestations (optional): for replay manifest and artifacts.
|
||||
|
||||
## 4. Determinism settings
|
||||
- Fixed clock (`--fixed-clock` ISO-8601), RNG seed (`RNG_SEED`), single-threaded mode (`SCANNER_MAX_CONCURRENCY=1`), stable ordering (sorted inputs), log filtering (strip timestamps/PIDs).
|
||||
- Disable network/egress; rely on bundled feeds/policy.
|
||||
|
||||
## 5. Assertions
|
||||
- Hash equality for outputs: SBOMs, findings, VEX, logs (canonicalised), determinism.json (if present).
|
||||
- Verify DSSE signatures and Rekor proofs when available; fail if mismatched or missing.
|
||||
- Report diff summary when hashes differ (feed/tool/policy drift).
|
||||
|
||||
## 6. Tooling
|
||||
- CLI: `stella replay run --bundle <path> --fixed-clock 2025-11-01T00:00:00Z --seed 1337 --single-threaded`.
|
||||
- Scripts: `scripts/replay/verify_bundle.sh` (hash/manifest check), `scripts/replay/run_replay.sh` (orchestrates fixed settings), `scripts/replay/diff_outputs.py` (canonical diffs).
|
||||
- CI: `bench:determinism` target executes golden replay on reference bundles; fails on hash delta.
|
||||
|
||||
## 7. Outputs
|
||||
- `replay-results.json` with per-artifact hashes, pass/fail, diff counts.
|
||||
- `replay.log` filtered (no timestamps/PIDs), `replay.hashes` (sha256sum of outputs).
|
||||
- Optional DSSE attestation for replay results.
|
||||
|
||||
## 8. Reporting
|
||||
- Publish results to CI artifacts; store in Evidence Locker for audit.
|
||||
- Add summary to release notes when replay is part of a release gate.
|
||||
|
||||
## 9. Checklists
|
||||
- [ ] Bundle verified (hash manifest, DSSE if present).
|
||||
- [ ] Fixed clock/seed/concurrency applied.
|
||||
- [ ] Network disabled; feeds/policy/tooling from bundle only.
|
||||
- [ ] Outputs hashed and compared to baseline; diffs recorded.
|
||||
- [ ] Replay results stored + (optionally) attested.
|
||||
|
||||
## References
|
||||
- `docs/modules/scanner/determinism-score.md`
|
||||
- `docs/modules/replay/guides/DETERMINISTIC_REPLAY.md`
|
||||
- `docs/modules/scanner/entropy.md`
|
||||
455
docs/modules/replay/guides/replay-manifest-guide.md
Normal file
455
docs/modules/replay/guides/replay-manifest-guide.md
Normal file
@@ -0,0 +1,455 @@
|
||||
# Replay Manifest Guide
|
||||
|
||||
> **Sprint:** SPRINT_20251228_001_BE_replay_manifest_ci (T6)
|
||||
> **Purpose:** Complete reference for Replay Manifest export, verification, and CI integration.
|
||||
|
||||
## Overview
|
||||
|
||||
The Replay Manifest is a self-contained JSON document that captures everything needed to reproduce a scan: inputs, toolchain versions, policies, and expected outputs. When verified, it provides cryptographic proof that a scan is deterministic and reproducible.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Export replay manifest after scanning
|
||||
stella replay export --scan-id <scan-uuid> --output replay.json
|
||||
|
||||
# Or export for a specific image
|
||||
stella replay export --image myregistry/app:v1.0.0 --output replay.json
|
||||
|
||||
# Verify determinism (strict mode)
|
||||
stella replay verify --manifest replay.json --strict-mode
|
||||
|
||||
# Verify with drift failure (for CI)
|
||||
stella replay verify --manifest replay.json --fail-on-drift
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Schema Reference
|
||||
|
||||
### Schema Version
|
||||
|
||||
Current version: `1.0.0`
|
||||
|
||||
Schema location: `src/__Libraries/StellaOps.Replay.Core/Schemas/replay-export.schema.json`
|
||||
|
||||
### Top-Level Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "1.0.0",
|
||||
"snapshot": { ... },
|
||||
"toolchain": { ... },
|
||||
"inputs": { ... },
|
||||
"outputs": { ... },
|
||||
"verification": { ... }
|
||||
}
|
||||
```
|
||||
|
||||
### `snapshot` Object
|
||||
|
||||
Identifies the scan snapshot this manifest represents.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `id` | string | Unique snapshot ID (`snapshot:<sha256>`) |
|
||||
| `createdAt` | ISO 8601 | UTC timestamp when scan completed |
|
||||
| `artifact` | object | Reference to scanned artifact (digest, repository, tag) |
|
||||
|
||||
Example:
|
||||
```json
|
||||
{
|
||||
"id": "snapshot:a1b2c3d4e5f6...",
|
||||
"createdAt": "2025-12-28T14:30:00Z",
|
||||
"artifact": {
|
||||
"digest": "sha256:abc123...",
|
||||
"repository": "myregistry/app",
|
||||
"tag": "v1.0.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### `toolchain` Object
|
||||
|
||||
Captures exact versions of all tools used during the scan.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `scannerVersion` | string | StellaOps Scanner version |
|
||||
| `policyEngineVersion` | string | Policy Engine version |
|
||||
| `platform` | string | Platform identifier (e.g., `linux-x64`) |
|
||||
| `sbomerVersion` | string | SBOM generator version |
|
||||
| `vexerVersion` | string | VEX processor version |
|
||||
|
||||
Example:
|
||||
```json
|
||||
{
|
||||
"scannerVersion": "0.42.0",
|
||||
"policyEngineVersion": "0.42.0",
|
||||
"platform": "linux-x64",
|
||||
"sbomerVersion": "0.42.0",
|
||||
"vexerVersion": "0.42.0"
|
||||
}
|
||||
```
|
||||
|
||||
### `inputs` Object
|
||||
|
||||
All inputs consumed during the scan, with content hashes.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `sboms` | array | SBOM inputs (if layered) |
|
||||
| `vex` | array | VEX documents used |
|
||||
| `feeds` | array | Vulnerability feed snapshots |
|
||||
| `policies` | object | Policy bundle reference |
|
||||
|
||||
Feed snapshot example:
|
||||
```json
|
||||
{
|
||||
"feeds": [
|
||||
{
|
||||
"name": "nvd",
|
||||
"snapshotId": "nvd:2025-12-28T00:00:00Z",
|
||||
"digest": "sha256:def456...",
|
||||
"recordCount": 245678
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### `outputs` Object
|
||||
|
||||
Expected outputs from the scan, used for verification.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `verdictDigest` | string | SHA256 of verdict JSON |
|
||||
| `decision` | enum | `allow`, `deny`, or `review` |
|
||||
| `sbomDigest` | string | SHA256 of generated SBOM |
|
||||
| `findingsDigest` | string | SHA256 of findings JSON |
|
||||
|
||||
### `verification` Object
|
||||
|
||||
Helper commands and expected hashes for verification.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `command` | string | CLI command to reproduce scan |
|
||||
| `expectedSbomHash` | string | Expected SBOM content hash |
|
||||
| `expectedVerdictHash` | string | Expected verdict content hash |
|
||||
|
||||
---
|
||||
|
||||
## CLI Commands
|
||||
|
||||
### `stella replay export`
|
||||
|
||||
Export a replay manifest from a completed scan.
|
||||
|
||||
```bash
|
||||
stella replay export [OPTIONS]
|
||||
```
|
||||
|
||||
| Option | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| `--scan-id <uuid>` | One of | Scan ID to export |
|
||||
| `--image <ref>` | One of | Image reference (uses latest scan) |
|
||||
| `--output <path>` | No | Output path (default: `replay.json`) |
|
||||
| `--include-feed-snapshots` | No | Include full feed snapshot refs |
|
||||
| `--no-verification-script` | No | Skip verification command generation |
|
||||
|
||||
### `stella replay verify`
|
||||
|
||||
Verify a replay manifest by re-executing the scan and comparing outputs.
|
||||
|
||||
```bash
|
||||
stella replay verify [OPTIONS]
|
||||
```
|
||||
|
||||
| Option | Required | Description |
|
||||
|--------|----------|-------------|
|
||||
| `--manifest <path>` | Yes | Path to replay manifest |
|
||||
| `--strict-mode` | No | Require bit-for-bit identical outputs |
|
||||
| `--fail-on-drift` | No | Exit code 1 on any drift |
|
||||
| `--output-diff <path>` | No | Write diff report to file |
|
||||
|
||||
### Exit Codes
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| `0` | Verification passed, outputs match |
|
||||
| `1` | Drift detected, outputs differ |
|
||||
| `2` | Verification error (missing inputs, invalid manifest, etc.) |
|
||||
|
||||
---
|
||||
|
||||
## CI Integration
|
||||
|
||||
### Gitea Actions
|
||||
|
||||
```yaml
|
||||
name: SBOM Replay Verification
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
|
||||
jobs:
|
||||
verify-determinism:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Build image
|
||||
run: docker build -t ${{ github.repository }}:${{ github.sha }} .
|
||||
|
||||
- name: Scan with replay export
|
||||
run: |
|
||||
stellaops scan \
|
||||
--image ${{ github.repository }}:${{ github.sha }} \
|
||||
--output-sbom sbom.json \
|
||||
--output-replay replay.json
|
||||
|
||||
- name: Verify determinism
|
||||
run: |
|
||||
stellaops replay verify \
|
||||
--manifest replay.json \
|
||||
--fail-on-drift \
|
||||
--strict-mode
|
||||
|
||||
- name: Upload replay manifest
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: replay-manifest
|
||||
path: replay.json
|
||||
retention-days: 90
|
||||
```
|
||||
|
||||
### GitHub Actions
|
||||
|
||||
```yaml
|
||||
name: SBOM Replay Verification
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
pull_request:
|
||||
|
||||
jobs:
|
||||
verify-determinism:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up StellaOps
|
||||
uses: stellaops/setup-stella@v1
|
||||
with:
|
||||
version: '0.42.0'
|
||||
|
||||
- name: Build and scan
|
||||
run: |
|
||||
docker build -t myapp:${{ github.sha }} .
|
||||
stella scan --image myapp:${{ github.sha }} \
|
||||
--output-sbom sbom.json \
|
||||
--output-replay replay.json
|
||||
|
||||
- name: Verify replay
|
||||
run: stella replay verify --manifest replay.json --fail-on-drift
|
||||
|
||||
- name: Upload attestations
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: sbom-attestations
|
||||
path: |
|
||||
sbom.json
|
||||
replay.json
|
||||
```
|
||||
|
||||
### GitLab CI
|
||||
|
||||
```yaml
|
||||
sbom-replay:
|
||||
stage: security
|
||||
image: stellaops/cli:latest
|
||||
script:
|
||||
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
|
||||
- stella scan --image $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --output-replay replay.json
|
||||
- stella replay verify --manifest replay.json --fail-on-drift
|
||||
artifacts:
|
||||
paths:
|
||||
- replay.json
|
||||
expire_in: 90 days
|
||||
rules:
|
||||
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
|
||||
- if: $CI_COMMIT_BRANCH == "main"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting Drift Detection
|
||||
|
||||
### Common Drift Causes
|
||||
|
||||
| Cause | Symptom | Fix |
|
||||
|-------|---------|-----|
|
||||
| Feed update | `findingsDigest` differs | Pin feed snapshot version |
|
||||
| Policy change | `verdictDigest` differs | Version policy bundles |
|
||||
| Tool upgrade | All digests differ | Lock toolchain versions |
|
||||
| Non-deterministic SBOM | `sbomDigest` differs | Enable deterministic mode |
|
||||
| Timezone issues | Timestamps drift | Ensure UTC everywhere |
|
||||
|
||||
### Debugging Steps
|
||||
|
||||
1. **Export diff report:**
|
||||
```bash
|
||||
stella replay verify --manifest replay.json --output-diff drift-report.json
|
||||
```
|
||||
|
||||
2. **Compare inputs:**
|
||||
```bash
|
||||
stella replay diff --manifest-a old.json --manifest-b new.json --show-inputs
|
||||
```
|
||||
|
||||
3. **Check feed versions:**
|
||||
```bash
|
||||
stella feeds list --show-snapshots
|
||||
```
|
||||
|
||||
4. **Verify toolchain:**
|
||||
```bash
|
||||
stella version --all
|
||||
```
|
||||
|
||||
### Feed Snapshot Pinning
|
||||
|
||||
For reproducible CI, pin feed snapshots:
|
||||
|
||||
```bash
|
||||
# List available snapshots
|
||||
stella feeds snapshots --feed nvd
|
||||
|
||||
# Pin specific snapshot
|
||||
stella scan --image myapp:v1.0.0 \
|
||||
--feed-snapshot nvd:2025-12-28T00:00:00Z \
|
||||
--output-replay replay.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Best Practices for Deterministic Builds
|
||||
|
||||
### 1. Lock All Dependencies
|
||||
|
||||
```yaml
|
||||
# In CI, always specify exact versions
|
||||
stellaops/cli:0.42.0 # Not :latest
|
||||
```
|
||||
|
||||
### 2. Pin Feed Snapshots
|
||||
|
||||
```bash
|
||||
# Export current snapshot ID
|
||||
stella feeds export-snapshot --output feeds-snapshot.json
|
||||
|
||||
# Use in subsequent scans
|
||||
stella scan --feed-snapshot-file feeds-snapshot.json
|
||||
```
|
||||
|
||||
### 3. Version Policy Bundles
|
||||
|
||||
```bash
|
||||
# Store policies in version control
|
||||
git add policies/
|
||||
git commit -m "Policy bundle v2.3.0"
|
||||
|
||||
# Reference by commit in manifest
|
||||
stella scan --policy-ref policies@abc123
|
||||
```
|
||||
|
||||
### 4. Use Strict Mode in CI
|
||||
|
||||
```bash
|
||||
# Always use strict mode in CI pipelines
|
||||
stella replay verify --manifest replay.json --strict-mode --fail-on-drift
|
||||
```
|
||||
|
||||
### 5. Archive Replay Manifests
|
||||
|
||||
Store replay manifests alongside release artifacts for audit trail:
|
||||
|
||||
```bash
|
||||
# Archive with release
|
||||
cp replay.json releases/v1.0.0/replay.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Reference
|
||||
|
||||
### `IReplayManifestExporter`
|
||||
|
||||
```csharp
|
||||
public interface IReplayManifestExporter
|
||||
{
|
||||
/// <summary>
|
||||
/// Exports a replay manifest for a completed scan.
|
||||
/// </summary>
|
||||
Task<ReplayExportResult> ExportAsync(
|
||||
string scanId,
|
||||
ReplayExportOptions options,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
### `ReplayExportOptions`
|
||||
|
||||
```csharp
|
||||
public sealed record ReplayExportOptions
|
||||
{
|
||||
/// <summary>Include exact toolchain versions.</summary>
|
||||
public bool IncludeToolchainVersions { get; init; } = true;
|
||||
|
||||
/// <summary>Include feed snapshot references.</summary>
|
||||
public bool IncludeFeedSnapshots { get; init; } = true;
|
||||
|
||||
/// <summary>Generate verification shell command.</summary>
|
||||
public bool GenerateVerificationScript { get; init; } = true;
|
||||
|
||||
/// <summary>Output file path.</summary>
|
||||
public string OutputPath { get; init; } = "replay.json";
|
||||
}
|
||||
```
|
||||
|
||||
### `ReplayExportResult`
|
||||
|
||||
```csharp
|
||||
public sealed record ReplayExportResult
|
||||
{
|
||||
/// <summary>Path to exported manifest.</summary>
|
||||
public required string ManifestPath { get; init; }
|
||||
|
||||
/// <summary>SHA256 digest of manifest content.</summary>
|
||||
public required string ManifestDigest { get; init; }
|
||||
|
||||
/// <summary>Path to verification script (if generated).</summary>
|
||||
public string? VerificationScriptPath { get; init; }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Deterministic Replay](DETERMINISTIC_REPLAY.md) - Core concepts and architecture
|
||||
- [Developer Guide: Replay](DEVS_GUIDE_REPLAY.md) - Implementation details
|
||||
- [Replay Manifest v2 Acceptance](replay-manifest-v2-acceptance.md) - Schema evolution
|
||||
- [Test Strategy](TEST_STRATEGY.md) - Replay testing approach
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0.0 | 2025-12-28 | Initial schema and CLI commands |
|
||||
311
docs/modules/replay/guides/replay-manifest-v2-acceptance.md
Normal file
311
docs/modules/replay/guides/replay-manifest-v2-acceptance.md
Normal file
@@ -0,0 +1,311 @@
|
||||
# Replay Manifest v2 Acceptance Contract
|
||||
|
||||
_Last updated: 2025-12-13. Owner: BE-Base Platform Guild._
|
||||
|
||||
This document defines the acceptance criteria and test vectors for replay manifest v2, enabling Task 19 (GAP-REP-004) to proceed with implementation.
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
Replay manifest v2 introduces:
|
||||
|
||||
- **BLAKE3 graph hashes:** Primary hash algorithm for reachability graphs
|
||||
- **Sorted CAS entries:** Deterministic ordering of all CAS references
|
||||
- **hashAlg fields:** Explicit algorithm declarations for forward compatibility
|
||||
- **code_id coverage:** Coverage metrics for stripped binary handling
|
||||
|
||||
---
|
||||
|
||||
## 2. Schema Changes (v1 → v2)
|
||||
|
||||
### 2.1 Version Field
|
||||
|
||||
```json
|
||||
{
|
||||
"schemaVersion": "2.0",
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### 2.2 Hash Algorithm Declaration
|
||||
|
||||
All hash fields now include explicit algorithm:
|
||||
|
||||
```json
|
||||
{
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{
|
||||
"hash": "blake3:a1b2c3d4e5f6...",
|
||||
"hashAlg": "blake3-256",
|
||||
"casUri": "cas://reachability/graphs/blake3:a1b2c3d4..."
|
||||
}
|
||||
],
|
||||
"runtimeTraces": [
|
||||
{
|
||||
"hash": "sha256:feedface...",
|
||||
"hashAlg": "sha256",
|
||||
"casUri": "cas://reachability/runtime/sha256:feedface..."
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2.3 Sorted CAS Entries
|
||||
|
||||
All arrays must be sorted by deterministic key:
|
||||
|
||||
| Array | Sort Key |
|
||||
|-------|----------|
|
||||
| `reachability.graphs[]` | `casUri` (lexicographic) |
|
||||
| `reachability.runtimeTraces[]` | `casUri` (lexicographic) |
|
||||
| `inputs.feeds[]` | `name` (lexicographic) |
|
||||
| `inputs.tools[]` | `name` (lexicographic) |
|
||||
|
||||
### 2.4 Code ID Coverage
|
||||
|
||||
New field for stripped binary support:
|
||||
|
||||
```json
|
||||
{
|
||||
"reachability": {
|
||||
"code_id_coverage": {
|
||||
"total_nodes": 1247,
|
||||
"nodes_with_symbol_id": 1189,
|
||||
"nodes_with_code_id": 58,
|
||||
"coverage_percent": 100.0
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. CAS Registration Gates
|
||||
|
||||
### 3.1 Required Registration
|
||||
|
||||
All referenced artifacts must be registered in CAS before manifest finalization:
|
||||
|
||||
| Artifact Type | CAS Path Pattern | Required |
|
||||
|---------------|------------------|----------|
|
||||
| Graph body | `cas://reachability/graphs/{hash}` | Yes |
|
||||
| Graph DSSE | `cas://reachability/graphs/{hash}.dsse` | Yes |
|
||||
| Runtime trace | `cas://reachability/runtime/{hash}` | Conditional |
|
||||
| Edge bundle | `cas://reachability/edges/{graph_hash}/{bundle_id}` | Conditional |
|
||||
|
||||
### 3.2 Registration Validation
|
||||
|
||||
Before signing a replay manifest:
|
||||
|
||||
1. Verify all `casUri` references resolve to existing CAS objects
|
||||
2. Verify hash matches CAS content
|
||||
3. Verify DSSE envelope exists for all graph references
|
||||
4. Fail manifest creation if any reference is missing
|
||||
|
||||
### 3.3 Validation API
|
||||
|
||||
```csharp
|
||||
public interface ICasValidator
|
||||
{
|
||||
Task<CasValidationResult> ValidateAsync(string casUri, string expectedHash);
|
||||
Task<CasValidationResult> ValidateBatchAsync(IEnumerable<CasReference> refs);
|
||||
}
|
||||
|
||||
public record CasValidationResult(
|
||||
bool IsValid,
|
||||
string? ActualHash,
|
||||
string? Error
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Acceptance Test Vectors
|
||||
|
||||
### 4.1 Minimal Valid Manifest v2
|
||||
|
||||
```json
|
||||
{
|
||||
"schemaVersion": "2.0",
|
||||
"scan": {
|
||||
"id": "scan-test-001",
|
||||
"time": "2025-12-13T10:00:00Z",
|
||||
"mode": "record",
|
||||
"scannerVersion": "10.2.0"
|
||||
},
|
||||
"subject": {
|
||||
"ociDigest": "sha256:abc123..."
|
||||
},
|
||||
"inputs": {
|
||||
"feeds": [],
|
||||
"tools": []
|
||||
},
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{
|
||||
"kind": "static",
|
||||
"analyzer": "scanner.java@10.2.0",
|
||||
"hash": "blake3:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2",
|
||||
"hashAlg": "blake3-256",
|
||||
"casUri": "cas://reachability/graphs/blake3:a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2"
|
||||
}
|
||||
],
|
||||
"runtimeTraces": [],
|
||||
"code_id_coverage": {
|
||||
"total_nodes": 100,
|
||||
"nodes_with_symbol_id": 100,
|
||||
"nodes_with_code_id": 0,
|
||||
"coverage_percent": 100.0
|
||||
}
|
||||
},
|
||||
"outputs": {},
|
||||
"provenance": {}
|
||||
}
|
||||
```
|
||||
|
||||
**Expected canonical hash:** `sha256:e7f8a9b0...` (computed from canonical JSON)
|
||||
|
||||
### 4.2 Manifest with Runtime Traces
|
||||
|
||||
```json
|
||||
{
|
||||
"schemaVersion": "2.0",
|
||||
"scan": {
|
||||
"id": "scan-test-002",
|
||||
"time": "2025-12-13T11:00:00Z",
|
||||
"mode": "record",
|
||||
"scannerVersion": "10.2.0"
|
||||
},
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{
|
||||
"kind": "static",
|
||||
"analyzer": "scanner.java@10.2.0",
|
||||
"hash": "blake3:1111111111111111111111111111111111111111111111111111111111111111",
|
||||
"hashAlg": "blake3-256",
|
||||
"casUri": "cas://reachability/graphs/blake3:1111111111111111111111111111111111111111111111111111111111111111"
|
||||
}
|
||||
],
|
||||
"runtimeTraces": [
|
||||
{
|
||||
"source": "eventpipe",
|
||||
"hash": "sha256:2222222222222222222222222222222222222222222222222222222222222222",
|
||||
"hashAlg": "sha256",
|
||||
"casUri": "cas://reachability/runtime/sha256:2222222222222222222222222222222222222222222222222222222222222222",
|
||||
"recordedAt": "2025-12-13T10:30:00Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.3 Sorting Validation Vector
|
||||
|
||||
Input (unsorted):
|
||||
|
||||
```json
|
||||
{
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{"casUri": "cas://reachability/graphs/blake3:zzzz...", "kind": "framework"},
|
||||
{"casUri": "cas://reachability/graphs/blake3:aaaa...", "kind": "static"}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Expected output (sorted):
|
||||
|
||||
```json
|
||||
{
|
||||
"reachability": {
|
||||
"graphs": [
|
||||
{"casUri": "cas://reachability/graphs/blake3:aaaa...", "kind": "static"},
|
||||
{"casUri": "cas://reachability/graphs/blake3:zzzz...", "kind": "framework"}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.4 Invalid Manifest Vectors
|
||||
|
||||
| Test Case | Input | Expected Error |
|
||||
|-----------|-------|----------------|
|
||||
| Missing schemaVersion | `{}` | `REPLAY_MANIFEST_MISSING_VERSION` |
|
||||
| Invalid version | `{"schemaVersion": "1.0"}` | `REPLAY_MANIFEST_VERSION_MISMATCH` (when v2 required) |
|
||||
| Missing hashAlg | `{"hash": "blake3:..."}` | `REPLAY_MANIFEST_MISSING_HASH_ALG` |
|
||||
| Unsorted graphs | See 4.3 input | `REPLAY_MANIFEST_UNSORTED_ENTRIES` |
|
||||
| Missing CAS reference | `{"casUri": "cas://missing/..."}` | `REPLAY_MANIFEST_CAS_NOT_FOUND` |
|
||||
| Hash mismatch | CAS content differs | `REPLAY_MANIFEST_HASH_MISMATCH` |
|
||||
|
||||
---
|
||||
|
||||
## 5. Migration Path
|
||||
|
||||
### 5.1 v1 → v2 Upgrade
|
||||
|
||||
```csharp
|
||||
public static ReplayManifest UpgradeToV2(ReplayManifest v1)
|
||||
{
|
||||
return v1 with
|
||||
{
|
||||
SchemaVersion = "2.0",
|
||||
Reachability = v1.Reachability with
|
||||
{
|
||||
Graphs = v1.Reachability.Graphs
|
||||
.Select(g => g with { HashAlg = InferHashAlg(g.Hash) })
|
||||
.OrderBy(g => g.CasUri)
|
||||
.ToList(),
|
||||
RuntimeTraces = v1.Reachability.RuntimeTraces
|
||||
.Select(t => t with { HashAlg = InferHashAlg(t.Hash) })
|
||||
.OrderBy(t => t.CasUri)
|
||||
.ToList()
|
||||
}
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Backward Compatibility
|
||||
|
||||
- v2 readers MUST accept v1 manifests with warning
|
||||
- v2 writers MUST always emit v2 format
|
||||
- v1 writers deprecated after 2026-03-01
|
||||
|
||||
---
|
||||
|
||||
## 6. Test Fixture Locations
|
||||
|
||||
```
|
||||
tests/Replay/
|
||||
fixtures/
|
||||
manifest-v2-minimal.json
|
||||
manifest-v2-with-runtime.json
|
||||
manifest-v2-sorted.json
|
||||
manifest-v2-code-id-coverage.json
|
||||
invalid/
|
||||
manifest-missing-version.json
|
||||
manifest-unsorted.json
|
||||
manifest-missing-hashalg.json
|
||||
golden/
|
||||
manifest-v2-canonical.golden.json
|
||||
manifest-v2-hash.golden.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Implementation Checklist
|
||||
|
||||
- [ ] Update `ReplayManifest` record with v2 fields
|
||||
- [ ] Add `hashAlg` to all hash-bearing types
|
||||
- [ ] Implement sorting in `ReachabilityReplayWriter`
|
||||
- [ ] Add CAS registration validation
|
||||
- [ ] Create test fixtures
|
||||
- [ ] Update `DETERMINISTIC_REPLAY.md` section 3
|
||||
- [ ] Wire into RecordModeService
|
||||
|
||||
---
|
||||
|
||||
_Last updated: 2025-12-13. See Sprint 0401 GAP-REP-004 for implementation._
|
||||
@@ -0,0 +1,27 @@
|
||||
# Replay Retention Schema Freeze - 2025-12-10
|
||||
|
||||
## Why
|
||||
- Unblock EvidenceLocker replay ingestion tasks (EVID-REPLAY-187-001) and downstream CLI/runbook work by freezing a retention declaration schema.
|
||||
- Keep outputs deterministic and tenant-scoped while offline/air-gap friendly.
|
||||
|
||||
## Scope & Decisions
|
||||
- Schema path: `docs/schemas/replay-retention.schema.json`.
|
||||
- Fields:
|
||||
- `retention_policy_id` (string, stable ID for policy version).
|
||||
- `tenant_id` (string, required).
|
||||
- `dataset` (string; e.g., evidence_bundle, replay_log, advisory_payload).
|
||||
- `bundle_type` (enum: portable_bundle, sealed_bundle, replay_log, advisory_payload).
|
||||
- `retention_days` (int 1-3650).
|
||||
- `legal_hold` (bool).
|
||||
- `purge_after` (ISO-8601 UTC; derived from ingest + retention_days unless legal_hold=true).
|
||||
- `checksum` (algorithm: sha256/sha512, value hex).
|
||||
- `created_at` (ISO-8601 UTC).
|
||||
- Determinism: no additionalProperties; checksum recorded for audit; UTC timestamps only.
|
||||
- Tenant isolation: tenant_id mandatory; policy IDs may be per-tenant.
|
||||
|
||||
## Impacted Tasks
|
||||
- EVID-REPLAY-187-001, CLI-REPLAY-187-002, RUNBOOK-REPLAY-187-004 are unblocked on retention shape; implementation still required in corresponding modules.
|
||||
|
||||
## Next Steps
|
||||
- Wire schema validation in EvidenceLocker ingest and CLI replay commands.
|
||||
- Document retention defaults and legal-hold overrides in `docs/operations/runbooks/replay_ops.md`.
|
||||
44
docs/modules/replay/schemas/replay_schema.md
Normal file
44
docs/modules/replay/schemas/replay_schema.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# Replay PostgreSQL Schema
|
||||
|
||||
Status: draft · applies to net10 replay pipeline (Sprint 0185)
|
||||
|
||||
## Tables
|
||||
|
||||
### replay_runs
|
||||
- **id**: scan UUID (string, primary key)
|
||||
- **manifest_hash**: `sha256:<hex>` (unique)
|
||||
- **status**: `pending|verified|failed|replayed`
|
||||
- **created_at / updated_at**: UTC ISO-8601
|
||||
- **signatures**: JSONB `[{ profile, verified }]` (multi-profile DSSE verification)
|
||||
- **outputs**: JSONB `{ sbom, findings, vex?, log? }` (all SHA-256 digests)
|
||||
|
||||
**Indexes**
|
||||
- `runs_manifest_hash_unique`: `(manifest_hash)` (unique)
|
||||
- `runs_status_created_at`: `(status, created_at DESC)`
|
||||
|
||||
### replay_bundles
|
||||
- **id**: bundle digest hex (no `sha256:` prefix)
|
||||
- **type**: `input|output|rootpack|reachability`
|
||||
- **size**: bytes
|
||||
- **location**: CAS URI `cas://replay/<prefix>/<digest>.tar.zst`
|
||||
- **created_at**: UTC ISO-8601
|
||||
|
||||
**Indexes**
|
||||
- `bundles_type`: `(type, created_at DESC)`
|
||||
- `bundles_location`: `(location)`
|
||||
|
||||
### replay_subjects
|
||||
- **id**: OCI image digest (`sha256:<hex>`)
|
||||
- **layers**: JSONB `[{ layer_digest, merkle_root, leaf_count }]`
|
||||
|
||||
**Indexes**
|
||||
- `subjects_layer_digest`: GIN index on `layers` for layer_digest lookups
|
||||
|
||||
## Determinism & constraints
|
||||
- All timestamps stored as UTC.
|
||||
- Digests are lowercase hex; CAS URIs must follow `cas://<prefix>/<shard>/<digest>.tar.zst` where `<shard>` = first two hex chars.
|
||||
- No external references; embed minimal metadata only (feed/policy hashes live in replay manifest).
|
||||
|
||||
## Client models
|
||||
- Implemented in `src/__Libraries/StellaOps.Replay.Core/ReplayPostgresModels.cs` with matching index name constants (`ReplayIndexes`).
|
||||
- Serialization uses System.Text.Json with snake_case property naming; field names match table schema above.
|
||||
Reference in New Issue
Block a user