- Add RpmVersionComparer for RPM version comparison with epoch, version, and release handling. - Introduce DebianVersion for parsing Debian EVR (Epoch:Version-Release) strings. - Create ApkVersion for parsing Alpine APK version strings with suffix support. - Define IVersionComparator interface for version comparison with proof-line generation. - Implement VersionComparisonResult struct to encapsulate comparison results and proof lines. - Add tests for Debian and RPM version comparers to ensure correct functionality and edge case handling. - Create project files for the version comparison library and its tests.
333 lines
8.7 KiB
Markdown
333 lines
8.7 KiB
Markdown
# Replay Verification
|
|
|
|
_Last updated: 2025-12-22. Owner: Scanner Guild._
|
|
|
|
This document describes the **replay verification** workflow that ensures reachability slices are reproducible and tamper-evident.
|
|
|
|
---
|
|
|
|
## 1. Overview
|
|
|
|
Replay verification answers: *"Given the same inputs, do we get the exact same slice?"*
|
|
|
|
This is critical for:
|
|
- **Audit trails**: Prove analysis results are genuine
|
|
- **Tamper detection**: Detect modified inputs or results
|
|
- **Debugging**: Identify sources of non-determinism
|
|
- **Compliance**: Demonstrate reproducible security analysis
|
|
|
|
---
|
|
|
|
## 2. Replay Workflow
|
|
|
|
```
|
|
┌─────────────────┐ ┌──────────────────┐ ┌───────────────────┐
|
|
│ Original │ │ Rehydrate │ │ Recompute │
|
|
│ Slice │────►│ Inputs │────►│ Slice │
|
|
│ (with digest) │ │ from CAS │ │ (fresh) │
|
|
└─────────────────┘ └──────────────────┘ └───────────────────┘
|
|
│
|
|
▼
|
|
┌───────────────────┐
|
|
│ Compare │
|
|
│ byte-for-byte │
|
|
└───────────────────┘
|
|
│
|
|
┌─────────────┴─────────────┐
|
|
▼ ▼
|
|
┌──────────┐ ┌──────────┐
|
|
│ MATCH │ │ MISMATCH │
|
|
│ ✓ │ │ + diff │
|
|
└──────────┘ └──────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## 3. API Reference
|
|
|
|
### 3.1 Replay Endpoint
|
|
|
|
```http
|
|
POST /api/slices/replay
|
|
Content-Type: application/json
|
|
|
|
{
|
|
"sliceDigest": "blake3:a1b2c3d4..."
|
|
}
|
|
```
|
|
|
|
### 3.2 Response Format
|
|
|
|
**Match Response (200 OK)**:
|
|
```json
|
|
{
|
|
"match": true,
|
|
"originalDigest": "blake3:a1b2c3d4...",
|
|
"recomputedDigest": "blake3:a1b2c3d4...",
|
|
"replayedAt": "2025-12-22T10:00:00Z",
|
|
"inputsVerified": true
|
|
}
|
|
```
|
|
|
|
**Mismatch Response (200 OK)**:
|
|
```json
|
|
{
|
|
"match": false,
|
|
"originalDigest": "blake3:a1b2c3d4...",
|
|
"recomputedDigest": "blake3:e5f6g7h8...",
|
|
"replayedAt": "2025-12-22T10:00:00Z",
|
|
"diff": {
|
|
"missingNodes": ["node:5"],
|
|
"extraNodes": ["node:6"],
|
|
"missingEdges": [{"from": "node:1", "to": "node:5"}],
|
|
"extraEdges": [{"from": "node:1", "to": "node:6"}],
|
|
"verdictDiff": {
|
|
"original": "unreachable",
|
|
"recomputed": "reachable"
|
|
},
|
|
"confidenceDiff": {
|
|
"original": 0.95,
|
|
"recomputed": 0.72
|
|
}
|
|
},
|
|
"possibleCauses": [
|
|
"Input graph may have been modified",
|
|
"Analyzer version mismatch: 1.2.0 vs 1.2.1",
|
|
"Feed version changed: nvd-2025-12-20 vs nvd-2025-12-22"
|
|
]
|
|
}
|
|
```
|
|
|
|
**Error Response (404 Not Found)**:
|
|
```json
|
|
{
|
|
"error": "slice_not_found",
|
|
"message": "Slice with digest blake3:a1b2c3d4... not found in CAS",
|
|
"sliceDigest": "blake3:a1b2c3d4..."
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Input Rehydration
|
|
|
|
All inputs must be CAS-addressed for replay:
|
|
|
|
### 4.1 Required Inputs
|
|
|
|
| Input | CAS Key | Description |
|
|
|-------|---------|-------------|
|
|
| Graph | `cas://graphs/{digest}` | Full RichGraph JSON |
|
|
| Binaries | `cas://binaries/{digest}` | Binary file hashes |
|
|
| SBOM | `cas://sboms/{digest}` | CycloneDX/SPDX document |
|
|
| Policy | `cas://policies/{digest}` | Policy DSL |
|
|
| Feeds | `cas://feeds/{version}` | Advisory feed snapshot |
|
|
|
|
### 4.2 Manifest Contents
|
|
|
|
```json
|
|
{
|
|
"manifest": {
|
|
"analyzerVersion": "scanner.native:1.2.0",
|
|
"rulesetHash": "sha256:abc123...",
|
|
"feedVersions": {
|
|
"nvd": "2025-12-20",
|
|
"osv": "2025-12-20",
|
|
"ghsa": "2025-12-20"
|
|
},
|
|
"createdAt": "2025-12-22T10:00:00Z",
|
|
"toolchain": "iced-x86:1.21.0",
|
|
"environment": {
|
|
"os": "linux",
|
|
"arch": "x86_64"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Determinism Requirements
|
|
|
|
For byte-for-byte reproducibility:
|
|
|
|
### 5.1 JSON Canonicalization
|
|
|
|
```
|
|
1. Keys sorted alphabetically at all levels
|
|
2. No whitespace (compact JSON)
|
|
3. UTF-8 encoding
|
|
4. Lowercase hex for all hashes
|
|
5. Numbers: no trailing zeros, scientific notation for large values
|
|
```
|
|
|
|
### 5.2 Graph Ordering
|
|
|
|
```
|
|
Nodes: sorted by symbolId (lexicographic)
|
|
Edges: sorted by (from, to) tuple (lexicographic)
|
|
Paths: sorted by first node, then path length
|
|
```
|
|
|
|
### 5.3 Timestamp Handling
|
|
|
|
```
|
|
All timestamps: UTC, ISO-8601, with 'Z' suffix
|
|
Example: "2025-12-22T10:00:00Z"
|
|
No milliseconds unless significant
|
|
```
|
|
|
|
### 5.4 Floating Point
|
|
|
|
```
|
|
Confidence values: round to 6 decimal places
|
|
Example: 0.950000, not 0.95 or 0.9500001
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Diff Computation
|
|
|
|
When slices don't match:
|
|
|
|
### 6.1 Diff Algorithm
|
|
|
|
```python
|
|
def compute_diff(original, recomputed):
|
|
diff = SliceDiff()
|
|
|
|
# Node diff
|
|
orig_nodes = set(n.id for n in original.subgraph.nodes)
|
|
new_nodes = set(n.id for n in recomputed.subgraph.nodes)
|
|
diff.missing_nodes = list(orig_nodes - new_nodes)
|
|
diff.extra_nodes = list(new_nodes - orig_nodes)
|
|
|
|
# Edge diff
|
|
orig_edges = set((e.from, e.to) for e in original.subgraph.edges)
|
|
new_edges = set((e.from, e.to) for e in recomputed.subgraph.edges)
|
|
diff.missing_edges = list(orig_edges - new_edges)
|
|
diff.extra_edges = list(new_edges - orig_edges)
|
|
|
|
# Verdict diff
|
|
if original.verdict.status != recomputed.verdict.status:
|
|
diff.verdict_diff = {
|
|
"original": original.verdict.status,
|
|
"recomputed": recomputed.verdict.status
|
|
}
|
|
|
|
return diff
|
|
```
|
|
|
|
### 6.2 Cause Analysis
|
|
|
|
```python
|
|
def analyze_causes(original, recomputed, manifest):
|
|
causes = []
|
|
|
|
if manifest.analyzerVersion != current_version():
|
|
causes.append(f"Analyzer version mismatch")
|
|
|
|
if manifest.feedVersions != current_feed_versions():
|
|
causes.append(f"Feed version changed")
|
|
|
|
if original.inputs.graphDigest != fetch_graph_digest():
|
|
causes.append(f"Input graph may have been modified")
|
|
|
|
return causes
|
|
```
|
|
|
|
---
|
|
|
|
## 7. CLI Usage
|
|
|
|
### 7.1 Replay Command
|
|
|
|
```bash
|
|
# Replay and verify a slice
|
|
stella slice replay --digest blake3:a1b2c3d4...
|
|
|
|
# Output:
|
|
# ✓ Slice verified: digest matches
|
|
# Original: blake3:a1b2c3d4...
|
|
# Recomputed: blake3:a1b2c3d4...
|
|
```
|
|
|
|
### 7.2 Verbose Mode
|
|
|
|
```bash
|
|
stella slice replay --digest blake3:a1b2c3d4... --verbose
|
|
|
|
# Output:
|
|
# Fetching slice from CAS...
|
|
# Rehydrating inputs:
|
|
# - Graph: cas://graphs/blake3:xyz... ✓
|
|
# - SBOM: cas://sboms/sha256:abc... ✓
|
|
# - Policy: cas://policies/sha256:def... ✓
|
|
# Recomputing slice...
|
|
# Comparing results...
|
|
# ✓ Match confirmed
|
|
```
|
|
|
|
### 7.3 Mismatch Handling
|
|
|
|
```bash
|
|
stella slice replay --digest blake3:a1b2c3d4...
|
|
|
|
# Output:
|
|
# ✗ Slice mismatch detected!
|
|
#
|
|
# Differences:
|
|
# Nodes: 1 missing, 0 extra
|
|
# Edges: 1 missing, 1 extra
|
|
# Verdict: unreachable → reachable
|
|
#
|
|
# Possible causes:
|
|
# - Input graph may have been modified
|
|
# - Analyzer version: 1.2.0 → 1.2.1
|
|
#
|
|
# Run with --diff-file to export detailed diff
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Error Handling
|
|
|
|
| Error | Cause | Resolution |
|
|
|-------|-------|------------|
|
|
| `slice_not_found` | Slice not in CAS | Check digest, verify upload |
|
|
| `input_not_found` | Referenced input missing | Reupload inputs |
|
|
| `version_mismatch` | Analyzer version differs | Pin version or accept drift |
|
|
| `feed_stale` | Feed snapshot unavailable | Use latest or pin version |
|
|
|
|
---
|
|
|
|
## 9. Security Considerations
|
|
|
|
1. **Input integrity**: Verify CAS digests before replay
|
|
2. **Audit logging**: Log all replay attempts
|
|
3. **Rate limiting**: Prevent replay DoS
|
|
4. **Access control**: Same permissions as slice access
|
|
|
|
---
|
|
|
|
## 10. Performance Targets
|
|
|
|
| Metric | Target |
|
|
|--------|--------|
|
|
| Replay latency | <5s for typical slice |
|
|
| Input fetch | <2s (parallel CAS fetches) |
|
|
| Comparison | <100ms |
|
|
|
|
---
|
|
|
|
## 11. Related Documentation
|
|
|
|
- [Slice Schema](./slice-schema.md)
|
|
- [Binary Reachability Schema](./binary-reachability-schema.md)
|
|
- [Determinism Requirements](../contracts/determinism.md)
|
|
- [CAS Architecture](../modules/platform/cas.md)
|
|
|
|
---
|
|
|
|
_Created: 2025-12-22. See Sprint 3820 for implementation details._
|