# Score Replay Operations Runbook > **Version**: 1.0.0 > **Sprint**: 3500.0004.0004 > **Last Updated**: 2025-12-20 This runbook covers operational procedures for Score Replay, including deterministic score computation verification, proof bundle validation, and troubleshooting replay discrepancies. --- ## Table of Contents 1. [Overview](#1-overview) 2. [Score Replay Operations](#2-score-replay-operations) 3. [Determinism Verification](#3-determinism-verification) 4. [Proof Bundle Management](#4-proof-bundle-management) 5. [Troubleshooting](#5-troubleshooting) 6. [Monitoring & Alerting](#6-monitoring--alerting) 7. [Escalation Procedures](#7-escalation-procedures) --- ## 1. Overview ### What is Score Replay? Score Replay is the ability to re-execute a vulnerability score computation using the exact same inputs (SBOM, rules, policies, feeds) that were used in the original scan. This provides: - **Auditability**: Prove that a score was computed correctly - **Determinism verification**: Confirm that identical inputs produce identical outputs - **Compliance evidence**: Generate proof bundles for regulatory requirements - **Dispute resolution**: Verify contested scan results ### Key Concepts | Term | Definition | |------|------------| | **Manifest** | Content-addressed record of all scoring inputs (SBOM hash, rules hash, policy hash, feed hash) | | **Proof Bundle** | Signed attestation containing manifest, score, and Merkle proof | | **Root Hash** | Merkle tree root computed from all input hashes | | **DSSE Envelope** | Dead Simple Signing Envelope containing the signed proof | | **Freeze Timestamp** | Optional timestamp to replay scoring at a specific point in time | ### Architecture Components | Component | Purpose | Location | |-----------|---------|----------| | Score Engine | Computes vulnerability scores | Scanner Worker | | Manifest Store | Persists scoring manifests | `scanner.manifest` table | | Proof Chain | Generates Merkle proofs | Attestor library | | Signer | Signs proof bundles (DSSE) | Signer service | --- ## 2. Score Replay Operations ### 2.1 Triggering a Score Replay #### Via CLI ```bash # Basic replay stella score replay --scan # Replay with specific manifest stella score replay --scan --manifest-hash sha256:abc123... # Replay with frozen timestamp (for determinism testing) stella score replay --scan --freeze 2025-01-15T00:00:00Z # Output as JSON stella score replay --scan --output json ``` #### Via API ```bash # POST /api/v1/scanner/score/{scanId}/replay curl -X POST "https://scanner.stellaops.local/api/v1/scanner/score/scan-123/replay" \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "manifestHash": "sha256:abc123...", "freezeTimestamp": "2025-01-15T00:00:00Z" }' ``` #### Expected Response ```json { "scanId": "scan-123", "score": 7.5, "rootHash": "sha256:def456...", "bundleUri": "/api/v1/scanner/scans/scan-123/proofs/sha256:def456...", "manifestHash": "sha256:abc123...", "replayedAt": "2025-01-16T10:30:00Z", "deterministic": true } ``` ### 2.2 Retrieving Proof Bundles #### Via CLI ```bash # Get bundle for a scan stella score bundle --scan # Download bundle to file stella score bundle --scan --output bundle.tar.gz ``` #### Via API ```bash # GET /api/v1/scanner/score/{scanId}/bundle curl "https://scanner.stellaops.local/api/v1/scanner/score/scan-123/bundle" \ -H "Authorization: Bearer $TOKEN" \ -o bundle.tar.gz ``` ### 2.3 Verifying Score Integrity #### Via CLI ```bash # Verify against expected root hash stella score verify --scan --root-hash sha256:def456... # Verify downloaded bundle stella proof verify --bundle bundle.tar.gz ``` #### Via API ```bash # POST /api/v1/scanner/score/{scanId}/verify curl -X POST "https://scanner.stellaops.local/api/v1/scanner/score/scan-123/verify" \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{"expectedRootHash": "sha256:def456..."}' ``` --- ## 3. Determinism Verification ### 3.1 What Affects Determinism? Score computation is deterministic when: | Input | Requirement | |-------|-------------| | SBOM | Identical content (same hash) | | Rules | Same rule version and configuration | | Policy | Same policy document | | Feeds | Same feed snapshot (freeze timestamp) | | Ordering | Findings sorted deterministically | ### 3.2 Running Determinism Checks ```bash # Run replay twice and compare REPLAY1=$(stella score replay --scan $SCAN_ID --output json) REPLAY2=$(stella score replay --scan $SCAN_ID --output json) # Extract root hashes HASH1=$(echo $REPLAY1 | jq -r '.rootHash') HASH2=$(echo $REPLAY2 | jq -r '.rootHash') # Compare if [ "$HASH1" = "$HASH2" ]; then echo "✓ Determinism verified: $HASH1" else echo "✗ Non-deterministic! $HASH1 != $HASH2" exit 1 fi ``` ### 3.3 Common Determinism Issues | Issue | Cause | Resolution | |-------|-------|------------| | Different root hash | Feed data changed between replays | Use `--freeze` timestamp | | Score drift | Rule version mismatch | Pin rules version in manifest | | Ordering differences | Non-stable sort in findings | Check Scanner version (fixed in v2.1+) | | Timestamp in output | Current time in computation | Ensure frozen time mode | ### 3.4 Feed Freeze for Reproducibility ```bash # Replay with feed state frozen to original scan time stella score replay --scan $SCAN_ID \ --freeze $(stella scan show $SCAN_ID --output json | jq -r '.scannedAt') ``` --- ## 4. Proof Bundle Management ### 4.1 Bundle Contents A proof bundle (`.tar.gz`) contains: ``` bundle/ ├── manifest.json # Input hashes and metadata ├── score.json # Computed score and findings summary ├── merkle-proof.json # Merkle tree with inclusion proofs ├── dsse-envelope.json # Signed attestation (DSSE format) └── certificate.pem # Signing certificate (optional) ``` ### 4.2 Inspecting Bundles ```bash # Extract and view manifest tar -xzf bundle.tar.gz cat bundle/manifest.json | jq . # Verify DSSE signature stella proof verify --bundle bundle.tar.gz --verbose # Check Merkle proof stella proof spine --bundle bundle.tar.gz ``` ### 4.3 Bundle Retention Policy | Environment | Retention | Notes | |-------------|-----------|-------| | Production | 7 years | Regulatory compliance | | Staging | 90 days | Testing purposes | | Development | 30 days | Cleanup automatically | ### 4.4 Archiving Bundles ```bash # Export bundle to long-term storage stella score bundle --scan $SCAN_ID --output /archive/proofs/$SCAN_ID.tar.gz # Bulk export for compliance audit stella score bundle-export \ --since 2024-01-01 \ --until 2024-12-31 \ --output /archive/2024-proofs/ ``` --- ## 5. Troubleshooting ### 5.1 Replay Returns Different Score **Symptoms**: Replayed score differs from original scan score. **Diagnostic Steps**: 1. Check manifest integrity: ```bash stella scan show $SCAN_ID --output json | jq '.manifest' ``` 2. Verify feed state: ```bash # Compare feed hashes stella score replay --scan $SCAN_ID --freeze $ORIGINAL_TIME --output json | jq '.manifestHash' ``` 3. Check for rule updates: ```bash stella rules show --version --output json ``` **Resolution**: - Use `--freeze` timestamp matching original scan - Pin rule versions in policy - Regenerate manifest if inputs changed legitimately ### 5.2 Proof Verification Fails **Symptoms**: `stella proof verify` returns validation errors. **Diagnostic Steps**: 1. Check DSSE signature: ```bash stella proof verify --bundle bundle.tar.gz --verbose 2>&1 | grep -i signature ``` 2. Verify certificate validity: ```bash openssl x509 -in bundle/certificate.pem -noout -dates ``` 3. Check Merkle proof: ```bash stella proof spine --bundle bundle.tar.gz --verify ``` **Common Errors**: | Error | Cause | Fix | |-------|-------|-----| | `SIGNATURE_INVALID` | Bundle tampered or wrong key | Re-download bundle | | `CERTIFICATE_EXPIRED` | Signing cert expired | Check signing key rotation | | `MERKLE_MISMATCH` | Root hash doesn't match | Verify correct bundle version | | `MANIFEST_MISSING` | Incomplete bundle | Re-export from API | ### 5.3 Replay Timeout **Symptoms**: Replay request times out or takes too long. **Diagnostic Steps**: 1. Check scan size: ```bash stella scan show $SCAN_ID --output json | jq '.findingsCount' ``` 2. Monitor replay progress: ```bash stella score replay --scan $SCAN_ID --verbose ``` **Resolution**: - For large scans (>10k findings), increase timeout - Check Scanner Worker health - Consider async replay for very large scans ### 5.4 Missing Manifest **Symptoms**: `Manifest not found` error on replay. **Diagnostic Steps**: 1. Verify scan exists: ```bash stella scan show $SCAN_ID ``` 2. Check manifest table: ```sql SELECT * FROM scanner.manifest WHERE scan_id = 'scan-123'; ``` **Resolution**: - Manifest may have been purged (check retention policy) - Restore from backup if available - Re-run scan if original inputs available --- ## 6. Monitoring & Alerting ### 6.1 Key Metrics | Metric | Description | Alert Threshold | |--------|-------------|-----------------| | `score_replay_duration_ms` | Time to complete replay | p99 > 30s | | `score_replay_determinism_failures` | Non-deterministic replays | > 0 | | `proof_verification_failures` | Failed verifications | > 5/hour | | `manifest_storage_size_bytes` | Manifest table size | > 100GB | ### 6.2 Grafana Dashboard Queries ```promql # Replay latency histogram_quantile(0.99, rate(score_replay_duration_ms_bucket[5m]) ) # Determinism failure rate rate(score_replay_determinism_failures_total[1h]) # Proof verification success rate sum(rate(proof_verification_success_total[1h])) / sum(rate(proof_verification_total[1h])) ``` ### 6.3 Alert Rules ```yaml groups: - name: score-replay rules: - alert: ScoreReplayLatencyHigh expr: histogram_quantile(0.99, rate(score_replay_duration_ms_bucket[5m])) > 30000 for: 5m labels: severity: warning annotations: summary: Score replay latency exceeds 30s at p99 - alert: DeterminismFailure expr: increase(score_replay_determinism_failures_total[1h]) > 0 for: 1m labels: severity: critical annotations: summary: Non-deterministic score replay detected ``` --- ## 7. Escalation Procedures ### 7.1 Escalation Matrix | Severity | Condition | Response Time | Escalate To | |----------|-----------|---------------|-------------| | P1 - Critical | Determinism failure in production | 15 minutes | Platform Team Lead | | P2 - High | Proof verification failures > 10/hour | 1 hour | Scanner Team | | P3 - Medium | Replay latency degradation | 4 hours | Scanner Team | | P4 - Low | Single replay failure | Next business day | Support Queue | ### 7.2 P1: Determinism Failure Response 1. **Immediate Actions** (0-15 min): - Capture affected scan IDs - Preserve original manifest data - Check for recent deployments 2. **Investigation** (15-60 min): - Compare input hashes between replays - Check feed synchronization status - Review rule engine logs 3. **Remediation**: - Roll back if deployment-related - Freeze feeds if data drift - Hotfix if code bug identified ### 7.3 Contacts | Role | Contact | Availability | |------|---------|--------------| | Scanner Team Lead | scanner-lead@stellaops.io | Business hours | | Platform On-Call | platform-oncall@stellaops.io | 24/7 | | Security Team | security@stellaops.io | Business hours | --- ## Appendix A: SQL Queries ### Check Manifest History ```sql SELECT scan_id, manifest_hash, sbom_hash, rules_hash, policy_hash, feed_hash, created_at FROM scanner.manifest WHERE scan_id = 'scan-123' ORDER BY created_at DESC; ``` ### Find Non-Deterministic Replays ```sql SELECT scan_id, COUNT(DISTINCT root_hash) as unique_hashes, MIN(replayed_at) as first_replay, MAX(replayed_at) as last_replay FROM scanner.replay_log GROUP BY scan_id HAVING COUNT(DISTINCT root_hash) > 1; ``` ### Proof Bundle Statistics ```sql SELECT DATE_TRUNC('day', created_at) as day, COUNT(*) as bundles_created, AVG(bundle_size_bytes) as avg_size, SUM(bundle_size_bytes) as total_size FROM scanner.proof_bundle WHERE created_at > NOW() - INTERVAL '30 days' GROUP BY DATE_TRUNC('day', created_at) ORDER BY day DESC; ``` --- ## Appendix B: CLI Quick Reference ```bash # Score Replay Commands stella score replay --scan # Replay score computation stella score replay --scan --freeze # Replay with frozen time stella score bundle --scan # Get proof bundle stella score verify --scan --root-hash # Verify score # Proof Commands stella proof verify --bundle # Verify bundle file stella proof verify --bundle --offline # Offline verification stella proof spine --bundle # Show Merkle spine # Output Formats --output json # JSON output --output table # Table output (default) --output yaml # YAML output ``` --- ## Revision History | Version | Date | Author | Changes | |---------|------|--------|---------| | 1.0.0 | 2025-12-20 | Agent | Initial release |