Files
git.stella-ops.org/docs/airgap/offline-parity-verification.md
StellaOps Bot b058dbe031 up
2025-12-14 23:20:14 +02:00

519 lines
15 KiB
Markdown

# Offline Parity Verification
**Last Updated:** 2025-12-14
**Next Review:** 2026-03-14
---
## Overview
This document defines the methodology for verifying that StellaOps scanner produces **identical results** in offline/air-gapped environments compared to connected deployments. Parity verification ensures that security decisions made in disconnected environments are equivalent to those made with full network access.
---
## 1. PARITY VERIFICATION OBJECTIVES
### 1.1 Core Guarantees
| Guarantee | Description | Target |
|-----------|-------------|--------|
| **Bitwise Fidelity** | Scan outputs are byte-identical offline vs online | 100% |
| **Semantic Fidelity** | Same vulnerabilities, severities, and verdicts | 100% |
| **Temporal Parity** | Same results given identical feed snapshots | 100% |
| **Policy Parity** | Same pass/fail decisions with identical policies | 100% |
### 1.2 What Parity Does NOT Cover
- **Feed freshness**: Offline feeds may be hours/days behind live feeds (by design)
- **Network-only enrichment**: EPSS lookups, live KEV checks (graceful degradation applies)
- **Transparency log submission**: Rekor entries created only when connected
---
## 2. TEST METHODOLOGY
### 2.1 Environment Configuration
#### Connected Environment
```yaml
environment:
mode: connected
network: enabled
feeds:
sources: [osv, ghsa, nvd]
refresh: live
rekor: enabled
epss: enabled
timestamp_source: ntp
```
#### Offline Environment
```yaml
environment:
mode: offline
network: disabled
feeds:
sources: [local-bundle]
refresh: none
rekor: offline-snapshot
epss: bundled-cache
timestamp_source: frozen
timestamp_value: "2025-12-14T00:00:00Z"
```
### 2.2 Test Procedure
```
PARITY VERIFICATION PROCEDURE v1.0
══════════════════════════════════
PHASE 1: BUNDLE CAPTURE (Connected Environment)
─────────────────────────────────────────────────
1. Capture current feed state:
- Record feed version/digest
- Snapshot EPSS scores (top 1000 CVEs)
- Record KEV list state
2. Run connected scan:
stellaops scan --image <test-image> \
--format json \
--output connected-scan.json \
--receipt connected-receipt.json
3. Export offline bundle:
stellaops offline bundle export \
--feeds-snapshot \
--epss-cache \
--output parity-bundle-$(date +%Y%m%d).tar.zst
PHASE 2: OFFLINE SCAN (Air-Gapped Environment)
───────────────────────────────────────────────
1. Import bundle:
stellaops offline bundle import parity-bundle-*.tar.zst
2. Freeze clock to bundle timestamp:
export STELLAOPS_DETERMINISM_TIMESTAMP="2025-12-14T00:00:00Z"
3. Run offline scan:
stellaops scan --image <test-image> \
--format json \
--output offline-scan.json \
--receipt offline-receipt.json \
--offline-mode
PHASE 3: PARITY COMPARISON
──────────────────────────
1. Compare findings digests:
diff <(jq -S '.findings | sort_by(.id)' connected-scan.json) \
<(jq -S '.findings | sort_by(.id)' offline-scan.json)
2. Compare policy decisions:
diff <(jq -S '.policyDecision' connected-scan.json) \
<(jq -S '.policyDecision' offline-scan.json)
3. Compare receipt input hashes:
jq '.inputHash' connected-receipt.json
jq '.inputHash' offline-receipt.json
# MUST be identical if same bundle used
PHASE 4: RECORD RESULTS
───────────────────────
1. Generate parity report:
stellaops parity report \
--connected connected-scan.json \
--offline offline-scan.json \
--output parity-report-$(date +%Y%m%d).json
```
### 2.3 Test Image Matrix
Run parity tests against this representative image set:
| Image | Category | Expected Vulns | Notes |
|-------|----------|----------------|-------|
| `alpine:3.19` | Minimal | ~5 | Fast baseline |
| `debian:12-slim` | Standard | ~40 | OS package focus |
| `node:20-alpine` | Application | ~100 | npm + OS packages |
| `python:3.12` | Application | ~150 | pip + OS packages |
| `dotnet/aspnet:8.0` | Application | ~75 | NuGet + OS packages |
| `postgres:16-alpine` | Database | ~70 | Database + OS |
---
## 3. COMPARISON CRITERIA
### 3.1 Bitwise Comparison
Compare canonical JSON outputs after normalization:
```bash
# Canonical comparison script
canonical_compare() {
local connected="$1"
local offline="$2"
# Normalize both outputs
jq -S . "$connected" > /tmp/connected-canonical.json
jq -S . "$offline" > /tmp/offline-canonical.json
# Compute hashes
CONNECTED_HASH=$(sha256sum /tmp/connected-canonical.json | cut -d' ' -f1)
OFFLINE_HASH=$(sha256sum /tmp/offline-canonical.json | cut -d' ' -f1)
if [[ "$CONNECTED_HASH" == "$OFFLINE_HASH" ]]; then
echo "PASS: Bitwise identical"
return 0
else
echo "FAIL: Hash mismatch"
echo " Connected: $CONNECTED_HASH"
echo " Offline: $OFFLINE_HASH"
diff --color /tmp/connected-canonical.json /tmp/offline-canonical.json
return 1
fi
}
```
### 3.2 Semantic Comparison
When bitwise comparison fails, perform semantic comparison:
| Field | Comparison Rule | Allowed Variance |
|-------|-----------------|------------------|
| `findings[].id` | Exact match | None |
| `findings[].severity` | Exact match | None |
| `findings[].cvss.score` | Exact match | None |
| `findings[].cvss.vector` | Exact match | None |
| `findings[].affected` | Exact match | None |
| `findings[].reachability` | Exact match | None |
| `sbom.components[].purl` | Exact match | None |
| `sbom.components[].version` | Exact match | None |
| `metadata.timestamp` | Ignored | Expected to differ |
| `metadata.scanId` | Ignored | Expected to differ |
| `metadata.environment` | Ignored | Expected to differ |
### 3.3 Fields Excluded from Comparison
These fields are expected to differ and are excluded from parity checks:
```json
{
"excludedFields": [
"$.metadata.scanId",
"$.metadata.timestamp",
"$.metadata.hostname",
"$.metadata.environment.network",
"$.attestations[*].rekorEntry",
"$.metadata.epssEnrichedAt"
]
}
```
### 3.4 Graceful Degradation Fields
Fields that may be absent in offline mode (acceptable):
| Field | Online | Offline | Parity Rule |
|-------|--------|---------|-------------|
| `epssScore` | Present | May be stale/absent | Check if bundled |
| `kevStatus` | Live | Bundled snapshot | Compare against bundle date |
| `rekorEntry` | Present | Absent | Exclude from comparison |
| `fulcioChain` | Present | Absent | Exclude from comparison |
---
## 4. AUTOMATED PARITY CI
### 4.1 CI Workflow
```yaml
# .gitea/workflows/offline-parity.yml
name: Offline Parity Verification
on:
schedule:
- cron: '0 3 * * 1' # Weekly Monday 3am
workflow_dispatch:
jobs:
parity-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.x'
- name: Set determinism environment
run: |
echo "TZ=UTC" >> $GITHUB_ENV
echo "LC_ALL=C" >> $GITHUB_ENV
echo "STELLAOPS_DETERMINISM_SEED=42" >> $GITHUB_ENV
- name: Capture connected baseline
run: scripts/parity/capture-connected.sh
- name: Export offline bundle
run: scripts/parity/export-bundle.sh
- name: Run offline scan (sandboxed)
run: |
docker run --network none \
-v $(pwd)/bundle:/bundle:ro \
-v $(pwd)/results:/results \
stellaops/scanner:latest \
scan --offline-mode --bundle /bundle
- name: Compare parity
run: scripts/parity/compare-parity.sh
- name: Upload parity report
uses: actions/upload-artifact@v4
with:
name: parity-report
path: results/parity-report-*.json
```
### 4.2 Parity Test Script
```bash
#!/bin/bash
# scripts/parity/compare-parity.sh
set -euo pipefail
CONNECTED_DIR="results/connected"
OFFLINE_DIR="results/offline"
REPORT_FILE="results/parity-report-$(date +%Y%m%d).json"
declare -a IMAGES=(
"alpine:3.19"
"debian:12-slim"
"node:20-alpine"
"python:3.12"
"mcr.microsoft.com/dotnet/aspnet:8.0"
"postgres:16-alpine"
)
TOTAL=0
PASSED=0
FAILED=0
RESULTS=()
for image in "${IMAGES[@]}"; do
TOTAL=$((TOTAL + 1))
image_hash=$(echo "$image" | sha256sum | cut -c1-12)
connected_file="${CONNECTED_DIR}/${image_hash}-scan.json"
offline_file="${OFFLINE_DIR}/${image_hash}-scan.json"
# Compare findings
connected_findings=$(jq -S '.findings | sort_by(.id) | map(del(.metadata.timestamp))' "$connected_file")
offline_findings=$(jq -S '.findings | sort_by(.id) | map(del(.metadata.timestamp))' "$offline_file")
connected_hash=$(echo "$connected_findings" | sha256sum | cut -d' ' -f1)
offline_hash=$(echo "$offline_findings" | sha256sum | cut -d' ' -f1)
if [[ "$connected_hash" == "$offline_hash" ]]; then
PASSED=$((PASSED + 1))
status="PASS"
else
FAILED=$((FAILED + 1))
status="FAIL"
fi
RESULTS+=("{\"image\":\"$image\",\"status\":\"$status\",\"connectedHash\":\"$connected_hash\",\"offlineHash\":\"$offline_hash\"}")
done
# Generate report
cat > "$REPORT_FILE" <<EOF
{
"reportDate": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"bundleVersion": "$(cat bundle/version.txt)",
"summary": {
"total": $TOTAL,
"passed": $PASSED,
"failed": $FAILED,
"parityRate": $(echo "scale=4; $PASSED / $TOTAL" | bc)
},
"results": [$(IFS=,; echo "${RESULTS[*]}")]
}
EOF
echo "Parity Report: $PASSED/$TOTAL passed ($(echo "scale=2; $PASSED * 100 / $TOTAL" | bc)%)"
if [[ $FAILED -gt 0 ]]; then
echo "PARITY VERIFICATION FAILED"
exit 1
fi
```
---
## 5. PARITY RESULTS
### 5.1 Latest Verification Results
| Date | Bundle Version | Images Tested | Parity Rate | Notes |
|------|---------------|---------------|-------------|-------|
| 2025-12-14 | 2025.12.0 | 6 | 100% | Baseline established |
| — | — | — | — | — |
### 5.2 Historical Parity Tracking
```sql
-- Query for parity trend analysis
SELECT
date_trunc('week', report_date) AS week,
AVG(parity_rate) AS avg_parity,
MIN(parity_rate) AS min_parity,
COUNT(*) AS test_runs
FROM parity_reports
WHERE report_date >= NOW() - INTERVAL '90 days'
GROUP BY 1
ORDER BY 1 DESC;
```
### 5.3 Parity Database Schema
```sql
CREATE TABLE scanner.parity_reports (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
report_date TIMESTAMPTZ NOT NULL,
bundle_version TEXT NOT NULL,
bundle_digest TEXT NOT NULL,
total_images INT NOT NULL,
passed_images INT NOT NULL,
failed_images INT NOT NULL,
parity_rate NUMERIC(5,4) NOT NULL,
results JSONB NOT NULL,
ci_run_id TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_parity_reports_date ON scanner.parity_reports(report_date DESC);
CREATE INDEX idx_parity_reports_bundle ON scanner.parity_reports(bundle_version);
```
---
## 6. KNOWN LIMITATIONS
### 6.1 Acceptable Differences
| Scenario | Expected Behavior | Parity Impact |
|----------|-------------------|---------------|
| **EPSS scores** | Use bundled cache (may be stale) | None if cache bundled |
| **KEV status** | Use bundled snapshot | None if snapshot bundled |
| **Rekor entries** | Not created offline | Excluded from comparison |
| **Timestamp fields** | Differ by design | Excluded from comparison |
| **Network-only advisories** | Not available offline | Feed drift (documented) |
### 6.2 Known Edge Cases
1. **Race conditions during bundle capture**: If feeds update during bundle export, connected scan may include newer data than bundle. Mitigation: Capture bundle first, then run connected scan.
2. **Clock drift**: Offline environments with drifted clocks may compute different freshness scores. Mitigation: Always use frozen timestamps from bundle.
3. **Locale differences**: String sorting may differ across locales. Mitigation: Force `LC_ALL=C` in both environments.
4. **Floating point rounding**: CVSS v4 MacroVector interpolation may have micro-differences. Mitigation: Use integer basis points throughout.
### 6.3 Out of Scope
The following are intentionally NOT covered by parity verification:
- Real-time threat intelligence (requires network)
- Live vulnerability disclosure (requires network)
- Transparency log inclusion proofs (requires Rekor)
- OIDC/Fulcio certificate chains (requires network)
---
## 7. TROUBLESHOOTING
### 7.1 Common Parity Failures
| Symptom | Likely Cause | Resolution |
|---------|--------------|------------|
| Different vulnerability counts | Feed version mismatch | Verify bundle digest matches |
| Different CVSS scores | CVSS v4 calculation issue | Check MacroVector lookup parity |
| Different severity labels | Threshold configuration | Compare policy bundles |
| Missing EPSS data | EPSS cache not bundled | Re-export with `--epss-cache` |
| Different component counts | SBOM generation variance | Check analyzer versions |
### 7.2 Debug Commands
```bash
# Compare feed versions
stellaops feeds version --connected
stellaops feeds version --offline --bundle ./bundle
# Compare policy digests
stellaops policy digest --connected
stellaops policy digest --offline --bundle ./bundle
# Detailed diff of findings
stellaops parity diff \
--connected connected-scan.json \
--offline offline-scan.json \
--verbose
```
---
## 8. METRICS AND MONITORING
### 8.1 Prometheus Metrics
```
# Parity verification metrics
parity_test_total{status="pass|fail"}
parity_test_duration_seconds (histogram)
parity_bundle_age_seconds (gauge)
parity_findings_diff_count (gauge)
```
### 8.2 Alerting Rules
```yaml
groups:
- name: offline-parity
rules:
- alert: ParityTestFailed
expr: parity_test_total{status="fail"} > 0
for: 0m
labels:
severity: critical
annotations:
summary: "Offline parity test failed"
- alert: ParityRateDegraded
expr: |
(sum(parity_test_total{status="pass"}) /
sum(parity_test_total)) < 0.95
for: 1h
labels:
severity: warning
annotations:
summary: "Parity rate below 95%"
```
---
## 9. REFERENCES
- [Offline Update Kit (OUK)](../24_OFFLINE_KIT.md)
- [Offline and Air-Gap Technical Reference](../product-advisories/14-Dec-2025%20-%20Offline%20and%20Air-Gap%20Technical%20Reference.md)
- [Determinism and Reproducibility Technical Reference](../product-advisories/14-Dec-2025%20-%20Determinism%20and%20Reproducibility%20Technical%20Reference.md)
- [Determinism CI Harness](../modules/scanner/design/determinism-ci-harness.md)
- [Performance Baselines](../benchmarks/performance-baselines.md)
---
**Document Version**: 1.0
**Target Platform**: .NET 10, PostgreSQL >=16