git.stella-ops.org/docs/airgap/offline-parity-verification.md

# Offline Parity Verification

**Last Updated:** 2025-12-14
**Next Review:** 2026-03-14

---

## Overview

This document defines the methodology for verifying that StellaOps scanner produces **identical results** in offline/air-gapped environments compared to connected deployments. Parity verification ensures that security decisions made in disconnected environments are equivalent to those made with full network access.

---

## 1. PARITY VERIFICATION OBJECTIVES

### 1.1 Core Guarantees

| Guarantee | Description | Target |
|-----------|-------------|--------|
| **Bitwise Fidelity** | Scan outputs are byte-identical offline vs online | 100% |
| **Semantic Fidelity** | Same vulnerabilities, severities, and verdicts | 100% |
| **Temporal Parity** | Same results given identical feed snapshots | 100% |
| **Policy Parity** | Same pass/fail decisions with identical policies | 100% |

### 1.2 What Parity Does NOT Cover

- **Feed freshness**: Offline feeds may be hours/days behind live feeds (by design)
- **Network-only enrichment**: EPSS lookups, live KEV checks (graceful degradation applies)
- **Transparency log submission**: Rekor entries created only when connected

---

## 2. TEST METHODOLOGY

### 2.1 Environment Configuration

#### Connected Environment

```yaml
environment:
  mode: connected
  network: enabled
  feeds:
    sources: [osv, ghsa, nvd]
    refresh: live
  rekor: enabled
  epss: enabled
  timestamp_source: ntp
```

#### Offline Environment

```yaml
environment:
  mode: offline
  network: disabled
  feeds:
    sources: [local-bundle]
    refresh: none
  rekor: offline-snapshot
  epss: bundled-cache
  timestamp_source: frozen
  timestamp_value: "2025-12-14T00:00:00Z"
```

### 2.2 Test Procedure

```
PARITY VERIFICATION PROCEDURE v1.0
══════════════════════════════════

PHASE 1: BUNDLE CAPTURE (Connected Environment)
─────────────────────────────────────────────────
1. Capture current feed state:
   - Record feed version/digest
   - Snapshot EPSS scores (top 1000 CVEs)
   - Record KEV list state

2. Run connected scan:
   stellaops scan --image <test-image> \
     --format json \
     --output connected-scan.json \
     --receipt connected-receipt.json

3. Export offline bundle:
   stellaops offline bundle export \
     --feeds-snapshot \
     --epss-cache \
     --output parity-bundle-$(date +%Y%m%d).tar.zst

PHASE 2: OFFLINE SCAN (Air-Gapped Environment)
───────────────────────────────────────────────
1. Import bundle:
   stellaops offline bundle import parity-bundle-*.tar.zst

2. Freeze clock to bundle timestamp:
   export STELLAOPS_DETERMINISM_TIMESTAMP="2025-12-14T00:00:00Z"

3. Run offline scan:
   stellaops scan --image <test-image> \
     --format json \
     --output offline-scan.json \
     --receipt offline-receipt.json \
     --offline-mode

PHASE 3: PARITY COMPARISON
──────────────────────────
1. Compare findings digests:
   diff <(jq -S '.findings | sort_by(.id)' connected-scan.json) \
        <(jq -S '.findings | sort_by(.id)' offline-scan.json)

2. Compare policy decisions:
   diff <(jq -S '.policyDecision' connected-scan.json) \
        <(jq -S '.policyDecision' offline-scan.json)

3. Compare receipt input hashes:
   jq '.inputHash' connected-receipt.json
   jq '.inputHash' offline-receipt.json
   # MUST be identical if same bundle used

PHASE 4: RECORD RESULTS
───────────────────────
1. Generate parity report:
   stellaops parity report \
     --connected connected-scan.json \
     --offline offline-scan.json \
     --output parity-report-$(date +%Y%m%d).json
```

### 2.3 Test Image Matrix

Run parity tests against this representative image set:

| Image | Category | Expected Vulns | Notes |
|-------|----------|----------------|-------|
| `alpine:3.19` | Minimal | ~5 | Fast baseline |
| `debian:12-slim` | Standard | ~40 | OS package focus |
| `node:20-alpine` | Application | ~100 | npm + OS packages |
| `python:3.12` | Application | ~150 | pip + OS packages |
| `dotnet/aspnet:8.0` | Application | ~75 | NuGet + OS packages |
| `postgres:16-alpine` | Database | ~70 | Database + OS |

---

## 3. COMPARISON CRITERIA

### 3.1 Bitwise Comparison

Compare canonical JSON outputs after normalization:

```bash
# Canonical comparison script
canonical_compare() {
    local connected="$1"
    local offline="$2"

    # Normalize both outputs
    jq -S . "$connected" > /tmp/connected-canonical.json
    jq -S . "$offline" > /tmp/offline-canonical.json

    # Compute hashes
    CONNECTED_HASH=$(sha256sum /tmp/connected-canonical.json | cut -d' ' -f1)
    OFFLINE_HASH=$(sha256sum /tmp/offline-canonical.json | cut -d' ' -f1)

    if [[ "$CONNECTED_HASH" == "$OFFLINE_HASH" ]]; then
        echo "PASS: Bitwise identical"
        return 0
    else
        echo "FAIL: Hash mismatch"
        echo "  Connected: $CONNECTED_HASH"
        echo "  Offline:   $OFFLINE_HASH"
        diff --color /tmp/connected-canonical.json /tmp/offline-canonical.json
        return 1
    fi
}
```

### 3.2 Semantic Comparison

When bitwise comparison fails, perform semantic comparison:

| Field | Comparison Rule | Allowed Variance |
|-------|-----------------|------------------|
| `findings[].id` | Exact match | None |
| `findings[].severity` | Exact match | None |
| `findings[].cvss.score` | Exact match | None |
| `findings[].cvss.vector` | Exact match | None |
| `findings[].affected` | Exact match | None |
| `findings[].reachability` | Exact match | None |
| `sbom.components[].purl` | Exact match | None |
| `sbom.components[].version` | Exact match | None |
| `metadata.timestamp` | Ignored | Expected to differ |
| `metadata.scanId` | Ignored | Expected to differ |
| `metadata.environment` | Ignored | Expected to differ |

### 3.3 Fields Excluded from Comparison

These fields are expected to differ and are excluded from parity checks:

```json
{
  "excludedFields": [
    "$.metadata.scanId",
    "$.metadata.timestamp",
    "$.metadata.hostname",
    "$.metadata.environment.network",
    "$.attestations[*].rekorEntry",
    "$.metadata.epssEnrichedAt"
  ]
}
```

### 3.4 Graceful Degradation Fields

Fields that may be absent in offline mode (acceptable):

| Field | Online | Offline | Parity Rule |
|-------|--------|---------|-------------|
| `epssScore` | Present | May be stale/absent | Check if bundled |
| `kevStatus` | Live | Bundled snapshot | Compare against bundle date |
| `rekorEntry` | Present | Absent | Exclude from comparison |
| `fulcioChain` | Present | Absent | Exclude from comparison |

---

## 4. AUTOMATED PARITY CI

### 4.1 CI Workflow

```yaml
# .gitea/workflows/offline-parity.yml
name: Offline Parity Verification

on:
  schedule:
    - cron: '0 3 * * 1'  # Weekly Monday 3am
  workflow_dispatch:

jobs:
  parity-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup .NET
        uses: actions/setup-dotnet@v4
        with:
          dotnet-version: '10.0.x'

      - name: Set determinism environment
        run: |
          echo "TZ=UTC" >> $GITHUB_ENV
          echo "LC_ALL=C" >> $GITHUB_ENV
          echo "STELLAOPS_DETERMINISM_SEED=42" >> $GITHUB_ENV

      - name: Capture connected baseline
        run: scripts/parity/capture-connected.sh

      - name: Export offline bundle
        run: scripts/parity/export-bundle.sh

      - name: Run offline scan (sandboxed)
        run: |
          docker run --network none \
            -v $(pwd)/bundle:/bundle:ro \
            -v $(pwd)/results:/results \
            stellaops/scanner:latest \
            scan --offline-mode --bundle /bundle

      - name: Compare parity
        run: scripts/parity/compare-parity.sh

      - name: Upload parity report
        uses: actions/upload-artifact@v4
        with:
          name: parity-report
          path: results/parity-report-*.json
```

### 4.2 Parity Test Script

```bash
#!/bin/bash
# scripts/parity/compare-parity.sh

set -euo pipefail

CONNECTED_DIR="results/connected"
OFFLINE_DIR="results/offline"
REPORT_FILE="results/parity-report-$(date +%Y%m%d).json"

declare -a IMAGES=(
    "alpine:3.19"
    "debian:12-slim"
    "node:20-alpine"
    "python:3.12"
    "mcr.microsoft.com/dotnet/aspnet:8.0"
    "postgres:16-alpine"
)

TOTAL=0
PASSED=0
FAILED=0
RESULTS=()

for image in "${IMAGES[@]}"; do
    TOTAL=$((TOTAL + 1))
    image_hash=$(echo "$image" | sha256sum | cut -c1-12)

    connected_file="${CONNECTED_DIR}/${image_hash}-scan.json"
    offline_file="${OFFLINE_DIR}/${image_hash}-scan.json"

    # Compare findings
    connected_findings=$(jq -S '.findings | sort_by(.id) | map(del(.metadata.timestamp))' "$connected_file")
    offline_findings=$(jq -S '.findings | sort_by(.id) | map(del(.metadata.timestamp))' "$offline_file")

    connected_hash=$(echo "$connected_findings" | sha256sum | cut -d' ' -f1)
    offline_hash=$(echo "$offline_findings" | sha256sum | cut -d' ' -f1)

    if [[ "$connected_hash" == "$offline_hash" ]]; then
        PASSED=$((PASSED + 1))
        status="PASS"
    else
        FAILED=$((FAILED + 1))
        status="FAIL"
    fi

    RESULTS+=("{\"image\":\"$image\",\"status\":\"$status\",\"connectedHash\":\"$connected_hash\",\"offlineHash\":\"$offline_hash\"}")
done

# Generate report
cat > "$REPORT_FILE" <<EOF
{
  "reportDate": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
  "bundleVersion": "$(cat bundle/version.txt)",
  "summary": {
    "total": $TOTAL,
    "passed": $PASSED,
    "failed": $FAILED,
    "parityRate": $(echo "scale=4; $PASSED / $TOTAL" | bc)
  },
  "results": [$(IFS=,; echo "${RESULTS[*]}")]
}
EOF

echo "Parity Report: $PASSED/$TOTAL passed ($(echo "scale=2; $PASSED * 100 / $TOTAL" | bc)%)"

if [[ $FAILED -gt 0 ]]; then
    echo "PARITY VERIFICATION FAILED"
    exit 1
fi
```

---

## 5. PARITY RESULTS

### 5.1 Latest Verification Results

| Date | Bundle Version | Images Tested | Parity Rate | Notes |
|------|---------------|---------------|-------------|-------|
| 2025-12-14 | 2025.12.0 | 6 | 100% | Baseline established |
| — | — | — | — | — |

### 5.2 Historical Parity Tracking

```sql
-- Query for parity trend analysis
SELECT
    date_trunc('week', report_date) AS week,
    AVG(parity_rate) AS avg_parity,
    MIN(parity_rate) AS min_parity,
    COUNT(*) AS test_runs
FROM parity_reports
WHERE report_date >= NOW() - INTERVAL '90 days'
GROUP BY 1
ORDER BY 1 DESC;
```

### 5.3 Parity Database Schema

```sql
CREATE TABLE scanner.parity_reports (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    report_date TIMESTAMPTZ NOT NULL,
    bundle_version TEXT NOT NULL,
    bundle_digest TEXT NOT NULL,
    total_images INT NOT NULL,
    passed_images INT NOT NULL,
    failed_images INT NOT NULL,
    parity_rate NUMERIC(5,4) NOT NULL,
    results JSONB NOT NULL,
    ci_run_id TEXT,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_parity_reports_date ON scanner.parity_reports(report_date DESC);
CREATE INDEX idx_parity_reports_bundle ON scanner.parity_reports(bundle_version);
```

---

## 6. KNOWN LIMITATIONS

### 6.1 Acceptable Differences

| Scenario | Expected Behavior | Parity Impact |
|----------|-------------------|---------------|
| **EPSS scores** | Use bundled cache (may be stale) | None if cache bundled |
| **KEV status** | Use bundled snapshot | None if snapshot bundled |
| **Rekor entries** | Not created offline | Excluded from comparison |
| **Timestamp fields** | Differ by design | Excluded from comparison |
| **Network-only advisories** | Not available offline | Feed drift (documented) |

### 6.2 Known Edge Cases

1. **Race conditions during bundle capture**: If feeds update during bundle export, connected scan may include newer data than bundle. Mitigation: Capture bundle first, then run connected scan.

2. **Clock drift**: Offline environments with drifted clocks may compute different freshness scores. Mitigation: Always use frozen timestamps from bundle.

3. **Locale differences**: String sorting may differ across locales. Mitigation: Force `LC_ALL=C` in both environments.

4. **Floating point rounding**: CVSS v4 MacroVector interpolation may have micro-differences. Mitigation: Use integer basis points throughout.

### 6.3 Out of Scope

The following are intentionally NOT covered by parity verification:

- Real-time threat intelligence (requires network)
- Live vulnerability disclosure (requires network)
- Transparency log inclusion proofs (requires Rekor)
- OIDC/Fulcio certificate chains (requires network)

---

## 7. TROUBLESHOOTING

### 7.1 Common Parity Failures

| Symptom | Likely Cause | Resolution |
|---------|--------------|------------|
| Different vulnerability counts | Feed version mismatch | Verify bundle digest matches |
| Different CVSS scores | CVSS v4 calculation issue | Check MacroVector lookup parity |
| Different severity labels | Threshold configuration | Compare policy bundles |
| Missing EPSS data | EPSS cache not bundled | Re-export with `--epss-cache` |
| Different component counts | SBOM generation variance | Check analyzer versions |

### 7.2 Debug Commands

```bash
# Compare feed versions
stellaops feeds version --connected
stellaops feeds version --offline --bundle ./bundle

# Compare policy digests
stellaops policy digest --connected
stellaops policy digest --offline --bundle ./bundle

# Detailed diff of findings
stellaops parity diff \
  --connected connected-scan.json \
  --offline offline-scan.json \
  --verbose
```

---

## 8. METRICS AND MONITORING

### 8.1 Prometheus Metrics

```
# Parity verification metrics
parity_test_total{status="pass|fail"}
parity_test_duration_seconds (histogram)
parity_bundle_age_seconds (gauge)
parity_findings_diff_count (gauge)
```

### 8.2 Alerting Rules

```yaml
groups:
  - name: offline-parity
    rules:
      - alert: ParityTestFailed
        expr: parity_test_total{status="fail"} > 0
        for: 0m
        labels:
          severity: critical
        annotations:
          summary: "Offline parity test failed"

      - alert: ParityRateDegraded
        expr: |
          (sum(parity_test_total{status="pass"}) /
           sum(parity_test_total)) < 0.95
        for: 1h
        labels:
          severity: warning
        annotations:
          summary: "Parity rate below 95%"
```

---

## 9. REFERENCES

- [Offline Update Kit (OUK)](../24_OFFLINE_KIT.md)
- [Offline and Air-Gap Technical Reference](../product-advisories/14-Dec-2025%20-%20Offline%20and%20Air-Gap%20Technical%20Reference.md)
- [Determinism and Reproducibility Technical Reference](../product-advisories/14-Dec-2025%20-%20Determinism%20and%20Reproducibility%20Technical%20Reference.md)
- [Determinism CI Harness](../modules/scanner/design/determinism-ci-harness.md)
- [Performance Baselines](../benchmarks/performance-baselines.md)

---

**Document Version**: 1.0
**Target Platform**: .NET 10, PostgreSQL >=16