Files
git.stella-ops.org/docs/operations/airgap-operations-runbook.md

777 lines
21 KiB
Markdown

# Air-Gap Operations Runbook
> **Version**: 1.0.0
> **Sprint**: 3500.0004.0004
> **Last Updated**: 2025-12-20
This runbook covers operational procedures for running StellaOps in air-gapped (offline) environments, including offline kit management, feed updates, and isolated verification workflows.
---
## Table of Contents
1. [Overview](#1-overview)
2. [Offline Kit Management](#2-offline-kit-management)
3. [Feed Updates](#3-feed-updates)
4. [Scanning in Air-Gap Mode](#4-scanning-in-air-gap-mode)
5. [Verification in Air-Gap Mode](#5-verification-in-air-gap-mode)
6. [Troubleshooting](#6-troubleshooting)
7. [Monitoring & Health Checks](#7-monitoring--health-checks)
8. [Escalation Procedures](#8-escalation-procedures)
---
## 1. Overview
### What is Air-Gap Mode?
Air-gap mode allows StellaOps to operate in environments with no external network connectivity. This is required for:
- Classified or sensitive environments
- High-security facilities
- Regulatory compliance (certain industries)
- Disaster recovery scenarios
### Air-Gap Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Connected Environment │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Feed Sync │───►│ Bundle │───►│ Offline Kit │ │
│ │ Service │ │ Generator │ │ (.tar.gz) │ │
│ └─────────────┘ └─────────────┘ └──────┬──────┘ │
└─────────────────────────────────────────────────┼───────────┘
│ Physical
│ Transfer
┌─────────────────────────────────────────────────┼───────────┐
│ Air-Gapped Environment │ │
│ ┌──────────────┐ ┌─────────────┐ ┌──────▼──────┐ │
│ │ StellaOps │◄───│ Offline │◄───│ Import │ │
│ │ Scanner │ │ Data Store │ │ Service │ │
│ └──────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Offline Kit Contents
| Component | Description | Update Frequency |
|-----------|-------------|------------------|
| Vulnerability Database | NVD, OSV, vendor advisories | Daily/Weekly |
| Advisory Feeds | CVE details, EPSS scores | Daily |
| Trust Bundles | CA certificates, signing keys | Quarterly |
| Rules Engine | Scoring rules and policies | Monthly |
| Offline Binaries | CLI tools, extractors | Per release |
---
## 2. Offline Kit Management
### 2.1 Generating an Offline Kit
On the connected system:
```bash
# Generate full offline kit
stella offline-kit create \
--output /path/to/offline-kit.tar.gz \
--include-all
# Generate minimal kit (feeds only)
stella offline-kit create \
--output /path/to/offline-kit-feeds.tar.gz \
--feeds-only
# Generate with specific components
stella offline-kit create \
--output /path/to/offline-kit.tar.gz \
--include vuln-db \
--include advisories \
--include trust-bundles \
--include rules
```
### 2.2 Kit Manifest
Each kit includes a manifest for verification:
```json
{
"version": "1.0.0",
"createdAt": "2025-01-15T00:00:00Z",
"expiresAt": "2025-02-15T00:00:00Z",
"components": {
"vulnerability-database": {
"hash": "sha256:abc123...",
"size": 1073741824,
"records": 245000
},
"advisory-feeds": {
"hash": "sha256:def456...",
"size": 536870912,
"lastUpdate": "2025-01-15T00:00:00Z"
},
"trust-bundles": {
"hash": "sha256:ghi789...",
"size": 65536,
"certificates": 12
},
"signing-keys": {
"hash": "sha256:jkl012...",
"keyIds": ["key-001", "key-002"]
}
},
"signature": "base64-signature..."
}
```
### 2.3 Transferring to Air-Gapped Environment
#### Physical Media Transfer
```bash
# On connected system - write to media
cp offline-kit.tar.gz /media/secure-usb/
sha256sum offline-kit.tar.gz > /media/secure-usb/offline-kit.tar.gz.sha256
# On air-gapped system - verify and import
cd /media/secure-usb/
sha256sum -c offline-kit.tar.gz.sha256
stella offline-kit import --kit offline-kit.tar.gz
```
#### Secure File Transfer (if available)
```bash
# Using data diode or one-way transfer
scp offline-kit.tar.gz airgap-gateway:/incoming/
```
### 2.4 Installing Offline Kit
```bash
# Import and install kit
stella offline-kit import \
--kit /path/to/offline-kit.tar.gz \
--verify \
--install
# Verify installation
stella offline-kit status
# List installed components
stella offline-kit list
```
Expected output:
```
Offline Kit Status
══════════════════════════════════════════
Mode: AIR-GAP
Kit Version: 1.0.0
Installed At: 2025-01-15T10:30:00Z
Expires At: 2025-02-15T00:00:00Z
Components:
✓ vulnerability-database 2025-01-15 245,000 records
✓ advisory-feeds 2025-01-15 Active
✓ trust-bundles 2025-01-15 12 certificates
✓ signing-keys 2025-01-15 2 keys
✓ rules-engine 2025-01-15 v2.3.0
Health: HEALTHY
Days Until Expiry: 31
```
---
## 3. Feed Updates
### 3.1 Update Workflow
```
┌────────────────────────────────────────────────────────────┐
│ Weekly Feed Update Process │
├────────────────────────────────────────────────────────────┤
│ Day 1: Generate kit on connected system │
│ Day 2: Security review and approval │
│ Day 3: Transfer to air-gapped environment │
│ Day 4: Import and verify │
│ Day 5: Activate new feeds │
└────────────────────────────────────────────────────────────┘
```
### 3.2 Generating Delta Updates
For faster updates, generate delta kits:
```bash
# On connected system
stella offline-kit create \
--output delta-kit.tar.gz \
--delta-from 2025-01-08 \
--delta-to 2025-01-15
# Delta kit is smaller, contains only changes
```
### 3.3 Applying Updates
```bash
# Import delta update
stella offline-kit import \
--kit delta-kit.tar.gz \
--delta \
--verify
# Verify feed freshness
stella feeds status
```
### 3.4 Rollback Procedure
If an update causes issues:
```bash
# List available snapshots
stella offline-kit snapshots
# Rollback to previous version
stella offline-kit rollback --to 2025-01-08
# Verify rollback
stella feeds status
```
---
## 4. Scanning in Air-Gap Mode
### 4.1 Enabling Air-Gap Mode
```bash
# Enable air-gap mode
stella config set mode air-gap
# Verify mode
stella config get mode
# Output: air-gap
```
### 4.2 Running Scans
```bash
# Scan a local image (no registry pull)
stella scan image --local /path/to/image.tar
# Scan from local registry
stella scan image localhost:5000/myapp:v1.0
# Scan with offline feeds explicitly
stella scan image myapp:v1.0 --offline-feeds
```
### 4.3 Local Image Preparation
For images that need to be scanned:
```bash
# On connected system - save image
docker save myapp:v1.0 -o myapp-v1.0.tar
sha256sum myapp-v1.0.tar > myapp-v1.0.tar.sha256
# Transfer to air-gapped system
# ... physical transfer ...
# On air-gapped system - verify and load
sha256sum -c myapp-v1.0.tar.sha256
docker load -i myapp-v1.0.tar
stella scan image myapp:v1.0 --local
```
### 4.4 SBOM Generation
```bash
# Generate SBOM for local image
stella sbom generate --image myapp:v1.0 --output sbom.json
# Scan existing SBOM
stella scan sbom --file sbom.json
```
---
## 5. Verification in Air-Gap Mode
### 5.1 Offline Proof Verification
```bash
# Verify proof bundle offline
stella proof verify --bundle bundle.tar.gz --offline
# Verify with explicit trust store
stella proof verify --bundle bundle.tar.gz \
--offline \
--trust-store /etc/stellaops/offline/trust-roots.json
```
### 5.2 Preparing Trust Store
Before air-gapped deployment:
```bash
# On connected system - export trust configuration
stella trust export \
--output trust-roots.json \
--include-ca \
--include-signing-keys
# Transfer to air-gapped system
# ... physical transfer ...
# On air-gapped system - import trust
stella trust import --file trust-roots.json
```
### 5.3 Score Replay Offline
```bash
# Replay score using offline data
stella score replay --scan $SCAN_ID --offline
# Replay with frozen time
stella score replay --scan $SCAN_ID \
--offline \
--freeze 2025-01-15T00:00:00Z
```
---
## 6. Troubleshooting
### 6.1 Kit Import Fails
**Symptoms**: `Failed to import offline kit`
**Diagnostic Steps**:
1. Verify kit integrity:
```bash
sha256sum offline-kit.tar.gz
# Compare with manifest
```
2. Check kit signature:
```bash
stella offline-kit verify --kit offline-kit.tar.gz
```
3. Check disk space:
```bash
df -h /var/lib/stellaops/
```
**Common Causes**:
| Cause | Resolution |
|-------|------------|
| Corrupted transfer | Re-transfer, verify checksum |
| Invalid signature | Regenerate kit with valid signing key |
| Insufficient space | Free disk space or expand volume |
| Expired kit | Generate fresh kit |
### 6.2 Stale Feed Data
**Symptoms**: Scans report old vulnerabilities, miss new CVEs
**Diagnostic Steps**:
1. Check feed age:
```bash
stella feeds status
```
2. Verify last update:
```bash
stella offline-kit status | grep "Installed At"
```
**Resolution**:
- Generate and import fresh offline kit
- Establish regular update schedule
- Set up expiry alerts
### 6.3 Trust Verification Fails
**Symptoms**: `Certificate chain verification failed` in offline mode
**Diagnostic Steps**:
1. Check trust store:
```bash
stella trust list
```
2. Verify CA bundle:
```bash
openssl verify -CAfile /etc/stellaops/offline/ca-bundle.pem \
/path/to/certificate.pem
```
3. Check for expired roots:
```bash
stella trust check-expiry
```
**Resolution**:
- Update trust bundle in offline kit
- Import new CA certificates
- Rotate expired signing keys
### 6.4 Network Access Attempted
**Symptoms**: Air-gapped system attempts network connection
**Diagnostic Steps**:
1. Check mode configuration:
```bash
stella config get mode
```
2. Audit network attempts:
```bash
journalctl -u stellaops | grep -i "network\|connect\|http"
```
3. Verify no external URLs in config:
```bash
grep -r "http" /etc/stellaops/
```
**Resolution**:
- Ensure `mode: air-gap` in configuration
- Remove any hardcoded URLs
- Block outbound traffic at firewall level
---
## 7. Monitoring & Health Checks
### 7.1 Air-Gap Health Checks
```bash
# Run comprehensive health check
stella health check --air-gap
# Output
Air-Gap Health Check
══════════════════════════════════════════
✓ Mode: air-gap
✓ Feed Freshness: 7 days old (OK)
✓ Trust Store: Valid
✓ Signing Keys: 2 active
✓ Disk Space: 45% used
✓ Database: Healthy
⚠ Kit Expiry: 24 days remaining
Overall: HEALTHY (1 warning)
```
### 7.2 Automated Monitoring Script
```bash
#!/bin/bash
# /etc/stellaops/scripts/airgap-health.sh
set -e
# Check feed age
FEED_AGE=$(stella feeds age --days)
if [ "$FEED_AGE" -gt 14 ]; then
echo "CRITICAL: Feeds are $FEED_AGE days old"
exit 2
fi
# Check kit expiry
DAYS_LEFT=$(stella offline-kit days-until-expiry)
if [ "$DAYS_LEFT" -lt 7 ]; then
echo "WARNING: Kit expires in $DAYS_LEFT days"
exit 1
fi
# Check trust store
if ! stella trust verify --quiet; then
echo "CRITICAL: Trust store verification failed"
exit 2
fi
echo "OK: Air-gap health check passed"
exit 0
```
### 7.3 Metrics for Air-Gap
| Metric | Description | Alert Threshold |
|--------|-------------|-----------------|
| `offline_kit_age_days` | Days since kit import | > 14 days |
| `offline_kit_expiry_days` | Days until kit expires | < 7 days |
| `feed_freshness_days` | Age of vulnerability feeds | > 7 days |
| `trust_store_valid` | Trust store validity (0/1) | = 0 |
| `disk_usage_percent` | Data store disk usage | > 80% |
### 7.4 Alert Configuration
```yaml
# /etc/stellaops/alerts/airgap.yaml
alerts:
- name: offline_kit_expiring
condition: offline_kit_expiry_days < 7
severity: warning
message: "Offline kit expires in {{ .Value }} days"
- name: offline_kit_expired
condition: offline_kit_expiry_days <= 0
severity: critical
message: "Offline kit has expired"
- name: feeds_stale
condition: feed_freshness_days > 14
severity: warning
message: "Vulnerability feeds are {{ .Value }} days old"
```
---
## 8. Escalation Procedures
### 8.1 Escalation Matrix
| Severity | Condition | Response Time | Action |
|----------|-----------|---------------|--------|
| P1 - Critical | Offline kit expired | 4 hours | Emergency kit transfer |
| P1 - Critical | Trust store invalid | 4 hours | Restore from backup |
| P2 - High | Feeds > 14 days old | 24 hours | Schedule kit update |
| P3 - Medium | Kit expiring in < 7 days | 48 hours | Plan kit update |
| P4 - Low | Minor health check warnings | Next maintenance | Review and address |
### 8.2 Emergency Kit Update Process
When kit expires before scheduled update:
1. **On Connected System** (0-2 hours):
```bash
stella offline-kit create --output emergency-kit.tar.gz --include-all
```
2. **Security Review** (2-4 hours):
- Verify kit signature
- Check for known vulnerabilities in kit
- Get approval for transfer
3. **Transfer** (4-6 hours):
- Physical media preparation
- Chain of custody documentation
- Transfer to air-gapped environment
4. **Import** (6-8 hours):
```bash
stella offline-kit import --kit emergency-kit.tar.gz --verify --install
```
### 8.3 Contacts
| Role | Contact | Availability |
|------|---------|--------------|
| Air-Gap Operations | airgap-ops@stellaops.io | Business hours |
| Security Team | security@stellaops.io | Business hours |
| Platform On-Call | platform-oncall@stellaops.io | 24/7 |
---
## Appendix A: Configuration Reference
### Air-Gap Mode Configuration
```yaml
# /etc/stellaops/config.yaml
mode: air-gap
offline:
dataDir: /var/lib/stellaops/offline
feedsDir: /var/lib/stellaops/offline/feeds
trustStore: /etc/stellaops/offline/trust-roots.json
caBundle: /etc/stellaops/offline/ca-bundle.pem
# Disable all external network calls
disableNetworking: true
# Kit expiry settings
kit:
expiryWarningDays: 7
maxAgeDays: 30
# Feed freshness settings
feeds:
maxAgeDays: 14
warnAgeDays: 7
```
### Environment Variables
```bash
# Air-gap specific
export STELLAOPS_MODE=air-gap
export STELLAOPS_OFFLINE_DATA_DIR=/var/lib/stellaops/offline
export STELLAOPS_DISABLE_NETWORKING=true
# Trust configuration
export STELLAOPS_TRUST_STORE=/etc/stellaops/offline/trust-roots.json
export STELLAOPS_CA_BUNDLE=/etc/stellaops/offline/ca-bundle.pem
```
---
## Appendix B: CLI Quick Reference
```bash
# Offline Kit Commands
stella offline-kit create --output <path> # Generate kit
stella offline-kit import --kit <path> # Import kit
stella offline-kit status # Show status
stella offline-kit verify --kit <path> # Verify kit
stella offline-kit rollback --to <date> # Rollback feeds
# Feed Commands (Air-Gap)
stella feeds status # Show feed status
stella feeds age --days # Get feed age
# Trust Commands
stella trust list # List trust roots
stella trust import --file <path> # Import trust config
stella trust verify # Verify trust store
stella trust export --output <path> # Export trust config
# Scanning (Air-Gap)
stella scan image --local <path> # Scan local image
stella scan sbom --file <path> # Scan SBOM file
# Verification (Air-Gap)
stella proof verify --bundle <path> --offline # Offline verification
stella score replay --scan <id> --offline # Offline replay
# Health
stella health check --air-gap # Air-gap health check
stella config get mode # Check current mode
```
---
## Appendix C: Update Schedule Template
| Week | Activity | Owner | Notes |
|------|----------|-------|-------|
| 1 | Generate kit | Connected Ops | Monday |
| 1 | Security review | Security Team | Tuesday |
| 1 | Approval | Security Lead | Wednesday |
| 2 | Transfer | Air-Gap Ops | Monday |
| 2 | Import & verify | Air-Gap Ops | Tuesday |
| 2 | Activate | Air-Gap Ops | Wednesday |
| 2 | Validate scans | QA Team | Thursday |
---
## Revision History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2025-12-20 | Agent | Initial release |
| 1.1.0 | 2026-01-07 | Agent | Added HLC job sync section |
---
## Appendix D: HLC-Based Job Synchronization
> **Sprint**: SPRINT_20260105_002_003_ROUTER
> **Added**: 2026-01-07
### Overview
When running in air-gap mode, scheduled jobs are queued locally using Hybrid Logical Clocks (HLC). Upon reconnection or physical transfer, these jobs can be merged deterministically with the central scheduler while preserving global ordering.
### Key Concepts
| Term | Description |
|------|-------------|
| **HLC Timestamp** | Tuple of (PhysicalTime, LogicalCounter, NodeId) for total ordering |
| **Job Log** | Append-only log of jobs enqueued while offline |
| **Chain Link** | Cryptographic hash linking each job to its predecessor |
| **Air-Gap Bundle** | Exportable package containing job logs from one or more offline nodes |
### Exporting Jobs from Offline Node
```bash
# Export all pending jobs to a bundle
stella airgap export \
--output /media/usb/jobs-bundle.json \
--tenant <tenant-id>
# Export with DSSE signature (requires signing key)
stella airgap export \
--output /media/usb/jobs-bundle.json \
--tenant <tenant-id> \
--sign
```
### Importing Jobs to Central Scheduler
```bash
# Import bundle and merge jobs
stella airgap import \
--bundle /media/usb/jobs-bundle.json
# Import with signature verification
stella airgap import \
--bundle /media/usb/jobs-bundle.json \
--verify-signature
# Dry-run to preview merge results
stella airgap import \
--bundle /media/usb/jobs-bundle.json \
--dry-run
```
### Merge Behavior
1. **Total Ordering**: Jobs are merged by HLC order key (PhysicalTime, LogicalCounter, NodeId, JobId)
2. **Duplicate Detection**: Jobs with the same JobId are deduplicated
3. **Chain Verification**: Each job's chain link is verified against its predecessor
4. **Conflict Resolution**: Same-payload jobs from different nodes are kept; conflicting payloads are flagged
### Monitoring Metrics
| Metric | Description |
|--------|-------------|
| `airgap_bundles_exported_total` | Total bundles exported |
| `airgap_bundles_imported_total` | Total bundles imported |
| `airgap_jobs_synced_total` | Total jobs synchronized |
| `airgap_merge_conflicts_total` | Merge conflicts by type |
| `airgap_sync_duration_seconds` | Sync operation duration |
### Troubleshooting
**Bundle signature verification fails**
1. Check that signing key is correctly configured
2. Verify bundle was not modified during transfer
3. Check key ID matches expected signer
**Merge conflicts detected**
1. Review conflicts in import output
2. Check for clock skew between offline nodes
3. Use `--conflict-strategy` flag to auto-resolve
**Jobs missing after import**
1. Check for duplicate JobIds (same job, same node)
2. Verify chain integrity with `stella airgap verify --bundle <path>`
3. Review import logs for skipped entries