Files
git.stella-ops.org/docs/operations/runbooks/crypto-ops.md

371 lines
8.4 KiB
Markdown

# Sprint: SPRINT_20260117_029_Runbook_coverage_expansion
# Task: RUN-002 - Crypto Subsystem Runbook
# Regional Crypto Operations Runbook
Status: PRODUCTION-READY (2026-01-17 UTC)
## Scope
Cryptographic subsystem operations including HSM management, regional crypto profile configuration, key rotation, and certificate management for all supported crypto profiles (International, FIPS, eIDAS, GOST, SM).
---
## Pre-flight Checklist
### Environment Verification
```bash
# Check crypto subsystem health
stella doctor --category crypto
# Verify active crypto profile
stella crypto profile show
# List loaded crypto providers
stella crypto providers list
# Check key status
stella crypto keys status
```
### Metrics to Watch
- `stella_crypto_operations_total` - Crypto operation count by type
- `stella_crypto_operation_duration_seconds` - Signing/verification latency
- `stella_hsm_availability` - HSM availability (if configured)
- `stella_cert_expiry_days` - Certificate expiration countdown
---
## Regional Crypto Profiles
### Profile Overview
| Profile | Use Case | Key Algorithms | Compliance |
|---------|----------|----------------|------------|
| `international` | Default, most deployments | RSA-2048+, ECDSA P-256/P-384, Ed25519 | General |
| `fips` | US Government / FedRAMP | FIPS 140-2 approved algorithms only | FIPS 140-2 |
| `eidas` | European Union | RSA-PSS, ECDSA, Ed25519 per ETSI TS 119 312 | eIDAS |
| `gost` | Russian Federation | GOST R 34.10-2012, GOST R 34.11-2012 | Russian standards |
| `sm` | China | SM2, SM3, SM4 | GM/T 0003-2012 |
### Switching Profiles
1. **Pre-switch verification:**
```bash
# Verify target profile is available
stella crypto profile verify --profile <target-profile>
# Check for incompatible existing signatures
stella crypto audit --check-compatibility --target-profile <target-profile>
```
2. **Profile switch:**
```bash
# Switch profile (requires service restart)
stella crypto profile set --profile <target-profile>
# Restart services to apply
stella service restart --graceful
```
3. **Post-switch verification:**
```bash
stella doctor --check check.crypto.fips,check.crypto.eidas,check.crypto.gost,check.crypto.sm
```
---
## Standard Procedures
### SP-001: Key Rotation
**Frequency:** Quarterly or per policy
**Duration:** ~15 minutes (no downtime)
1. Generate new key:
```bash
# For software keys
stella crypto keys generate --type signing --algorithm ecdsa-p256 --name signing-$(date +%Y%m)
# For HSM-backed keys
stella crypto keys generate --type signing --algorithm ecdsa-p256 --provider hsm --name signing-$(date +%Y%m)
```
2. Activate new key:
```bash
stella crypto keys activate --name signing-$(date +%Y%m)
```
3. Verify signing with new key:
```bash
echo "test" | stella crypto sign --output /dev/null
```
4. Schedule old key deactivation:
```bash
stella crypto keys schedule-deactivation --name <old-key-name> --in 30d
```
### SP-002: Certificate Renewal
**When:** Certificate expiring within 30 days
1. Check expiration:
```bash
stella crypto certs check-expiry
```
2. Generate CSR:
```bash
stella crypto certs csr --subject "CN=stellaops.example.com,O=Example Corp" --output cert.csr
```
3. Install renewed certificate:
```bash
stella crypto certs install --cert renewed-cert.pem --chain ca-chain.pem
```
4. Verify certificate chain:
```bash
stella doctor --check check.crypto.certchain
```
5. Restart services:
```bash
stella service restart --graceful
```
### SP-003: HSM Health Check
**Frequency:** Daily (automated) or on-demand
1. Check HSM connectivity:
```bash
stella crypto hsm status
```
2. Verify slot access:
```bash
stella crypto hsm slots list
```
3. Test signing operation:
```bash
stella crypto hsm test-sign
```
4. Check HSM metrics:
- Free objects/sessions
- Temperature/health (vendor-specific)
---
## Incident Procedures
### INC-001: HSM Unavailable
**Symptoms:**
- Alert: `StellaHsmUnavailable`
- Signing operations failing with "HSM connection error"
**Investigation:**
```bash
# Check HSM status
stella crypto hsm status
# Test PKCS#11 module
stella crypto hsm test-module
# Check network to HSM
stella network test --host <hsm-host> --port <hsm-port>
```
**Resolution:**
1. **Network issue:**
- Verify network path to HSM
- Check firewall rules
- Verify HSM appliance is powered on
2. **Session exhaustion:**
```bash
# Release stale sessions
stella crypto hsm sessions release --stale
# Restart crypto service
stella service restart --service crypto-signer
```
3. **HSM failure:**
- Fail over to secondary HSM (if configured)
- Contact HSM vendor support
- Consider temporary fallback to software keys (with approval)
### INC-002: Signing Key Compromised
**CRITICAL - Follow incident response procedure**
1. **Immediate containment:**
```bash
# Revoke compromised key
stella crypto keys revoke --name <compromised-key> --reason compromise
# Block signing with compromised key
stella crypto keys block --name <compromised-key>
```
2. **Generate replacement key:**
```bash
stella crypto keys generate --type signing --algorithm ecdsa-p256 --name emergency-signing
stella crypto keys activate --name emergency-signing
```
3. **Notify downstream:**
- Update trust registries with new key
- Notify relying parties
- Publish key revocation notice
4. **Forensics:**
```bash
# Export key usage audit log
stella crypto audit export --key <compromised-key> --output /secure/key-audit.json
```
### INC-003: Certificate Expired
**Symptoms:**
- TLS connection failures
- Alert: `StellaCertExpired`
**Immediate Resolution:**
1. If renewed certificate is available:
```bash
stella crypto certs install --cert renewed-cert.pem --chain ca-chain.pem
stella service restart --graceful
```
2. If renewal not ready - emergency self-signed (temporary):
```bash
# Generate emergency certificate (NOT for production use)
stella crypto certs generate-self-signed --days 7 --name emergency
stella crypto certs install --cert emergency.pem
stella service restart --graceful
```
3. Expedite certificate renewal process
### INC-004: FIPS Mode Not Enabled
**Symptoms:**
- Alert: `StellaFipsNotEnabled`
- Compliance audit failure
**Resolution:**
1. **Linux:**
```bash
# Enable FIPS mode
sudo fips-mode-setup --enable
# Reboot required
sudo reboot
# Verify after reboot
fips-mode-setup --check
```
2. **Windows:**
- Enable via Group Policy
- Or via registry:
```powershell
Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Lsa\FipsAlgorithmPolicy" -Name "Enabled" -Value 1
Restart-Computer
```
3. Restart Stella services:
```bash
stella service restart
stella doctor --check check.crypto.fips
```
---
## Regional-Specific Procedures
### GOST Configuration (Russian Federation)
1. Install GOST engine:
```bash
sudo apt install libengine-gost-openssl1.1
```
2. Configure Stella:
```bash
stella crypto profile set --profile gost
stella crypto config set --gost-engine-path /usr/lib/x86_64-linux-gnu/engines-3/gost.so
```
3. Verify:
```bash
stella doctor --check check.crypto.gost
```
### SM Configuration (China)
1. Ensure OpenSSL 1.1.1+ with SM support:
```bash
openssl version
openssl list -cipher-algorithms | grep -i sm
```
2. Configure Stella:
```bash
stella crypto profile set --profile sm
```
3. Verify:
```bash
stella doctor --check check.crypto.sm
```
---
## Monitoring Dashboard
Access: Grafana → Dashboards → Stella Ops → Crypto Subsystem
Key panels:
- Signing operation latency
- Key usage by key ID
- HSM availability
- Certificate expiration countdown
- Crypto profile in use
---
## Evidence Capture
```bash
# Comprehensive crypto diagnostics
stella crypto diagnostics --output /tmp/crypto-diag-$(date +%Y%m%dT%H%M%S).tar.gz
```
Bundle includes:
- Active crypto profile
- Key inventory (public keys only)
- Certificate chain
- HSM status
- Operation audit log (last 24h)
---
## Escalation Path
1. **L1 (On-call):** Certificate installs, key activation
2. **L2 (Security team):** Key rotation, HSM issues
3. **L3 (Crypto SME):** Algorithm issues, compliance questions
4. **HSM Vendor:** Hardware failures
---
_Last updated: 2026-01-17 (UTC)_