old sprints work, new sprints for exposing functionality via cli, improve code_of_conduct and other agents instructions
This commit is contained in:
294
docs/operations/blue-green-deployment.md
Normal file
294
docs/operations/blue-green-deployment.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Blue/Green Deployment Guide
|
||||
|
||||
This guide documents the blue/green deployment strategy for Stella Ops platform upgrades with evidence continuity preservation.
|
||||
|
||||
## Overview
|
||||
|
||||
Blue/green deployment maintains two identical production environments:
|
||||
- **Blue**: Current production environment
|
||||
- **Green**: New version deployment target
|
||||
|
||||
This approach enables zero-downtime upgrades and instant rollback capability while preserving all evidence integrity.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Infrastructure Requirements
|
||||
|
||||
| Component | Blue Environment | Green Environment |
|
||||
|-----------|-----------------|-------------------|
|
||||
| Kubernetes namespace | `stellaops-prod` | `stellaops-green` |
|
||||
| PostgreSQL | Shared (with migration support) | Shared |
|
||||
| Redis/Valkey | Separate instance | Separate instance |
|
||||
| Object Storage | Shared (evidence bundles) | Shared |
|
||||
| Load Balancer | Traffic routing | Traffic routing |
|
||||
|
||||
### Version Compatibility
|
||||
|
||||
Before upgrading, verify version compatibility:
|
||||
|
||||
```bash
|
||||
# Check current version
|
||||
stella version
|
||||
|
||||
# Check target version compatibility
|
||||
stella upgrade check --target 2027.Q2
|
||||
```
|
||||
|
||||
See `docs/releases/VERSIONING.md` for the full compatibility matrix.
|
||||
|
||||
## Deployment Phases
|
||||
|
||||
### Phase 1: Preparation
|
||||
|
||||
#### 1.1 Environment Assessment
|
||||
|
||||
```bash
|
||||
# Verify current health
|
||||
stella doctor --full
|
||||
|
||||
# Check pending migrations
|
||||
stella system migrations-status
|
||||
|
||||
# Verify evidence integrity baseline
|
||||
stella evidence verify-all --output pre-upgrade-baseline.json
|
||||
```
|
||||
|
||||
#### 1.2 Backup Procedures
|
||||
|
||||
```bash
|
||||
# PostgreSQL backup
|
||||
pg_dump -Fc stellaops > backup-$(date +%Y%m%d-%H%M%S).dump
|
||||
|
||||
# Evidence bundle export
|
||||
stella evidence export --all --output evidence-backup/
|
||||
|
||||
# Configuration backup
|
||||
kubectl get configmap -n stellaops-prod -o yaml > configmaps-backup.yaml
|
||||
kubectl get secret -n stellaops-prod -o yaml > secrets-backup.yaml
|
||||
```
|
||||
|
||||
#### 1.3 Pre-Flight Checklist
|
||||
|
||||
- [ ] All services healthy
|
||||
- [ ] No active scans or attestations in progress
|
||||
- [ ] Queue depths at zero
|
||||
- [ ] Backup completed and verified
|
||||
- [ ] Evidence baseline captured
|
||||
- [ ] Maintenance window communicated
|
||||
|
||||
### Phase 2: Green Environment Deployment
|
||||
|
||||
#### 2.1 Deploy New Version
|
||||
|
||||
```bash
|
||||
# Deploy to green namespace
|
||||
helm upgrade stellaops-green ./helm/stellaops \
|
||||
--namespace stellaops-green \
|
||||
--create-namespace \
|
||||
--values values-production.yaml \
|
||||
--set image.tag=2027.Q2 \
|
||||
--wait
|
||||
|
||||
# Verify deployment
|
||||
kubectl get pods -n stellaops-green
|
||||
```
|
||||
|
||||
#### 2.2 Run Migrations
|
||||
|
||||
```bash
|
||||
# Apply startup migrations (Category A)
|
||||
stella system migrations-run --category A
|
||||
|
||||
# Verify migration status
|
||||
stella system migrations-status
|
||||
```
|
||||
|
||||
#### 2.3 Health Validation
|
||||
|
||||
```bash
|
||||
# Run health checks on green
|
||||
stella doctor --full --namespace stellaops-green
|
||||
|
||||
# Run smoke tests
|
||||
stella test smoke --namespace stellaops-green
|
||||
```
|
||||
|
||||
### Phase 3: Traffic Cutover
|
||||
|
||||
#### 3.1 Gradual Cutover (Recommended)
|
||||
|
||||
```yaml
|
||||
# Update ingress for gradual traffic shift
|
||||
# ingress-canary.yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: stellaops-canary
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/canary: "true"
|
||||
nginx.ingress.kubernetes.io/canary-weight: "10" # Start with 10%
|
||||
spec:
|
||||
rules:
|
||||
- host: stellaops.company.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: stellaops-green
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
Increase weight gradually: 10% -> 25% -> 50% -> 100%
|
||||
|
||||
#### 3.2 Instant Cutover
|
||||
|
||||
```bash
|
||||
# Switch DNS/load balancer to green
|
||||
kubectl patch ingress stellaops-main \
|
||||
-n stellaops-prod \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "stellaops-green"}]'
|
||||
```
|
||||
|
||||
#### 3.3 Monitoring During Cutover
|
||||
|
||||
Monitor these metrics during cutover:
|
||||
- Error rate: `rate(http_requests_total{status=~"5.."}[1m])`
|
||||
- Latency p99: `histogram_quantile(0.99, http_request_duration_seconds_bucket)`
|
||||
- Evidence operations: `rate(evidence_operations_total[1m])`
|
||||
- Attestation success: `rate(attestation_success_total[1m])`
|
||||
|
||||
### Phase 4: Post-Upgrade Validation
|
||||
|
||||
#### 4.1 Evidence Continuity Verification
|
||||
|
||||
```bash
|
||||
# Verify evidence chain-of-custody
|
||||
stella evidence verify-continuity \
|
||||
--baseline pre-upgrade-baseline.json \
|
||||
--output post-upgrade-report.html
|
||||
|
||||
# Generate audit report
|
||||
stella evidence audit-report \
|
||||
--since $(date -d '1 hour ago' --iso-8601) \
|
||||
--output upgrade-audit.pdf
|
||||
```
|
||||
|
||||
#### 4.2 Functional Validation
|
||||
|
||||
```bash
|
||||
# Run full test suite
|
||||
stella test integration
|
||||
|
||||
# Verify scan capability
|
||||
stella scan --image test-image:latest --dry-run
|
||||
|
||||
# Verify attestation generation
|
||||
stella attest verify --bundle test-bundle.tar.gz
|
||||
```
|
||||
|
||||
#### 4.3 Documentation Update
|
||||
|
||||
- Update `CURRENT_VERSION.md` with new version
|
||||
- Record upgrade in `CHANGELOG.md`
|
||||
- Archive upgrade artifacts
|
||||
|
||||
### Phase 5: Cleanup
|
||||
|
||||
#### 5.1 Observation Period
|
||||
|
||||
Maintain blue environment for 72 hours minimum before decommission.
|
||||
|
||||
#### 5.2 Blue Environment Decommission
|
||||
|
||||
```bash
|
||||
# After observation period, remove blue
|
||||
helm uninstall stellaops-blue -n stellaops-prod
|
||||
|
||||
# Clean up resources
|
||||
kubectl delete namespace stellaops-blue
|
||||
```
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Immediate Rollback (During Cutover)
|
||||
|
||||
```bash
|
||||
# Revert traffic to blue
|
||||
kubectl patch ingress stellaops-main \
|
||||
-n stellaops-prod \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "stellaops-blue"}]'
|
||||
```
|
||||
|
||||
### Post-Cutover Rollback
|
||||
|
||||
If rollback needed after cutover complete:
|
||||
|
||||
1. **Assess impact**: Run `stella evidence verify-continuity` to check evidence state
|
||||
2. **Database considerations**: Backward-compatible migrations allow rollback; breaking migrations require restore
|
||||
3. **Evidence preservation**: Evidence bundles created during green operation remain valid
|
||||
|
||||
```bash
|
||||
# If database rollback needed
|
||||
pg_restore -d stellaops backup-YYYYMMDD-HHMMSS.dump
|
||||
|
||||
# Redeploy blue version
|
||||
helm upgrade stellaops ./helm/stellaops \
|
||||
--namespace stellaops-prod \
|
||||
--set image.tag=2027.Q1 \
|
||||
--wait
|
||||
```
|
||||
|
||||
## Evidence Continuity Guarantees
|
||||
|
||||
### Preserved During Upgrade
|
||||
|
||||
| Artifact | Guarantee |
|
||||
|----------|-----------|
|
||||
| OCI digests | Unchanged |
|
||||
| SBOM content hashes | Unchanged |
|
||||
| Merkle roots | Recomputed if schema changes (cross-reference maintained) |
|
||||
| Attestation signatures | Valid |
|
||||
| Rekor log entries | Immutable |
|
||||
|
||||
### Verification Commands
|
||||
|
||||
```bash
|
||||
# Verify OCI digests unchanged
|
||||
stella evidence verify-digests --report digests.json
|
||||
|
||||
# Verify attestation validity
|
||||
stella attest verify-all --since $(date -d '7 days ago' --iso-8601)
|
||||
|
||||
# Generate compliance report
|
||||
stella evidence compliance-report --format pdf
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Issue | Symptom | Resolution |
|
||||
|-------|---------|------------|
|
||||
| Migration timeout | Pod stuck in init | Increase `migrationTimeoutSeconds` |
|
||||
| Health check failure | Ready probe failing | Check database connectivity |
|
||||
| Evidence mismatch | Continuity check fails | Run `stella evidence reindex` |
|
||||
| Traffic not routing | 502 errors | Verify service selector labels |
|
||||
|
||||
### Support Escalation
|
||||
|
||||
If upgrade issues cannot be resolved:
|
||||
1. Capture diagnostics: `stella doctor --export diagnostics.tar.gz`
|
||||
2. Rollback to blue environment
|
||||
3. Contact support with diagnostics bundle
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Upgrade Runbook](upgrade-runbook.md)
|
||||
- [Evidence Migration](evidence-migration.md)
|
||||
- [Database Migration Strategy](../db/MIGRATION_STRATEGY.md)
|
||||
- [Versioning Policy](../releases/VERSIONING.md)
|
||||
329
docs/operations/hsm-setup-runbook.md
Normal file
329
docs/operations/hsm-setup-runbook.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# HSM Setup and Configuration Runbook
|
||||
|
||||
This runbook provides step-by-step procedures for configuring Hardware Security Module (HSM) integration with Stella Ops.
|
||||
|
||||
## Overview
|
||||
|
||||
Stella Ops supports PKCS#11-compatible HSMs for cryptographic key storage and signing operations. This includes:
|
||||
- YubiHSM 2
|
||||
- Thales Luna Network HSM
|
||||
- AWS CloudHSM
|
||||
- SoftHSM2 (development/testing)
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Hardware Requirements
|
||||
|
||||
| Component | Requirement |
|
||||
|-----------|-------------|
|
||||
| HSM Device | PKCS#11 compatible |
|
||||
| Network | HSM accessible from Stella Ops services |
|
||||
| Backup | Secondary HSM for key backup |
|
||||
|
||||
### Software Requirements
|
||||
|
||||
```bash
|
||||
# PKCS#11 library for your HSM
|
||||
# Example for SoftHSM2 (development)
|
||||
apt-get install softhsm2 opensc
|
||||
|
||||
# Verify installation
|
||||
softhsm2-util --version
|
||||
pkcs11-tool --version
|
||||
```
|
||||
|
||||
## SoftHSM2 Setup (Development)
|
||||
|
||||
### Step 1: Initialize SoftHSM
|
||||
|
||||
```bash
|
||||
# Create token directory
|
||||
mkdir -p /var/lib/softhsm/tokens
|
||||
chmod 700 /var/lib/softhsm/tokens
|
||||
|
||||
# Initialize token
|
||||
softhsm2-util --init-token \
|
||||
--slot 0 \
|
||||
--label "StellaOps-Dev" \
|
||||
--so-pin 12345678 \
|
||||
--pin 87654321
|
||||
|
||||
# Verify token
|
||||
softhsm2-util --show-slots
|
||||
```
|
||||
|
||||
### Step 2: Generate Signing Key
|
||||
|
||||
```bash
|
||||
# Generate ECDSA P-256 key
|
||||
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
|
||||
--login --pin 87654321 \
|
||||
--keypairgen \
|
||||
--key-type EC:prime256v1 \
|
||||
--id 01 \
|
||||
--label "stellaops-signing-2026"
|
||||
|
||||
# List keys
|
||||
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
|
||||
--login --pin 87654321 \
|
||||
--list-objects
|
||||
```
|
||||
|
||||
### Step 3: Export Public Key
|
||||
|
||||
```bash
|
||||
# Export public key for distribution
|
||||
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
|
||||
--login --pin 87654321 \
|
||||
--read-object \
|
||||
--type pubkey \
|
||||
--id 01 \
|
||||
--output-file stellaops-signing-2026.pub.der
|
||||
|
||||
# Convert to PEM
|
||||
openssl ec -pubin -inform DER \
|
||||
-in stellaops-signing-2026.pub.der \
|
||||
-outform PEM \
|
||||
-out stellaops-signing-2026.pub.pem
|
||||
```
|
||||
|
||||
## YubiHSM 2 Setup
|
||||
|
||||
### Step 1: Install YubiHSM SDK
|
||||
|
||||
```bash
|
||||
# Download YubiHSM SDK
|
||||
wget https://developers.yubico.com/YubiHSM2/Releases/yubihsm2-sdk-2023.01-ubuntu2204-amd64.tar.gz
|
||||
tar xzf yubihsm2-sdk-*.tar.gz
|
||||
cd yubihsm2-sdk
|
||||
sudo ./install.sh
|
||||
|
||||
# Start connector
|
||||
sudo systemctl enable yubihsm-connector
|
||||
sudo systemctl start yubihsm-connector
|
||||
```
|
||||
|
||||
### Step 2: Initialize YubiHSM
|
||||
|
||||
```bash
|
||||
# Connect to YubiHSM shell
|
||||
yubihsm-shell
|
||||
|
||||
# Authenticate with default auth key
|
||||
connect
|
||||
session open 1 password
|
||||
|
||||
# Create authentication key for Stella Ops
|
||||
generate authkey 0 100 "StellaOps-Auth" 1 generate-asymmetric-key:sign-ecdsa:delete-asymmetric-key
|
||||
|
||||
# Generate signing key
|
||||
generate asymmetric 0 200 "StellaOps-Signing" 1 sign-ecdsa ecp256
|
||||
|
||||
# Export public key
|
||||
get public key 0 200 stellaops-yubihsm.pub
|
||||
|
||||
session close 0
|
||||
quit
|
||||
```
|
||||
|
||||
### Step 3: Configure PKCS#11
|
||||
|
||||
```bash
|
||||
# Create PKCS#11 configuration
|
||||
cat > /etc/yubihsm_pkcs11.conf <<EOF
|
||||
connector = http://127.0.0.1:12345
|
||||
EOF
|
||||
|
||||
# Test PKCS#11 access
|
||||
pkcs11-tool --module /usr/lib/libyubihsm_pkcs11.so \
|
||||
--list-slots
|
||||
```
|
||||
|
||||
## Stella Ops Configuration
|
||||
|
||||
### Basic HSM Configuration
|
||||
|
||||
```yaml
|
||||
# etc/stellaops.yaml
|
||||
signing:
|
||||
provider: "hsm"
|
||||
hsm:
|
||||
type: "pkcs11"
|
||||
libraryPath: "/usr/lib/softhsm/libsofthsm2.so" # or /usr/lib/libyubihsm_pkcs11.so
|
||||
slotId: 0
|
||||
tokenLabel: "StellaOps-Dev"
|
||||
pin: "${HSM_PIN}" # Use environment variable
|
||||
keyId: "01"
|
||||
keyLabel: "stellaops-signing-2026"
|
||||
|
||||
# Connection settings
|
||||
connectionTimeoutSeconds: 30
|
||||
maxSessions: 10
|
||||
sessionIdleTimeoutSeconds: 300
|
||||
|
||||
# Retry settings
|
||||
maxRetries: 3
|
||||
retryDelayMs: 100
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Set HSM PIN securely
|
||||
export HSM_PIN="87654321"
|
||||
|
||||
# Or use secrets manager
|
||||
export HSM_PIN=$(aws secretsmanager get-secret-value \
|
||||
--secret-id stellaops/hsm-pin \
|
||||
--query SecretString --output text)
|
||||
```
|
||||
|
||||
### Kubernetes Secret
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: stellaops-hsm
|
||||
namespace: stellaops
|
||||
type: Opaque
|
||||
stringData:
|
||||
HSM_PIN: "87654321"
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
### Step 1: Connectivity Check
|
||||
|
||||
```bash
|
||||
# Run HSM connectivity doctor check
|
||||
stella doctor --check hsm
|
||||
|
||||
# Expected output:
|
||||
# [PASS] HSM Connectivity
|
||||
# - Library loaded: /usr/lib/softhsm/libsofthsm2.so
|
||||
# - Slot available: 0 (StellaOps-Dev)
|
||||
# - Key found: stellaops-signing-2026
|
||||
# - Sign/verify test: PASSED
|
||||
```
|
||||
|
||||
### Step 2: Signing Test
|
||||
|
||||
```bash
|
||||
# Test signing operation
|
||||
stella sign test \
|
||||
--message "test message" \
|
||||
--key-label "stellaops-signing-2026"
|
||||
|
||||
# Expected output:
|
||||
# Signature: base64...
|
||||
# Algorithm: ECDSA-P256
|
||||
# Key ID: 01
|
||||
```
|
||||
|
||||
### Step 3: Integration Test
|
||||
|
||||
```bash
|
||||
# Run HSM integration tests
|
||||
stella test integration --filter "HSM*"
|
||||
```
|
||||
|
||||
## Key Rotation
|
||||
|
||||
### Step 1: Generate New Key
|
||||
|
||||
```bash
|
||||
# Generate new key in HSM
|
||||
pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
|
||||
--login --pin ${HSM_PIN} \
|
||||
--keypairgen \
|
||||
--key-type EC:prime256v1 \
|
||||
--id 02 \
|
||||
--label "stellaops-signing-2027"
|
||||
```
|
||||
|
||||
### Step 2: Add to Trust Anchor
|
||||
|
||||
```bash
|
||||
# Add new key to Stella Ops
|
||||
stella key add \
|
||||
--key-id "stellaops-signing-2027" \
|
||||
--algorithm EC-P256 \
|
||||
--public-key stellaops-signing-2027.pub.pem
|
||||
```
|
||||
|
||||
### Step 3: Transition Period
|
||||
|
||||
```yaml
|
||||
# Update configuration for dual-key
|
||||
signing:
|
||||
activeKeyId: "stellaops-signing-2027"
|
||||
additionalKeys:
|
||||
- keyId: "stellaops-signing-2026"
|
||||
keyLabel: "stellaops-signing-2026"
|
||||
```
|
||||
|
||||
### Step 4: Revoke Old Key
|
||||
|
||||
```bash
|
||||
# After transition period (2-4 weeks)
|
||||
stella key revoke \
|
||||
--key-id "stellaops-signing-2026" \
|
||||
--reason "scheduled-rotation"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Issue | Symptom | Resolution |
|
||||
|-------|---------|------------|
|
||||
| Library not found | `PKCS11 library not found` | Verify `libraryPath` in config |
|
||||
| Slot not available | `Slot 0 not found` | Run `pkcs11-tool --list-slots` |
|
||||
| Key not found | `Key stellaops-signing not found` | Verify key label with `--list-objects` |
|
||||
| Pin incorrect | `CKR_PIN_INCORRECT` | Check HSM_PIN environment variable |
|
||||
| Session limit | `CKR_SESSION_COUNT` | Increase `maxSessions` or restart |
|
||||
|
||||
### Debug Logging
|
||||
|
||||
```yaml
|
||||
# Enable HSM debug logging
|
||||
logging:
|
||||
levels:
|
||||
StellaOps.Cryptography.Hsm: Debug
|
||||
```
|
||||
|
||||
### Session Recovery
|
||||
|
||||
```bash
|
||||
# If sessions exhausted, restart service
|
||||
kubectl rollout restart deployment stellaops-signer -n stellaops
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
1. **PIN Management**
|
||||
- Never hardcode PINs in configuration files
|
||||
- Use secrets management (Vault, AWS Secrets Manager)
|
||||
- Rotate PINs periodically
|
||||
|
||||
2. **Key Backup**
|
||||
- Configure HSM key backup/replication
|
||||
- Test key recovery procedures regularly
|
||||
- Document recovery process
|
||||
|
||||
3. **Access Control**
|
||||
- Limit HSM access to required services only
|
||||
- Use separate authentication keys per service
|
||||
- Audit HSM access logs
|
||||
|
||||
4. **Network Security**
|
||||
- Use TLS for network HSM connections
|
||||
- Firewall HSM to authorized hosts only
|
||||
- Monitor for unauthorized access attempts
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Key Rotation Runbook](key-rotation-runbook.md)
|
||||
- [Dual-Control Ceremonies](dual-control-ceremonies.md)
|
||||
- [Signer Architecture](../modules/signer/architecture.md)
|
||||
381
docs/operations/upgrade-runbook.md
Normal file
381
docs/operations/upgrade-runbook.md
Normal file
@@ -0,0 +1,381 @@
|
||||
# Stella Ops Upgrade Runbook
|
||||
|
||||
This runbook provides step-by-step procedures for upgrading Stella Ops with evidence continuity preservation.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Phase | Duration | Owner | Rollback Point |
|
||||
|-------|----------|-------|----------------|
|
||||
| Pre-Upgrade | 2-4 hours | Platform Team | N/A |
|
||||
| Backup | 1-2 hours | DBA | Full restore |
|
||||
| Deploy Green | 30-60 min | Platform Team | Abort deploy |
|
||||
| Cutover | 15-30 min | Platform Team | Instant rollback |
|
||||
| Validation | 1-2 hours | QA + Security | 72h observation |
|
||||
| Cleanup | 30 min | Platform Team | N/A |
|
||||
|
||||
## Pre-Upgrade Checklist
|
||||
|
||||
### Environment Verification
|
||||
|
||||
```bash
|
||||
# Step 1: Record current version
|
||||
stella version > /tmp/pre-upgrade-version.txt
|
||||
echo "Current version: $(cat /tmp/pre-upgrade-version.txt)"
|
||||
|
||||
# Step 2: Verify system health
|
||||
stella doctor --full --output /tmp/pre-upgrade-health.json
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "ABORT: System health check failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Step 3: Check pending migrations
|
||||
stella system migrations-status
|
||||
# Ensure no pending migrations before upgrade
|
||||
|
||||
# Step 4: Verify queue depths
|
||||
stella queue status --all
|
||||
# All queues should be empty or near-empty
|
||||
```
|
||||
|
||||
### Evidence Integrity Baseline
|
||||
|
||||
```bash
|
||||
# Step 5: Capture evidence baseline
|
||||
stella evidence verify-all \
|
||||
--output /backup/pre-upgrade-evidence-baseline.json \
|
||||
--include-merkle-roots
|
||||
|
||||
# Step 6: Export Merkle root summary
|
||||
stella evidence roots-export \
|
||||
--output /backup/pre-upgrade-merkle-roots.json
|
||||
|
||||
# Step 7: Record evidence counts
|
||||
stella evidence stats > /backup/pre-upgrade-evidence-stats.txt
|
||||
```
|
||||
|
||||
### Backup Procedures
|
||||
|
||||
```bash
|
||||
# Step 8: PostgreSQL backup
|
||||
BACKUP_TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
||||
pg_dump -Fc -d stellaops -f /backup/stellaops-${BACKUP_TIMESTAMP}.dump
|
||||
|
||||
# Step 9: Verify backup integrity
|
||||
pg_restore --list /backup/stellaops-${BACKUP_TIMESTAMP}.dump > /dev/null
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "ABORT: Backup verification failed"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Step 10: Evidence bundle backup
|
||||
stella evidence export \
|
||||
--all \
|
||||
--output /backup/evidence-bundles-${BACKUP_TIMESTAMP}/
|
||||
|
||||
# Step 11: Configuration backup
|
||||
kubectl get configmap -n stellaops -o yaml > /backup/configmaps-${BACKUP_TIMESTAMP}.yaml
|
||||
kubectl get secret -n stellaops -o yaml > /backup/secrets-${BACKUP_TIMESTAMP}.yaml
|
||||
```
|
||||
|
||||
### Pre-Flight Approval
|
||||
|
||||
Complete this checklist before proceeding:
|
||||
|
||||
- [ ] Current version documented
|
||||
- [ ] System health: GREEN
|
||||
- [ ] Evidence baseline captured
|
||||
- [ ] PostgreSQL backup completed and verified
|
||||
- [ ] Evidence bundles exported
|
||||
- [ ] Configuration backed up
|
||||
- [ ] Maintenance window approved
|
||||
- [ ] Stakeholders notified
|
||||
- [ ] Rollback plan reviewed
|
||||
|
||||
**Approver signature**: __________________ **Date**: __________
|
||||
|
||||
## Upgrade Execution
|
||||
|
||||
### Deploy Green Environment
|
||||
|
||||
```bash
|
||||
# Step 12: Create green namespace
|
||||
kubectl create namespace stellaops-green
|
||||
|
||||
# Step 13: Copy secrets to green namespace
|
||||
kubectl get secret stellaops-secrets -n stellaops -o yaml | \
|
||||
sed 's/namespace: stellaops/namespace: stellaops-green/' | \
|
||||
kubectl apply -f -
|
||||
|
||||
# Step 14: Deploy new version
|
||||
helm upgrade stellaops-green ./helm/stellaops \
|
||||
--namespace stellaops-green \
|
||||
--values values-production.yaml \
|
||||
--set image.tag=${TARGET_VERSION} \
|
||||
--wait --timeout 10m
|
||||
|
||||
# Step 15: Verify deployment
|
||||
kubectl get pods -n stellaops-green -w
|
||||
# Wait for all pods to be Running and Ready
|
||||
```
|
||||
|
||||
### Run Migrations
|
||||
|
||||
```bash
|
||||
# Step 16: Apply Category A migrations (startup)
|
||||
stella system migrations-run \
|
||||
--category A \
|
||||
--namespace stellaops-green
|
||||
|
||||
# Step 17: Verify migration success
|
||||
stella system migrations-status --namespace stellaops-green
|
||||
# All migrations should show "Applied"
|
||||
|
||||
# Step 18: Apply Category B migrations if needed (manual)
|
||||
# Review migration list first
|
||||
stella system migrations-pending --category B
|
||||
|
||||
# Apply after review
|
||||
stella system migrations-run \
|
||||
--category B \
|
||||
--namespace stellaops-green \
|
||||
--confirm
|
||||
```
|
||||
|
||||
### Evidence Migration (If Required)
|
||||
|
||||
```bash
|
||||
# Step 19: Check if evidence migration needed
|
||||
stella evidence migrate --dry-run --namespace stellaops-green
|
||||
|
||||
# Step 20: If migration needed, execute
|
||||
stella evidence migrate \
|
||||
--namespace stellaops-green \
|
||||
--batch-size 100 \
|
||||
--progress
|
||||
|
||||
# Step 21: Verify evidence integrity post-migration
|
||||
stella evidence verify-all \
|
||||
--namespace stellaops-green \
|
||||
--output /tmp/post-migration-evidence.json
|
||||
```
|
||||
|
||||
### Health Validation
|
||||
|
||||
```bash
|
||||
# Step 22: Run health checks on green
|
||||
stella doctor --full --namespace stellaops-green
|
||||
|
||||
# Step 23: Run smoke tests
|
||||
stella test smoke --namespace stellaops-green
|
||||
|
||||
# Step 24: Verify critical paths
|
||||
stella test critical-paths --namespace stellaops-green
|
||||
```
|
||||
|
||||
## Traffic Cutover
|
||||
|
||||
### Gradual Cutover
|
||||
|
||||
```bash
|
||||
# Step 25: Enable canary (10%)
|
||||
kubectl apply -f - <<EOF
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: stellaops-canary
|
||||
namespace: stellaops-green
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/canary: "true"
|
||||
nginx.ingress.kubernetes.io/canary-weight: "10"
|
||||
spec:
|
||||
ingressClassName: nginx
|
||||
rules:
|
||||
- host: stellaops.company.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: stellaops-api
|
||||
port:
|
||||
number: 80
|
||||
EOF
|
||||
|
||||
# Step 26: Monitor for 15 minutes
|
||||
# Check error rates, latency, evidence operations
|
||||
|
||||
# Step 27: Increase to 50%
|
||||
kubectl patch ingress stellaops-canary -n stellaops-green \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/metadata/annotations/nginx.ingress.kubernetes.io~1canary-weight", "value": "50"}]'
|
||||
|
||||
# Step 28: Monitor for 15 minutes
|
||||
|
||||
# Step 29: Complete cutover (100%)
|
||||
kubectl patch ingress stellaops-canary -n stellaops-green \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/metadata/annotations/nginx.ingress.kubernetes.io~1canary-weight", "value": "100"}]'
|
||||
```
|
||||
|
||||
### Monitoring During Cutover
|
||||
|
||||
Watch these dashboards:
|
||||
- Grafana: Stella Ops Overview
|
||||
- Grafana: Evidence Operations
|
||||
- Grafana: Attestation Pipeline
|
||||
|
||||
Alert thresholds:
|
||||
- Error rate > 1%: Pause cutover
|
||||
- p99 latency > 5s: Investigate
|
||||
- Evidence failures > 0: Rollback
|
||||
|
||||
## Post-Upgrade Validation
|
||||
|
||||
### Evidence Continuity Verification
|
||||
|
||||
```bash
|
||||
# Step 30: Verify chain-of-custody
|
||||
stella evidence verify-continuity \
|
||||
--baseline /backup/pre-upgrade-evidence-baseline.json \
|
||||
--output /reports/continuity-report.html
|
||||
|
||||
# Step 31: Verify Merkle roots
|
||||
stella evidence verify-roots \
|
||||
--baseline /backup/pre-upgrade-merkle-roots.json \
|
||||
--output /reports/roots-verification.json
|
||||
|
||||
# Step 32: Compare evidence stats
|
||||
stella evidence stats > /tmp/post-upgrade-evidence-stats.txt
|
||||
diff /backup/pre-upgrade-evidence-stats.txt /tmp/post-upgrade-evidence-stats.txt
|
||||
|
||||
# Step 33: Generate audit report
|
||||
stella evidence audit-report \
|
||||
--since "${UPGRADE_START_TIME}" \
|
||||
--format pdf \
|
||||
--output /reports/upgrade-audit-$(date +%Y%m%d).pdf
|
||||
```
|
||||
|
||||
### Functional Validation
|
||||
|
||||
```bash
|
||||
# Step 34: Full integration test
|
||||
stella test integration --full
|
||||
|
||||
# Step 35: Scan test
|
||||
stella scan \
|
||||
--image registry.company.com/test-app:latest \
|
||||
--sbom-format spdx-2.3
|
||||
|
||||
# Step 36: Attestation test
|
||||
stella attest \
|
||||
--subject sha256:test123 \
|
||||
--predicate-type slsa-provenance
|
||||
|
||||
# Step 37: Policy evaluation test
|
||||
stella policy evaluate \
|
||||
--artifact sha256:test123 \
|
||||
--environment production
|
||||
```
|
||||
|
||||
### Post-Upgrade Checklist
|
||||
|
||||
- [ ] Evidence continuity verified
|
||||
- [ ] Merkle roots consistent
|
||||
- [ ] All services healthy
|
||||
- [ ] Integration tests passing
|
||||
- [ ] Scan capability verified
|
||||
- [ ] Attestation generation working
|
||||
- [ ] Policy evaluation working
|
||||
- [ ] No elevated error rates
|
||||
- [ ] Latency within SLO
|
||||
|
||||
**Validator signature**: __________________ **Date**: __________
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Immediate Rollback (During Cutover)
|
||||
|
||||
```bash
|
||||
# Revert canary to 0%
|
||||
kubectl patch ingress stellaops-canary -n stellaops-green \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/metadata/annotations/nginx.ingress.kubernetes.io~1canary-weight", "value": "0"}]'
|
||||
|
||||
# Or delete canary entirely
|
||||
kubectl delete ingress stellaops-canary -n stellaops-green
|
||||
```
|
||||
|
||||
### Full Rollback (After Cutover)
|
||||
|
||||
```bash
|
||||
# Step R1: Assess database state
|
||||
stella system migrations-status
|
||||
|
||||
# Step R2: If migrations are backward-compatible
|
||||
# Simply redeploy previous version
|
||||
helm upgrade stellaops ./helm/stellaops \
|
||||
--namespace stellaops \
|
||||
--set image.tag=${PREVIOUS_VERSION} \
|
||||
--wait
|
||||
|
||||
# Step R3: If database restore needed
|
||||
# Stop all services first
|
||||
kubectl scale deployment --all --replicas=0 -n stellaops
|
||||
|
||||
# Restore database
|
||||
pg_restore -d stellaops -c /backup/stellaops-${BACKUP_TIMESTAMP}.dump
|
||||
|
||||
# Redeploy previous version
|
||||
helm upgrade stellaops ./helm/stellaops \
|
||||
--namespace stellaops \
|
||||
--set image.tag=${PREVIOUS_VERSION} \
|
||||
--wait
|
||||
|
||||
# Step R4: Verify rollback
|
||||
stella doctor --full
|
||||
stella evidence verify-all
|
||||
```
|
||||
|
||||
## Cleanup
|
||||
|
||||
### After 72-Hour Observation
|
||||
|
||||
```bash
|
||||
# Step 40: Verify stable operation
|
||||
stella doctor --full
|
||||
stella evidence verify-all
|
||||
|
||||
# Step 41: Remove blue environment
|
||||
kubectl delete namespace stellaops-blue
|
||||
|
||||
# Step 42: Archive upgrade artifacts
|
||||
tar -czf /archive/upgrade-${UPGRADE_TIMESTAMP}.tar.gz \
|
||||
/backup/ \
|
||||
/reports/ \
|
||||
/tmp/pre-upgrade-*.txt
|
||||
|
||||
# Step 43: Update documentation
|
||||
echo "${TARGET_VERSION}" > docs/CURRENT_VERSION.md
|
||||
```
|
||||
|
||||
## Appendix
|
||||
|
||||
### Version-Specific Notes
|
||||
|
||||
See `docs/releases/{version}/MIGRATION.md` for version-specific migration notes.
|
||||
|
||||
### Breaking Changes Matrix
|
||||
|
||||
| From | To | Breaking Changes | Migration Required |
|
||||
|------|-----|-----------------|-------------------|
|
||||
| 2027.Q1 | 2027.Q2 | None | No |
|
||||
| 2026.Q4 | 2027.Q1 | Policy schema v2 | Yes |
|
||||
|
||||
### Support Contacts
|
||||
|
||||
- Platform Team: platform@company.com
|
||||
- DBA Team: dba@company.com
|
||||
- Security Team: security@company.com
|
||||
- On-Call: +1-555-OPS-CALL
|
||||
Reference in New Issue
Block a user