old sprints work, new sprints for exposing functionality via cli, improve code_of_conduct and other agents instructions
This commit is contained in:
294
docs/operations/blue-green-deployment.md
Normal file
294
docs/operations/blue-green-deployment.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Blue/Green Deployment Guide
|
||||
|
||||
This guide documents the blue/green deployment strategy for Stella Ops platform upgrades with evidence continuity preservation.
|
||||
|
||||
## Overview
|
||||
|
||||
Blue/green deployment maintains two identical production environments:
|
||||
- **Blue**: Current production environment
|
||||
- **Green**: New version deployment target
|
||||
|
||||
This approach enables zero-downtime upgrades and instant rollback capability while preserving all evidence integrity.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Infrastructure Requirements
|
||||
|
||||
| Component | Blue Environment | Green Environment |
|
||||
|-----------|-----------------|-------------------|
|
||||
| Kubernetes namespace | `stellaops-prod` | `stellaops-green` |
|
||||
| PostgreSQL | Shared (with migration support) | Shared |
|
||||
| Redis/Valkey | Separate instance | Separate instance |
|
||||
| Object Storage | Shared (evidence bundles) | Shared |
|
||||
| Load Balancer | Traffic routing | Traffic routing |
|
||||
|
||||
### Version Compatibility
|
||||
|
||||
Before upgrading, verify version compatibility:
|
||||
|
||||
```bash
|
||||
# Check current version
|
||||
stella version
|
||||
|
||||
# Check target version compatibility
|
||||
stella upgrade check --target 2027.Q2
|
||||
```
|
||||
|
||||
See `docs/releases/VERSIONING.md` for the full compatibility matrix.
|
||||
|
||||
## Deployment Phases
|
||||
|
||||
### Phase 1: Preparation
|
||||
|
||||
#### 1.1 Environment Assessment
|
||||
|
||||
```bash
|
||||
# Verify current health
|
||||
stella doctor --full
|
||||
|
||||
# Check pending migrations
|
||||
stella system migrations-status
|
||||
|
||||
# Verify evidence integrity baseline
|
||||
stella evidence verify-all --output pre-upgrade-baseline.json
|
||||
```
|
||||
|
||||
#### 1.2 Backup Procedures
|
||||
|
||||
```bash
|
||||
# PostgreSQL backup
|
||||
pg_dump -Fc stellaops > backup-$(date +%Y%m%d-%H%M%S).dump
|
||||
|
||||
# Evidence bundle export
|
||||
stella evidence export --all --output evidence-backup/
|
||||
|
||||
# Configuration backup
|
||||
kubectl get configmap -n stellaops-prod -o yaml > configmaps-backup.yaml
|
||||
kubectl get secret -n stellaops-prod -o yaml > secrets-backup.yaml
|
||||
```
|
||||
|
||||
#### 1.3 Pre-Flight Checklist
|
||||
|
||||
- [ ] All services healthy
|
||||
- [ ] No active scans or attestations in progress
|
||||
- [ ] Queue depths at zero
|
||||
- [ ] Backup completed and verified
|
||||
- [ ] Evidence baseline captured
|
||||
- [ ] Maintenance window communicated
|
||||
|
||||
### Phase 2: Green Environment Deployment
|
||||
|
||||
#### 2.1 Deploy New Version
|
||||
|
||||
```bash
|
||||
# Deploy to green namespace
|
||||
helm upgrade stellaops-green ./helm/stellaops \
|
||||
--namespace stellaops-green \
|
||||
--create-namespace \
|
||||
--values values-production.yaml \
|
||||
--set image.tag=2027.Q2 \
|
||||
--wait
|
||||
|
||||
# Verify deployment
|
||||
kubectl get pods -n stellaops-green
|
||||
```
|
||||
|
||||
#### 2.2 Run Migrations
|
||||
|
||||
```bash
|
||||
# Apply startup migrations (Category A)
|
||||
stella system migrations-run --category A
|
||||
|
||||
# Verify migration status
|
||||
stella system migrations-status
|
||||
```
|
||||
|
||||
#### 2.3 Health Validation
|
||||
|
||||
```bash
|
||||
# Run health checks on green
|
||||
stella doctor --full --namespace stellaops-green
|
||||
|
||||
# Run smoke tests
|
||||
stella test smoke --namespace stellaops-green
|
||||
```
|
||||
|
||||
### Phase 3: Traffic Cutover
|
||||
|
||||
#### 3.1 Gradual Cutover (Recommended)
|
||||
|
||||
```yaml
|
||||
# Update ingress for gradual traffic shift
|
||||
# ingress-canary.yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: stellaops-canary
|
||||
annotations:
|
||||
nginx.ingress.kubernetes.io/canary: "true"
|
||||
nginx.ingress.kubernetes.io/canary-weight: "10" # Start with 10%
|
||||
spec:
|
||||
rules:
|
||||
- host: stellaops.company.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: stellaops-green
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
Increase weight gradually: 10% -> 25% -> 50% -> 100%
|
||||
|
||||
#### 3.2 Instant Cutover
|
||||
|
||||
```bash
|
||||
# Switch DNS/load balancer to green
|
||||
kubectl patch ingress stellaops-main \
|
||||
-n stellaops-prod \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "stellaops-green"}]'
|
||||
```
|
||||
|
||||
#### 3.3 Monitoring During Cutover
|
||||
|
||||
Monitor these metrics during cutover:
|
||||
- Error rate: `rate(http_requests_total{status=~"5.."}[1m])`
|
||||
- Latency p99: `histogram_quantile(0.99, http_request_duration_seconds_bucket)`
|
||||
- Evidence operations: `rate(evidence_operations_total[1m])`
|
||||
- Attestation success: `rate(attestation_success_total[1m])`
|
||||
|
||||
### Phase 4: Post-Upgrade Validation
|
||||
|
||||
#### 4.1 Evidence Continuity Verification
|
||||
|
||||
```bash
|
||||
# Verify evidence chain-of-custody
|
||||
stella evidence verify-continuity \
|
||||
--baseline pre-upgrade-baseline.json \
|
||||
--output post-upgrade-report.html
|
||||
|
||||
# Generate audit report
|
||||
stella evidence audit-report \
|
||||
--since $(date -d '1 hour ago' --iso-8601) \
|
||||
--output upgrade-audit.pdf
|
||||
```
|
||||
|
||||
#### 4.2 Functional Validation
|
||||
|
||||
```bash
|
||||
# Run full test suite
|
||||
stella test integration
|
||||
|
||||
# Verify scan capability
|
||||
stella scan --image test-image:latest --dry-run
|
||||
|
||||
# Verify attestation generation
|
||||
stella attest verify --bundle test-bundle.tar.gz
|
||||
```
|
||||
|
||||
#### 4.3 Documentation Update
|
||||
|
||||
- Update `CURRENT_VERSION.md` with new version
|
||||
- Record upgrade in `CHANGELOG.md`
|
||||
- Archive upgrade artifacts
|
||||
|
||||
### Phase 5: Cleanup
|
||||
|
||||
#### 5.1 Observation Period
|
||||
|
||||
Maintain blue environment for 72 hours minimum before decommission.
|
||||
|
||||
#### 5.2 Blue Environment Decommission
|
||||
|
||||
```bash
|
||||
# After observation period, remove blue
|
||||
helm uninstall stellaops-blue -n stellaops-prod
|
||||
|
||||
# Clean up resources
|
||||
kubectl delete namespace stellaops-blue
|
||||
```
|
||||
|
||||
## Rollback Procedures
|
||||
|
||||
### Immediate Rollback (During Cutover)
|
||||
|
||||
```bash
|
||||
# Revert traffic to blue
|
||||
kubectl patch ingress stellaops-main \
|
||||
-n stellaops-prod \
|
||||
--type='json' \
|
||||
-p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "stellaops-blue"}]'
|
||||
```
|
||||
|
||||
### Post-Cutover Rollback
|
||||
|
||||
If rollback needed after cutover complete:
|
||||
|
||||
1. **Assess impact**: Run `stella evidence verify-continuity` to check evidence state
|
||||
2. **Database considerations**: Backward-compatible migrations allow rollback; breaking migrations require restore
|
||||
3. **Evidence preservation**: Evidence bundles created during green operation remain valid
|
||||
|
||||
```bash
|
||||
# If database rollback needed
|
||||
pg_restore -d stellaops backup-YYYYMMDD-HHMMSS.dump
|
||||
|
||||
# Redeploy blue version
|
||||
helm upgrade stellaops ./helm/stellaops \
|
||||
--namespace stellaops-prod \
|
||||
--set image.tag=2027.Q1 \
|
||||
--wait
|
||||
```
|
||||
|
||||
## Evidence Continuity Guarantees
|
||||
|
||||
### Preserved During Upgrade
|
||||
|
||||
| Artifact | Guarantee |
|
||||
|----------|-----------|
|
||||
| OCI digests | Unchanged |
|
||||
| SBOM content hashes | Unchanged |
|
||||
| Merkle roots | Recomputed if schema changes (cross-reference maintained) |
|
||||
| Attestation signatures | Valid |
|
||||
| Rekor log entries | Immutable |
|
||||
|
||||
### Verification Commands
|
||||
|
||||
```bash
|
||||
# Verify OCI digests unchanged
|
||||
stella evidence verify-digests --report digests.json
|
||||
|
||||
# Verify attestation validity
|
||||
stella attest verify-all --since $(date -d '7 days ago' --iso-8601)
|
||||
|
||||
# Generate compliance report
|
||||
stella evidence compliance-report --format pdf
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
| Issue | Symptom | Resolution |
|
||||
|-------|---------|------------|
|
||||
| Migration timeout | Pod stuck in init | Increase `migrationTimeoutSeconds` |
|
||||
| Health check failure | Ready probe failing | Check database connectivity |
|
||||
| Evidence mismatch | Continuity check fails | Run `stella evidence reindex` |
|
||||
| Traffic not routing | 502 errors | Verify service selector labels |
|
||||
|
||||
### Support Escalation
|
||||
|
||||
If upgrade issues cannot be resolved:
|
||||
1. Capture diagnostics: `stella doctor --export diagnostics.tar.gz`
|
||||
2. Rollback to blue environment
|
||||
3. Contact support with diagnostics bundle
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Upgrade Runbook](upgrade-runbook.md)
|
||||
- [Evidence Migration](evidence-migration.md)
|
||||
- [Database Migration Strategy](../db/MIGRATION_STRATEGY.md)
|
||||
- [Versioning Policy](../releases/VERSIONING.md)
|
||||
Reference in New Issue
Block a user