synergy moats product advisory implementations

This commit is contained in:
master
2026-01-17 01:30:03 +02:00
parent 77ff029205
commit 702a27ac83
112 changed files with 21356 additions and 127 deletions

View File

@@ -0,0 +1,189 @@
# Runbook: Release Orchestrator - Rollback Operation Failed
> **Sprint:** SPRINT_20260117_029_DOCS_runbook_coverage
> **Task:** RUN-004 - Release Orchestrator Runbooks
## Metadata
| Field | Value |
|-------|-------|
| **Component** | Release Orchestrator |
| **Severity** | Critical |
| **On-call scope** | Platform team, Release team |
| **Last updated** | 2026-01-17 |
| **Doctor check** | `check.orchestrator.rollback-health` |
---
## Symptoms
- [ ] Rollback operation failing or stuck
- [ ] Alert `OrchestratorRollbackFailed` firing
- [ ] Error: "rollback failed" or "cannot restore previous version"
- [ ] Target environment in inconsistent state
- [ ] Previous artifact not available for deployment
---
## Impact
| Impact Type | Description |
|-------------|-------------|
| **User-facing** | Rollback blocked; potentially broken release in production |
| **Data integrity** | Environment may be in partial rollback state |
| **SLA impact** | Incident resolution blocked; extended outage |
---
## Diagnosis
### Quick checks
1. **Check Doctor diagnostics:**
```bash
stella doctor --check check.orchestrator.rollback-health
```
2. **Check rollback status:**
```bash
stella rollback status <rollback-id>
```
3. **Check previous deployment history:**
```bash
stella orch deployments list --env <env-name> --last 10
```
### Deep diagnosis
1. **Check why rollback failed:**
```bash
stella rollback trace <rollback-id> --verbose
```
Look for: Which step failed, error message
2. **Check previous artifact availability:**
```bash
stella orch artifacts get <previous-digest> --check
```
Problem if: Artifact deleted, not in registry
3. **Check environment state:**
```bash
stella orch env status <env-name> --detailed
```
4. **Check for deployment locks:**
```bash
stella orch locks list --env <env-name>
```
---
## Resolution
### Immediate mitigation
1. **Force release lock if stuck:**
```bash
stella orch locks release --env <env-name> --force
```
2. **Manual rollback using specific artifact:**
```bash
stella deploy --env <env-name> --artifact <previous-digest> --force
```
3. **If artifact unavailable, deploy last known good:**
```bash
stella orch deployments list --env <env-name> --status success
stella deploy --env <env-name> --artifact <last-good-digest>
```
### Root cause fix
**If previous artifact not in registry:**
1. Check artifact retention policy:
```bash
stella registry retention show
```
2. Restore from backup registry:
```bash
stella registry restore --artifact <digest> --from backup
```
3. Increase artifact retention:
```bash
stella registry retention set --min-versions 10
```
**If deployment service unavailable:**
1. Check deployment target connectivity:
```bash
stella orch connectivity --target <env-name>
```
2. Check deployment agent status:
```bash
stella orch agent status --env <env-name>
```
**If configuration drift:**
1. Check environment configuration:
```bash
stella orch env config diff <env-name>
```
2. Reset environment to known state:
```bash
stella orch env reset <env-name> --to-baseline
```
**If database state inconsistent:**
1. Check orchestrator database:
```bash
stella orch db verify
```
2. Repair deployment state:
```bash
stella orch repair --deployment <deployment-id>
```
### Verification
```bash
# Verify rollback completed
stella rollback status <rollback-id>
# Verify environment state
stella orch env status <env-name>
# Verify correct version deployed
stella orch deployments current --env <env-name>
# Health check the environment
stella orch health-check --env <env-name>
```
---
## Prevention
- [ ] **Retention:** Maintain at least 5 previous versions in registry
- [ ] **Testing:** Test rollback procedure in staging regularly
- [ ] **Monitoring:** Alert on rollback failures immediately
- [ ] **Documentation:** Document manual rollback procedures per environment
---
## Related Resources
- **Architecture:** `docs/modules/release-orchestrator/rollback.md`
- **Related runbooks:** `orchestrator-promotion-stuck.md`, `orchestrator-evidence-missing.md`
- **Rollback procedures:** `docs/operations/rollback-procedures.md`