doctor: complete runtime check documentation sprint

Signed-off-by: master <>
This commit is contained in:
master
2026-03-31 23:26:24 +03:00
parent 404d50bcb7
commit 152c1b1357
54 changed files with 2210 additions and 258 deletions

View File

@@ -0,0 +1,52 @@
---
checkId: check.db.migrations.failed
plugin: stellaops.doctor.database
severity: fail
tags: [database, migrations, postgres, schema]
---
# Failed Migrations
## What It Checks
Reads the `stella_migration_history` table, when present, and reports rows marked `failed` or `incomplete`.
If the tracking table does not exist, the check reports informationally and assumes the service is using a different migration mechanism.
## Why It Matters
Partially applied migrations leave schemas in undefined states. That is a common cause of startup failures and runtime `500` errors after upgrades.
## Common Causes
- A migration script failed during deployment
- The database user lacks DDL permissions
- Two processes attempted to apply migrations concurrently
- An interrupted deployment left the migration history half-written
## How to Fix
### Docker Compose
```bash
docker compose -f devops/compose/docker-compose.stella-ops.yml logs --tail 200 doctor-web
docker compose -f devops/compose/docker-compose.stella-ops.yml exec postgres psql -U stellaops -d stellaops -c "SELECT migration_id, status, error_message, applied_at FROM stella_migration_history ORDER BY applied_at DESC LIMIT 10;"
```
Fix the underlying SQL or permission problem, then restart the owning service so startup migrations run again.
### Bare Metal / systemd
```bash
journalctl -u <service-name> -n 200
dotnet ef database update
```
### Kubernetes / Helm
```bash
kubectl logs deploy/<service-name> -n <namespace> --tail=200
kubectl exec -n <namespace> <postgres-pod> -- psql -U <db-user> -d <db-name> -c "SELECT migration_id, status FROM stella_migration_history;"
```
## Verification
```bash
stella doctor --check check.db.migrations.failed
```
## Related Checks
- `check.db.migrations.pending` - pending migrations often follow a failed rollout
- `check.db.schema.version` - schema consistency should be rechecked after cleanup