Files
git.stella-ops.org/docs/runbooks/vuln-ops.md
2025-12-07 00:09:24 +00:00

41 lines
2.8 KiB
Markdown

# Vuln / Findings Ops Runbook (dev-mock ready)
Status: DRAFT (2025-12-06 UTC). Safe for dev/mock exercises; production steps need final digests and schema from DEPLOY-VULN-29-001.
## Scope
- Findings Ledger + projector + Vuln Explorer API deployment/rollback, plus common incident drills (lag, storms, export failures).
## Pre-flight (dev vs. prod)
1) Release manifest guard
- Dev/mock: `python ops/devops/release/check_release_manifest.py deploy/releases/2025.09-mock-dev.yaml --downloads deploy/downloads/manifest.json`
- Prod: rerun against `deploy/releases/2025.09-stable.yaml` once ledger/api digests land.
2) Render plan
- Helm (mock overlay): `helm template vuln-mock ./deploy/helm/stellaops -f deploy/helm/stellaops/values-mock.yaml --debug --validate > /tmp/vuln-mock.yaml`
- Compose (dev with overlay): `USE_MOCK=1 deploy/compose/scripts/quickstart.sh env/dev.env.example && docker compose --env-file env/dev.env.example -f docker-compose.dev.yaml -f docker-compose.mock.yaml config > /tmp/vuln-compose.yaml`
3) Backups (prod only)
- Postgres dump for Findings Ledger DB; Mongo dump if projector uses Mongo cache; copy object-store buckets tied to projector anchors.
## Deploy (mock path)
- Helm apply (dev): `helm upgrade --install stellaops ./deploy/helm/stellaops -f deploy/helm/stellaops/values-mock.yaml --atomic --timeout 10m`.
- Compose: quickstart already starts ledger + vuln API with mock pins; validate health at `https://localhost:8443/swagger` (dev certs).
## Incident drills
- Projector lag: scale projector worker up (`kubectl scale deploy/findings-ledger -n stellaops --replicas=2`) then back down; monitor queue length (metric hook pending).
- Resolver storms: temporarily set `ASPNETCORE_THREADPOOL_MINTHREADS` higher or scale API horizontally; in compose, use `docker compose restart vuln-explorer-api` after bumping `VULNEXPLORER__MAX_CONCURRENCY` env once schema lands.
- Export failures: re-run export job after verifying hashes in `deploy/releases/*`; mock path skips signing but still exercises checksum validation via `ops/devops/release/check_release_manifest.py`.
## Rollback
- Helm: `helm rollback stellaops 1` to previous revision.
- Compose: `docker compose --env-file env/dev.env.example -f docker-compose.dev.yaml -f docker-compose.mock.yaml down`.
## Evidence capture
- Keep `/tmp/vuln-mock.yaml`, `/tmp/vuln-compose.yaml`, and the release manifest used.
- `kubectl logs deployment/findings-ledger -n stellaops --since=30m > /tmp/ledger-logs.txt`
- DB snapshot checksums if taken; bundle into `vuln-evidence-$(date -u +%Y%m%dT%H%M%SZ).tar.gz`.
## Open TODOs
- Replace mock digests with production pins; add concrete env knobs for projector and API when schemas publish.
- Hook Prometheus counters for projector lag and resolver storm dashboards once metrics are exported.
_Last updated: 2025-12-06 (UTC)_