Doctor plugin checks: implement health check classes and documentation
Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
---
|
||||
checkId: check.integration.secrets.manager
|
||||
plugin: stellaops.doctor.integration
|
||||
severity: fail
|
||||
tags: [integration, secrets, vault, security, keyvault]
|
||||
---
|
||||
# Secrets Manager Connectivity
|
||||
|
||||
## What It Checks
|
||||
Iterates over all secrets managers defined under `Secrets:Managers` (or the legacy `Secrets:Vault:Url` / `Vault:Url` single-manager key). For each manager it sends an HTTP GET to a type-specific health endpoint: Vault uses `/v1/sys/health?standbyok=true&sealedcode=200&uninitcode=200`, Azure Key Vault uses `/healthstatus`, and others use `/health`. Sets the appropriate auth header (`X-Vault-Token` for Vault, `Bearer` for others). Records reachability, authentication success, and latency. For Vault, parses the response JSON for `sealed`, `initialized`, and `version` fields. The check **fails** if any manager is unreachable or returns 401/403, **fails** if any Vault instance is sealed, and **passes** if all managers are healthy and unsealed.
|
||||
|
||||
## Why It Matters
|
||||
Secrets managers store registry credentials, signing keys, API tokens, and encryption keys. If a secrets manager is unreachable, Stella Ops cannot retrieve credentials for deployments, cannot sign attestations, and cannot decrypt sensitive configuration. A sealed Vault is equally critical: all secret reads fail until it is manually unsealed. This is a hard blocker for any release operation.
|
||||
|
||||
## Common Causes
|
||||
- Secrets manager service is down or restarting
|
||||
- Network connectivity issue between Stella Ops and the secrets manager
|
||||
- Authentication token has expired or been revoked
|
||||
- TLS certificate issue (expired, untrusted CA)
|
||||
- Vault was restarted and needs manual unseal
|
||||
- Vault auto-seal triggered due to HSM connectivity loss
|
||||
|
||||
## How to Fix
|
||||
|
||||
### Docker Compose
|
||||
```bash
|
||||
# Check secrets manager configuration
|
||||
grep 'SECRETS__\|VAULT__' .env
|
||||
|
||||
# Test Vault health
|
||||
docker compose exec gateway curl -sv \
|
||||
http://vault:8200/v1/sys/health
|
||||
|
||||
# Unseal Vault if sealed
|
||||
docker compose exec vault vault operator unseal <key1>
|
||||
docker compose exec vault vault operator unseal <key2>
|
||||
docker compose exec vault vault operator unseal <key3>
|
||||
|
||||
# Refresh Vault token
|
||||
docker compose exec vault vault token create -policy=stellaops
|
||||
echo 'Secrets__Managers__0__Token=<new-token>' >> .env
|
||||
docker compose restart platform
|
||||
```
|
||||
|
||||
### Bare Metal / systemd
|
||||
```bash
|
||||
# Check Vault status
|
||||
vault status
|
||||
|
||||
# Unseal if needed
|
||||
vault operator unseal
|
||||
|
||||
# Renew the Vault token
|
||||
vault token renew
|
||||
|
||||
# Check Azure Key Vault health
|
||||
curl -v https://myvault.vault.azure.net/healthstatus
|
||||
|
||||
# Update configuration
|
||||
sudo nano /etc/stellaops/appsettings.Production.json
|
||||
sudo systemctl restart stellaops-platform
|
||||
```
|
||||
|
||||
### Kubernetes / Helm
|
||||
```yaml
|
||||
# values.yaml
|
||||
secrets:
|
||||
managers:
|
||||
- name: vault-prod
|
||||
url: http://vault.vault.svc.cluster.local:8200
|
||||
type: vault
|
||||
existingSecret: stellaops-vault-token
|
||||
```
|
||||
```bash
|
||||
# Update Vault token secret
|
||||
kubectl create secret generic stellaops-vault-token \
|
||||
--from-literal=token=<new-token> \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
|
||||
helm upgrade stellaops ./chart -f values.yaml
|
||||
```
|
||||
|
||||
## Verification
|
||||
```
|
||||
stella doctor run --check check.integration.secrets.manager
|
||||
```
|
||||
|
||||
## Related Checks
|
||||
- `check.integration.oci.credentials` -- registry credentials that may be sourced from the secrets manager
|
||||
Reference in New Issue
Block a user