Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules
(Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment,
EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release,
Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation,
Authority, Core, Cryptography, Database, Docker, Integration, Notify,
Observability, Security, ServiceGraph, Sources, Verification).

Each check now emits structured remediation metadata (severity, category,
runbook links, and fix suggestions) consumed by the Doctor dashboard
remediation panel.

Also adds:
- docs/doctor/articles/ knowledge base for check explanations
- Advisory AI search seed and allowlist updates for doctor content
- Sprint plan for doctor checks documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
master
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions

View File

@@ -0,0 +1,114 @@
---
checkId: check.security.tls.certificate
plugin: stellaops.doctor.security
severity: fail
tags: [security, tls, certificate]
---
# TLS Certificate
## What It Checks
Validates TLS certificate validity and expiration. The check only runs when a certificate path is configured (`Tls:CertificatePath` or `Kestrel:Certificates:Default:Path`). It loads the certificate file and performs the following validations:
| Condition | Result |
|---|---|
| Certificate file not found | `fail` |
| Certificate cannot be loaded (corrupt, wrong password) | `fail` |
| Certificate not yet valid (`NotBefore` in the future) | `fail` |
| Certificate has expired (`NotAfter` in the past) | `fail` |
| Certificate expires in less than **30 days** | `warn` |
| Certificate valid for 30+ days | `pass` |
The check supports both PEM certificates and PKCS#12 (.pfx/.p12) files with optional passwords (`Tls:CertificatePassword` or `Kestrel:Certificates:Default:Password`).
Evidence collected includes: subject, issuer, NotBefore, NotAfter, days until expiry, and thumbprint.
## Why It Matters
An expired or invalid TLS certificate causes all HTTPS connections to fail. Browsers display security warnings, API clients reject responses, and inter-service communication breaks. In a release control plane, TLS failures prevent:
- Console access for operators.
- API calls from CI/CD pipelines.
- Inter-service communication via HTTPS.
- OIDC authentication flows with the Authority.
Certificate expiration is the most common cause of production outages that is entirely preventable with monitoring.
## Common Causes
- Certificate file path is incorrect or the file was deleted
- Certificate has exceeded its validity period (expired)
- Certificate validity period has not started yet (clock skew or pre-dated certificate)
- Certificate file is corrupted
- Certificate password is incorrect (for PKCS#12 files)
- Certificate format not supported
## How to Fix
### Docker Compose
Mount the certificate and configure the path:
```yaml
services:
platform:
environment:
Tls__CertificatePath: "/app/certs/stellaops.pfx"
Tls__CertificatePassword: "${TLS_CERT_PASSWORD}"
volumes:
- ./certs/stellaops.pfx:/app/certs/stellaops.pfx:ro
```
Generate a new self-signed certificate for development:
```bash
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes \
-subj "/CN=stella-ops.local"
openssl pkcs12 -export -out stellaops.pfx -inkey key.pem -in cert.pem
```
### Bare Metal / systemd
Renew the certificate (e.g., with Let's Encrypt):
```bash
sudo certbot renew
sudo systemctl restart stellaops-platform
```
Or update the configuration with a new certificate:
```bash
# Update appsettings.json
{
"Tls": {
"CertificatePath": "/etc/ssl/stellaops/cert.pfx",
"CertificatePassword": "<password>"
}
}
```
### Kubernetes / Helm
Use cert-manager for automatic certificate management:
```yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: stellaops-tls
spec:
secretName: stellaops-tls-secret
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- stella-ops.yourdomain.com
```
Reference in Helm values:
```yaml
tls:
secretName: "stellaops-tls-secret"
```
## Verification
```
stella doctor run --check check.security.tls.certificate
```
## Related Checks
- `check.security.headers` — HSTS requires a valid TLS certificate
- `check.security.encryption` — validates encryption at rest (TLS handles encryption in transit)
- `check.core.crypto.available` — RSA/ECDSA must be available for certificate operations