Doctor plugin checks: implement health check classes and documentation

Implement remediation-aware health checks across all Doctor plugin modules (Agent, Attestor, Auth, BinaryAnalysis, Compliance, Crypto, Environment, EvidenceLocker, Notify, Observability, Operations, Policy, Postgres, Release, Scanner, Storage, Vex) and their backing library counterparts (AI, Attestation, Authority, Core, Cryptography, Database, Docker, Integration, Notify, Observability, Security, ServiceGraph, Sources, Verification). Each check now emits structured remediation metadata (severity, category, runbook links, and fix suggestions) consumed by the Doctor dashboard remediation panel. Also adds: - docs/doctor/articles/ knowledge base for check explanations - Advisory AI search seed and allowlist updates for doctor content - Sprint plan for doctor checks documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:28:00 +02:00
parent fbd24e71de
commit c58a236d70
326 changed files with 18500 additions and 463 deletions
--- a/docs/doctor/articles/integration/ci-system-connectivity.md
+++ b/docs/doctor/articles/integration/ci-system-connectivity.md
@@ -0,0 +1,71 @@
+---
+checkId: check.integration.ci.system
+plugin: stellaops.doctor.integration
+severity: warn
+tags: [integration, ci, cd, jenkins, gitlab, github]
+---
+# CI System Connectivity
+
+## What It Checks
+Iterates over all CI/CD systems defined under `CI:Systems` (or the legacy `CI:Url` single-system key). For each system it sends an HTTP GET to a type-specific health endpoint (Jenkins `/api/json`, GitLab `/api/v4/version`, GitHub `/rate_limit`, Azure DevOps `/_apis/connectionData`, or generic `/health`), sets the appropriate auth header (Bearer for GitHub/generic, `PRIVATE-TOKEN` for GitLab), and records reachability, authentication success, and latency. If the system is reachable and authenticated, it optionally queries runner/agent status (Jenkins `/computer/api/json`, GitLab `/api/v4/runners?status=online`). The check **fails** when any system is unreachable or returns 401/403, **warns** when all systems are reachable but one or more has zero available runners (out of a non-zero total), and **passes** otherwise.
+
+## Why It Matters
+CI/CD systems are the trigger point for automated builds, tests, and release pipelines. If a CI system is unreachable or its credentials have expired, new commits will not be built, security scans will not run, and promotions will stall. Runner exhaustion has the same effect: pipelines queue indefinitely, delaying releases and blocking evidence collection.
+
+## Common Causes
+- CI system is down or undergoing maintenance
+- Network connectivity issue between Stella Ops and the CI host
+- API credentials (token or password) have expired or been rotated
+- Firewall or security group blocking the CI API port
+- All CI runners/agents are offline or busy
+
+## How to Fix
+
+### Docker Compose
+```bash
+# Verify the CI URL is correct in your environment file
+grep -E '^CI__' .env
+
+# Test connectivity from within the Docker network
+docker compose exec gateway curl -sv https://ci.example.com/api/json
+
+# Rotate or set a new API token
+echo 'CI__Systems__0__ApiToken=<new-token>' >> .env
+docker compose restart gateway
+```
+
+### Bare Metal / systemd
+```bash
+# Check config in appsettings
+cat /etc/stellaops/appsettings.Production.json | jq '.CI'
+
+# Test connectivity
+curl -H "Authorization: Bearer $CI_TOKEN" https://ci.example.com/api/json
+
+# Update the token
+sudo nano /etc/stellaops/appsettings.Production.json
+sudo systemctl restart stellaops-platform
+```
+
+### Kubernetes / Helm
+```yaml
+# values.yaml
+ci:
+  systems:
+    - name: jenkins-prod
+      url: https://ci.example.com
+      type: jenkins
+      apiToken: <token>     # or use existingSecret
+```
+```bash
+helm upgrade stellaops ./chart -f values.yaml
+```
+
+## Verification
+```
+stella doctor run --check check.integration.ci.system
+```
+
+## Related Checks
+- `check.integration.webhooks` -- validates webhook delivery from CI events
+- `check.integration.git` -- validates Git provider reachability (often same host as CI)