consolidation of some of the modules, localization fixes, product advisories work, qa work

This commit is contained in:
master
2026-03-05 03:54:22 +02:00
parent 7bafcc3eef
commit 8e1cb9448d
3878 changed files with 72600 additions and 46861 deletions

View File

@@ -0,0 +1,6 @@
{
"_note": "Placeholder Grafana dashboard stub for offline import. Populate with panel definitions when metrics endpoints are available; see runbooks/observability.md for expected panels.",
"schemaVersion": 39,
"title": "Vuln Explorer Observability (stub)",
"panels": []
}

View File

@@ -0,0 +1,41 @@
# Vuln Explorer observability runbook (demo snapshot · 2025-11-29)
## Dashboards (offline-friendly)
- Grafana JSON: `docs/modules/vuln-explorer/runbooks/dashboards/vuln-explorer-observability.json` (import locally; no external data sources assumed).
- Ops dashboards: `ops/devops/vuln/dashboards/vuln-explorer.json` (CI/staging) adds API latency p95, projection lag, error rate, query budget enforcement.
## Key metrics
- `vuln_projection_lag_seconds{tenant}` seconds between latest ledger event and projector head.
- `vuln_findings_open_total{severity,tenant}` count of open findings by severity.
- `vuln_export_duration_seconds_bucket` histogram for export job runtime.
- `vuln_projection_backlog_total` queued events awaiting projection.
- `vuln_triage_actions_total{type}` immutable triage actions (assign, comment, risk_accept, remediation_note).
- `vuln_api_request_duration_seconds_bucket{route}` API latency for `GET /v1/findings*` and `POST /v1/reports`.
- `vuln_query_hashes_total{tenant,query_hash}` hashed query shapes (no PII) to observe cache effectiveness.
- `vuln_api_payload_bytes_bucket{direction}` request/response size histograms to spot oversized payloads.
## Logs & traces
- Correlate by `correlationId` and `findingId`. Structured fields: `tenant`, `advisoryKey`, `policyVersion`, `projectId`, `route`.
- Query PII guardrail: request filters are hashed (SHA-256 with deployment salt); raw filters are not logged. Strings longer than 128 chars are truncated; known PII fields (`email`, `userId`) are dropped before logging.
- Trace exemplar anchors: `traceparent` headers are copied into logs; exporters stay disabled by default for air-gap. Enable by setting `Telemetry:ExportEnabled=true` and pointing to on-prem Tempo/Jaeger.
## Health/diagnostics
- `/health/liveness` and `/health/readiness` (HTTP 200 expected; readiness checks PostgreSQL + cache reachability).
- `/status` returns build version, git commit, and enabled features; safe for anonymous fetch in sealed environments.
- Ledger replay check: `GET /v1/findings?projectionMode=verify` emits `X-Vuln-Projection-Head` for quick consistency probes.
## Alert hints (wire to local Alertmanager or watchdog)
- Projection lag > 120s for any tenant.
- API p99 latency > 800ms for `GET /v1/findings` or `POST /v1/reports`.
- Export failure rate > 2% over 10m window.
- Accepted-risk approaching expiry within 7d (emit Notify event `vuln.accepted_risk.expiring`).
## Offline verification steps
1) Import Grafana JSON locally and point to Prometheus scrape job `vuln-explorer`.
2) Run `stella vuln export --format json --manifest out/manifest.json` and validate hashes using `jq -r '.files[].sha256'` against generated bundle.
3) Use `curl -s "$BASEURL/status" | jq '{commit,version,features}'` to confirm expected build metadata matches the exported bundle manifest.
## Evidence locations
- Sprint alignment: `docs/implplan/SPRINT_0334_0001_0001_docs_modules_vuln_explorer.md`.
- API contract draft: `docs/modules/vuln-explorer/api.md` and OpenAPI at `docs/modules/vuln-explorer/openapi/vuln-explorer.v1.yaml`.
- Schema references: `docs/modules/vuln-explorer/architecture.md` (ledger model, VEX decision schemas).