Some checks failed
Signals CI & Image / signals-ci (push) Has been cancelled
Signals Reachability Scoring & Events / reachability-smoke (push) Has been cancelled
Signals Reachability Scoring & Events / sign-and-upload (push) Has been cancelled
Manifest Integrity / Validate Schema Integrity (push) Has been cancelled
Manifest Integrity / Validate Contract Documents (push) Has been cancelled
Manifest Integrity / Validate Pack Fixtures (push) Has been cancelled
Manifest Integrity / Audit SHA256SUMS Files (push) Has been cancelled
Manifest Integrity / Verify Merkle Roots (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
2.4 KiB
2.4 KiB
VEX Lens observability runbook (stub · 2025-11-29 demo)
Dashboards (offline import)
- Grafana JSON:
docs/modules/vex-lens/runbooks/dashboards/vex-lens-observability.json(import locally; no external data sources assumed). - Planned panels: consensus latency, conflict backlog, recompute duration, issuer trust changes, export job success rate, and DSSE verification failures.
Key metrics
vex_consensus_latency_seconds_bucket— latency from observation intake to consensus write.vex_conflict_queue_depth— size of unresolved conflict queue.vex_recompute_duration_seconds_bucket{reason}— recompute times by trigger (issuer update, policy knob, ingestion delta).vex_export_duration_seconds_bucket— export job runtime.vex_dsse_verification_failures_total— failed attestations during export/ingest.vex_consensus_conflicts_total{reason}— conflict counts by reason (status disagreement, scope mismatch, missing provenance).
Logs & traces
- Correlate by
correlationId,artifactKey,advisoryKey, andissuer. IncludetrustTier,weightBefore,weightAfter, andjustificationfields for audits. - Traces disabled by default for air-gap; enable by setting
Telemetry:ExportEnabled=trueand pointing OTLP endpoint to on-prem collector.
Health/diagnostics
/health/livenessand/health/readiness(service) must return 200; readiness checks projection store (PostgreSQL or in-memory), cache, and event bus reachability./statusexposes build version, commit, feature flags; verify it matches offline bundle manifest.- Export self-check: run
stella vex export --format json --manifest out/manifest.jsonand validate hashes against manifest entries.
Alert hints
- Consensus latency p99 > 1.5s over 5m.
- Conflict queue depth > 500 for any tenant.
- DSSE verification failures > 0 in a 10m window.
- Export failure rate > 2% over 10m.
Offline verification steps
- Import Grafana JSON locally; point to Prometheus scrape labeled
vex-lens. - Run export CLI above and verify
manifest.jsonhashes viajq -r '.files[].sha256'. - Fetch
/statusand confirm commit/version match the exported manifest and offline kit bundle metadata.
Evidence locations
- Sprint tracker:
docs/implplan/SPRINT_0332_0001_0001_docs_modules_vex_lens.md. - Module docs:
README.md,architecture.md,implementation_plan.md. - Dashboard stub:
runbooks/dashboards/vex-lens-observability.json.