up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Policy Lint & Smoke / policy-lint (push) Has been cancelled

This commit is contained in:
StellaOps Bot
2025-12-01 21:16:22 +02:00
parent c11d87d252
commit 909d9b6220
208 changed files with 860954 additions and 832 deletions

View File

@@ -123,12 +123,18 @@ It aligns with `Sprint 12 Runtime Guardrails` and assumes components consume
- Extract Prometheus rules into offline monitoring cluster (`/etc/prometheus/rules.d`).
- Import Grafana dashboard via `grafana-cli --config ...`.
## 6. Observability assets
## 6. Observability assets
- Prometheus alert rules: `docs/modules/zastava/operations/runtime-prometheus-rules.yaml`.
- Grafana dashboard JSON: `docs/modules/zastava/operations/runtime-grafana-dashboard.json`.
- Add both to the monitoring repo (`ops/monitoring/zastava`) and reference them in
the Offline Kit manifest.
- Add both to the monitoring repo (`ops/monitoring/zastava`) and reference them in
the Offline Kit manifest.
### 6.1 Surface manifest troubleshooting
- Metrics: `zastava_surface_manifest_failures_total{reason=not_found|fetch_error}` increments when Observer cannot resolve cached `cas://` pointers or digests; correlate with Scanner cache health.
- Evidence: Observer appends `runtime.surface.manifest{resolved|not_found|fetch_error}` plus `runtime.surface.manifestUri`/`manifestDigest` and up to five artifact metadata keys per manifest; view via drift diagnostics or runtime posture evidence.
- Checklist: ensure `Surface:Manifest:RootDirectory` points to the Scanner cache mount, tenant matches `ZASTAVA_SURFACE_TENANT`, and `cas://` URIs from drift/entrytrace events exist on disk (`<root>/manifests/<hh>/<tt>/<digest>.json`).
- Offline: if missing, sync the manifests directory from Offline Kit bundle into the Observer node cache and rerun the drift check. Avoid network fetches.
## 7. Build-id correlation & symbol retrieval