up
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
Some checks failed
AOC Guard CI / aoc-guard (push) Has been cancelled
AOC Guard CI / aoc-verify (push) Has been cancelled
Docs CI / lint-and-preview (push) Has been cancelled
Mirror Thin Bundle Sign & Verify / mirror-sign (push) Has been cancelled
api-governance / spectral-lint (push) Has been cancelled
This commit is contained in:
@@ -37,6 +37,24 @@ Excititor’s evidence APIs now emit first-class OpenTelemetry metrics so Lens,
|
||||
3. **Alerting**: add rules for high guard violation rates, missing signatures, and abnormal chunk bytes/record counts. Tie alerts back to connectors via tenant metadata.
|
||||
4. **Post-deploy checks**: after each release, verify metrics emit by curling `/v1/vex/observations/...` and `/v1/vex/evidence/chunks`, watching the console exporter (dev) or OTLP (prod).
|
||||
|
||||
## SLOs (Sprint 119 – OBS-51-001)
|
||||
|
||||
The following SLOs apply to Excititor evidence read paths when telemetry is enabled. Record them in the shared SLO registry and alert via the platform alertmanager.
|
||||
|
||||
| Surface | SLI | Target | Window | Burn alert | Notes |
|
||||
| --- | --- | --- | --- | --- | --- |
|
||||
| `/v1/vex/observations` | p95 latency | ≤ 450 ms | 7d | 2 % over 1h | Measured on successful responses only; tenant scoped. |
|
||||
| `/v1/vex/observations` | freshness | ≥ 99 % within 5 min of upstream ingest | 7d | 5 % over 4h | Derived from arrival minus `createdAt`; requires ingest clocks in UTC. |
|
||||
| `/v1/vex/observations` | signature presence | ≥ 98 % statements with signature present | 7d | 3 % over 24h | Use `excititor.vex.signature.status{status="missing"}`. |
|
||||
| `/v1/vex/evidence/chunks` | p95 stream duration | ≤ 600 ms | 7d | 2 % over 1h | From request start to last NDJSON write; excludes client disconnects. |
|
||||
| `/v1/vex/evidence/chunks` | truncation rate | ≤ 1 % truncated streams | 7d | 1 % over 1h | `excititor.vex.chunks.records` with `truncated=true`. |
|
||||
| AOC guardrail | zero hard violations | 0 | continuous | immediate | Any `excititor.vex.aoc.guard_violations` with severity `error` pages ops. |
|
||||
|
||||
Implementation notes:
|
||||
- Emit latency/freshness SLOs via OTEL views that pre-aggregate by tenant and route to the platform SLO backend; keep bucket boundaries aligned with 50/100/250/450/650/1000 ms.
|
||||
- Freshness SLI derived from ingest timestamps; ensure clocks are synchronized (NTP) and stored in UTC.
|
||||
- For air-gapped deployments without OTEL sinks, scrape console exporter and push to offline Prometheus; same thresholds apply.
|
||||
|
||||
## Related documents
|
||||
|
||||
- `docs/modules/excititor/architecture.md` – API contract, AOC guardrails, connector responsibilities.
|
||||
|
||||
Reference in New Issue
Block a user